Ruby "Speedup" hints?

Sorry for replying so soon but I had to :wink:

"Sounds like time to repost (yet again, can't we add this to a FAQ
somewhere?)

http://rubygarden.org/Ruby/page/show/RubyOptimization"

I think collecting it in one place would be best for both
newcomers and old rubyistas.

Personally I think all useful things (to a general user
base of ruby) should be collected at the official homepage - or,
in case it would not qualify, there could be links.

External servers should be somewhat reliable too though.
The following result assents me here slightly when I try to
visit above URL ... :wink:

"Proxy Error

The proxy server received an invalid response from an upstream server.
The proxy server could not handle the request GET
/Ruby/page/show/RubyOptimization.

Reason: Error reading from remote server

Apache/2.2.3 (Unix) DAV/2 PHP/5.1.6 SVN/1.1.4 Server at rubygarden.org
Port 80"

Regards

···

--
Posted via http://www.ruby-forum.com/.

The problem with simple truths is that often aren't simple, and almost
never stay true. There was/is a large percentage of hackers in the
C/C++ community who learned certain "simple truths" about how to make
programs go faster - and continued to use them religiously for
decades, even after the compilers had made those techniques irrelavent
or even detrimental. I believe the same has happened in Java, as
JITing and other optimizations have made a lot of assumptions about
performance moot.

The other problem with "simple truths" is that in performance, it
really isn't true that every little bit helps. I can't tell you how
many times I've seen a design which was complicated and obfuscated by
some coder's (potentially superstitious) beliefs about certain
techniques being faster than others (e.g. "always avoid dynamic
dispatch"). Almost invariably it turns out that the real performance
gains to be made are algorithmic and/or have to do with IO, and are
orders of magnitude larger than tiny gains made by following
performance "truths" - to the point that the latter gains are lost in
the statistical noise. It's often the case that those little
rule-based optimizations only served to make the code more difficult
to refactor, and thus harder to apply the real optimizations to.

At a time in Ruby's history when it's implementation is in flux, and
there are multiple alternate implementations coming on line, I think
it is very dangerous to start compiling "rules of thumb" or "simple
truths" about Ruby performance hacks. And when such rules are stated,
they need to be quantified with benchmarks and qualified with very
specific information about the platforms those benchmarks were
gathered on.

...and then you should *still* write your code to be clear and
well-factored, and put off any optimizations until after you profile.
Chances are you'll discover that your real slowdown is in IO, or in
some database interaction, not in instantiating objects.

I saw a pertinent quote the other day: "It is easier to optimize
correct code than to correct optimized code." --Bill Harlan

···

On Tue, Mar 18, 2008 at 6:43 AM, Marc Heiler <shevegen@linuxmail.org> wrote:

Personally I like such simple truths. :slight_smile:

--
Avdi

The difference is even more stark on 1.9:

cout@bean:~/tmp$ time ruby1.9 a.rb 10_000_000 > /dev/null

real 0m48.103s
user 0m44.203s
sys 0m0.240s
cout@bean:~/tmp$ time ruby1.9 c.rb 10_000_000 > /dev/null

real 0m10.934s
user 0m10.425s
sys 0m0.076s

Paul

···

On Wed, Mar 19, 2008 at 10:25:32PM +0900, Paul Brannan wrote:

cout@bean:~/tmp$ time ruby a.rb 10_000_000 > /dev/null

real 0m27.841s
user 0m25.694s
sys 0m0.112s
cout@bean:~/tmp$ time ruby b.rb 10_000_000 > /dev/null

real 0m54.888s
user 0m11.333s
sys 0m0.504s
cout@bean:~/tmp$ time ruby c.rb 10_000_000 > /dev/null

real 0m13.141s
user 0m11.749s
sys 0m0.100s

>>
>>>> Anyone of you has a few hints on how to speed up ruby code?
>>>> (Note - not writing it, but running it :wink: )
>>>>
>>>> I only have a few hints, like 5... would love to extend it.
>>>
>>> When generating text output, using StringIO is faster than using
>>> puts
>>
>> This doesn't make sense. One can easily call #puts on a StringIO
>> object.
>>
>> Do you have a concrete example/benchmark that demonstrates the
>> difference?
>>
>> Paul
>>
>>
>>
>
> $ time ruby a.rb 10_000_000 > /tmp/a && sleep 3 && time ruby b.rb
> 10_000_000 > /tmp/b; printf "\a"
>
> real 1m45.305s
> user 1m27.581s
> sys 0m17.715s
>
> real 0m59.049s
> user 0m41.984s
> sys 0m16.997s
> $ cat a.rb
> times = ARGV[0].to_i
> times.times { puts "hola mundo" }
> $ cat b.rb
> require 'stringio'
>
> times = ARGV[0].to_i
> output = StringIO.new
> times.times { output.write("hola mundo\n") }
> output.rewind
> print output.read
>
>
> The difference increases in real world reports.
> --
> Gerardo Santana
>

Do you not realize that this is fundamentally telling you your
terminal is super-slow?
This has nothing to do with ruby itself,
really. Try it on some other platforms and tool stacks, for goodness
sake.

For goodness sake, read before replying. I'm not using any terminal.

Oh, and this is only really true for short strings and relatively
small amounts of memory. Try doing that whilst say, stream editing a
1TB file.

As I already said, this has proven true in real world reports.

This whole direction of thought is broken, honestly.

Fine. Don't take the hint.

···

On 3/19/08, James Tucker <jftucker@gmail.com> wrote:

On 18 Mar 2008, at 05:40, Gerardo Santana Gómez Garrido wrote:
> On Mon, Mar 17, 2008 at 6:08 PM, Paul Brannan <pbrannan@atdesk.com> > > wrote:
>> On Tue, Mar 18, 2008 at 02:17:22AM +0900, Gerardo Santana G?mez > >> Garrido wrote:
>>> On Mon, Mar 17, 2008 at 12:42 PM, Marc Heiler <shevegen@linuxmail.org > >>> > wrote:

--
Gerardo Santana

Replace output.write("hola mundo\n") with output.puts("hola mundo")
and you'll get something similar.

···

On 3/19/08, Paul Brannan <pbrannan@atdesk.com> wrote:

On Tue, Mar 18, 2008 at 02:40:43PM +0900, Gerardo Santana G?mez Garrido wrote:
> $ time ruby a.rb 10_000_000 > /tmp/a && sleep 3 && time ruby b.rb
> 10_000_000 > /tmp/b; printf "\a"
>
> real 1m45.305s
> user 1m27.581s
> sys 0m17.715s
>
> real 0m59.049s
> user 0m41.984s
> sys 0m16.997s
> $ cat a.rb
> times = ARGV[0].to_i
> times.times { puts "hola mundo" }
> $ cat b.rb
> require 'stringio'
>
> times = ARGV[0].to_i
> output = StringIO.new
> times.times { output.write("hola mundo\n") }
> output.rewind
> print output.read

I think you are demonstrating the difference between IO#puts and
IO#write (which still surprised me).

--
Gerardo Santana

Guys,

Given, we are talking about optimization, an big part of it is to optimize
page-download times. I am told that this can be somewhat accomplished using
the following -

1. Strip spaces, tabs, CR/LF from the HTML
2. Enable HTTP Compression

Is there a way in RoR w/ Mongrel to accomplish the two?

Rajat

···

On Tue, Mar 18, 2008 at 7:30 AM, Avdi Grimm <avdi@avdi.org> wrote:

On Tue, Mar 18, 2008 at 6:43 AM, Marc Heiler <shevegen@linuxmail.org> > wrote:
> Personally I like such simple truths. :slight_smile:

The problem with simple truths is that often aren't simple, and almost
never stay true. There was/is a large percentage of hackers in the
C/C++ community who learned certain "simple truths" about how to make
programs go faster - and continued to use them religiously for
decades, even after the compilers had made those techniques irrelavent
or even detrimental. I believe the same has happened in Java, as
JITing and other optimizations have made a lot of assumptions about
performance moot.

The other problem with "simple truths" is that in performance, it
really isn't true that every little bit helps. I can't tell you how
many times I've seen a design which was complicated and obfuscated by
some coder's (potentially superstitious) beliefs about certain
techniques being faster than others (e.g. "always avoid dynamic
dispatch"). Almost invariably it turns out that the real performance
gains to be made are algorithmic and/or have to do with IO, and are
orders of magnitude larger than tiny gains made by following
performance "truths" - to the point that the latter gains are lost in
the statistical noise. It's often the case that those little
rule-based optimizations only served to make the code more difficult
to refactor, and thus harder to apply the real optimizations to.

At a time in Ruby's history when it's implementation is in flux, and
there are multiple alternate implementations coming on line, I think
it is very dangerous to start compiling "rules of thumb" or "simple
truths" about Ruby performance hacks. And when such rules are stated,
they need to be quantified with benchmarks and qualified with very
specific information about the platforms those benchmarks were
gathered on.

...and then you should *still* write your code to be clear and
well-factored, and put off any optimizations until after you profile.
Chances are you'll discover that your real slowdown is in IO, or in
some database interaction, not in instantiating objects.

I saw a pertinent quote the other day: "It is easier to optimize
correct code than to correct optimized code." --Bill Harlan

--
Avdi

--
Rajat Garg

Ph: 206-499-9495
Add: 1314 Spring Street, #412
Seattle, WA 98104
Web: http://www.pilotoutlook.com
-----------------------------------------------------------------------------------------------------
Flying is the second greatest thrill known to man. Landing is the first!

Personally I like such simple truths. :slight_smile:

The problem with simple truths is that often aren't simple, and almost
never stay true. There was/is a large percentage of hackers in the
C/C++ community who learned certain "simple truths" about how to make
programs go faster - and continued to use them religiously for
decades, even after the compilers had made those techniques irrelavent
or even detrimental. I believe the same has happened in Java, as
JITing and other optimizations have made a lot of assumptions about
performance moot.

Yes, that's true. But I'm inclined to say that even in Java there is a single operation that is always the most expensive (not counting IO of course) and this is object creation. The reason is fairly simple: object creation has significant overhead (memory allocation or GC, management overhead for GC) which won't easily go away - although Sun has done a tremendous job in improving this throughout the course of JVM evolution!

The other problem with "simple truths" is that in performance, it
really isn't true that every little bit helps. I can't tell you how
many times I've seen a design which was complicated and obfuscated by
some coder's (potentially superstitious) beliefs about certain
techniques being faster than others (e.g. "always avoid dynamic
dispatch"). Almost invariably it turns out that the real performance
gains to be made are algorithmic and/or have to do with IO, and are
orders of magnitude larger than tiny gains made by following
performance "truths" - to the point that the latter gains are lost in
the statistical noise. It's often the case that those little
rule-based optimizations only served to make the code more difficult
to refactor, and thus harder to apply the real optimizations to.

I could not agree more. In fact, I said the same - just not so well elaborate as you did. :slight_smile:

At a time in Ruby's history when it's implementation is in flux, and
there are multiple alternate implementations coming on line, I think
it is very dangerous to start compiling "rules of thumb" or "simple
truths" about Ruby performance hacks.

Simple truths are so compelling because they are so simple - but they often are deceptive, too. Just think about the numerous articles about Java performance (e.g. [1]). Yes, people need to take them with a huge grain of salt - at least the Ruby version and platform.

And when such rules are stated,
they need to be quantified with benchmarks and qualified with very
specific information about the platforms those benchmarks were
gathered on.

Good point.

..and then you should *still* write your code to be clear and
well-factored, and put off any optimizations until after you profile.
Chances are you'll discover that your real slowdown is in IO, or in
some database interaction, not in instantiating objects.

I saw a pertinent quote the other day: "It is easier to optimize
correct code than to correct optimized code." --Bill Harlan

Great, I have to print that one out! But wait, after optimization correct code will become optimized code - and sometimes you don't know whether it's still correct (ok, that should be avoided by unit tests). :slight_smile:

Kind regards

  robert

[1] http://www-128.ibm.com/developerworks/java/library/j-jtp09275.html

···

On 18.03.2008 15:30, Avdi Grimm wrote:

On Tue, Mar 18, 2008 at 6:43 AM, Marc Heiler <shevegen@linuxmail.org> wrote:

OK, I avoided saying something on the previous message, but can you (and anyone else who posts benchmarks like this) please use names better than a.rb, b.rb, and c.rb so it actually makes some sense? Particularly when quoted or forwarded!

-Rob

Rob Biedenharn http://agileconsultingllc.com
Rob@AgileConsultingLLC.com

···

On Mar 19, 2008, at 9:29 AM, Paul Brannan wrote:

On Wed, Mar 19, 2008 at 10:25:32PM +0900, Paul Brannan wrote:

cout@bean:~/tmp$ time ruby a.rb 10_000_000 > /dev/null

real 0m27.841s
user 0m25.694s
sys 0m0.112s
cout@bean:~/tmp$ time ruby b.rb 10_000_000 > /dev/null

real 0m54.888s
user 0m11.333s
sys 0m0.504s
cout@bean:~/tmp$ time ruby c.rb 10_000_000 > /dev/null

real 0m13.141s
user 0m11.749s
sys 0m0.100s

The difference is even more stark on 1.9:

cout@bean:~/tmp$ time ruby1.9 a.rb 10_000_000 > /dev/null

real 0m48.103s
user 0m44.203s
sys 0m0.240s
cout@bean:~/tmp$ time ruby1.9 c.rb 10_000_000 > /dev/null

real 0m10.934s
user 0m10.425s
sys 0m0.076s

Paul

Not much difference, and the StringIO solution uses more memory. Not
surprising, since both File#puts and StringIO#puts are implemented by
rb_io_puts.

The difference may be more dramatic on your platform, depending on the
implementation of your C library's stdio functions.

[pbrannan@zaphod tmp]$ ruby -v
ruby 1.8.6 (2007-09-24 patchlevel 111) [i686-linux]
[pbrannan@zaphod tmp]$ ruby test.rb
Rehearsal --------------------------------------------------
IO#puts 10.320000 0.010000 10.330000 ( 10.412573)
IO#write 5.110000 0.020000 5.130000 ( 5.171733)
StringIO#puts 9.740000 0.060000 9.800000 ( 9.847963)
StringIO#write 4.430000 0.040000 4.470000 ( 4.504434)
---------------------------------------- total: 29.730000sec

                     user system total real
IO#puts 10.220000 0.010000 10.230000 ( 10.258458)
IO#write 5.090000 0.020000 5.110000 ( 5.164432)
StringIO#puts 10.150000 0.160000 10.310000 ( 10.438876)
StringIO#write 4.630000 0.090000 4.720000 ( 4.769139)
[pbrannan@zaphod tmp]$ cat test.rb
require 'stringio'
require 'benchmark'

Benchmark.bmbm(15) do |x|
  devnull = File.open('/dev/null', 'w')
  n = 2_000_000

  x.report("IO#puts") {
    n.times do
      devnull.puts "hola"
    end
  }

  x.report("IO#write") {
    n.times do
      devnull.write "hola\n"
    end
  }

  x.report("StringIO#puts") {
    s = StringIO.new
    n.times do
      s.puts "hola"
    end
    s.rewind
    devnull.print s.read
  }

  x.report("StringIO#write") {
    s = StringIO.new
    n.times do
      s.write "hola\n"
    end
    s.rewind
    devnull.print s.read
  }
end

···

On Thu, Mar 20, 2008 at 01:44:11AM +0900, Gerardo Santana G?mez Garrido wrote:

Replace output.write("hola mundo\n") with output.puts("hola mundo")
and you'll get something similar.

Oh, and this is only really true for short strings and relatively
small amounts of memory. Try doing that whilst say, stream editing a
1TB file.

As I already said, this has proven true in real world reports.

I'd be very impressed if you could afford the ram to do the recommended trick with a 1TB file...

I'm not saying you can't speed some things up using these methods, I'm saying that any serious and sane requirement for performance will not get significant gains from these kinds of suggestions, as this kind of stuff is so rarely the real cause of performance loss.

It's about how much indirection you have going on. Where and when does puts clear buffers and so on?

This whole direction of thought is broken, honestly.

Fine. Don't take the hint.

No worries, I can live with that. :slight_smile:

raggi@mbk:~/tmp/bench$ ruby runner.rb 10_000_000
time ruby a.rb 10000000 > /tmp/a

real 0m11.273s
user 0m10.042s
sys 0m0.483s
time ruby b.rb 10000000 > /tmp/b

real 0m7.473s
user 0m3.978s
sys 0m0.743s
time ruby c.rb 10000000 > /tmp/c

real 0m6.366s
user 0m5.403s
sys 0m0.491s
raggi@mbk:~/tmp/bench$ cat c.rb
times = ARGV[0].to_i
times.times { print "hola mundo\n" }

And the above will work with a significant sized stream, and bigger sized strings. You'll notice if you did bound check benchmarks, that the other solutions have serious issues in various not uncommon bounds.

Whoops.

Even in the 60's computer scientists knew that:

"Premature optimization is the root of all evil."
- C. A. R. Hoare

In the embedded version of Java, where it is possible to pre-allocate
objects and object pools, even this may not be a reliable truism.

···

On Tue, Mar 18, 2008 at 3:10 PM, Robert Klemme <shortcutter@googlemail.com> wrote:

Yes, that's true. But I'm inclined to say that even in Java there is a
single operation that is always the most expensive (not counting IO of
course) and this is object creation.

--
Avdi

I would normally do that, but I didn't want to cause more confusion by
renaming the author's original code.

Paul

···

On Wed, Mar 19, 2008 at 10:53:03PM +0900, Rob Biedenharn wrote:

OK, I avoided saying something on the previous message, but can you
(and anyone else who posts benchmarks like this) please use names
better than a.rb, b.rb, and c.rb so it actually makes some sense?
Particularly when quoted or forwarded!

> Replace output.write("hola mundo\n") with output.puts("hola mundo")
> and you'll get something similar.

Not much difference, and the StringIO solution uses more memory. Not
surprising, since both File#puts and StringIO#puts are implemented by
rb_io_puts.

The difference may be more dramatic on your platform, depending on the
implementation of your C library's stdio functions.

I'd say the difference is not worthwhile the added complexity and
memory overhead of StringIO:

18:35:19 /cygdrive/c/SCMws/
$ ruby <<XX

require 'stringio'
require 'benchmark'

File.open('/dev/null', 'wb') do |devnull|
  Benchmark.bmbm(15) do |x|
   n = 2_000_000

   x.report("IO#puts") {
     n.times do
       devnull.puts "hola"
     end
   }

   x.report("IO#write") {
     n.times do
       devnull.write "hola\n"
     end
   }

   x.report("StringIO#puts") {
     s = StringIO.new
     n.times do
       s.puts "hola"
     end
     s.rewind
     devnull.print s.read
   }

   x.report("StringIO#write") {
     s = StringIO.new
     n.times do
       s.write "hola\n"
     end
     s.rewind
     devnull.print s.read
   }
  end
end
XX

Rehearsal --------------------------------------------------
IO#puts 5.984000 0.016000 6.000000 ( 6.000000)
IO#write 2.875000 0.000000 2.875000 ( 2.885000)
StringIO#puts 4.219000 0.016000 4.235000 ( 4.240000)
StringIO#write 2.078000 0.031000 2.109000 ( 2.113000)
---------------------------------------- total: 15.219000sec

                     user system total real
IO#puts 6.000000 0.000000 6.000000 ( 5.996000)
IO#write 2.875000 0.000000 2.875000 ( 2.871000)
StringIO#puts 4.203000 0.015000 4.218000 ( 4.224000)
StringIO#write 2.140000 0.047000 2.187000 ( 2.175000)
18:36:46 /cygdrive/c/SCMws/
$

Note: I added proper closing of devnull and made the stream binary.

Kind regards

robert

···

2008/3/19, Paul Brannan <pbrannan@atdesk.com>:

On Thu, Mar 20, 2008 at 01:44:11AM +0900, Gerardo Santana G?mez Garrido wrote: