Ruby performance

Hi,

I'm new to Ruby, and I've just tried to measure it's performance
compared to Python, and got some interesting results.

The Ruby test script is:

#!/usr/bin/env ruby

s = ""
i = 0
while line = gets
  s += line
  i += 1
  puts(i) if i % 1000 == 0
end

and the Python one:

#!/usr/bin/env python

import sys

s = ""
i = 0
for line in sys.stdin:
  s += line
  i += 1
  if i % 1000 == 0: print i

I fed a 1MB-large file as input to each of this script. The strange
thing is that the Ruby version starts to output progress slower and
slower as the s string grows. The Python version went smoothly.

Looks like a memory management issue with Ruby. I wonder if this is
going to be significantly improved in subsequent releases of Ruby? I
was testing with Ruby 1.8.6, and Python 2.5.1. (I also tried Python
2.3 and it was significantly slower than 2.5)

As I said, I'm new to Ruby, but want to use it for a long term
project, and would like to know more about it's performance specifics.

Thanks for any clarification in advance!

This is because "string" + "otherstring" allocates a new string, "stringotherstring".

Use << instead, and it will concatenate "otherstring" to the orignal "string".

s = ""
i = 0
while line = gets
  s << line
  i += 1
  puts(i) if i % 1000 == 0
end

Performance should be a lot better with that version.

Kirk Haines

···

On Sat, 29 Sep 2007, Vasyl Smirnov wrote:

Hi,

I'm new to Ruby, and I've just tried to measure it's performance
compared to Python, and got some interesting results.

The Ruby test script is:

#!/usr/bin/env ruby

s = ""
i = 0
while line = gets
s += line
i += 1
puts(i) if i % 1000 == 0
end

and the Python one:

#!/usr/bin/env python

import sys

s = ""
i = 0
for line in sys.stdin:
s += line
i += 1
if i % 1000 == 0: print i

I fed a 1MB-large file as input to each of this script. The strange
thing is that the Ruby version starts to output progress slower and
slower as the s string grows. The Python version went smoothly.

Some things about optimizing Python that doesn't work in Ruby except
to make the Ruby code run more slowly:

1). Read the entire contents of the file using a single operation.
2). Use LC (List Comprenhensions) to iterate over the entire file such
as to count or gather lines.
3). Use Psyco to dramatically boost the speed of the Python code to
near machine code speeds.
4). Use String translate function to process the contents of the file
in cases where this sort of thing is needed.

I was able to read a 20 MB file using Python code that ran 10x to 30x
faster than the fastest Ruby code that attempted to read the file into
an array of 64 bit values. In my case my code needed to process every
single character in the 20 MB file by setting the MSB of each
character and then write the file back out.

···

On Sep 28, 8:59 am, Vasyl Smirnov <vasyl.smir...@gmail.com> wrote:

Hi,

I'm new to Ruby, and I've just tried to measure it's performance
compared to Python, and got some interesting results.

The Ruby test script is:

#!/usr/bin/env ruby

s = ""
i = 0
while line = gets
  s += line
  i += 1
  puts(i) if i % 1000 == 0
end

and the Python one:

#!/usr/bin/env python

import sys

s = ""
i = 0
for line in sys.stdin:
  s += line
  i += 1
  if i % 1000 == 0: print i

I fed a 1MB-large file as input to each of this script. The strange
thing is that the Ruby version starts to output progress slower and
slower as the s string grows. The Python version went smoothly.

Looks like a memory management issue with Ruby. I wonder if this is
going to be significantly improved in subsequent releases of Ruby? I
was testing with Ruby 1.8.6, and Python 2.5.1. (I also tried Python
2.3 and it was significantly slower than 2.5)

As I said, I'm new to Ruby, but want to use it for a long term
project, and would like to know more about it's performance specifics.

Thanks for any clarification in advance!

Indeed, fast as hell!

Thank you, Kirk!

···

On Sep 28, 7:06 pm, khai...@enigo.com wrote:

Use << instead, and it will concatenate "otherstring" to the orignal
"string".

Kirk Haines

That should be on a Ruby Optimization list somewhere!

···

On Sep 28, 2007, at 11:06 AM, khaines@enigo.com wrote:

On Sat, 29 Sep 2007, Vasyl Smirnov wrote:

Hi,

I'm new to Ruby, and I've just tried to measure it's performance
compared to Python, and got some interesting results.

The Ruby test script is:

#!/usr/bin/env ruby

s = ""
i = 0
while line = gets
s += line
i += 1
puts(i) if i % 1000 == 0
end

and the Python one:

#!/usr/bin/env python

import sys

s = ""
i = 0
for line in sys.stdin:
s += line
i += 1
if i % 1000 == 0: print i

I fed a 1MB-large file as input to each of this script. The strange
thing is that the Ruby version starts to output progress slower and
slower as the s string grows. The Python version went smoothly.

This is because "string" + "otherstring" allocates a new string, "stringotherstring".

Use << instead, and it will concatenate "otherstring" to the orignal "string".

s = ""
i = 0
while line = gets
s << line
i += 1
puts(i) if i % 1000 == 0
end

Performance should be a lot better with that version.

Kirk Haines

> Hi,

> I'm new to Ruby, and I've just tried to measure it's performance
> compared to Python, and got some interesting results.

> The Ruby test script is:

> #!/usr/bin/env ruby

> s = ""
> i = 0
> while line = gets
> s += line
> i += 1
> puts(i) if i % 1000 == 0
> end

> and the Python one:

> #!/usr/bin/env python

> import sys

> s = ""
> i = 0
> for line in sys.stdin:
> s += line
> i += 1
> if i % 1000 == 0: print i

> I fed a 1MB-large file as input to each of this script. The strange
> thing is that the Ruby version starts to output progress slower and
> slower as the s string grows. The Python version went smoothly.

> Looks like a memory management issue with Ruby. I wonder if this is
> going to be significantly improved in subsequent releases of Ruby? I
> was testing with Ruby 1.8.6, and Python 2.5.1. (I also tried Python
> 2.3 and it was significantly slower than 2.5)

> As I said, I'm new to Ruby, but want to use it for a long term
> project, and would like to know more about it's performance specifics.

> Thanks for any clarification in advance!

Some things about optimizing Python that doesn't work in Ruby except
to make the Ruby code run more slowly:

1). Read the entire contents of the file using a single operation.
2). Use LC (List Comprenhensions) to iterate over the entire file such
as to count or gather lines.
3). Use Psyco to dramatically boost the speed of the Python code to
near machine code speeds.

Bull. Psyco isn't even as fast as LuaJIT.

4). Use String translate function to process the contents of the file
in cases where this sort of thing is needed.

I was able to read a 20 MB file using Python code that ran 10x to 30x
faster than the fastest Ruby code that attempted to read the file into
an array of 64 bit values. In my case my code needed to process every
single character in the 20 MB file by setting the MSB of each
character and then write the file back out.

Since Psyco only had to execute one command, "translate", in order to
convert the entire contents of the file, the program was very fast.
Most programs are much slower under Psyco.

···

On Sep 28, 2:28 pm, Ruby Maniac <rubyman...@gmail.com> wrote:

On Sep 28, 8:59 am, Vasyl Smirnov <vasyl.smir...@gmail.com> wrote:

Weird. The python version extremely slow compared to Kirk's ruby version. Does it suffer from the same allocation problems as the initial ruby version?

/C

···

On 28 Sep 2007, at 19:21, John Joyce wrote:

On Sep 28, 2007, at 11:06 AM, khaines@enigo.com wrote:

On Sat, 29 Sep 2007, Vasyl Smirnov wrote:

Hi,

I'm new to Ruby, and I've just tried to measure it's performance
compared to Python, and got some interesting results.

The Ruby test script is:

#!/usr/bin/env ruby

s = ""
i = 0
while line = gets
s += line
i += 1
puts(i) if i % 1000 == 0
end

and the Python one:

#!/usr/bin/env python

import sys

s = ""
i = 0
for line in sys.stdin:
s += line
i += 1
if i % 1000 == 0: print i

I fed a 1MB-large file as input to each of this script. The strange
thing is that the Ruby version starts to output progress slower and
slower as the s string grows. The Python version went smoothly.

This is because "string" + "otherstring" allocates a new string, "stringotherstring".

Use << instead, and it will concatenate "otherstring" to the orignal "string".

s = ""
i = 0
while line = gets
s << line
i += 1
puts(i) if i % 1000 == 0
end

Performance should be a lot better with that version.

Kirk Haines

That should be on a Ruby Optimization list somewhere!

Hey all

I've been meaning to send out this email for a while.

In light of recent criticism and desperation for faster Ruby code, what are some common practices (little things) that will make code run faster? IE, if-then or case-when, etc.

So far all I have is:
<< instead of +

-------------------------------------------|
~ Ari

From now on, when giving examples, instead of ie use ff.

Christoffer Lernö wrote:

Weird. The python version extremely slow compared to Kirk's ruby version. Does it suffer from the same allocation problems as the initial ruby version?

Yes, for a + b python allocates a new string of length(a) + length(b), then a's and b's content are copied into the new string. There is only an optimization if a or b is empty. a << b in ruby first reallocates memory of length(b) to a's then copies only b's content to a. If b is empty ruby does nothing. The more complex concatenation operation in python is caused by python's immutable strings.

···

--
Florian Frank

The most general advice is to minimize the number of method calls you
make, and try to create fewer intermediate objects. Using << instead of
+ to accumulate strings satisfies the latter.

Another specific instance would be that it usually performs better to
do:

def foo(x)
   x.y { yield }
end

than:

def foo(x, &block)
   x.y &block
end

(Of course the latter form is still required if #y retains the block as
a Proc object, rather than calling it directly.)

-mental

···

On Sun, 2007-09-30 at 06:34 +0900, Ari Brown wrote:

In light of recent criticism and desperation for faster Ruby code,
what are some common practices (little things) that will make code
run faster? IE, if-then or case-when, etc.

So far all I have is:
<< instead of +