Basic Ruby performance

Dmitry_Nikiforov · 2 February 2012 22:20

Hello all!

First, I should point out that I'm new to Ruby, although it seems pretty
similar in some regards to JavaScript and Perl.

Anyway, I'm not sure if it's normal, or if it's specifics of Ruby on Mac
OS X, or if I haven't compiled it properly (although I used rvm to
install on my OS X Lion, to have the latest version - Lion comes with
1.8.7 by default, I believe, so I installed 1.9.3), but most things I
try to replicate with it that I used Perl to do before run about twice
slower. So I ran some basic benchmarks. Here's one example:

Ruby:

for a in 0..1E8
a*2
end

Perl:

for $a ( 0..1E8 ) {
a*2
}

Ruby takes 22 seconds, Perl - 9 seconds to execute this. This is very
similar to all other scenarios I tried (one of which is splitting
millions of comma separated rows into arrays).

I would really appreciate any useful suggestions: I would LOVE to be
able to use Ruby for most of the stuff I do (it's not that I don't like
Perl, but I love Ruby's syntax )

Thanks!

···

--
Posted via http://www.ruby-forum.com/.

Dmitry_Nikiforov · 2 February 2012 22:55

Here's another example with significantly bigger performance difference:

Ruby:

s = "This is a test string"

re = Regexp.new( / test / )

for a in 0..1E7
re.match( s )
end

Perl:

my $s = "This is a test string";

for my $a ( 0..1E7 ) {
$s =~ / test /;
}

Perl takes about 1.5 seconds to execute this, while Ruby takes a
whopping 16!!! :((( I have a very strong feeling that I didn't compile
Ruby properly - there can't be such a huge difference in regexp matching

···

--
Posted via http://www.ruby-forum.com/.

Ryan_Davis1 · 2 February 2012 23:15

Choosing the right language is a lot less important than choosing the right algorithm:

5461 % time ruby -e 'n = 10**8; p (n + 3*n**2 + 2*n**3)/6'
333333338333333350000000

real 0m0.009s
user 0m0.004s
sys 0m0.004s

In most cases (depending on the domain, of course (*)), ruby is "fast enough". Often, with my slower ruby, I'll finish coding long before you would in your faster language. This coding-time difference is usually sufficient to deal with run-time differences.

*) your domain is fast enough unless you work for the IRS, NASA, wallstreet **, or pixar ***.
**) sufficient examples exist to show that those domains are also fast enough.
***) prolly not here tho.

···

On Feb 2, 2012, at 14:20 , Dmitry Nikiforov wrote:

Ruby:

for a in 0..1E8
a*2
end

Perl:

for $a ( 0..1E8 ) {
a*2
}

Ruby takes 22 seconds, Perl - 9 seconds to execute this. This is very
similar to all other scenarios I tried (one of which is splitting
millions of comma separated rows into arrays).

I would really appreciate any useful suggestions: I would LOVE to be
able to use Ruby for most of the stuff I do (it's not that I don't like
Perl, but I love Ruby's syntax )

Dmitry_Nikiforov · 3 February 2012 02:27

I have to tell, however, that string.split() works much faster in Ruby
than it does in Perl, for some odd reason

···

--
Posted via http://www.ruby-forum.com/.

botp1 · 3 February 2012 03:52

Ruby:

for a in 0..1E8
a*2
end

omg, careful w big numbers in ruby..

x=Time.now();for a in 0..1E3;end; puts(Time.now-x)

0.000380635

x=Time.now();for a in 0..1E4;end; puts(Time.now-x)

0.00373148

x=Time.now();for a in 0..1E5;end; puts(Time.now-x)

0.029043426

x=Time.now();for a in 0..1E6;end; puts(Time.now-x)

0.201745265

x=Time.now();for a in 0..1E7;end; puts(Time.now-x)

1.939860867

x=Time.now();for a in 0..1E8;end; puts(Time.now-x)

19.276653266

note, the jump...

best regards -botp

···

On Fri, Feb 3, 2012 at 6:20 AM, Dmitry Nikiforov <dniq@dniq-online.com> wrote:

Reid_Thompson1 · 3 February 2012 04:10

$ cat mult.rb
#for a in 0..100000000
# a*2
#end
require 'rubygems'
require 'inline'

class Multiply
inline do |builder|
builder.c "
        long mult(int max) {
        long ctr = 0;
        unsigned long long result;
        while (ctr < max){ result = (ctr++ * 2);}
        return result;
        }"
end

···

On 2/2/2012 5:20 PM, Dmitry Nikiforov wrote:

Hello all!

First, I should point out that I'm new to Ruby, although it seems pretty
similar in some regards to JavaScript and Perl.

Anyway, I'm not sure if it's normal, or if it's specifics of Ruby on Mac
OS X, or if I haven't compiled it properly (although I used rvm to
install on my OS X Lion, to have the latest version - Lion comes with
1.8.7 by default, I believe, so I installed 1.9.3), but most things I
try to replicate with it that I used Perl to do before run about twice
slower. So I ran some basic benchmarks. Here's one example:

Ruby:

for a in 0..1E8
a*2
end

Perl:

for $a ( 0..1E8 ) {
a*2
}

Ruby takes 22 seconds, Perl - 9 seconds to execute this. This is very
similar to all other scenarios I tried (one of which is splitting
millions of comma separated rows into arrays).

I would really appreciate any useful suggestions: I would LOVE to be
able to use Ruby for most of the stuff I do (it's not that I don't like
Perl, but I love Ruby's syntax )

Thanks!

#

puts ARGV[0]
m = Multiply.new()
start_time = Time.now
a = m.mult(ARGV[0].to_i)
puts a.to_s
end_time = Time.now
duration = ((end_time.to_f - start_time.to_f) * 1000.0).to_i
puts "You took " + duration.to_s + " seconds."

my linux box
model name : Intel(R) Core(TM) i5-2500 CPU @ 3.30GHz quad core
$ ruby -v
ruby 1.8.7 (2011-12-28 patchlevel 357) [x86_64-linux]

[23:05:57] rthompso@raker2>~
$ time ruby mult.rb 999999991
999999991
1999999980
You took 0 seconds.

real 0m0.059s
user 0m0.050s
sys 0m0.008s

my windows xp laptop using cygwin ruby
T7100 @ 1.80GHZ dual core
Reid.Thompson@lt-lat-4960 ~
$ ruby -v
ruby 1.8.7 (2008-08-11 patchlevel 72) [i386-cygwin]

Reid.Thompson@lt-lat-4960 ~
$ time ruby rbtest.rb 999999991
999999991
1999999980
You took 0 seconds.

real 0m0.303s
user 0m0.109s
sys 0m0.170s

Dmitry_Nikiforov · 3 February 2012 17:17

Here's a question: when I say "for a in ( 0..1E8 )" - does Ruby create
an array and populate is with values from 0 through 1E8, or does it
merely create a counter similar to "for( a = 0; a<=1e8, a++ )" ?

···

--
Posted via http://www.ruby-forum.com/.

Dmitry_Nikiforov · 3 February 2012 18:56

Tried rubinius and jruby. Rubinius so far is the fastest one, but still
slower than Perl. The empty "for" loop runs about as fast as "while" or
.each (10 seconds - rubinius, 3.5 seconds - Perl), although .times
takes only 6 seconds.

Regexp match using / test /.match works about the same as s =~ / test /,
and is about 5 seconds, vs. Perl's 1.4s (1e7 repetitions), although,
seeing as there's such a huge difference in just empty loop alone it's
hard to tell if it's because regexps themselves are slower in Ruby, or
if it's because of the regexp engine...

jruby is just as slow as regular ruby 1.9.3.

···

--
Posted via http://www.ruby-forum.com/.

Dmitry_Nikiforov · 3 February 2012 19:00

Curious: rubinius reports itself as 2.0.0dev ( 1.8.7 ), which is strange
- Ruby 1.8.7 does not support \p{} regexps (like \p{Alnum} for example),
and rubinius does..

···

--
Posted via http://www.ruby-forum.com/.

Ryan_Davis1 · 2 February 2012 23:20

It's all the parens, whitespace, and use of tabs that slows ruby down:

# takes 26.6 seconds on my laptop:

s = "This is a test string"

re = Regexp.new( / test / )

for a in 0..1E7
re.match( s )
end

# takes 8.67 seconds on my laptop:

s = "This is a test string"

for a in 0..1E7
s =~ / test /
end

···

On Feb 2, 2012, at 14:55 , Dmitry Nikiforov wrote:

Here's another example with significantly bigger performance difference:

Ruby:

s = "This is a test string"

re = Regexp.new( / test / )

for a in 0..1E7
re.match( s )
end

Perl:

my $s = "This is a test string";

for my $a ( 0..1E7 ) {
$s =~ / test /;
}

Perl takes about 1.5 seconds to execute this, while Ruby takes a
whopping 16!!! :((( I have a very strong feeling that I didn't compile
Ruby properly - there can't be such a huge difference in regexp matching

Dmitry_Nikiforov · 3 February 2012 02:14

Ryan Davis wrote in post #1043801:

Choosing the right language is a lot less important than choosing the
right algorithm:

5461 % time ruby -e 'n = 10**8; p (n + 3*n**2 + 2*n**3)/6'
333333338333333350000000

real 0m0.009s
user 0m0.004s
sys 0m0.004s

My code was merely an example of very simple loop. Its purpose was not
to calculate something, but run through the loop, and execute
multiplication on every iteration.

My main area of development is processing of rather large amounts of
data (billions of entries, primarily processed by regular expressions,
with some statistical analysis on top, and potentially - addition of NLP
later). You _have_ to iterate through every entry of the incoming data
(which might already be in the database, plain text file, or might be
just a "fire hose" of data pouring into the system in real time).

While I'd LOVE to have a nice and clean syntax, performance is still
number 1 on my list of priorities, therefore I asked if maybe there are
ways to improve Ruby performance.

···

--
Posted via http://www.ruby-forum.com/\.

botp1 · 3 February 2012 04:03

x=Time.now();for a in 0..1E3;end; puts(Time.now-x)

0.000380635

x=Time.now();for a in 0..1E4;end; puts(Time.now-x)

0.00373148

x=Time.now();for a in 0..1E5;end; puts(Time.now-x)

0.029043426

x=Time.now();for a in 0..1E6;end; puts(Time.now-x)

0.201745265

x=Time.now();for a in 0..1E7;end; puts(Time.now-x)

1.939860867

x=Time.now();for a in 0..1E8;end; puts(Time.now-x)

19.276653266
note, the jump...

using plain fixnum may help, but the jump is still there

(3..8).each do |x|
t=Time.now();for a in 0..10**x;end; puts("#{x} #{Time.now()-t}")
end

3 0.000142406
4 0.001344933
5 0.014539207
6 0.076141941
7 0.737979205
8 7.359555691

best regards -botp

···

On Fri, Feb 3, 2012 at 11:52 AM, botp <botpena@gmail.com> wrote:

On Fri, Feb 3, 2012 at 6:20 AM, Dmitry Nikiforov <dniq@dniq-online.com> wrote:

Eric_Christopherson · 3 February 2012 17:32

It might make a counter-based loop internally, but at a higher level
it translates it into an iterator using Enumerable#each. The iterator
knows how to return the next number in the range each time. It doesn't
create an array with 1E8+1 elements.

···

On Fri, Feb 3, 2012 at 11:17 AM, Dmitry Nikiforov <dniq@dniq-online.com> wrote:

Here's a question: when I say "for a in ( 0..1E8 )" - does Ruby create
an array and populate is with values from 0 through 1E8, or does it
merely create a counter similar to "for( a = 0; a<=1e8, a++ )" ?

Chuck_Remes · 3 February 2012 19:01

The thing you need to know about Rubinius and JRuby is that they both JIT (just-in-time) compile the code, but they need to collect statistics on the runtime profile first. That usually takes a few seconds. So any test that runs for under 10s or so doesn't give the runtime much opportunity to optimize the code.

If you really are going to be working on big datasets, then try to benchmark something that takes at least a minute or so to run. You *cannot* reliably extrapolate performance from a test that runs 4s versus 2s.

cr

···

On Feb 3, 2012, at 12:56 PM, Dmitry Nikiforov wrote:

Tried rubinius and jruby. Rubinius so far is the fastest one, but still
slower than Perl. The empty "for" loop runs about as fast as "while" or
.each (10 seconds - rubinius, 3.5 seconds - Perl), although .times
takes only 6 seconds.

Regexp match using / test /.match works about the same as s =~ / test /,
and is about 5 seconds, vs. Perl's 1.4s (1e7 repetitions), although,
seeing as there's such a huge difference in just empty loop alone it's
hard to tell if it's because regexps themselves are slower in Ruby, or
if it's because of the regexp engine...

jruby is just as slow as regular ruby 1.9.3.

Charles_Nutter · 15 February 2012 08:57

I cannot reproduce. Post your jruby -v...I suspect it's running with
the non-optimizing JVM mode.

This is JRuby master + Java 7, which is the fastest of all Ruby impls
I have installed:

system ~/projects/jruby $ jruby -rbenchmark -e "3.times { puts
Benchmark.measure { for a in 0..1E8; a * 2; end } }"
  8.945000 0.000000 8.945000 ( 8.945000)
  8.405000 0.000000 8.405000 ( 8.405000)
  8.341000 0.000000 8.341000 ( 8.341000)

system ~/projects/jruby $ rvm 1.9.3 do ruby -rbenchmark -e "3.times {
puts Benchmark.measure { for a in 0..1E8; a * 2; end } }"
17.540000 0.000000 17.540000 ( 17.552517)
17.440000 0.010000 17.450000 ( 17.440687)
17.490000 0.000000 17.490000 ( 17.498955)

system ~/projects/jruby $ macruby -rbenchmark -e "3.times { puts
Benchmark.measure { for a in 0..1E8; a * 2; end } }" 15.800000
0.010000 15.810000 ( 15.803463)
15.750000 0.000000 15.750000 ( 15.751863)
15.850000 0.000000 15.850000 ( 15.860579)

system ~/projects/jruby $ ../rubinius/bin/rbx -rbenchmark -e "3.times
{ puts Benchmark.measure { for a in 0..1E8; a * 2; end } }"
11.190136 0.002077 11.192213 ( 11.193023)
11.087971 0.003063 11.091034 ( 11.091239)
7.020758 0.001578 7.022336 ( 7.022235)

It's a pretty meaningless benchmark, though.

- Charlie

···

On Fri, Feb 3, 2012 at 12:56 PM, Dmitry Nikiforov <dniq@dniq-online.com> wrote:

jruby is just as slow as regular ruby 1.9.3.

Peter_Vandenabeele1 · 2 February 2012 23:33

> Here's another example with significantly bigger performance difference:
>
> Ruby:
>
> s = "This is a test string"
>
> re = Regexp.new( / test / )
>
> for a in 0..1E7
> re.match( s )
> end
>
> Perl:
>
> my $s = "This is a test string";
>
> for my $a ( 0..1E7 ) {
> $s =~ / test /;
> }
>
> Perl takes about 1.5 seconds to execute this, while Ruby takes a
> whopping 16!!! :((( I have a very strong feeling that I didn't compile
> Ruby properly - there can't be such a huge difference in regexp matching
>

It's all the parens, whitespace, and use of tabs that slows ruby down:

Euhmmm, I doubt that ...

# takes 26.6 seconds on my laptop:

s = "This is a test string"

re = Regexp.new( / test / )

for a in 0..1E7
re.match( s )
end

# takes 8.67 seconds on my laptop:

s = "This is a test string"

for a in 0..1E7
s =~ / test /
end

The same "formatted" code with just replacing re.match( s) by
s =~ /test/ also causes the same change from 22 to 7 seconds
on my system (with the same formatting, spaces, etc.).

I rather expect it's because

`match` and `=~` do quite different things ...

`match` returns a complete MatchData object

`=~` returns the index (position) of the first match

017:0> re.match( s )
=> #<MatchData " test ">
018:0> s =~ /test/
=> 10

<speculation>
Maybe (speculation) the MatchData object takes more
dynamic Object allocation and thus more calls to the GC ?
</speculation>

HTH,

Peter

···

On Fri, Feb 3, 2012 at 12:20 AM, Ryan Davis <ryand-ruby@zenspider.com>wrote:

On Feb 2, 2012, at 14:55 , Dmitry Nikiforov wrote:

Jeremy_Bopp · 2 February 2012 23:36

Ryan is being a little facetious about the parenthesis and whitespace in
case that isn't clear. He has strong preferences about coding style.

Your test above runs in about 10 seconds on my system under Ruby 1.9.2.
The following equivalent code runs in about 6 seconds and is fairly
idiomatic Ruby:

s = "This is a test string"
(0..1E7).each do
s =~ / test /
end

This code runs in about 4 seconds, but it is a bit less pretty to my eyes:

s = "This is a test string"
i = 0
while i < 1E7 do
s =~ / test /
i += 1
end

I'm sure there are other solutions as well. The thing to keep in mind
is that method calls in Ruby are relatively expensive, so if you need
speed, you should try to avoid them.

Don't get hung up on micro benchmarks like the above though! They can
really be deceiving with respect to real world applications.

-Jeremy

···

On 02/02/2012 05:20 PM, Ryan Davis wrote:

On Feb 2, 2012, at 14:55 , Dmitry Nikiforov wrote:

Here's another example with significantly bigger performance difference:

Ruby:

s = "This is a test string"

re = Regexp.new( / test / )

for a in 0..1E7
re.match( s )
end

Perl:

my $s = "This is a test string";

for my $a ( 0..1E7 ) {
$s =~ / test /;
}

Perl takes about 1.5 seconds to execute this, while Ruby takes a
whopping 16!!! :((( I have a very strong feeling that I didn't compile
Ruby properly - there can't be such a huge difference in regexp matching

It's all the parens, whitespace, and use of tabs that slows ruby down:

Su_Zhang · 3 February 2012 02:40

One thing you can do is to replace for loops with while loops. For loops in Ruby will be translated to method calls to Enumerable#each, and in Ruby 1.9, Enumerable#each is slower than using ordinary while loops because of the overhead of processing enumerators. It is actually even slower than Ruby 1.8's Enumerable#each because 1.8 does not have enumerators.

···

On 2/2/2012 9:14 PM, Dmitry Nikiforov wrote:

My main area of development is processing of rather large amounts of
data (billions of entries, primarily processed by regular expressions,
with some statistical analysis on top, and potentially - addition of NLP
later). You _have_ to iterate through every entry of the incoming data
(which might already be in the database, plain text file, or might be
just a "fire hose" of data pouring into the system in real time).

While I'd LOVE to have a nice and clean syntax, performance is still
number 1 on my list of priorities, therefore I asked if maybe there are
ways to improve Ruby performance.

botp1 · 3 February 2012 04:25

pls ignore. i think it is consistent at 10
-botp

···

On Fri, Feb 3, 2012 at 12:03 PM, botp <botpena@gmail.com> wrote:

(3..8).each do |x|
t=Time.now();for a in 0..10**x;end; puts("#{x} #{Time.now()-t}")
end

3 0.000142406
4 0.001344933
5 0.014539207
6 0.076141941
7 0.737979205
8 7.359555691

Robert_K1 · 3 February 2012 10:40

Ryan Davis wrote in post #1043801:

Choosing the right language is a lot less important than choosing the
right algorithm:

5461 % time ruby -e 'n = 10**8; p (n + 3*n**2 + 2*n**3)/6'
333333338333333350000000

real 0m0.009s
user 0m0.004s
sys 0m0.004s

My code was merely an example of very simple loop. Its purpose was not
to calculate something, but run through the loop, and execute
multiplication on every iteration.

Yes, and maybe it was a bad example for what you are trying to do:

My main area of development is processing of rather large amounts of
data (billions of entries, primarily processed by regular expressions,
with some statistical analysis on top, and potentially - addition of NLP
later). You _have_ to iterate through every entry of the incoming data
(which might already be in the database, plain text file, or might be
just a "fire hose" of data pouring into the system in real time).

Question: the data needs to come from somewhere. Are you sure that
your processing is CPU bound? If it is IO bound the difference
between Perl and Ruby won't really show. I reckon it's better to
create a more realistic example of what you are trying to do and
measure again. (And take care to run tests between Ruby and Perl
alternating in order to prevent OS IO caching from preferring one or
the other.)

Kind regards

robert

···

On Fri, Feb 3, 2012 at 3:14 AM, Dmitry Nikiforov <dniq@dniq-online.com> wrote:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Topic		Replies	Views
Ruby Performance ruby-talk	87	237	17 August 2005
Ruby vs. perl ruby-talk	4	148	2 July 2002
Ruby vs Perl performance ruby-talk	76	239	16 February 2009
Runtime disparity - Same program in Perl and Ruby ruby-talk	5	109	16 June 2007
Ruby Compile-time optimization ruby-talk	42	156	18 March 2003

Basic Ruby performance

Related topics