Basic Ruby performance

Hello all!

First, I should point out that I'm new to Ruby, although it seems pretty
similar in some regards to JavaScript and Perl.

Anyway, I'm not sure if it's normal, or if it's specifics of Ruby on Mac
OS X, or if I haven't compiled it properly (although I used rvm to
install on my OS X Lion, to have the latest version - Lion comes with
1.8.7 by default, I believe, so I installed 1.9.3), but most things I
try to replicate with it that I used Perl to do before run about twice
slower. So I ran some basic benchmarks. Here's one example:

Ruby:

for a in 0..1E8
        a*2
end

Perl:

for $a ( 0..1E8 ) {
        a*2
}

Ruby takes 22 seconds, Perl - 9 seconds to execute this. This is very
similar to all other scenarios I tried (one of which is splitting
millions of comma separated rows into arrays).

I would really appreciate any useful suggestions: I would LOVE to be
able to use Ruby for most of the stuff I do (it's not that I don't like
Perl, but I love Ruby's syntax :slight_smile: )

Thanks!

···

--
Posted via http://www.ruby-forum.com/.

Here's another example with significantly bigger performance difference:

Ruby:

s = "This is a test string"

re = Regexp.new( / test / )

for a in 0..1E7
        re.match( s )
end

Perl:

my $s = "This is a test string";

for my $a ( 0..1E7 ) {
       $s =~ / test /;
}

Perl takes about 1.5 seconds to execute this, while Ruby takes a
whopping 16!!! :((( I have a very strong feeling that I didn't compile
Ruby properly - there can't be such a huge difference in regexp matching
:frowning:

···

--
Posted via http://www.ruby-forum.com/.

Choosing the right language is a lot less important than choosing the right algorithm:

5461 % time ruby -e 'n = 10**8; p (n + 3*n**2 + 2*n**3)/6'
333333338333333350000000

real 0m0.009s
user 0m0.004s
sys 0m0.004s

In most cases (depending on the domain, of course (*)), ruby is "fast enough". Often, with my slower ruby, I'll finish coding long before you would in your faster language. This coding-time difference is usually sufficient to deal with run-time differences.

*) your domain is fast enough unless you work for the IRS, NASA, wallstreet **, or pixar ***.
**) sufficient examples exist to show that those domains are also fast enough.
***) prolly not here tho.

···

On Feb 2, 2012, at 14:20 , Dmitry Nikiforov wrote:

Ruby:

for a in 0..1E8
       a*2
end

Perl:

for $a ( 0..1E8 ) {
       a*2
}

Ruby takes 22 seconds, Perl - 9 seconds to execute this. This is very
similar to all other scenarios I tried (one of which is splitting
millions of comma separated rows into arrays).

I would really appreciate any useful suggestions: I would LOVE to be
able to use Ruby for most of the stuff I do (it's not that I don't like
Perl, but I love Ruby's syntax :slight_smile: )

I have to tell, however, that string.split() works much faster in Ruby
than it does in Perl, for some odd reason :slight_smile:

···

--
Posted via http://www.ruby-forum.com/.

Ruby:

for a in 0..1E8
a*2
end

omg, careful w big numbers in ruby..

x=Time.now();for a in 0..1E3;end; puts(Time.now-x)

0.000380635

x=Time.now();for a in 0..1E4;end; puts(Time.now-x)

0.00373148

x=Time.now();for a in 0..1E5;end; puts(Time.now-x)

0.029043426

x=Time.now();for a in 0..1E6;end; puts(Time.now-x)

0.201745265

x=Time.now();for a in 0..1E7;end; puts(Time.now-x)

1.939860867

x=Time.now();for a in 0..1E8;end; puts(Time.now-x)

19.276653266

note, the jump...

best regards -botp

···

On Fri, Feb 3, 2012 at 6:20 AM, Dmitry Nikiforov <dniq@dniq-online.com> wrote:

$ cat mult.rb
#for a in 0..100000000
# a*2
#end
require 'rubygems'
require 'inline'

class Multiply
inline do |builder|
builder.c "
        long mult(int max) {
        long ctr = 0;
        unsigned long long result;
        while (ctr < max){ result = (ctr++ * 2);}
        return result;
        }"
end

···

On 2/2/2012 5:20 PM, Dmitry Nikiforov wrote:

Hello all!

First, I should point out that I'm new to Ruby, although it seems pretty
similar in some regards to JavaScript and Perl.

Anyway, I'm not sure if it's normal, or if it's specifics of Ruby on Mac
OS X, or if I haven't compiled it properly (although I used rvm to
install on my OS X Lion, to have the latest version - Lion comes with
1.8.7 by default, I believe, so I installed 1.9.3), but most things I
try to replicate with it that I used Perl to do before run about twice
slower. So I ran some basic benchmarks. Here's one example:

Ruby:

for a in 0..1E8
         a*2
end

Perl:

for $a ( 0..1E8 ) {
         a*2
}

Ruby takes 22 seconds, Perl - 9 seconds to execute this. This is very
similar to all other scenarios I tried (one of which is splitting
millions of comma separated rows into arrays).

I would really appreciate any useful suggestions: I would LOVE to be
able to use Ruby for most of the stuff I do (it's not that I don't like
Perl, but I love Ruby's syntax :slight_smile: )

Thanks!

#

puts ARGV[0]
m = Multiply.new()
start_time = Time.now
a = m.mult(ARGV[0].to_i)
puts a.to_s
end_time = Time.now
duration = ((end_time.to_f - start_time.to_f) * 1000.0).to_i
puts "You took " + duration.to_s + " seconds."

my linux box
model name : Intel(R) Core(TM) i5-2500 CPU @ 3.30GHz quad core
$ ruby -v
ruby 1.8.7 (2011-12-28 patchlevel 357) [x86_64-linux]

[23:05:57] rthompso@raker2>~
$ time ruby mult.rb 999999991
999999991
1999999980
You took 0 seconds.

real 0m0.059s
user 0m0.050s
sys 0m0.008s

my windows xp laptop using cygwin ruby
T7100 @ 1.80GHZ dual core
Reid.Thompson@lt-lat-4960 ~
$ ruby -v
ruby 1.8.7 (2008-08-11 patchlevel 72) [i386-cygwin]

Reid.Thompson@lt-lat-4960 ~
$ time ruby rbtest.rb 999999991
999999991
1999999980
You took 0 seconds.

real 0m0.303s
user 0m0.109s
sys 0m0.170s

Here's a question: when I say "for a in ( 0..1E8 )" - does Ruby create
an array and populate is with values from 0 through 1E8, or does it
merely create a counter similar to "for( a = 0; a<=1e8, a++ )" ?

···

--
Posted via http://www.ruby-forum.com/.

Tried rubinius and jruby. Rubinius so far is the fastest one, but still
slower than Perl. The empty "for" loop runs about as fast as "while" or
.each (10 seconds - rubinius, 3.5 seconds - Perl), although .times
takes only 6 seconds.

Regexp match using / test /.match works about the same as s =~ / test /,
and is about 5 seconds, vs. Perl's 1.4s (1e7 repetitions), although,
seeing as there's such a huge difference in just empty loop alone it's
hard to tell if it's because regexps themselves are slower in Ruby, or
if it's because of the regexp engine... :frowning:

jruby is just as slow as regular ruby 1.9.3.

···

--
Posted via http://www.ruby-forum.com/.

Curious: rubinius reports itself as 2.0.0dev ( 1.8.7 ), which is strange
- Ruby 1.8.7 does not support \p{} regexps (like \p{Alnum} for example),
and rubinius does..

···

--
Posted via http://www.ruby-forum.com/.

It's all the parens, whitespace, and use of tabs that slows ruby down:

# takes 26.6 seconds on my laptop:

s = "This is a test string"

re = Regexp.new( / test / )

for a in 0..1E7
       re.match( s )
end

# takes 8.67 seconds on my laptop:

s = "This is a test string"

for a in 0..1E7
  s =~ / test /
end

···

On Feb 2, 2012, at 14:55 , Dmitry Nikiforov wrote:

Here's another example with significantly bigger performance difference:

Ruby:

s = "This is a test string"

re = Regexp.new( / test / )

for a in 0..1E7
       re.match( s )
end

Perl:

my $s = "This is a test string";

for my $a ( 0..1E7 ) {
      $s =~ / test /;
}

Perl takes about 1.5 seconds to execute this, while Ruby takes a
whopping 16!!! :((( I have a very strong feeling that I didn't compile
Ruby properly - there can't be such a huge difference in regexp matching
:frowning:

Ryan Davis wrote in post #1043801:

Choosing the right language is a lot less important than choosing the
right algorithm:

5461 % time ruby -e 'n = 10**8; p (n + 3*n**2 + 2*n**3)/6'
333333338333333350000000

real 0m0.009s
user 0m0.004s
sys 0m0.004s

My code was merely an example of very simple loop. Its purpose was not
to calculate something, but run through the loop, and execute
multiplication on every iteration.

My main area of development is processing of rather large amounts of
data (billions of entries, primarily processed by regular expressions,
with some statistical analysis on top, and potentially - addition of NLP
later). You _have_ to iterate through every entry of the incoming data
(which might already be in the database, plain text file, or might be
just a "fire hose" of data pouring into the system in real time).

While I'd LOVE to have a nice and clean syntax, performance is still
number 1 on my list of priorities, therefore I asked if maybe there are
ways to improve Ruby performance.

···

--
Posted via http://www.ruby-forum.com/\.

x=Time.now();for a in 0..1E3;end; puts(Time.now-x)

0.000380635

x=Time.now();for a in 0..1E4;end; puts(Time.now-x)

0.00373148

x=Time.now();for a in 0..1E5;end; puts(Time.now-x)

0.029043426

x=Time.now();for a in 0..1E6;end; puts(Time.now-x)

0.201745265

x=Time.now();for a in 0..1E7;end; puts(Time.now-x)

1.939860867

x=Time.now();for a in 0..1E8;end; puts(Time.now-x)

19.276653266
note, the jump...

using plain fixnum may help, but the jump is still there

(3..8).each do |x|
t=Time.now();for a in 0..10**x;end; puts("#{x} #{Time.now()-t}")
end

3 0.000142406
4 0.001344933
5 0.014539207
6 0.076141941
7 0.737979205
8 7.359555691

best regards -botp

···

On Fri, Feb 3, 2012 at 11:52 AM, botp <botpena@gmail.com> wrote:

On Fri, Feb 3, 2012 at 6:20 AM, Dmitry Nikiforov <dniq@dniq-online.com> wrote:

It might make a counter-based loop internally, but at a higher level
it translates it into an iterator using Enumerable#each. The iterator
knows how to return the next number in the range each time. It doesn't
create an array with 1E8+1 elements.

···

On Fri, Feb 3, 2012 at 11:17 AM, Dmitry Nikiforov <dniq@dniq-online.com> wrote:

Here's a question: when I say "for a in ( 0..1E8 )" - does Ruby create
an array and populate is with values from 0 through 1E8, or does it
merely create a counter similar to "for( a = 0; a<=1e8, a++ )" ?

The thing you need to know about Rubinius and JRuby is that they both JIT (just-in-time) compile the code, but they need to collect statistics on the runtime profile first. That usually takes a few seconds. So any test that runs for under 10s or so doesn't give the runtime much opportunity to optimize the code.

If you really are going to be working on big datasets, then try to benchmark something that takes at least a minute or so to run. You *cannot* reliably extrapolate performance from a test that runs 4s versus 2s.

cr

···

On Feb 3, 2012, at 12:56 PM, Dmitry Nikiforov wrote:

Tried rubinius and jruby. Rubinius so far is the fastest one, but still
slower than Perl. The empty "for" loop runs about as fast as "while" or
.each (10 seconds - rubinius, 3.5 seconds - Perl), although .times
takes only 6 seconds.

Regexp match using / test /.match works about the same as s =~ / test /,
and is about 5 seconds, vs. Perl's 1.4s (1e7 repetitions), although,
seeing as there's such a huge difference in just empty loop alone it's
hard to tell if it's because regexps themselves are slower in Ruby, or
if it's because of the regexp engine... :frowning:

jruby is just as slow as regular ruby 1.9.3.

I cannot reproduce. Post your jruby -v...I suspect it's running with
the non-optimizing JVM mode.

This is JRuby master + Java 7, which is the fastest of all Ruby impls
I have installed:

system ~/projects/jruby $ jruby -rbenchmark -e "3.times { puts
Benchmark.measure { for a in 0..1E8; a * 2; end } }"
  8.945000 0.000000 8.945000 ( 8.945000)
  8.405000 0.000000 8.405000 ( 8.405000)
  8.341000 0.000000 8.341000 ( 8.341000)

system ~/projects/jruby $ rvm 1.9.3 do ruby -rbenchmark -e "3.times {
puts Benchmark.measure { for a in 0..1E8; a * 2; end } }"
17.540000 0.000000 17.540000 ( 17.552517)
17.440000 0.010000 17.450000 ( 17.440687)
17.490000 0.000000 17.490000 ( 17.498955)

system ~/projects/jruby $ macruby -rbenchmark -e "3.times { puts
Benchmark.measure { for a in 0..1E8; a * 2; end } }" 15.800000
0.010000 15.810000 ( 15.803463)
15.750000 0.000000 15.750000 ( 15.751863)
15.850000 0.000000 15.850000 ( 15.860579)

system ~/projects/jruby $ ../rubinius/bin/rbx -rbenchmark -e "3.times
{ puts Benchmark.measure { for a in 0..1E8; a * 2; end } }"
11.190136 0.002077 11.192213 ( 11.193023)
11.087971 0.003063 11.091034 ( 11.091239)
  7.020758 0.001578 7.022336 ( 7.022235)

It's a pretty meaningless benchmark, though.

- Charlie

···

On Fri, Feb 3, 2012 at 12:56 PM, Dmitry Nikiforov <dniq@dniq-online.com> wrote:

jruby is just as slow as regular ruby 1.9.3.

> Here's another example with significantly bigger performance difference:
>
> Ruby:
>
> s = "This is a test string"
>
> re = Regexp.new( / test / )
>
> for a in 0..1E7
> re.match( s )
> end
>
> Perl:
>
> my $s = "This is a test string";
>
> for my $a ( 0..1E7 ) {
> $s =~ / test /;
> }
>
> Perl takes about 1.5 seconds to execute this, while Ruby takes a
> whopping 16!!! :((( I have a very strong feeling that I didn't compile
> Ruby properly - there can't be such a huge difference in regexp matching
> :frowning:

It's all the parens, whitespace, and use of tabs that slows ruby down:

Euhmmm, I doubt that ...

# takes 26.6 seconds on my laptop:

s = "This is a test string"

re = Regexp.new( / test / )

for a in 0..1E7
      re.match( s )
end

# takes 8.67 seconds on my laptop:

s = "This is a test string"

for a in 0..1E7
s =~ / test /
end

The same "formatted" code with just replacing re.match( s) by
s =~ /test/ also causes the same change from 22 to 7 seconds
on my system (with the same formatting, spaces, etc.).

I rather expect it's because

`match` and `=~` do quite different things ...

`match` returns a complete MatchData object

`=~` returns the index (position) of the first match

017:0> re.match( s )
=> #<MatchData " test ">
018:0> s =~ /test/
=> 10

<speculation>
Maybe (speculation) the MatchData object takes more
dynamic Object allocation and thus more calls to the GC ?
</speculation>

HTH,

Peter

···

On Fri, Feb 3, 2012 at 12:20 AM, Ryan Davis <ryand-ruby@zenspider.com>wrote:

On Feb 2, 2012, at 14:55 , Dmitry Nikiforov wrote:

Ryan is being a little facetious about the parenthesis and whitespace in
case that isn't clear. He has strong preferences about coding style. :slight_smile:

Your test above runs in about 10 seconds on my system under Ruby 1.9.2.
The following equivalent code runs in about 6 seconds and is fairly
idiomatic Ruby:

s = "This is a test string"
(0..1E7).each do
  s =~ / test /
end

This code runs in about 4 seconds, but it is a bit less pretty to my eyes:

s = "This is a test string"
i = 0
while i < 1E7 do
  s =~ / test /
  i += 1
end

I'm sure there are other solutions as well. The thing to keep in mind
is that method calls in Ruby are relatively expensive, so if you need
speed, you should try to avoid them.

Don't get hung up on micro benchmarks like the above though! They can
really be deceiving with respect to real world applications.

-Jeremy

···

On 02/02/2012 05:20 PM, Ryan Davis wrote:

On Feb 2, 2012, at 14:55 , Dmitry Nikiforov wrote:

Here's another example with significantly bigger performance difference:

Ruby:

s = "This is a test string"

re = Regexp.new( / test / )

for a in 0..1E7
       re.match( s )
end

Perl:

my $s = "This is a test string";

for my $a ( 0..1E7 ) {
      $s =~ / test /;
}

Perl takes about 1.5 seconds to execute this, while Ruby takes a
whopping 16!!! :((( I have a very strong feeling that I didn't compile
Ruby properly - there can't be such a huge difference in regexp matching
:frowning:

It's all the parens, whitespace, and use of tabs that slows ruby down:

One thing you can do is to replace for loops with while loops. For loops in Ruby will be translated to method calls to Enumerable#each, and in Ruby 1.9, Enumerable#each is slower than using ordinary while loops because of the overhead of processing enumerators. It is actually even slower than Ruby 1.8's Enumerable#each because 1.8 does not have enumerators.

···

On 2/2/2012 9:14 PM, Dmitry Nikiforov wrote:

My main area of development is processing of rather large amounts of
data (billions of entries, primarily processed by regular expressions,
with some statistical analysis on top, and potentially - addition of NLP
later). You _have_ to iterate through every entry of the incoming data
(which might already be in the database, plain text file, or might be
just a "fire hose" of data pouring into the system in real time).

While I'd LOVE to have a nice and clean syntax, performance is still
number 1 on my list of priorities, therefore I asked if maybe there are
ways to improve Ruby performance.

pls ignore. i think it is consistent at 10
-botp

···

On Fri, Feb 3, 2012 at 12:03 PM, botp <botpena@gmail.com> wrote:

(3..8).each do |x|
t=Time.now();for a in 0..10**x;end; puts("#{x} #{Time.now()-t}")
end

3 0.000142406
4 0.001344933
5 0.014539207
6 0.076141941
7 0.737979205
8 7.359555691

Ryan Davis wrote in post #1043801:

Choosing the right language is a lot less important than choosing the
right algorithm:

5461 % time ruby -e 'n = 10**8; p (n + 3*n**2 + 2*n**3)/6'
333333338333333350000000

real 0m0.009s
user 0m0.004s
sys 0m0.004s

My code was merely an example of very simple loop. Its purpose was not
to calculate something, but run through the loop, and execute
multiplication on every iteration.

Yes, and maybe it was a bad example for what you are trying to do:

My main area of development is processing of rather large amounts of
data (billions of entries, primarily processed by regular expressions,
with some statistical analysis on top, and potentially - addition of NLP
later). You _have_ to iterate through every entry of the incoming data
(which might already be in the database, plain text file, or might be
just a "fire hose" of data pouring into the system in real time).

Question: the data needs to come from somewhere. Are you sure that
your processing is CPU bound? If it is IO bound the difference
between Perl and Ruby won't really show. I reckon it's better to
create a more realistic example of what you are trying to do and
measure again. (And take care to run tests between Ruby and Perl
alternating in order to prevent OS IO caching from preferring one or
the other.)

Kind regards

robert

···

On Fri, Feb 3, 2012 at 3:14 AM, Dmitry Nikiforov <dniq@dniq-online.com> wrote:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/