Basic Ruby performance

Ummmm... yeah... every time you tell it to do ten times as many loops,
it takes almost ten times as long. What's so surprising?

If you tweak the code to make it say how long each loop took, you'll
see that it actually gets FASTER at first, presumably due to assorted
constant overhead, then a tiny pinch slower (possibly due to the
switch to a different kind of number) but little enough that IMHO
that's lost in the noise.

puts("Pwr Tot Secs uS/Loop ")
(3..8).each do |x|
  limit = 10**x
  start_time = Time.now()
  for a in 0 .. limit
    # do nothing here, just timing how long the loops take
  end
  elapsed = Time.now() - start_time
  puts(" #{x} #{elapsed} #{elapsed * 1000000.0 / limit} ")
end

will output, with a tiny bit of post-formatting:

Pwr Tot Secs uS/Loop
  3 0.000292 0.292
  4 0.001869 0.1869
  5 0.014103 0.14103
  6 0.1058 0.1058
  7 1.051723 0.1051723
  8 10.584073 0.10584073

Of course, as someone mentioned later, the runtime may matter; this is
MRI 1.9.3.

-Dave

···

On Thu, Feb 2, 2012 at 23:03, botp <botpena@gmail.com> wrote:

(3..8).each do |x|
t=Time.now();for a in 0..10**x;end; puts("#{x} #{Time.now()-t}")
end

3 0.000142406
4 0.001344933
5 0.014539207
6 0.076141941
7 0.737979205
8 7.359555691

--
Dave Aronson: Available Cleared Ruby on Rails Freelancer
(NoVa/DC/Remote) -- see www.DaveAronson.com, and blogs at
www.Codosaur.us, www.Dare2XL.com, www.RecruitingRants.com

And your other bench. Again, pretty meaningless.

MacRuby does especially poorly here.

system ~/projects/jruby $ cat other_bench.rb
s = "This is a test string"

re = Regexp.new( / test / )

5.times {
puts Benchmark.measure {
for a in 0..1E7
       re.match( s )
end
}
}

system ~/projects/jruby $ jruby -rbenchmark other_bench.rb
  3.088000 0.000000 3.088000 ( 3.088000)
  2.359000 0.000000 2.359000 ( 2.359000)
  2.412000 0.000000 2.412000 ( 2.412000)

system ~/projects/jruby $ rvm 1.9.3 do ruby -rbenchmark other_bench.rb
12.880000 0.010000 12.890000 ( 12.894050)
12.840000 0.000000 12.840000 ( 12.845842)
12.800000 0.010000 12.810000 ( 12.804021)

system ~/projects/jruby $ macruby -rbenchmark other_bench.rb
65.710000 2.300000 68.010000 ( 56.906346)
^C

system ~/projects/jruby $ ../rubinius/bin/rbx -rbenchmark other_bench.rb
  4.682926 0.003170 4.686096 ( 4.686355)
  4.719460 0.002050 4.721510 ( 4.721482)
  4.181200 0.001947 4.183147 ( 4.183126)

- Charlie

You're replacing a method call (a.match b) with a syntactic construct a =~ b, the latter of which bypasses method dispatch and goes straight to the C-implimentation. Nothing else is really different, just a more direct code path. The match data is still available via the usual globals.

IMHO re.match is just as useless as Regexp.new, Array.new, and Hash.new (assuming no args/blocks passed). They're throwbacks to java devs and serve no purpose but to make things more verbose. In this specific case, there are tangible reasons to use =~ over #match.

···

On Feb 2, 2012, at 15:33 , Peter Vandenabeele wrote:

The same "formatted" code with just replacing re.match( s) by
s =~ /test/ also causes the same change from 22 to 7 seconds
on my system (with the same formatting, spaces, etc.).

I tried to drive that point home by showing a ruby solution that was 1000x faster than his perl solution, but unfortunately, rationality and micro-benchmarking don't often play well together.

···

On Feb 2, 2012, at 15:36 , Jeremy Bopp wrote:

Don't get hung up on micro benchmarks like the above though! They can
really be deceiving with respect to real world applications.

Jeremy Bopp wrote in post #1043805:

(0..1E7).each do
  s =~ / test /
end

Hmmm... Didn't realize this would make difference :slight_smile: Thanks!

Don't get hung up on micro benchmarks like the above though! They can
really be deceiving with respect to real world applications.

Well, I started doing these benchmarks after I've tried to rewrite parts
of the project I'm working on in Ruby. The project is rather
complicated, so it seemed as if Ruby's neat, clean syntax would make it
easier to handle, but the performance was dreadful. Initially I tried
1.8.7 that came natively with OS X Lion, then installed 1.9.3, without
much difference in performance - it's still mostly multiple times slower
than the Perl version I have :frowning: The problem with Perl version, though,
is that once it reaches certain limit - it becomes rather hard to manage
(especially so if you focus on performance the most - there are tricks
in Perl that make code run significantly faster, but make it virtually
unreadable).

···

--
Posted via http://www.ruby-forum.com/\.

Su Zhang wrote in post #1043831:

···

On 2/2/2012 9:14 PM, Dmitry Nikiforov wrote:

One thing you can do is to replace for loops with while loops. For loops
in Ruby will be translated to method calls to Enumerable#each, and in
Ruby 1.9, Enumerable#each is slower than using ordinary while loops
because of the overhead of processing enumerators. It is actually even
slower than Ruby 1.8's Enumerable#each because 1.8 does not have
enumerators.

Hmmm... Thanks! Definitely useful advice!

--
Posted via http://www.ruby-forum.com/\.

Robert Klemme wrote in post #1043884:

Question: the data needs to come from somewhere. Are you sure that
your processing is CPU bound? If it is IO bound the difference
between Perl and Ruby won't really show. I reckon it's better to
create a more realistic example of what you are trying to do and
measure again. (And take care to run tests between Ruby and Perl
alternating in order to prevent OS IO caching from preferring one or
the other.)

Yep, I'm sure it's CPU bound: the CPU load is at 100%. The data comes in
faster than the Ruby script can process it, unfortunately, at this
point. I'm trying to optimize it, of course, but so far Perl version
beats Ruby hands down. But there aren't too many options at my disposal,
it seems. In my examples, the "for" seems to be the major culprit: it
alone, without ANYTHING within the loop, takes 19 seconds to execute 1E8
times. The (0..1E8).each only saves about 1 second for me. Which doesn't
really matter - most of the loops in my scripts are "while" loops
anyway. Still, the regexps themselves run very slow. I wish Ruby used
standard Perl's PCRE library - that would make at least regexps run as
fast as they do in Perl, and I would be able to write my scripts in Ruby
:slight_smile:

···

--
Posted via http://www.ruby-forum.com/\.

i noticed the mult 10 too late. what i was emphasizing is that, given
the simple loop above, your point of acceptance should be less than
10**7. otherwise, beyond that, you' d get unacceptable response time.
just imagine, 7 seconds! this would not be acceptable for database
apps for example w response times of less than 5 seconds.

kind regards -botp

···

On Sat, Feb 4, 2012 at 12:27 AM, Dave Aronson <rubytalk2dave@davearonson.com> wrote:

On Thu, Feb 2, 2012 at 23:03, botp <botpena@gmail.com> wrote:

(3..8).each do |x|
t=Time.now();for a in 0..10**x;end; puts("#{x} #{Time.now()-t}")
end

3 0.000142406
4 0.001344933
5 0.014539207
6 0.076141941
7 0.737979205
8 7.359555691

Ummmm... yeah... every time you tell it to do ten times as many loops,
it takes almost ten times as long. What's so surprising?

You're replacing a method call (a.match b) with a syntactic construct a =~
b, the latter of which bypasses method dispatch and goes straight to the
C-implimentation.

Wow, I never knew that. I don't understand how it accomplishes this, a
could be any kind of object with =~ defined anywhere on it, how can it
bypass method dispatch?

IMHO re.match is just as useless as Regexp.new, Array.new, and Hash.new
(assuming no args/blocks passed).

I usually use `meth Hash.new` instead of `meth({})` I think it looks
cleaner.

···

On Thu, Feb 2, 2012 at 7:01 PM, Ryan Davis <ryand-ruby@zenspider.com> wrote:

Ryan Davis wrote in post #1043813:

IMHO re.match is just as useless as Regexp.new, Array.new, and Hash.new
(assuming no args/blocks passed). They're throwbacks to java devs and
serve no purpose but to make things more verbose. In this specific case,
there are tangible reasons to use =~ over #match.

The reason I tried to use Regexp.new is because I figured it would
pre-compile the regexp - the way "qr/ test /" in Perl would do, so that
it doesn't have to re-compile it on every iteration.

···

--
Posted via http://www.ruby-forum.com/\.

Hard to believe that this thread has gone this long without a mention of other Ruby runtimes.

You may want to also benchmark with JRuby (jruby.org) or with Rubinius (rubini.us). For ease of installation, you may want to consider using "rvm" to manage your Rubies (google for it to figure out how to install it).

cr

···

On Feb 2, 2012, at 8:19 PM, Dmitry Nikiforov wrote:

Jeremy Bopp wrote in post #1043805:

(0..1E7).each do
s =~ / test /
end

Hmmm... Didn't realize this would make difference :slight_smile: Thanks!

Don't get hung up on micro benchmarks like the above though! They can
really be deceiving with respect to real world applications.

Well, I started doing these benchmarks after I've tried to rewrite parts
of the project I'm working on in Ruby. The project is rather
complicated, so it seemed as if Ruby's neat, clean syntax would make it
easier to handle, but the performance was dreadful. Initially I tried
1.8.7 that came natively with OS X Lion, then installed 1.9.3, without
much difference in performance - it's still mostly multiple times slower
than the Perl version I have :frowning: The problem with Perl version, though,
is that once it reaches certain limit - it becomes rather hard to manage
(especially so if you focus on performance the most - there are tricks
in Perl that make code run significantly faster, but make it virtually
unreadable).

It creates a Range, which just iterates, not an array. A more idiomatic way
would probably be (1+10**8).times { ... }

As an aside, if all the processing is happening in the loop, then it might
make more sense that the loop just delegates work out to other processes
(e.g. parse a line or process a parsed set of data). This could be pretty
simple if done with a thread pool in a single Ruby script (you'll want one
of the alternate implementations here since you're CPU bound and MRI has a
GIL), or as arbitrarily complex as you like.

···

On Fri, Feb 3, 2012 at 11:13 AM, Dmitry Nikiforov <dniq@dniq-online.com>wrote:

Robert Klemme wrote in post #1043884:

> Question: the data needs to come from somewhere. Are you sure that
> your processing is CPU bound? If it is IO bound the difference
> between Perl and Ruby won't really show. I reckon it's better to
> create a more realistic example of what you are trying to do and
> measure again. (And take care to run tests between Ruby and Perl
> alternating in order to prevent OS IO caching from preferring one or
> the other.)

Yep, I'm sure it's CPU bound: the CPU load is at 100%. The data comes in
faster than the Ruby script can process it, unfortunately, at this
point. I'm trying to optimize it, of course, but so far Perl version
beats Ruby hands down. But there aren't too many options at my disposal,
it seems. In my examples, the "for" seems to be the major culprit: it
alone, without ANYTHING within the loop, takes 19 seconds to execute 1E8
times. The (0..1E8).each only saves about 1 second for me. Which doesn't
really matter - most of the loops in my scripts are "while" loops
anyway. Still, the regexps themselves run very slow. I wish Ruby used
standard Perl's PCRE library - that would make at least regexps run as
fast as they do in Perl, and I would be able to write my scripts in Ruby
:slight_smile:

--
Posted via http://www.ruby-forum.com/\.

You're replacing a method call (a.match b) with a syntactic construct a =~
b, the latter of which bypasses method dispatch and goes straight to the
C-implimentation.

Wow, I never knew that. I don't understand how it accomplishes this, a
could be any kind of object with =~ defined anywhere on it, how can it
bypass method dispatch?

MAGIC!

The code does extra type-checking at runtime.

IMHO re.match is just as useless as Regexp.new, Array.new, and Hash.new
(assuming no args/blocks passed).

I usually use `meth Hash.new` instead of `meth({})` I think it looks
cleaner.

def meth h = {}
  # ...
end

takes care of this entirely.

···

On Feb 2, 2012, at 17:36 , Josh Cheek wrote:

On Thu, Feb 2, 2012 at 7:01 PM, Ryan Davis <ryand-ruby@zenspider.com> wrote:

That's only because it *does* look cleaner.

···

On Fri, Feb 03, 2012 at 10:36:11AM +0900, Josh Cheek wrote:

I usually use `meth Hash.new` instead of `meth({})` I think it looks
cleaner.

--
Chad Perrin [ original content licensed OWL: http://owl.apotheon.org ]

Everything in Ruby is an object, even regexps, so you can save your
regexp to a variable or a constant to avoid a recompile. In addition,
the // expression is pretty much just syntactic sugar for
Regexp.new("some string") or Regexp.new(/some regexp/), so you can
forgoe that noise. The sugar is probably faster too since it should
avoid Ruby method calls, unlike Regexp.new, not that it should be an
issue in this example.

To see if this helps at all, try changing the code to the following:

s = "This is a test string"
re = / test /
for a in 0..1E7
  s =~ re
end

Try a similar change to the other looping variations that have been
discussed and see if and how much they may improve. For me I didn't
really see any difference between using re as above or using the simple
regexp directly; however, the code was almost an order of magnitude
slower when I replaced the comparison as follows:

  s =~ / test#{} /

It seems that Ruby is smart enough to see that the simple regexp will
never need to be re-evaluated. The regexp used above must force that
optimization off because #{} while constantly evaluated to the empty
string is technically dynamic, thus the regexp needs to be re-evaluated
in every iteration of the loop.

If you *really* need performance in the end, however, you might want to
consider coding your critical code paths in something like C and then
calling those from Ruby as a direct extension or using something like
ffi to call into a DLL containing the logic. Your overall code base may
be a little messy, but sometimes the speed you need requires such a
trade-off. Hopefully, you can keep the mess limited to only a small set
of your overall application logic. Of course, the same holds true for
Perl in this regard.

-Jeremy

···

On 02/02/2012 08:21 PM, Dmitry Nikiforov wrote:

Ryan Davis wrote in post #1043813:

IMHO re.match is just as useless as Regexp.new, Array.new, and Hash.new
(assuming no args/blocks passed). They're throwbacks to java devs and
serve no purpose but to make things more verbose. In this specific case,
there are tangible reasons to use =~ over #match.

The reason I tried to use Regexp.new is because I figured it would
pre-compile the regexp - the way "qr/ test /" in Perl would do, so that
it doesn't have to re-compile it on every iteration.

Not necessary in Ruby: regexp literals are treated specially and are
not recompiled. Usually it's faster to do

io.each do |line|
  if line =~ /foo/
  end
end

than

rx = /foo/

io.each do |line|
  if line =~ rx
  end
end

If there is dynamic content, use /o:

input = gets

io.each do |line|
  if line =~ /foo:#{input}/o
  end
end

Kind regards

robert

···

On Fri, Feb 3, 2012 at 3:21 AM, Dmitry Nikiforov <dniq@dniq-online.com> wrote:

Ryan Davis wrote in post #1043813:

IMHO re.match is just as useless as Regexp.new, Array.new, and Hash.new
(assuming no args/blocks passed). They're throwbacks to java devs and
serve no purpose but to make things more verbose. In this specific case,
there are tangible reasons to use =~ over #match.

The reason I tried to use Regexp.new is because I figured it would
pre-compile the regexp - the way "qr/ test /" in Perl would do, so that
it doesn't have to re-compile it on every iteration.

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Josh Cheek wrote in post #1043970:

It creates a Range, which just iterates, not an array. A more idiomatic
way
would probably be (1+10**8).times { ... }

Phew... :slight_smile:

As an aside, if all the processing is happening in the loop, then it
might
make more sense that the loop just delegates work out to other processes
(e.g. parse a line or process a parsed set of data). This could be
pretty
simple if done with a thread pool in a single Ruby script (you'll want
one
of the alternate implementations here since you're CPU bound and MRI has
a
GIL), or as arbitrarily complex as you like.

Yeah, that's how it works in my Perl version - it all runs on Amazon,
with workload delegated to "worker" servers in a MapReduce-like fashion,
using Redis for inter-server communication.

···

--
Posted via http://www.ruby-forum.com/\.

. . . unless you have a different default for the method's argument.

···

On Fri, Feb 03, 2012 at 11:03:28AM +0900, Ryan Davis wrote:

On Feb 2, 2012, at 17:36 , Josh Cheek wrote:
>
> I usually use `meth Hash.new` instead of `meth({})` I think it looks
> cleaner.

def meth h = {}
  # ...
end

takes care of this entirely.

--
Chad Perrin [ original content licensed OWL: http://owl.apotheon.org ]

The o flag tells Ruby to only interpolate the first time, and then cache
the regex

s =~ / test#{} /o

···

On Thu, Feb 2, 2012 at 8:57 PM, Jeremy Bopp <jeremy@bopp.net> wrote:

Try a similar change to the other looping variations that have been
discussed and see if and how much they may improve. For me I didn't
really see any difference between using re as above or using the simple
regexp directly; however, the code was almost an order of magnitude
slower when I replaced the comparison as follows:

s =~ / test#{} /

It seems that Ruby is smart enough to see that the simple regexp will
never need to be re-evaluated. The regexp used above must force that
optimization off because #{} while constantly evaluated to the empty
string is technically dynamic, thus the regexp needs to be re-evaluated
in every iteration of the loop.

Jeremy Bopp wrote in post #1043834:

Thank you for the advices!

If you *really* need performance in the end, however, you might want to
consider coding your critical code paths in something like C and then
calling those from Ruby as a direct extension or using something like
ffi to call into a DLL containing the logic. Your overall code base may
be a little messy, but sometimes the speed you need requires such a
trade-off. Hopefully, you can keep the mess limited to only a small set
of your overall application logic. Of course, the same holds true for
Perl in this regard.

Well, the performance of Perl has so far been very satisfactory. In
fact, as far as RegExps are concerned - I could barely match Perl's
performance in C++ (and even then had to mix in some plain C code). So
far, it seems, that I'm stuck with Perl :frowning: Not that it's really a bad
thing - I've been developing in it since 1997, so I know it pretty well,
while I've only spent about 2 weeks with Ruby.

I guess I will have to wait and see if Ruby interpreter becomes more
efficient :frowning: But I have to confess: I'm REALLY tempted to, in some
cases, forgo the performance in favor of handsome code :slight_smile:

···

--
Posted via http://www.ruby-forum.com/\.