Methods defined with module_eval are... faster?

Hi, just let me show this code:

···

-------------------
require "benchmark"

module M
  self.module_eval("
    def self.kk(var)
      if var == 123
        a=var*2
      else
        a=var*4
      end
    end
")

  def self.aa(var)
    if var == 123
      b=var*2
    else
      b=var*4
    end
  end
end

printf "M.kk: "
puts Benchmark.realtime { 4000000.times { M.kk(123) ; M.kk(4) } }
printf "M.aa: "
puts Benchmark.realtime { 4000000.times { M.aa(123) ; M.aa(4) } }
printf "M.kk: "
puts Benchmark.realtime { 4000000.times { M.kk(123) ; M.kk(4) } }
printf "M.aa: "
puts Benchmark.realtime { 4000000.times { M.aa(123) ; M.aa(4) } }
-------------------

Results (ruby 1.9.2):

M.kk: 1.4799699783325195
M.aa: 1.6037561893463135
M.kk: 1.4918322563171387
M.aa: 1.612684965133667

This is, M.kk (defined with module_eval) is faster than M.aa (which is
a commonly defined method). ¿?¿?

--
Iñaki Baz Castillo
<ibc@aliax.net>

Yes that's correct. In order to understand the logic it's important to take a look at the resulting bytecode, as benchmarks can get a little weird. First the non-module_eval version:

Resulting bytecode:

As shown here, the method is created, then attached to the module. Now the module_eval version:

Resulting bytecode:

Now here, instead of creating a method and attaching it the module, it's instead just putting a string with the code on the stack and sending it off to module_eval. This hits the C side of ruby in the following manner:

So in essence it means more time is spent in the C side instead of a back and forth chain of bytecode calls. Note that all bytecode is produced via MRI 1.9.3. Honestly though the difference between the two using the benchmark was minimal enough that I don't think it's anything to be worried over. Also note that syntax highlighters will see the eval string as what it is, a literal string. This will make your code less readable!

Regards,
Chris White
Twitter: http://www.twitter.com/cwgem

Hi,

When I run this using Benchmark.bmbm (to account for GC) I don't see
any appreciable difference between the two methods.

I used this code with ruby 1.9.2:

TIMES=4000000
Benchmark.bmbm {|x| x.report("M.kk") { TIMES.times { M.kk(123) ; M.kk(4) } } }
Benchmark.bmbm {|x| x.report("M.aa") { TIMES.times { M.aa(123) ; M.aa(4) } } }
Benchmark.bmbm {|x| x.report("M.kk") { TIMES.times { M.kk(123) ; M.kk(4) } } }
Benchmark.bmbm {|x| x.report("M.aa") { TIMES.times { M.aa(123) ; M.aa(4) } } }

and got these results:

M.kk 3.850000 0.010000 3.860000 ( 4.149469)
M.aa 3.960000 0.010000 3.970000 ( 3.995387)
M.kk 3.900000 0.020000 3.920000 ( 4.154616)
M.aa 3.900000 0.010000 3.910000 ( 4.201047)

I then re-ran the original code but changed the order of the tests. I
got these results:

Original order:
M.kk: 3.75673246383667
M.aa: 3.9386367797851562
M.kk: 3.531486988067627
M.aa: 3.7896509170532227

Swapped:
M.aa: 4.207374095916748
M.kk: 4.187262296676636
M.aa: 3.915318727493286
M.kk: 4.421187162399292

Apart from showing that I have a ~much~ slower machine than yours, I
don't see anything significant in these results.
I think what you're seeing is an artefact of garbage collection.

Regards,
Sean

I got that aa was slightly quicker on 1.9.2p180
M.kk: 1.5347199440002441
M.aa: 1.506194829940796
M.kk: 1.527411937713623
M.aa: 1.4843168258666992

···

On Sun, Aug 7, 2011 at 7:45 AM, Iñaki Baz Castillo <ibc@aliax.net> wrote:

Hi, just let me show this code:

-------------------
require "benchmark"

module M
self.module_eval("
   def self.kk(var)
     if var == 123
       a=var*2
     else
       a=var*4
     end
   end
")

def self.aa(var)
   if var == 123
     b=var*2
   else
     b=var*4
   end
end
end

printf "M.kk: "
puts Benchmark.realtime { 4000000.times { M.kk(123) ; M.kk(4) } }
printf "M.aa: "
puts Benchmark.realtime { 4000000.times { M.aa(123) ; M.aa(4) } }
printf "M.kk: "
puts Benchmark.realtime { 4000000.times { M.kk(123) ; M.kk(4) } }
printf "M.aa: "
puts Benchmark.realtime { 4000000.times { M.aa(123) ; M.aa(4) } }
-------------------

Results (ruby 1.9.2):

M.kk: 1.4799699783325195
M.aa: 1.6037561893463135
M.kk: 1.4918322563171387
M.aa: 1.612684965133667

This is, M.kk (defined with module_eval) is faster than M.aa (which is
a commonly defined method). ¿?¿?

--
Iñaki Baz Castillo
<ibc@aliax.net>

Okay so I took another look at this. Since the bytecode I showed before only showed the string of code being sent to module_eval, I needed to find a way to get the bytecode for the result of running it. Fortunately in vm_eval.c there's a line that's set to `if (0)`, where the code it's surrounding shows the resulting eval bytecode (with a comment saying use it for debug purposes only, which, guess what I'm doing!) Horray, so I set this to 1, recompiled and this is what I got:

Line for line, it's all exactly the same (okay with the exception of variable names). This breaks my theory that eval's behind the scenes generation did something unique with the code. Here's my results running it on Mac OSX 1.9.2-p290 source compile:

M.kk: 1.056156873703003
M.aa: 1.0313341617584229
M.kk: 1.0313379764556885
M.aa: 1.0262627601623535

In this case there's less and less of a difference. I'd almost go so far to say that it's just slight inaccuracies involved in benchmark timings, with the bytecode being the same and all.

Regards,
Chris White
http://www.twitter.com/cwgem

···

On Aug 7, 2011, at 5:45 AM, Iñaki Baz Castillo wrote:

Hi, just let me show this code:

-------------------
require "benchmark"

module M
self.module_eval("
   def self.kk(var)
     if var == 123
       a=var*2
     else
       a=var*4
     end
   end
")

def self.aa(var)
   if var == 123
     b=var*2
   else
     b=var*4
   end
end
end

printf "M.kk: "
puts Benchmark.realtime { 4000000.times { M.kk(123) ; M.kk(4) } }
printf "M.aa: "
puts Benchmark.realtime { 4000000.times { M.aa(123) ; M.aa(4) } }
printf "M.kk: "
puts Benchmark.realtime { 4000000.times { M.kk(123) ; M.kk(4) } }
printf "M.aa: "
puts Benchmark.realtime { 4000000.times { M.aa(123) ; M.aa(4) } }
-------------------

Results (ruby 1.9.2):

M.kk: 1.4799699783325195
M.aa: 1.6037561893463135
M.kk: 1.4918322563171387
M.aa: 1.612684965133667

Amazing explanation.

Yes. the difference is not so critical/important, not at all. And of
course writing methods as strings to be later evaluated is a pain :slight_smile:

Thanks a lot.

···

2011/8/7 Chris White <cwprogram@live.com>:

So in essence it means more time is spent in the C side instead of a back and forth chain of bytecode calls. Note that all bytecode is produced via MRI 1.9.3. Honestly though the difference between the two using the benchmark was minimal enough that I don't think it's anything to be worried over. Also note that syntax highlighters will see the eval string as what it is, a literal string. This will make your code less readable!

--
Iñaki Baz Castillo
<ibc@aliax.net>

This doesn't explain at all why running the method `kk` is faster, it explains why
the small snippet *creating* the method `kk` might be faster than creating the
method `aa`. You didn't show and compare the bytecode of the method `kk`
after it had been compiled by module_eval, which is what is actually run by
the loop and is measured to be faster than the bytecode inside `aa`.

Michael Edgar
adgar@carboni.ca
http://carboni.ca/

···

On Aug 7, 2011, at 11:01 AM, Chris White wrote:

Yes that's correct. In order to understand the logic it's important to take a look at the resulting bytecode, as benchmarks can get a little weird. First the non-module_eval version:

non-module_eval version · GitHub

Resulting bytecode:

non-module_eval version bytecode · GitHub

As shown here, the method is created, then attached to the module. Now the module_eval version:

Using module eval · GitHub

Resulting bytecode:

module_eval version bytecode · GitHub

Now here, instead of creating a method and attaching it the module, it's instead just putting a string with the code on the stack and sending it off to module_eval. This hits the C side of ruby in the following manner:

module eval call chain · GitHub

So in essence it means more time is spent in the C side instead of a back and forth chain of bytecode calls. Note that all bytecode is produced via MRI 1.9.3. Honestly though the difference between the two using the benchmark was minimal enough that I don't think it's anything to be worried over. Also note that syntax highlighters will see the eval string as what it is, a literal string. This will make your code less readable!

Not only that, but the differences are so small, that you have to
correct for inaccuracies of the internal clock's resolution.

Tenths of milliseconds is, generally speaking, too small to be
measures accurately by non-specialist (i.e. lab) tools.

···

On Tue, Aug 9, 2011 at 5:41 AM, Chris White <cwprogram@live.com> wrote:

In this case there's less and less of a difference. I'd almost go so far to say that
it's just slight inaccuracies involved in benchmark timings, with the bytecode
being the same and all.

--
Phillip Gawlowski

phgaw.posterous.com | twitter.com/phgaw | gplus.to/phgaw

A method of solution is perfect if we can forsee from the start,
and even prove, that following that method we shall attain our aim.
-- Leibniz

Good experiment :wink:

···

2011/8/9 Chris White <cwprogram@live.com>:

Bytecode output · GitHub

Line for line, it's all exactly the same (okay with the exception of variable names). This breaks my theory that eval's behind the scenes generation did something unique with the code. Here's my results running it on Mac OSX 1.9.2-p290 source compile:

M.kk: 1.056156873703003
M.aa: 1.0313341617584229
M.kk: 1.0313379764556885
M.aa: 1.0262627601623535

In this case there's less and less of a difference. I'd almost go so far to say that it's just slight inaccuracies involved in benchmark timings, with the bytecode being the same and all.

--
Iñaki Baz Castillo
<ibc@aliax.net>