Array performance question

Here's a real Array performance mystery. I can't help my friend figure
this one out. He writes,

···

========

Is there a known bug in Ruby array performance? I spent a lot of time
yesterday boiling down the following example

http://blogs.codehaus.org/people/geir/archives/001768_baffled_with_ruby.html

only because I still can't believe I'm not doing something incredibly
stupid. (I'm a newbie, "Reluctant Rubyist")

I'm just not used to arrays behaving this way :slight_smile: Anyone have any
insight?

========

He'd love a response in his blog, but with permission I'll copy
answers here over there. He's not a regular ruby-talk reader...yet.

Jim
--
Jim Menard, jimm@io.com, jim.menard@gmail.com
http://www.io.com/~jimm/

Slightly modified test (attached):

15:44:06 Temp$ ruby bm.rb bm; ruby bm.rb bmbm; ruby19 bm.rb bm; ruby19
bm.rb bmbm
1.8.7
                                    user system total real
size = 10000 10000 0.125000 0.000000 0.125000 ( 0.114000)
size = 100000 100000 0.859000 0.000000 0.859000 ( 0.863000)
size = 1000000 1000000 24.969000 0.000000 24.969000 ( 24.961000)
1.8.7
Rehearsal -----------------------------------------------------------------
size = 10000 10000 24.235000 0.032000 24.267000 ( 24.308000)
size = 100000 100000 18.078000 0.000000 18.078000 ( 18.150630)
size = 1000000 1000000 14.297000 0.000000 14.297000 ( 14.289000)
------------------------------------------------------- total: 56.642000sec

                                    user system total real
size = 10000 10000 12.031000 0.000000 12.031000 ( 12.039000)
size = 100000 100000 10.250000 0.000000 10.250000 ( 10.290950)
size = 1000000 1000000 8.985000 0.000000 8.985000 ( 8.989000)
1.9.1
                                    user system total real
size = 10000 10000 0.016000 0.000000 0.016000 ( 0.027000)
size = 100000 100000 0.031000 0.000000 0.031000 ( 0.028000)
size = 1000000 1000000 0.047000 0.000000 0.047000 ( 0.043000)
1.9.1
Rehearsal -----------------------------------------------------------------
size = 10000 10000 0.032000 0.000000 0.032000 ( 0.043000)
size = 100000 100000 0.032000 0.000000 0.032000 ( 0.043000)
size = 1000000 1000000 0.031000 0.000000 0.031000 ( 0.043000)
-------------------------------------------------------- total: 0.095000sec

                                    user system total real
size = 10000 10000 0.031000 0.000000 0.031000 ( 0.039000)
size = 100000 100000 0.047000 0.000000 0.047000 ( 0.043000)
size = 1000000 1000000 0.032000 0.000000 0.032000 ( 0.043000)

Strange that the rehearsal has this dramatic effect. Normally it
should simply execute the block twice to warm up the system.

Kind regards

robert

bm.rb (537 Bytes)

···

2009/2/27 Jim Menard <jim.menard@gmail.com>:

Here's a real Array performance mystery. I can't help my friend figure
this one out. He writes,

========

Is there a known bug in Ruby array performance? I spent a lot of time
yesterday boiling down the following example

http://blogs.codehaus.org/people/geir/archives/001768_baffled_with_ruby.html

only because I still can't believe I'm not doing something incredibly
stupid. (I'm a newbie, "Reluctant Rubyist")

I'm just not used to arrays behaving this way :slight_smile: Anyone have any
insight?

========

He'd love a response in his blog, but with permission I'll copy
answers here over there. He's not a regular ruby-talk reader...yet.

--
remember.guy do |as, often| as.you_can - without end

Jim Menard wrote:

Here's a real Array performance mystery. I can't help my friend figure
this one out. He writes,

========

Is there a known bug in Ruby array performance? I spent a lot of time
yesterday boiling down the following example

http://blogs.codehaus.org/people/geir/archives/001768_baffled_with_ruby.html

only because I still can't believe I'm not doing something incredibly
stupid. (I'm a newbie, "Reluctant Rubyist")

I'm just not used to arrays behaving this way :slight_smile: Anyone have any
insight?

========

He'd love a response in his blog, but with permission I'll copy
answers here over there. He's not a regular ruby-talk reader...yet.

Jim

This variation exhibits the same behavior.

require 'benchmark'

class BufferContainer

def initialize( initial_data )
   @buf = initial_data
end

def put_array( array )
   @buf[0, array.size] = array
end
end

size = 999
the_array = (0..4).to_a

(1..3).each{|i|
  size *= 10
  a = [nil] * size

# RUN WITH THIS COMMENTED OUT FIRST
# a << 9

  puts "size = #{ size } #{ a.size }"
  puts Benchmark.measure {
    10_000.times {
      buf = BufferContainer.new( a )
      buf.put_array( the_array )
    }
  }

}

I don't have time to dig into this, but I believe he's hitting a GC threshold by going over N objects.

···

On Feb 27, 2009, at 05:16 , Jim Menard wrote:

Is there a known bug in Ruby array performance? I spent a lot of time
yesterday boiling down the following example

http://blogs.codehaus.org/people/geir/archives/001768_baffled_with_ruby.html

Jim Menard wrote:

Here's a real Array performance mystery. I can't help my friend figure
this one out. He writes,

========

Is there a known bug in Ruby array performance? I spent a lot of time
yesterday boiling down the following example

http://blogs.codehaus.org/people/geir/archives/001768_baffled_with_ruby.html

only because I still can't believe I'm not doing something incredibly
stupid. (I'm a newbie, "Reluctant Rubyist")

I'm just not used to arrays behaving this way :slight_smile: Anyone have any
insight?

========

He'd love a response in his blog, but with permission I'll copy
answers here over there. He's not a regular ruby-talk reader...yet.

Jim
  

I am betting Ryan is right about GC issues. JRuby does not fall down doing this either (which probably also backs up the GC theory):

ruby ~/jruby/scripts/arr_ben.rb
size = 10000 10000
  0.050000 0.000000 0.050000 ( 0.057190)
size = 100000 100000
  0.660000 0.000000 0.660000 ( 0.672178)
size = 1000000 1000000
34.430000 0.340000 34.770000 ( 35.562454)

jruby --server ~/jruby/scripts/arr_ben.rb
size = 10000 10000
  0.372000 0.000000 0.372000 ( 0.266000)
size = 100000 100000
  0.099000 0.000000 0.099000 ( 0.099000)
size = 1000000 1000000
  0.154000 0.000000 0.154000 ( 0.154000)

-Tom

Here's a better test

16:16:32 Temp$ ruby bm.rb; ruby19 bm.rb
1.8.7
                                    user system total real
size = 10000 10000 0.110000 0.000000 0.110000 ( 0.118000)
size = 100000 100000 0.781000 0.000000 0.781000 ( 0.769000)
size = 1000000 1000000 25.172000 0.000000 25.172000 ( 25.185000)
size = 10000 10000 0.328000 0.000000 0.328000 ( 0.329000)
size = 100000 100000 0.547000 0.000000 0.547000 ( 0.542000)
size = 1000000 1000000 18.156000 0.000000 18.156000 ( 18.151000)
1.9.1
                                    user system total real
size = 10000 10000 0.031000 0.000000 0.031000 ( 0.027000)
size = 100000 100000 0.031000 0.000000 0.031000 ( 0.028000)
size = 1000000 1000000 0.031000 0.000000 0.031000 ( 0.043000)
size = 10000 10000 0.031000 0.000000 0.031000 ( 0.031000)
size = 100000 100000 0.032000 0.000000 0.032000 ( 0.023000)
size = 1000000 1000000 0.031000 0.000000 0.031000 ( 0.043000)

"better" in the sense that we're not suffering from bmbm's magic.

Kind regards

robert

bm.rb (545 Bytes)

···

2009/2/27 Robert Klemme <shortcutter@googlemail.com>:

Strange that the rehearsal has this dramatic effect. Normally it
should simply execute the block twice to warm up the system.

--
remember.guy do |as, often| as.you_can - without end

Poke around gc.c and see what you can figure out. Also, look at the 1.9 version or through the changelog to find hints...

#ifndef GC_MALLOC_LIMIT
#if defined(MSDOS) || defined(__human68k__)
#define GC_MALLOC_LIMIT 200000
#else
#define GC_MALLOC_LIMIT 8000000
#endif

···

On Feb 28, 2009, at 21:32 , Ryan Davis wrote:

On Feb 27, 2009, at 05:16 , Jim Menard wrote:

Is there a known bug in Ruby array performance? I spent a lot of time
yesterday boiling down the following example

http://blogs.codehaus.org/people/geir/archives/001768_baffled_with_ruby.html

I don't have time to dig into this, but I believe he's hitting a GC threshold by going over N objects.