Below are some benchmark results from four different implementations
(excluding the *zip.flatten approach, since it's way too slow). The
key and value arrays just contain integers.
The second set of results has GC disabled, and the GC time after
each run is included. It's interesting that the times are so different:
your size.times approach is under 3.5 seconds either way, but the
other approaches are very much slower when GC is enabled. I assume this
means that the other three are causing a lot of temporary objects to
be created.
** I suppose these objects could be the pair argument arrays for each of
the block invocations. Can anybody confirm this or suggest otherwise?
I realise now that my original tests (where I said that the
zip-with-block solution was twice as fast as each_with_index and
inject) were flawed since I wasn't using a large enough data set.
user system total real
each_with_index 28.090000 0.110000 28.200000 ( 28.794794)
inject 27.180000 0.040000 27.220000 ( 27.742072)
zip with block 27.610000 0.030000 27.640000 ( 28.192968)
size.times 3.270000 0.060000 3.330000 ( 3.381180)
user system total real
each_with_index 4.040000 0.260000 4.300000 ( 4.421269)
each_with_index(GC) 0.720000 0.000000 0.720000 ( 0.767091)
inject 4.470000 0.090000 4.560000 ( 4.760866)
inject(GC) 1.150000 0.010000 1.160000 ( 1.157983)
zip with block 3.500000 0.000000 3.500000 ( 3.630590)
zip with block(GC) 1.130000 0.000000 1.130000 ( 1.136200)
size.times 2.920000 0.010000 2.930000 ( 3.017372)
size.times(GC) 0.470000 0.000000 0.470000 ( 0.478058)
···
On Wed, 08 Dec 2004 03:15:10 +0900, Florian Frank wrote:
Jonathan Paisley wrote:
I did some benchmark tests of all four implementations (my original, your
two, and the above). The last is about twice as fast as each_with_index
and inject, and at least an order of magnitude faster than my original
method involving flattening.
Try this implementation, too:
hash = {}
keys.size.times { |i| hash[ keys[i] ] = values[i] }
=================================================================
require 'benchmark'
n = 1000000
keys = (0...n).to_a
values = keys.dup
def Hash.from_pairs_a(keys,values)
h = {}
keys.each_with_index {|e,i| h[e] = values[i]}
h
end
def Hash.from_pairs_b(keys,values)
Hash[*keys.zip(values).flatten]
end
def Hash.from_pairs_c(keys,values)
h = {}
keys.inject(values) do |v,k|
h[k] = v.shift
v
end
h
end
def Hash.from_pairs_d(keys,values)
h = {}
keys.zip(values) do |k,v|
h[k] = v
end
h
end
def Hash.from_pairs_e(keys,values)
hash = {}
keys.size.times { |i| hash[ keys[i] ] = values[i] }
hash
end
$no_gc = ARGV[0]
Benchmark.bm(20) do |x|
def x.gcreport(label,&block)
if $no_gc then GC.enable; GC.start; GC.disable; end
report(label,&block)
if $no_gc then
report(label + "(GC)") { GC.enable; GC.start; GC.disable}
end
end
x.gcreport("each_with_index") { Hash.from_pairs_a(keys,values) }
#x.report("*zip.flatten") { Hash.from_pairs_b(keys,values) }
x.gcreport("inject") { Hash.from_pairs_c(keys,values) }
x.gcreport("zip with block") { Hash.from_pairs_d(keys,values) }
x.gcreport("size.times") { Hash.from_pairs_e(keys,values) }
end