Dear list,
some days ago I wrote a script that stored values in an Hashtable with
different buckets. It was some string-parsing grunt work.
I noticed that Ruby became pretty slow when the hashtable grew to ~10 Mio
entries. To make sure it is actually Ruby that is to blame, I wrote comparison
script snippets in Perl and Python, and yes, Ruby was just plain slower.
By reducing the amount of code step-by-step I was able to narrow the problem
down. Currently, I can reproduce the behaviour with the following simple
scriptlet:
---%<---
tbl = { }
10_000_000.times do |i|
tbl['last'] = i*i*i
end
--->%---
Execution times are as follows:
+ ruby loop0r.rb
real 0m10.741s
user 0m10.596s
sys 0m0.107s
+ perl loop0r.pl
real 0m4.503s
user 0m4.420s
sys 0m0.048s
+ python loop0r.py
real 0m6.704s
user 0m6.367s
sys 0m0.274s
Replacing the String I use as Hash key with a symbol, i.e., :last, lowers the
execution time of Ruby dramatically so that it becomes fast than Python (and
still slower than Perl, but that's ok).
I was baffled and ran callgrind against the interpreter. It came up with this:
---%<---
183,248,191 /usr/local/rvm/src/ruby-2.0.0-p247/vm.inc:vm_exec_core'2
155,033,405 /usr/local/rvm/src/ruby-2.0.0-p247/siphash.c:ruby_sip_hash24
[/home/eveith/.rvm/rubies/ruby-2.0.0-p247/lib/libruby.so.2.0.0]
112,110,714 /usr/local/rvm/src/ruby-2.0.0-p247/insns.def:vm_exec_core'2
93,170,146 /usr/local/rvm/src/ruby-2.0.0-p247/string.c:str_replace
[/home/eveith/.rvm/rubies/ruby-2.0.0-p247/lib/libruby.so.2.0.0]
78,086,663 /usr/local/rvm/src/ruby-2.0.0-p247/st.c:st_update
[/home/eveith/.rvm/rubies/ruby-2.0.0-p247/lib/libruby.so.2.0.0]
74,991,856 /usr/local/rvm/src/ruby-2.0.0-p247/gc.c:slot_sweep
[/home/eveith/.rvm/rubies/ruby-2.0.0-p247/lib/libruby.so.2.0.0]
73,137,809 /usr/local/rvm/src/ruby-2.0.0-
p247/vm_insnhelper.c:vm_call_cfunc_with_frame'2
[/home/eveith/.rvm/rubies/ruby-2.0.0-p247/lib/libruby.so.2.0.0]
63,043,562 /usr/local/rvm/src/ruby-2.0.0-p247/vm_insnhelper.c:vm_push_frame
[/home/eveith/.rvm/rubies/ruby-2.0.0-p247/lib/libruby.so.2.0.0]
61,029,008 /usr/local/rvm/src/ruby-2.0.0-p247/vm.c:rb_yield
--->%---
Now: Why does Ruby call string.c:str_replace? There is obviously no operation
that modifies a string here. In fact, I even use the same String as hash key
over and over again.
Of course, I can also freeze the String, which amounts to the same as using a
Symbol (performance-wise). Still, I don't understand why Ruby is calling
str_*replace*.
What is happening here? And how can I circumvent it?
Thanks alot for any replies!
--- Eric