time = [Time.new]
c = ''
'aaaa'.upto('zzzz') {|e| c << e}
3.times { c << c }
time << Time.new
File.open('out.file','w') { |f| f.write(c) }
time << Time.new
c = File.open('out.file','r') { |f| f.read }
time << Time.new
0.upto(time.size - 2) {|i| p "#{i} #{time[i+1]-time[i]}" }
time = [Time.new]
c = ''
'aaaa'.upto('zzzz') {|e| c << e}
3.times { c << c }
time << Time.new
File.open('out.file','w') { |f| f.write(c) }
time << Time.new
c = File.open('out.file','r') { |f| f.read }
time << Time.new
0.upto(time.size - 2) {|i| p "#{i} #{time[i+1]-time[i]}" }
I'm pretty sure from your other message that it's using Ruby 1.9 with UTF-8 encoded strings that kills performance. I get equivalent results on OS X comparing 1.8.6 to 1.9.1 on common string operations like String#(a_fixnum).
It's understandable that correct character-wise operations are much harder for Ruby to implement efficiently on variable-byte-length encodings like UTF-8. Unfortunately, though I was delighted to get proper encoding support in 1.9, the performance hit on strings was a killer on my string-intensive GUI app.
There are sometimes ways round this (eg I rewrote diff/lcs to use string character enumerators rather than repeated expensive calls to String#).
I expect the Ruby team are already aware of the problem, and I hope this is a top priority for 1.9.2.
Do you have a virus scanner on that Windows box? If so, how are
measurements when switched off?
Cheers
robert
···
2009/3/27 Damjan Rems <d_rems@yahoo.com>:
This simple code shows everything:
time = [Time.new]
c = ''
'aaaa'.upto('zzzz') {|e| c << e}
3.times { c << c }
time << Time.new
File.open('out.file','w') { |f| f.write(c) }
time << Time.new
c = File.open('out.file','r') { |f| f.read }
time << Time.new
0.upto(time.size - 2) {|i| p "#{i} #{time[i+1]-time[i]}" }
That is about 5x slower write and 500x read operation. Times are the
same if I do:
f = File.new('out.file','r')
c = f.read
f.close
P.S.
Where can I officaly report a Ruby bug?
There are sometimes ways round this (eg I rewrote diff/lcs to use string
character enumerators rather than repeated expensive calls to
String#).
I expect the Ruby team are already aware of the problem, and I hope this
is a top priority for 1.9.2.
Unfortunately it appears to be a designed-in pessimization, though I
initially thought it was a mistake when I saw it. Guido van Rossum (the
Python creator) probably did too, as he asked Matz about it at the
40-minute mark here
At Fri, 27 Mar 2009 23:16:38 +0900,
Alex Fenton wrote in [ruby-talk:332264]:
I'm pretty sure from your other message that it's using Ruby 1.9 with
UTF-8 encoded strings that kills performance. I get equivalent results
on OS X comparing 1.8.6 to 1.9.1 on common string operations like
String#(a_fixnum).
That is about 5x slower write and 500x read operation. Times are the
same if I do:
f = File.new('out.file','r')
c = f.read
f.close
P.S.
Where can I officaly report a Ruby bug?
At Fri, 27 Mar 2009 23:16:38 +0900,
Alex Fenton wrote in [ruby-talk:332264]:
I'm pretty sure from your other message that it's using Ruby 1.9 with
UTF-8 encoded strings that kills performance. I get equivalent results
on OS X comparing 1.8.6 to 1.9.1 on common string operations like
String#(a_fixnum).
Slow-read is caused by universal-newline.
Should'nt fread be reading binary file or should that be forced by 'rb'
parameter at file.open()
I did fast test on Ubuntu x64 Linux. And the problem does not exist.
Althow it was Ruby version 1.9.0.
That is about 5x slower write and 500x read operation. Times are the
same if I do:
f = File.new('out.file','r')
c = f.read
f.close
P.S.
Where can I officaly report a Ruby bug?