Bug: Ruby 1.9.1 File io on Windows is slooooow

This simple code shows everything:

time = [Time.new]
c = ''
'aaaa'.upto('zzzz') {|e| c << e}
3.times { c << c }
time << Time.new
File.open('out.file','w') { |f| f.write(c) }
time << Time.new
c = File.open('out.file','r') { |f| f.read }
time << Time.new
0.upto(time.size - 2) {|i| p "#{i} #{time[i+1]-time[i]}" }

ruby 1.9.1p0 (2009-01-30 revision 21907) [i386-mswin32]
"0 0.537075"
"1 0.696244"
"2 40.188834"

ruby 1.8.6 (2007-09-24 patchlevel 111) [i386-mswin32]
"0 0.551"
"1 0.133"
"2 0.087"

That is about 5x slower write and 500x read operation. Times are the
same if I do:
f = File.new('out.file','r')
c = f.read
f.close

Tried on two machines. Vista SP1 and XP SP3. Same results.

by
TheR

P.S.
Where can I officaly report a Ruby bug?

···

--
Posted via http://www.ruby-forum.com/.

Damjan Rems wrote:

This simple code shows everything:

time = [Time.new]
c = ''
'aaaa'.upto('zzzz') {|e| c << e}
3.times { c << c }
time << Time.new
File.open('out.file','w') { |f| f.write(c) }
time << Time.new
c = File.open('out.file','r') { |f| f.read }
time << Time.new
0.upto(time.size - 2) {|i| p "#{i} #{time[i+1]-time[i]}" }

ruby 1.9.1p0 (2009-01-30 revision 21907) [i386-mswin32]
"0 0.537075"
"1 0.696244"
"2 40.188834"

ruby 1.8.6 (2007-09-24 patchlevel 111) [i386-mswin32]
"0 0.551"
"1 0.133"
"2 0.087"

I'm pretty sure from your other message that it's using Ruby 1.9 with UTF-8 encoded strings that kills performance. I get equivalent results on OS X comparing 1.8.6 to 1.9.1 on common string operations like String#(a_fixnum).

It's understandable that correct character-wise operations are much harder for Ruby to implement efficiently on variable-byte-length encodings like UTF-8. Unfortunately, though I was delighted to get proper encoding support in 1.9, the performance hit on strings was a killer on my string-intensive GUI app.

There are sometimes ways round this (eg I rewrote diff/lcs to use string character enumerators rather than repeated expensive calls to String#).

I expect the Ruby team are already aware of the problem, and I hope this is a top priority for 1.9.2.

btw, you can report bugs at:

alex

Do you have a virus scanner on that Windows box? If so, how are
measurements when switched off?

Cheers

robert

···

2009/3/27 Damjan Rems <d_rems@yahoo.com>:

This simple code shows everything:

time = [Time.new]
c = ''
'aaaa'.upto('zzzz') {|e| c << e}
3.times { c << c }
time << Time.new
File.open('out.file','w') { |f| f.write(c) }
time << Time.new
c = File.open('out.file','r') { |f| f.read }
time << Time.new
0.upto(time.size - 2) {|i| p "#{i} #{time[i+1]-time[i]}" }

ruby 1.9.1p0 (2009-01-30 revision 21907) [i386-mswin32]
"0 0.537075"
"1 0.696244"
"2 40.188834"

ruby 1.8.6 (2007-09-24 patchlevel 111) [i386-mswin32]
"0 0.551"
"1 0.133"
"2 0.087"

That is about 5x slower write and 500x read operation.

--
remember.guy do |as, often| as.you_can - without end

ruby 1.9.1p0 (2009-01-30 revision 21907) [i386-mswin32]
"2 40.188834"

ruby 1.8.6 (2007-09-24 patchlevel 111) [i386-mswin32]
"2 0.087"

That is about 5x slower write and 500x read operation. Times are the
same if I do:
f = File.new('out.file','r')
c = f.read
f.close
P.S.
Where can I officaly report a Ruby bug?

Looks like this bug is http://redmine.ruby-lang.org/issues/show/1332
Can you try with mode 'rb' and is it fast then?
-r

···

--
Posted via http://www.ruby-forum.com/\.

Damjan Rems wrote:

This simple code shows everything:

...

That is about 5x slower write and 500x read operation.

You can try with some patches at

http://www.ruby-forum.com/topic/204113

then it's only 36x as slow for reads :slight_smile:

-r

···

--
Posted via http://www.ruby-forum.com/\.

Alex Fenton wrote:

There are sometimes ways round this (eg I rewrote diff/lcs to use string
character enumerators rather than repeated expensive calls to
String#).

I expect the Ruby team are already aware of the problem, and I hope this
is a top priority for 1.9.2.

Unfortunately it appears to be a designed-in pessimization, though I
initially thought it was a mistake when I saw it. Guido van Rossum (the
Python creator) probably did too, as he asked Matz about it at the
40-minute mark here

···

--
Posted via http://www.ruby-forum.com/\.

Hi,

At Fri, 27 Mar 2009 23:16:38 +0900,
Alex Fenton wrote in [ruby-talk:332264]:

I'm pretty sure from your other message that it's using Ruby 1.9 with
UTF-8 encoded strings that kills performance. I get equivalent results
on OS X comparing 1.8.6 to 1.9.1 on common string operations like
String#(a_fixnum).

Slow-read is caused by universal-newline.

···

--
Nobu Nakada

Robert Klemme wrote:

···

2009/3/27 Damjan Rems <d_rems@yahoo.com>:

c = File.open('out.file','r') { |f| f.read }
"1 0.133"
"2 0.087"

That is about 5x slower write and 500x read operation.

Do you have a virus scanner on that Windows box? If so, how are
measurements when switched off?

Cheers

robert

I swithed it off and times are pretty much the same.

by
TheR
--
Posted via http://www.ruby-forum.com/\.

Roger Pack wrote:

ruby 1.9.1p0 (2009-01-30 revision 21907) [i386-mswin32]
"2 40.188834"

ruby 1.8.6 (2007-09-24 patchlevel 111) [i386-mswin32]
"2 0.087"

That is about 5x slower write and 500x read operation. Times are the
same if I do:
f = File.new('out.file','r')
c = f.read
f.close
P.S.
Where can I officaly report a Ruby bug?

Looks like this bug is http://redmine.ruby-lang.org/issues/show/1332
Can you try with mode 'rb' and is it fast then?
-r

Also this one is much much slower on ruby 1.9 then on ruby 1.8:
Tested on Windows XP. No virus scanner.

time = [Time.new]
n = 1000
lines =
(1..n).each { |nr| lines << "line" }

time << Time.new
File.open('out2.file','w') { |f|
  lines.each { |l|
    f.print(l)
    f.flush
  }
}
time << Time.new

txt = ['construct','write','read']
0.upto(time.size - 2) {|i| p "#{txt[i]} #{time[i+1]-time[i]}" }

#ruby 1.9
#"construct 0.0"
#"write 22.453125"

#ruby 1.8
#"construct 0.0"
#"write 0.015"

···

--
Posted via http://www.ruby-forum.com/\.

Nobuyoshi Nakada wrote:

Hi,

At Fri, 27 Mar 2009 23:16:38 +0900,
Alex Fenton wrote in [ruby-talk:332264]:

I'm pretty sure from your other message that it's using Ruby 1.9 with
UTF-8 encoded strings that kills performance. I get equivalent results
on OS X comparing 1.8.6 to 1.9.1 on common string operations like
String#(a_fixnum).

Slow-read is caused by universal-newline.

Should'nt fread be reading binary file or should that be forced by 'rb'
parameter at file.open()

I did fast test on Ubuntu x64 Linux. And the problem does not exist.
Althow it was Ruby version 1.9.0.

by
TheR

···

--
Posted via http://www.ruby-forum.com/\.

How many test runs did you do? Because of caching figures from a single run are usually not very telling.

Kind regards

  robert

···

On 02/05/2010 10:53 PM, Ad Gr wrote:

Roger Pack wrote:

ruby 1.9.1p0 (2009-01-30 revision 21907) [i386-mswin32]
"2 40.188834"

ruby 1.8.6 (2007-09-24 patchlevel 111) [i386-mswin32]
"2 0.087"

That is about 5x slower write and 500x read operation. Times are the
same if I do:
f = File.new('out.file','r')
c = f.read
f.close
P.S.
Where can I officaly report a Ruby bug?

Looks like this bug is http://redmine.ruby-lang.org/issues/show/1332
Can you try with mode 'rb' and is it fast then?
-r

Also this one is much much slower on ruby 1.9 then on ruby 1.8:
Tested on Windows XP. No virus scanner.

time = [Time.new]
n = 1000
lines =
(1..n).each { |nr| lines << "line" }

time << Time.new
File.open('out2.file','w') { |f|
  lines.each { |l|
    f.print(l)
    f.flush
  }
}
time << Time.new

txt = ['construct','write','read']
0.upto(time.size - 2) {|i| p "#{txt[i]} #{time[i+1]-time[i]}" }

#ruby 1.9
#"construct 0.0"
#"write 22.453125"

#ruby 1.8
#"construct 0.0"
#"write 0.015"

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

    f.print(l)
    f.flush

If you remove the flush it seems to work *much* faster.
This might be a bug in core. I'd probably report it.

I think to avoid the slowdown with reads you can "hack a work around"
like

c = File.open('out.file','rb') { |f| f.read }
c.gsub!("\r\n", "\n")

But there might be a bug in there, too. [1]

-rp
[1] http://redmine.ruby-lang.org/issues/show/1332

···

--
Posted via http://www.ruby-forum.com/\.