Bug: Ruby 1.9.1 File io on Windows is slooooow

Damjan_Rems · 27 March 2009 13:34

This simple code shows everything:

time = [Time.new]
c = ''
'aaaa'.upto('zzzz') {|e| c << e}
3.times { c << c }
time << Time.new
File.open('out.file','w') { |f| f.write(c) }
time << Time.new
c = File.open('out.file','r') { |f| f.read }
time << Time.new
0.upto(time.size - 2) {|i| p "#{i} #{time[i+1]-time[i]}" }

ruby 1.9.1p0 (2009-01-30 revision 21907) [i386-mswin32]
"0 0.537075"
"1 0.696244"
"2 40.188834"

ruby 1.8.6 (2007-09-24 patchlevel 111) [i386-mswin32]
"0 0.551"
"1 0.133"
"2 0.087"

That is about 5x slower write and 500x read operation. Times are the
same if I do:
f = File.new('out.file','r')
c = f.read
f.close

Tried on two machines. Vista SP1 and XP SP3. Same results.

by
TheR

P.S.
Where can I officaly report a Ruby bug?

···

--
Posted via http://www.ruby-forum.com/.

Alex_Fenton2 · 27 March 2009 14:16

Damjan Rems wrote:

This simple code shows everything:

time = [Time.new]
c = ''
'aaaa'.upto('zzzz') {|e| c << e}
3.times { c << c }
time << Time.new
File.open('out.file','w') { |f| f.write(c) }
time << Time.new
c = File.open('out.file','r') { |f| f.read }
time << Time.new
0.upto(time.size - 2) {|i| p "#{i} #{time[i+1]-time[i]}" }

ruby 1.9.1p0 (2009-01-30 revision 21907) [i386-mswin32]
"0 0.537075"
"1 0.696244"
"2 40.188834"

ruby 1.8.6 (2007-09-24 patchlevel 111) [i386-mswin32]
"0 0.551"
"1 0.133"
"2 0.087"

I'm pretty sure from your other message that it's using Ruby 1.9 with UTF-8 encoded strings that kills performance. I get equivalent results on OS X comparing 1.8.6 to 1.9.1 on common string operations like String#(a_fixnum).

It's understandable that correct character-wise operations are much harder for Ruby to implement efficiently on variable-byte-length encodings like UTF-8. Unfortunately, though I was delighted to get proper encoding support in 1.9, the performance hit on strings was a killer on my string-intensive GUI app.

There are sometimes ways round this (eg I rewrote diff/lcs to use string character enumerators rather than repeated expensive calls to String#).

I expect the Ruby team are already aware of the problem, and I hope this is a top priority for 1.9.2.

btw, you can report bugs at:

alex

Robert_K1 · 29 March 2009 11:05

Do you have a virus scanner on that Windows box? If so, how are
measurements when switched off?

Cheers

robert

···

2009/3/27 Damjan Rems <d_rems@yahoo.com>:

This simple code shows everything:

time = [Time.new]
c = ''
'aaaa'.upto('zzzz') {|e| c << e}
3.times { c << c }
time << Time.new
File.open('out.file','w') { |f| f.write(c) }
time << Time.new
c = File.open('out.file','r') { |f| f.read }
time << Time.new
0.upto(time.size - 2) {|i| p "#{i} #{time[i+1]-time[i]}" }

ruby 1.9.1p0 (2009-01-30 revision 21907) [i386-mswin32]
"0 0.537075"
"1 0.696244"
"2 40.188834"

ruby 1.8.6 (2007-09-24 patchlevel 111) [i386-mswin32]
"0 0.551"
"1 0.133"
"2 0.087"

That is about 5x slower write and 500x read operation.

--
remember.guy do |as, often| as.you_can - without end

Roger_Pack4 · 2 December 2009 16:37

ruby 1.9.1p0 (2009-01-30 revision 21907) [i386-mswin32]
"2 40.188834"

ruby 1.8.6 (2007-09-24 patchlevel 111) [i386-mswin32]
"2 0.087"

That is about 5x slower write and 500x read operation. Times are the
same if I do:
f = File.new('out.file','r')
c = f.read
f.close
P.S.
Where can I officaly report a Ruby bug?

Looks like this bug is http://redmine.ruby-lang.org/issues/show/1332
Can you try with mode 'rb' and is it fast then?
-r

···

--
Posted via http://www.ruby-forum.com/\.

Roger_Pack4 · 15 February 2010 14:28

Damjan Rems wrote:

This simple code shows everything:

...

That is about 5x slower write and 500x read operation.

You can try with some patches at

http://www.ruby-forum.com/topic/204113

then it's only 36x as slow for reads

-r

···

--
Posted via http://www.ruby-forum.com/\.

The_Higgs_bozo · 27 March 2009 15:23

Alex Fenton wrote:

There are sometimes ways round this (eg I rewrote diff/lcs to use string
character enumerators rather than repeated expensive calls to
String#).

I expect the Ruby team are already aware of the problem, and I hope this
is a top priority for 1.9.2.

Unfortunately it appears to be a designed-in pessimization, though I
initially thought it was a mistake when I saw it. Guido van Rossum (the
Python creator) probably did too, as he asked Matz about it at the
40-minute mark here

···

--
Posted via http://www.ruby-forum.com/\.

Nobuyoshi_Nakada1 · 28 March 2009 02:22

Hi,

At Fri, 27 Mar 2009 23:16:38 +0900,
Alex Fenton wrote in [ruby-talk:332264]:

I'm pretty sure from your other message that it's using Ruby 1.9 with
UTF-8 encoded strings that kills performance. I get equivalent results
on OS X comparing 1.8.6 to 1.9.1 on common string operations like
String#(a_fixnum).

Slow-read is caused by universal-newline.

···

--
Nobu Nakada

Damjan_Rems · 30 March 2009 05:43

Robert Klemme wrote:

···

2009/3/27 Damjan Rems <d_rems@yahoo.com>:

c = File.open('out.file','r') { |f| f.read }
"1 0.133"
"2 0.087"

That is about 5x slower write and 500x read operation.

Do you have a virus scanner on that Windows box? If so, how are
measurements when switched off?

Cheers

robert

I swithed it off and times are pretty much the same.

by
TheR
--
Posted via http://www.ruby-forum.com/\.

Ad_Gr · 5 February 2010 21:53

Roger Pack wrote:

ruby 1.9.1p0 (2009-01-30 revision 21907) [i386-mswin32]
"2 40.188834"

ruby 1.8.6 (2007-09-24 patchlevel 111) [i386-mswin32]
"2 0.087"

That is about 5x slower write and 500x read operation. Times are the
same if I do:
f = File.new('out.file','r')
c = f.read
f.close
P.S.
Where can I officaly report a Ruby bug?

Looks like this bug is http://redmine.ruby-lang.org/issues/show/1332
Can you try with mode 'rb' and is it fast then?
-r

Also this one is much much slower on ruby 1.9 then on ruby 1.8:
Tested on Windows XP. No virus scanner.

time = [Time.new]
n = 1000
lines =
(1..n).each { |nr| lines << "line" }

time << Time.new
File.open('out2.file','w') { |f|
  lines.each { |l|
    f.print(l)
    f.flush
  }
}
time << Time.new

txt = ['construct','write','read']
0.upto(time.size - 2) {|i| p "#{txt[i]} #{time[i+1]-time[i]}" }

#ruby 1.9
#"construct 0.0"
#"write 22.453125"

#ruby 1.8
#"construct 0.0"
#"write 0.015"

···

--
Posted via http://www.ruby-forum.com/\.

Damjan_Rems · 29 March 2009 07:20

Nobuyoshi Nakada wrote:

Hi,

At Fri, 27 Mar 2009 23:16:38 +0900,
Alex Fenton wrote in [ruby-talk:332264]:

I'm pretty sure from your other message that it's using Ruby 1.9 with
UTF-8 encoded strings that kills performance. I get equivalent results
on OS X comparing 1.8.6 to 1.9.1 on common string operations like
String#(a_fixnum).

Slow-read is caused by universal-newline.

Should'nt fread be reading binary file or should that be forced by 'rb'
parameter at file.open()

I did fast test on Ubuntu x64 Linux. And the problem does not exist.
Althow it was Ruby version 1.9.0.

by
TheR

···

--
Posted via http://www.ruby-forum.com/\.

Robert_K1 · 6 February 2010 13:45

How many test runs did you do? Because of caching figures from a single run are usually not very telling.

Kind regards

robert

···

On 02/05/2010 10:53 PM, Ad Gr wrote:

Roger Pack wrote:

ruby 1.9.1p0 (2009-01-30 revision 21907) [i386-mswin32]
"2 40.188834"

ruby 1.8.6 (2007-09-24 patchlevel 111) [i386-mswin32]
"2 0.087"

That is about 5x slower write and 500x read operation. Times are the
same if I do:
f = File.new('out.file','r')
c = f.read
f.close
P.S.
Where can I officaly report a Ruby bug?

Looks like this bug is http://redmine.ruby-lang.org/issues/show/1332
Can you try with mode 'rb' and is it fast then?
-r

Also this one is much much slower on ruby 1.9 then on ruby 1.8:
Tested on Windows XP. No virus scanner.

time = [Time.new]
n = 1000
lines =
(1..n).each { |nr| lines << "line" }

time << Time.new
File.open('out2.file','w') { |f|
  lines.each { |l|
    f.print(l)
    f.flush
  }
}
time << Time.new

txt = ['construct','write','read']
0.upto(time.size - 2) {|i| p "#{txt[i]} #{time[i+1]-time[i]}" }

#ruby 1.9
#"construct 0.0"
#"write 22.453125"

#ruby 1.8
#"construct 0.0"
#"write 0.015"

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Roger_Pack4 · 6 February 2010 21:17

f.print(l)
f.flush

If you remove the flush it seems to work *much* faster.
This might be a bug in core. I'd probably report it.

I think to avoid the slowdown with reads you can "hack a work around"
like

c = File.open('out.file','rb') { |f| f.read }
c.gsub!("\r\n", "\n")

But there might be a bug in there, too. [1]

-rp
[1] http://redmine.ruby-lang.org/issues/show/1332

···

--
Posted via http://www.ruby-forum.com/\.

Topic		Replies	Views
File.mtime: Bug on Windows? ruby-talk	1	109	2 April 2006
Newbie question:read file speed ruby-talk	35	169	7 April 2003
File.mtime: Bug on Windows? ruby-talk	0	113	31 March 2006
Read efficiency? ruby-talk	3	102	22 February 2010
File.utime, Windows and Ruby 1.8 ruby-talk	2	107	15 May 2005

Bug: Ruby 1.9.1 File io on Windows is slooooow

Related topics