Why is I/O slow?

Ok, folk, time to try again. It’s nothing to do with SHA-1.
I deleted the SHA-1 update from the code so it’s the same
as “cat file >/dev/null”, and it takes ages:

$ time cat /tmp/ten_megabytes >/dev/null
real 0m0.040s
user 0m0.000s
sys 0m0.040s
$ time ruby cat.rb /tmp/ten_megabytes
real 0m1.262s
user 0m1.230s
sys 0m0.030s
$ time ruby sha1.rb /tmp/ten_megabytes
8c206a1a87599f532ce68675536f0b1546900d7a /tmp/ten_megabytes
real 0m1.693s
user 0m1.600s
sys 0m0.090s
$ time perl cat.pl < /tmp/ten_megabytes >/dev/null
real 0m0.044s
user 0m0.010s
sys 0m0.030s

I gotta say folk, this is going to make it very hard to convince others
here to use ruby, and that’s a shame… 30 times slower than Perl?!!
Code available on request.

Clifford Heath

Clifford Heath cjh_nospam@managesoft.com writes:

Ok, folk, time to try again. It’s nothing to do with SHA-1.
I deleted the SHA-1 update from the code so it’s the same
as “cat file >/dev/null”, and it takes ages:

$ time cat /tmp/ten_megabytes >/dev/null
real 0m0.040s
user 0m0.000s
sys 0m0.040s
$ time ruby cat.rb /tmp/ten_megabytes
real 0m1.262s
user 0m1.230s
sys 0m0.030s
$ time ruby sha1.rb /tmp/ten_megabytes
8c206a1a87599f532ce68675536f0b1546900d7a /tmp/ten_megabytes
real 0m1.693s
user 0m1.600s
sys 0m0.090s
$ time perl cat.pl < /tmp/ten_megabytes >/dev/null
real 0m0.044s
user 0m0.010s
sys 0m0.030s

I gotta say folk, this is going to make it very hard to convince others
here to use ruby, and that’s a shame… 30 times slower than Perl?!!
Code available on request.

dave[/tmp 0:58:39] cat >cat.rb
print $stdin.read

dave[/tmp 0:59:42] time cat <12meg >/dev/null
cat < 12meg > /dev/null 0.00s user 0.05s system 102% cpu 0.049 total

dave[/tmp 0:59:52] time ruby cat.rb <12meg >/dev/null
ruby cat.rb < 12meg > /dev/null 1.78s user 0.14s system 100% cpu 1.917 total

Slower than ‘cat’, but still a lot less that 86s.

Dave

Clifford Heath cjh_nospam@managesoft.com writes:

I gotta say folk, this is going to make it very hard to convince others
here to use ruby, and that’s a shame…

Well, I was intrigue by this, so I tried some of my own:

jenny:~/tmp> time cat one_gig > /dev/null
0.090u 5.100s 0:53.85 9.6% 0+0k 0+0io 122pf+0w
0.050u 5.310s 0:54.18 9.8% 0+0k 0+0io 122pf+0w
0.120u 5.090s 0:54.34 9.5% 0+0k 0+0io 122pf+0w

jenny:~/tmp> time ./cat.rb one_gig > /dev/null

Using IO#read with block size 8192 bytes

114.530u 6.810s 2:39.15 76.2% 0+0k 0+0io 338pf+0w
114.560u 6.590s 2:41.99 74.7% 0+0k 0+0io 338pf+0w
113.600u 6.840s 2:31.38 79.5% 0+0k 0+0io 336pf+0w

Using IO#sysread with block size 8192 bytes

1.030u 10.940s 0:51.41 23.2% 0+0k 0+0io 343pf+0w
0.950u 11.590s 0:50.36 24.9% 0+0k 0+0io 343pf+0w
0.920u 11.860s 0:51.41 24.8% 0+0k 0+0io 343pf+0w

Using IO#read with block size 16384 bytes

113.530u 6.670s 2:30.63 79.7% 0+0k 0+0io 336pf+0w
114.390u 6.370s 2:43.17 74.0% 0+0k 0+0io 336pf+0w
114.460u 7.340s 2:44.17 74.1% 0+0k 0+0io 336pf+0w

Using IO#sysread with block size 16384 bytes

0.810u 10.570s 0:50.25 22.6% 0+0k 0+0io 343pf+0w
0.670u 10.240s 0:50.04 21.8% 0+0k 0+0io 343pf+0w
0.850u 10.720s 0:50.26 23.0% 0+0k 0+0io 343pf+0w

As can be seen, ruby’s version could be faster than cat. When
comparing things, it is important to compare apples with apples. cat
uses the read syscall (man 2 read), so, use IO#sysread instead of
IO#read. But by using sysread, you’ll miss the niceness of readline,
etc.

CPU usage in ruby’s sysread test is also higher than cat. But this
illustrates the overhead of ruby compared to compiled c.

IO#read reads a character at a time using getc (man 3 getc), that’s
why it’s so slow.

30 times slower than Perl?!!

Your benchmark shows that perl’s version is slower than cat. But
ruby’s version is faster than cat. So, perl’s IO is actually slower
than ruby’s. This is nice.

YS.

Hi,

Ok, folk, time to try again. It’s nothing to do with SHA-1.
I deleted the SHA-1 update from the code so it’s the same
as “cat file >/dev/null”, and it takes ages:
(snip)
I gotta say folk, this is going to make it very hard to convince others
here to use ruby, and that’s a shame… 30 times slower than Perl?!!
Code available on request.

What version do you use? Read is slow in 1.6, but improved in
1.7.

···

At Thu, 20 Jun 2002 13:47:37 +0900, Clifford Heath wrote:


Nobu Nakada

I gotta say folk, this is going to make it very hard to convince others
here to use ruby, and that’s a shame… 30 times slower than Perl?!!
Code available on request.

Clifford Heath

What version? I believe from prior postings to this newsgroup that 1.6.7
and even better 1.7.2 has fixed it.

Bob

Dave Thomas Dave@PragmaticProgrammer.com writes:

Slower than ‘cat’, but still a lot less that 86s.

Forget it - I misread the times in your original post.

Dave

Yohanes Santoso ruby-talk@jenny-gnome.dyndns.org writes:

Well, I was intrigue by this, so I tried some of my own:

Forgot to attach my test:

cat.rb (789 Bytes)

Dave Thomas wrote:

print $stdin.read

Thanks Dave. Hopefully someone will profile the code and find some
simple problem… though perhaps not with a simple fix. While I’m
here the above reads the whole file into memory then spits it out,
which though simple might not be preferable :-). That’s why I used
a fixed size 8K block (in my OP and in my cat.rb). It seems to be
about the same speed as yours (though faster than repeated readline).

···


Clifford Heath

1.6.6, which I think is the latest published Debian package.

···

nobu.nokada@softhome.net wrote:

What version do you use? Read is slow in 1.6, but improved in 1.7.


Clifford Heath

Yohanes Santoso wrote:

IO#read reads a character at a time using getc (man 3 getc), that’s
why it’s so slow.

Sigh. Why do they always do it the haaard way… :slight_smile:

Seriously though, a simple fread() call would improve things enormously:

$ time /bin/cat < /tmp/ten_megabytes > /dev/null
real 0m0.035s
user 0m0.000s
sys 0m0.030s
$ time ./cat_read < /tmp/ten_megabytes > /dev/null
real 0m0.085s
user 0m0.000s
sys 0m0.080s
$ time ./cat_fread < /tmp/ten_megabytes > /dev/null
real 0m0.086s
user 0m0.000s
sys 0m0.080s
$ time ./cat_getchar < /tmp/ten_megabytes > /dev/null
real 0m2.017s
user 0m1.980s
sys 0m0.030s
$ time ruby cat_sysread.rb /tmp/ten_megabytes > /dev/null
real 0m0.154s
user 0m0.060s
sys 0m0.090s
$ time ruby cat_read.rb /tmp/ten_megabytes > /dev/null
real 0m1.294s
user 0m1.270s
sys 0m0.030s

Note that Linux “cat” doesn’t move the data twice. Instead it mmap’s
the file and writes that, which apparently in this case does actually
transfer the data at least once - which it shouldn’t need to… I would
have thought it would mmap pages set to fault on read, so that untouched
pages never get read.

Either way, ruby can’t do this, but it can use fread! What about it Matz?
Since fread is almost as fast as read, the restriction on not mixing
sysread and read could perhaps be relaxed too?

···


Clifford Heath

Bob X wrote:

What version? I believe from prior postings to this newsgroup that 1.6.7
and even better 1.7.2 has fixed it.

Yes. “ruby cat.rb < /tmp/ten_megabytes > /dev/null” takes 0.15s with 1.6.7,
down from 1.3s with 1.6.6.

Perl does it in .037s, and /bin/cat in .035s, and my C program with a 65K
buffer and read/write does it in .063s - no idea why /bin/cat is faster.
But ruby is only four times slower, so that’s not too bad.

···


Clifford Heath

Clifford Heath wrote:

[about /bin/cat using mmap to read files]
Either way, ruby can’t do this.

Sorry to keep banging on about this, but a quick check reveals that
Perl 5.6.1 must also be using mmap, so perhaps Ruby can/should.

···


Clifford Heath

Clifford Heath cjh_nospam@managesoft.com writes:

Note that Linux “cat” doesn’t move the data twice. Instead it mmap’s

That’s strange. My GNU cat does not use mmap. It uses read() and
write().

Either way, ruby can’t do this, but it can use fread! What about it Matz?

$ time ./cat_read < /tmp/ten_megabytes > /dev/null
real 0m0.085s
user 0m0.000s
sys 0m0.080s
$ time ./cat_fread < /tmp/ten_megabytes > /dev/null
real 0m0.086s
user 0m0.000s
sys 0m0.080s

In the above benchmark, fread(3) and read(2) does not differ by
much. But, at least in glibc 2.2.4, fread(3) is merely a portability
layer on top of read(2). So, theoretically, using read(2) should
result in faster performance than fread(3).

On another note, mmap cannot be used as a generic reading
mechanism. It requires an fd. Accessing $stdin will be done
differently than accessing other IO objects. Too much hassle, and for
the case of cat-ing, there won’t be any improvement since the file
access is linear, not random. In fact, hunting down mmap() in
filemap.c (which gave me a headache) from linux 2.4.18 code makes me
think that for strictly linear access, mmap will suffer because of the
overhead.

Since fread is almost as fast as read, the restriction on not mixing
sysread and read could perhaps be relaxed too?

You’re not the only one confused about existence of #sysread and
#read, I am too. Both rb_io_read and rb_io_sysread do basically the
same thing. Only diff is one uses getc(3), and the other one
read(2). Since they are not on the same layer, calling one after the
other one confuses the system. Simply changing #sysread to use
fread(3) will eliminate the confusion and the price is a very small
overhead. But Matz didn’t do it.

Is there anything that can be done with read(2) but can’t be done with
fread(3)? If not, then the only reason I can think of is #sysread is
there for you to utilise the maximum capability of the OS. Could this
be true?

YS.

What version do you use? Read is slow in 1.6, but improved in 1.7.

1.6.6, which I think is the latest published Debian package.

Then you must live in unupgraded world.

radek@kvark:~$ dpkg -l ruby
Desired=Unknown/Install/Remove/Purge/Hold

···

On Thu, Jun 20, 2002 at 05:07:51PM +0900, Clifford Heath wrote:

nobu.nokada@softhome.net wrote:
Status=Not/Installed/Config-files/Unpacked/Failed-config/Half-installed
/ Err?=(none)/Hold/Reinst-required/X=both-problems (Status,Err: uppercase=bad)

/ Name Version Description
++±==============-==============-============================================
ii ruby 1.6.7-3 An interpreter of object-oriented scripting
radek@kvark:~$

Debian Woody.

The Potato has an 1.4.3-6. In Sid you can try ruby1.7
1.7.2.0cvs2002.01.18-1. But I don’t live on the edge.


Radek Hnilica

No matter how far down the wrong road you’ve gone, turn back.
Turkish proverb

Note that Linux “cat” doesn’t move the data twice. Instead it mmap’s
the file and writes that, which apparently in this case does actually
transfer the data at least once - which it shouldn’t need to… I would
have thought it would mmap pages set to fault on read, so that untouched
pages never get read.

Either way, ruby can’t do this,

It is possible to use mmap from Ruby (this requires Ruby 1.7):

require ‘dl/import’
require ‘dl/struct’

module Mmap
extend DL::Importable

dlload 'libc.so.6'

typealias 'size_t', 'unsigned long'
typealias 'ssize_t', 'long'
typealias 'off_t', 'unsigned long'
extern 'void * mmap(void *, size_t, int, int, int, off_t)'
extern 'ssize_t write(int, void *, size_t)'

PROT_READ = 0x1; PROT_WRITE = 0x2; PROT_EXEC = 0x3; PROT_NONE = 0x0
MAP_SHARED = 0x1; MAP_PRIVATE = 0x2; MAP_FIXED = 0x10
MAP_FAILED = DL::PtrData.new(-1)

end

infile = File.open(“test.dat”, File::RDONLY)
size = infile.stat.size
inmem = Mmap.mmap(nil, size, Mmap::PROT_READ, Mmap::MAP_PRIVATE, infile.to_i, 0)
raise “unable to mmap test.dat” if inmem == Mmap::MAP_FAILED

outfile = File.open(“out.dat”, File::WRONLY | File::CREAT | File::TRUNC)
Mmap.write(outfile.to_i, inmem, size)

Some systems have “copy file” system call that could be used this way,
too.

but it can use fread! What about it Matz?
Since fread is almost as fast as read, the restriction on not mixing
sysread and read could perhaps be relaxed too?

Sounds like a good idea to me. I’m sure there’s a good reason for it
being written the way it is.

Paul

···

On Thu, Jun 20, 2002 at 05:07:53PM +0900, Clifford Heath wrote:

Hi,

···

In message “Re: Why is I/O slow?” on 02/06/20, Clifford Heath cjh_nospam@managesoft.com writes:

Either way, ruby can’t do this, but it can use fread! What about it Matz?
Since fread is almost as fast as read, the restriction on not mixing
sysread and read could perhaps be relaxed too?

It’s due to stdio buffering. In the future version, Ruby will do its
own IO buffering without using stdio, so that it can tweak IO buffer
freely without portability problem.

						matz.

Yohanes Santoso ruby-talk@jenny-gnome.dyndns.org writes:

In my previous post, please ignore the paragraph that starts with:

On another note, mmap cannot be used as a generic reading
mechanism.

Here is my correction:

On another note, mmap cannot be used as a generic reading
mechanism. mmap requires something of fixed size. So, accesing $stdin,
network socket, etc. will have to be done differently than accessing
other file-based objects. In the case of cat-ing, there won’t be any
improvement since the file access is linear, not random. In fact,
hunting down mmap() in filemap.c (which gave me a headache) from linux
2.4.18 code makes me think that for strictly linear access, mmap will
suffer because of the overhead.

YS.

Yukihiro Matsumoto wrote:

It’s due to stdio buffering.

But our tests on Linux with fread and glibc indicate that there is no
extra buffering happening. stdio is capable of buffering, but in this
case it isn’t actually buffering (it’s not possible to move A->B->C
as fast as A->C, from my argument about bandwidth). stdio libraries that
still do double copies must surely be rare now…?

In the future version, Ruby will do its
own IO buffering without using stdio, so that it can tweak IO buffer
freely without portability problem.

With respect, this doesn’t sound like a smart idea. The glibc folk have
spent vast effort making it fly, and you want to bypass it?

Anyhow, thanks to Radek for the advice, I’ve got 1.6.7 and the performance
of sysread seems comparable to read now.

···


Clifford Heath

Why not? It is, after all, dependent upon the presence of a
particular glibc – which I won’t have, since I do most of my Ruby
on Windows.

-austin
– Austin Ziegler, austin@halostatue.ca on 2002.06.20 at 22.28.14

···

On Fri, 21 Jun 2002 10:48:39 +0900, Clifford Heath wrote:

Yukihiro Matsumoto wrote:

In the future version, Ruby will do its own IO buffering without
using stdio, so that it can tweak IO buffer freely without
portability problem.
With respect, this doesn’t sound like a smart idea. The glibc folk
have spent vast effort making it fly, and you want to bypass it?

With respect, this doesn’t sound like a smart idea. The glibc folk have
spent vast effort making it fly, and you want to bypass it?

There are other systems out there that don’t have the luxury of glibc…