I'm wondering if anyone knows much about Ruby's efficiency with IO#read.
Specifically, I'm wondering about libraries I might use to speed up disk
reads.
To see what I mean, here's some test code that iterates over an
11-megabyte file. All it does is call IO#read on a number of bytes (set
on the command-line) over the entire file, and times it.
start = Time.now
while (fd.read(buf_size))
end
stop = Time.now
puts (stop - start).to_s + " seconds"
#--- EOF
Running this on my system yields:
$ ruby readspeed.rb 4096
0.014 seconds
$ ruby readspeed.rb 1
7.547 seconds
Obviously a big difference! This is a simplified version of the test I
was actually running, which tried to account for the increased amount of
overhead when calling with 1 byte at a time. There's still an
order-of-magnitude difference between the two...reading one byte at a
time is *slow*, slow enough to bog down an entire program.
I know this is supposed to be the case with unbuffered input, such as
the C standard library "read", but isn't IO#read supposed to be
buffered? What's causing this slowdown? I'm writing a class that will
hopefully speed up smaller reads from binary files by explicitly caching
data in memory, but I'm wondering if there are any pre-built (i.e.,
tested) solutions that Ruby programmers might be using.
I hate it when I put my foot in my mouth. After further testing, I'm
almost entirely sure this is just due to overhead, not a problem with
disk access.
Who would have thought looping 11*2**20 times would incur a performance
hit?
I hate it when I put my foot in my mouth. After further testing, I'm
almost entirely sure this is just due to overhead, not a problem with
disk access.
Who would have thought looping 11*2**20 times would incur a performance
hit?
I believe that Rubinius has less overhead "hit time" -- as long as you
define everything as methods so it can JIT them.
GL!
-rp