I’ve found that in the real world the speed advantages of using mmap
are more significant than you seem to allow for. (see “My experiments”
below.)
This is interesting.
When you say your original version used “buffered IO” do you mean it was
using getc() and friends? (i.e. buffered IO at the Unix layer, not a Ruby IO
object)
It’s interesting to note that the gets function in Ruby 1.8.0p2 is quite
different from 1.6.8; where available, it pokes into the FILE* structure to
work out how many characters are available. In 1.6.8 it did individual
getc()'s, each one of which had to interact with threading, which was quite
an overhead. And from what others have posted, that seemed to work really
badly under Windoze.
- The code that handles the mmap in C is much simpler, because it
eliminates the need for a buffer. As far as the rest of my code is
concerned, I’m simply iterating through a char*
But remember that it is also less general, as you can’t mmap() a fifo, a
character device etc.
- I was unable to utilize a buffer of more than 10k (10,240 chars) to
read data without getting a segfault. (I was allocating it using an
instant array allocation at the top of my function – I did not try
using malloc.)
You couldn’t allocate more than 10K on the stack?? I do remember someone
posting a message here a month or too back about the default stack size for
MacOSX applications being extraordinarily small, and I think someone showed
how to raise it. Or just malloc, as you suggested.
- Since I’m analyzing the file byte by byte, using a buffer requires
testing for the end of the buffer/end of the file after each byte,
which is more slightly more complicated than simply comparing an
incremented value to a limit.
This is true, although you would have the same overhead if you wanted to
read an enormous file which was too big to mmap in at once, so you had to
mmap it in say 10MB segments.
I believe the main advantage that mmap has over the buffered I/O libraries
is that it doesn’t need actually to copy the data from the vbuf’s to your
application’s buffer.
While I was developing, I gauged performance using a file with 20,000
lines, 424,250 words, and 3,443,296 characters (which I do still have).
I did both a bulk read (returning an array of arrays of the entire
file) and an iterative read (where each row is yielded to a block as an
array; the block in this case joins the row with a tab.) For a file
this size, there was no noticeable difference. Performance parsing
this file using either read method (iterative or bulk) was as follows:
- Using C open/close with file descriptor: approx. 10 seconds to
process
- Using mmap: just over 2 seconds to process
Simple profiling gave the following results:
-
running ruby -rprofile for bulk read gave me the following:
% cumulative self self total
time seconds seconds calls ms/call ms/call name
71.59 3.78 3.78 1 3780.00 5200.00 Text_CSV.decode
26.89 5.20 1.42 20000 0.07 0.07 Array#clone
0.95 5.25 0.05 5 10.00 16.00 Kernel.require
…
(Keep in mind that the program runs significantly slower using
-rprofile, so that the time indicated above do not reflect the real,
non-profiled execution time)
-
running ruby -rprofile for iterative read give me the following:
% cumulative self self total
time seconds seconds calls ms/call ms/call name
75.94 4.64 4.64 1 4640.00 6050.00 Text_CSV.decode
23.08 6.05 1.41 20000 0.07 0.07 Array#join
0.49 6.08 0.03 5 6.00 12.00 Kernel.require
…
(Keep in mind that the program runs significantly slower using
-rprofile, so that the time indicated above do not reflect the real,
non-profiled execution time)
Ruby’s profiler is a bit dodgy IMO, as I’ve seen it do strange things in the
past, so you might be better running a C profiler instead on the Ruby
interpreter itself - or just use the ‘time’ command,
time ./foo …
although of course this doesn’t factor out how much time is spent in
‘decode’ and how much in the rest of the program.
Clearly you had a faster implementation with mmap. It might however have
been possible to squeeze the difference by tweaking the code which used
standard I/O (with a 64K buffer, say)
Lastly, if my memory serves me correctly, then the Perl Text_CSV module
(which uses stdio.h) is about the same speed as my ruby that used the
unistd.h functions, and so it is much slower than my ruby routine that
uses mmap.
This is all perhaps worth revisiting anyway, as I don’t really like the API
to the CSV functions which are in RAA (but I don’t really have the time to
rewrite it myself So if you want to release a super-CSV that would be
great…
Cheers,
Brian.
···
On Fri, Apr 18, 2003 at 01:11:07AM +0900, David King Landrith wrote: