while start < fileSize count = (fileSize - start) if (start + chunk) >
fileSize
buf = input.sysread count output.syswrite buf buf = nil
We found that when we did it the way that you propose, the system would
grind to a halt when it hit an unusually large file. E.g., if we hit a
(say) 1.4MB file, it would take exponentially longer to process it than
if we looped through it with a smaller buffer. (We’re on a dual Athlon
with 2GB RAM.) We chose 20k as our buffer size, because that’s a
little over twice the typical file size.
Almost all of of the files are 8k or less, so we very rarely loop.
Again, we’re not having a performance problem with the copying in
general. Moreover, we can reproduce the problem no matter what the
size of the input files. We are finding that if we have 50,000 files
in a directory, then we spend (something like) .002 seconds per input
file–not an average. When we have a directory with 200,000 files, we
are getting upwards of .5 per file, and the first file isn’t processing
any quicker than the last one, so it’s not an issue of things
“grinding” to a halt.
start += count end output.syswrite("\f")
ensure
# you probably want to _know_ if the system is having
probs
# closing files
In practice, failure to close files has happened so very rarely
(basically, only because of NFS hiccups) that we don’t want it to
interfere with processing. Out of the 9 million input files processed
in the last month, less than 10 actually failed to close.
input.close if input output.close if output
If we close the output file here, then we won’t be able put the next
input file into it.
alternatively you might be able to use the
open(path) do |f|
…
endidiom with ‘output’.
This idiom would create empty output files when there were problems
with the input. We’ve designed the above routine so that it wouldn’t
do that.
i suspect that you were grinding to a halt with too many
output files open (they were never closed)…
There’s only one output file in this routine and it is closed. And the
performance curve is steady; things are not grinding to a halt.