When I run the program:
STDIN.each_line { |line| line.split("\t") }
Is it really only this line? How do you feed the big file to stdin? Does the big file actually _have_ lines? Because if not you would likely see that behavior because each_line needs to read at least until it finds a line terminator.
with a big input file, the process size grows rapidly, reaching 1G in
less than 1 minute.
If I let it go, it continues to grow at this rate until the whole
system fails.
What does this mean? Does the kernel panic? Or does the Ruby process terminate with an error?
This is ruby 1.8.6 on fedora 9.
You do not accidentally have switched off GC completely, do you?
Is this a known problem?
Is there some way to work around it?
1.8.6 is not really current any more. I would upgrade if possible - at least get the latest version of the package. For more advice we have a bit too little information, I am afraid.
When I try this with my cygwin version memory consumption stays at roughly 3MB:
robert@fussel ~
$ perl -e 'foreach $i (1..10000000) {print $i, "--\t--\t--\t--\n";}' \
> > ruby -e 'STDIN.each_line { |line| line.split("\t") }'
robert@fussel ~
$ ruby -v
ruby 1.8.7 (2008-08-11 patchlevel 72) [i386-cygwin]
robert@fussel ~
$
But if I read a file that does not have lines behavior is as you describe, memory goes up and up
ruby -e 'STDIN.each_line { |line| line.split("\t") }' </dev/zero
Kind regards
robert
···
On 31.12.2008 17:28, cchayden.nyt@gmail.com wrote:
--
remember.guy do |as, often| as.you_can - without end