Park and Sean....Possible memory bug in Ruby? I'm stumped!

Thanks a lot, Park and Sean.

Two questions:

  • Sean, what do you mean by CG when you write in your message “This is exactly the type of error that CG causes”?

  • Is there a set limit on the number of files that can be open? If so, how many?

  • Park, you write that one option is this:

     inserta = File.new("./input/intro_subtemplate.txt").read
    
     should be
    
     f = File.new("./input/intro_subtemplate.txt")
     inserta = f.read
     f.close
    

    but, these statements seem logically (to this newbie) to be logically equivalent. Does the one-line method I used create a phantom file that can’t be closed?

Thanks again!

-Kurt

···

-----Original Message-----
From: Park Heesob [mailto:phasis@kornet.net]
Sent: Sunday, June 09, 2002 1:38 AM
To: ruby-talk@ruby-lang.org
Subject: Re: Possible memory bug in Ruby? I’m stumped!

Hi,

“Kurt Euler” keuler@portal.com wrote in message
news:C47CCC6238EFD4119C5200508B95A100074AF63B@cup1ex1.portal.com

All-

The code as written after the dashed line way below (with file name t7.rb)
correctly inserts the contents of file “intro_subtemplate.txt” into a
template file during each iteration, resulting in the creation of some 189
identical but differently named files in directory “./output”. (There are
189
filenames in the 189 rows in file “packages.txt”.)

t7.rb:6:in ‘initialize’: Invalid argument -
“./input/intro_subtemplate.txt” (Errno::EINVAL)
from t7.rb:6:in ‘new’
from t7.rb:6
from t7.rb:2in ‘foreach’
from t7.rb:2

t7.rb:15:in ‘initialize’: Invalid argument -
“./output/SOL_BroadbandManager_6.2_SP1” (Errno::EINVAL)
from t7.rb:15:in ‘new’
from t7.rb:15
from t7.rb:2in ‘foreach’
from t7.rb:2

These error messages caused by too many open files.

Thanks in advance!!!

-Kurt Euler
(I’m using Ruby 1.67)

File t7.rb:

content = File.new(“./input/main_readme_template.txt”).read
IO.foreach(“./input/packages.txt”) { |x|
field = x.chop.split(‘:’)
content2 = content

inserta = File.new(“./input/intro_subtemplate.txt”).read
content2.gsub!(/<intro_subtemplate.txt>/, inserta)

#insertb = File.new(“./input/install_unix_subtemplate.txt”).read
#content2.gsub!(/<install_unix_subtemplate.txt>/, insertb)

#insertc =
File.new(“./input/files_changed_sp_num-1_subtemplate.txt”).read
#content2.gsub!(/<files_changed_sp_num-1_subtemplate.txt">/, insertc)

File.new(“./output/”+field[0],“w+”).write(content2)
puts field[0]
}

Following three lines should be placed before IO.foeach loop.

inserta = File.new(“./input/intro_subtemplate.txt”).read
insertb = File.new(“./input/install_unix_subtemplate.txt”).read
insertc = File.new(“./input/files_changed_sp_num-1_subtemplate.txt”).read

Or

inserta = File.new(“./input/intro_subtemplate.txt”).read

should be

f = File.new(“./input/intro_subtemplate.txt”)
inserta = f.read
f.close

Park Heesob.

Thanks a lot, Park and Sean.

Two questions:

  • Sean, what do you mean by CG when you write in your message “This
    is exactly the type of error that CG causes”?

I believe he meant GC, as in garbage collection. There is nothing about
intrinsic about garbage collection that causes too many open files. =)
Perhaps he meant that with a language that has garbage collection, it’s
a lot easier to write code that leaves files open and never closes
them? I won’t put words in his mouth, but that may have been what he
was referring to.

  • Is there a set limit on the number of files that can be open? If
    so, how many?

It depends on the operating system and various process limits that may
be set. The limits I’ve seen on various systems range from tens to
thousands. Usually having more than a handful of files open at a time
means trouble.

  • Park, you write that one option is this:

     inserta = File.new("./input/intro_subtemplate.txt").read
    
     should be
    
     f = File.new("./input/intro_subtemplate.txt")
     inserta = f.read
     f.close
    

The first one creates a file object, does a read, then the file object
is floating. If and when it gets garbage collected, the file will be
closed. But if you run a ton of these commands in a row, it’s extremely
easy to open more files than is allowed by the OS before the garbage
collector ever kicks in. (Perhaps ruby ought to catch the error and run
the garbage collector, then try again transparently?)

The second one closes the file explicitly, and thus can never have the
problem the first one does. But it requires an extra variable and you
have to remember to close it.

I usually use something like this instead:

inserta = ‘’ # assuming inserta isn’t already a local variable
File.open(“./input/into_subtemplate.txt”) { |f| inserta = f.read }

Since a block is passed to File.open, the file is automatically closed
after the block is executed. Not so with the first example you typed
above. =)

···

On Sunday 09 June 2002 02:07 pm, Kurt Euler wrote:


Wesley J. Landaker - wjl@icecavern.net
OpenPGP FP: C99E DF40 54F6 B625 FD48 B509 A3DE 8D79 541F F830

I believe you were using a file that had been opened inside a block. I
think you are using a file that should have fallen out of scope, but due
to the nature of GC, whenever it does its cleanup, it will destroy that
file. So sometime through your program, the GC mechanism is run and it
destroys that file, seemingly at a random point in your file. Read up on
how scoping works in Ruby. This is what I gathered from the other
responses to the problem, I haven’t really looked at your code to hard.

PS, please don’t change the name of discussions on here.

more comments are below:

Kurt Euler wrote:

Thanks a lot, Park and Sean.

Two questions:

  • Sean, what do you mean by CG when you write in your message “This is exactly the type of error that CG causes”?

  • Is there a set limit on the number of files that can be open? If so, how many?

  • Park, you write that one option is this:

     inserta = File.new("./input/intro_subtemplate.txt").read
    
     should be
    
     f = File.new("./input/intro_subtemplate.txt")
     inserta = f.read
     f.close
    

    but, these statements seem logically (to this newbie) to be logically equivalent. Does the one-line method I used create a phantom file that can’t be closed?

They are, he was just giving you another way to write it.

You know, this might be the problem I’m experiencing with DBI and
dbd_oracle.

I never execute StatementHandle#finish – could I be running out of
statement handles? (Remember, when I explicitly added “GC.start”
to my code, the crash went away.)

Shouldn’t DBI tell me I’ve run out of statement handles, rather than
segfaulting Ruby?

Maybe I can create a reproducible test case now, if this is why it’s
dying …

– Dossy

···

On 2002.06.10, Wesley J Landaker wjl@icecavern.net wrote:

The first one creates a file object, does a read, then the file object
is floating. If and when it gets garbage collected, the file will be
closed. But if you run a ton of these commands in a row, it’s extremely
easy to open more files than is allowed by the OS before the garbage
collector ever kicks in. (Perhaps ruby ought to catch the error and run
the garbage collector, then try again transparently?)


Dossy Shiobara mail: dossy@panoptic.com
Panoptic Computer Network web: http://www.panoptic.com/
“He realized the fastest way to change is to laugh at your own
folly – then you can let go and quickly move on.” (p. 70)

It does:

file = fopen(fname, mode);
if (!file) {

»·······if (errno == EMFILE || errno == ENFILE) {
»······· rb_gc();
»······· file = fopen(fname, mode);
»·······}

Spot the great use of mixed tabs and spaces to indent there :wink:

···

If and when it gets garbage collected, the file will be closed. But if
you run a ton of these commands in a row, it’s extremely easy to open
more files than is allowed by the OS before the garbage collector ever
kicks in. (Perhaps ruby ought to catch the error and run the garbage
collector, then try again transparently?)


Thomas ‘Freaky’ Hurst - freaky@aagh.net - http://www.aagh.net/

To one large turkey add one gallon of vermouth and a demijohn of Angostura
bitters. Shake.
– F. Scott Fitzgerald, recipe for turkey cocktail.

The first one creates a file object, does a read, then the file
object is floating. If and when it gets garbage collected, the file
will be closed. But if you run a ton of these commands in a row,
it’s extremely easy to open more files than is allowed by the OS
before the garbage collector ever kicks in. (Perhaps ruby ought to
catch the error and run the garbage collector, then try again
transparently?)

You know, this might be the problem I’m experiencing with DBI and
dbd_oracle.

I never execute StatementHandle#finish – could I be running out of
statement handles? (Remember, when I explicitly added “GC.start”
to my code, the crash went away.)

Shouldn’t DBI tell me I’ve run out of statement handles, rather than
segfaulting Ruby?

I’m not real familiar with DBI and the dbd_oracle extension, but
assuming it’s using file descriptors for the statement handles (file
descriptors are used for files, sockets, fifos, pipes, etc) the same
sort of thing could be happening – where you have sort of a race
between your code and the garbage collector. =)

Maybe I can create a reproducible test case now, if this is why it’s
dying …

Good luck, it’s a definate possibility. =)

···

On Sunday 09 June 2002 02:53 pm, Dossy wrote:

On 2002.06.10, Wesley J Landaker wjl@icecavern.net wrote:


Wesley J. Landaker - wjl@icecavern.net
OpenPGP FP: C99E DF40 54F6 B625 FD48 B509 A3DE 8D79 541F F830