Newbie Mem Leak Issue

I am fairly new to Ruby and a program that I have created seems to have
a memory leak. I have generated a small program which does the basics
of my program that leaks, and the test program also seems to leak. Can
anyone spot the design flaw?

Any help you can provide is greatly appreciated.

Thanks,
Keith

···

=================================
class LeakTest
  def initialize
    print("initialized\n")
    # fill up the standard 8M of space so the leak becomes apparent
faster
    str = "0123456789 0123456789 0123456789 0123456789 0123456789
0123456789 0123456789 0123456789 0123456789 0123456789 "
    @huge_mem = Array.new(1000000, str)
  end

  def get_files(pathname)
    listing = Array.new

    Dir.glob("#{pathname}/*",0) do |f|
      # add the files
      listing << f
      # now add directories.
      if(File.directory?(f))
        returnlist = get_files(f)
        returnlist.each {|filename| listing << filename }
      end
    end
    return listing
  end

  include GC
  def cleanup
    printf("cleaning up")
    GC.start
    5.times do
      print"."
      sleep(1)
    end
    print"\n"
  end

  def monitor
    listing = get_files("C:/Program Files")
    printf("Found %d files\n", listing.length)
    5.times do
      print"."
      sleep(1)
    end
    print"\n"
  end
end

lt = LeakTest.new
while true
  lt.monitor
  lt.cleanup
end

--
Posted via http://www.ruby-forum.com/.

Keith Barr wrote:

    str = "0123456789 0123456789 0123456789 0123456789 0123456789
0123456789 0123456789 0123456789 0123456789 0123456789 "
    @huge_mem = Array.new(1000000, str)

That's an array of 1000000 references to the same string. Did you want lots of strings? This is the way to do that:

Array.new(1000000) {str}

------------------------------------------------------------- Array::new
      Array.new(size=0, obj=nil)
      Array.new(array)
      Array.new(size) {|index| block }

···

------------------------------------------------------------------------
      Returns a new array. In the first form, the new array is empty. In
      the second it is created with _size_ copies of _obj_ (that is,
      _size_ references to the same _obj_).

--
       vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

Keith Barr wrote:

I am fairly new to Ruby and a program that I have created seems to have
a memory leak. I have generated a small program which does the basics
of my program that leaks, and the test program also seems to leak. Can
anyone spot the design flaw?

Any help you can provide is greatly appreciated.

Does not seem to leak with ruby-1.8.6p111 on linux. (I replaced "C:/Program Files" with the mount point of my ntfs partition.) It stays under 18Mb of VM.

···

--
       vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

Dunno about the memory leak but are you aware that in get_files, you only go one level of sub-directory deep, rather than down the whole depth of the directory tree e.g. (with apologies to the excellent
Xwin32 product):

if I have:
c:\program files\
    +--> Starnet
              +---->xwin32
                  +-----> xwin32.exe

I'll never see xwin32.exe in the list of files?
If you want full depth my suggestion is that you
recursively call get_files (when it encounters a directory)
and then append the results it returns to the original result.

Ron.

Keith Barr wrote:

···

I am fairly new to Ruby and a program that I have created seems to have
a memory leak. I have generated a small program which does the basics
of my program that leaks, and the test program also seems to leak. Can
anyone spot the design flaw?

Any help you can provide is greatly appreciated.

Thanks,
Keith

=================================
class LeakTest
  def initialize
    print("initialized\n")
    # fill up the standard 8M of space so the leak becomes apparent
faster
    str = "0123456789 0123456789 0123456789 0123456789 0123456789
0123456789 0123456789 0123456789 0123456789 0123456789 "
    @huge_mem = Array.new(1000000, str)
  end

  def get_files(pathname)
    listing = Array.new

    Dir.glob("#{pathname}/*",0) do |f|
      # add the files
      listing << f
      # now add directories.
      if(File.directory?(f))
        returnlist = get_files(f)
        returnlist.each {|filename| listing << filename }
      end
    end
    return listing
  end

  include GC
  def cleanup
    printf("cleaning up")
    GC.start
    5.times do
      print"."
      sleep(1)
    end
    print"\n"
  end

  def monitor
    listing = get_files("C:/Program Files")
    printf("Found %d files\n", listing.length)
    5.times do
      print"."
      sleep(1)
    end
    print"\n"
  end
end

lt = LeakTest.new
while true
  lt.monitor
  lt.cleanup
end

Joel VanderWerf wrote:

Keith Barr wrote:

    str = "0123456789 0123456789 0123456789 0123456789 0123456789
0123456789 0123456789 0123456789 0123456789 0123456789 "
    @huge_mem = Array.new(1000000, str)

That's an array of 1000000 references to the same string. Did you want
lots of strings? This is the way to do that:

Array.new(1000000) {str}

------------------------------------------------------------- Array::new
      Array.new(size=0, obj=nil)
      Array.new(array)
      Array.new(size) {|index| block }

oops, yeah, good catch.

···

--
Posted via http://www.ruby-forum.com/\.

Joel VanderWerf wrote:

Keith Barr wrote:

I am fairly new to Ruby and a program that I have created seems to have
a memory leak. I have generated a small program which does the basics
of my program that leaks, and the test program also seems to leak. Can
anyone spot the design flaw?

Any help you can provide is greatly appreciated.

Does not seem to leak with ruby-1.8.6p111 on linux. (I replaced
"C:/Program Files" with the mount point of my ntfs partition.) It stays
under 18Mb of VM.

Interesting. I am running (obviously) on Windows and I also have a
Linux box, both on patch 111, and both leaking. It isn't a quick leak,
if you run for an a while it might show up. How long did you let it go?

If you don't ever see anything leaking, I wonder what could be different
in our setup or builds.

I certainly don't see anything in there that looks like a memory leak,
but it certainly grows over time on my boxes. Very difficult problem.

KB

···

--
Posted via http://www.ruby-forum.com/\.

Array.new(1000000) {str}

Isn't the above still just an array of 1000000 refs to the same string? Perhaps you meant str.dup?

For example:

str = "asdf"
puts Array.new(3){str}.collect{|s|s.object_id}

-605717778
# the above three object_ids are the same

puts Array.new(3){str.dup}.collect{|s|s.object_id}

-605741458
-605741468
-605741478
# the above three object_ids are different

···

-----Original Message-----
From: Keith Barr
Sent: 02/21/2008 02:16 PM

Joel VanderWerf wrote:

Keith Barr wrote:

    str = "0123456789 0123456789 0123456789 0123456789 0123456789
0123456789 0123456789 0123456789 0123456789 0123456789 "
    @huge_mem = Array.new(1000000, str)

That's an array of 1000000 references to the same string. Did you want
lots of strings? This is the way to do that:

Array.new(1000000) {str}

------------------------------------------------------------- Array::new
      Array.new(size=0, obj=nil)
      Array.new(array)
      Array.new(size) {|index| block }

oops, yeah, good catch.

Keith Barr wrote:

Interesting. I am running (obviously) on Windows and I also have a Linux box, both on patch 111, and both leaking. It isn't a quick leak, if you run for an a while it might show up. How long did you let it go?

Only 10 cycles, but it didn't seem to be growing...

···

--
       vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

John Woods wrote:

>> Array.new(1000000) {str}

Isn't the above still just an array of 1000000 refs to the same string? Perhaps you meant str.dup?

Oops, you're right. Using #dup or the following will work:

a = Array.new(5) {"str"}
p a.map {|s| s.object_id}.uniq.size # => 5

···

--
       vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

Joel VanderWerf wrote:

Keith Barr wrote:

Interesting. I am running (obviously) on Windows and I also have a Linux box, both on patch 111, and both leaking. It isn't a quick leak, if you run for an a while it might show up. How long did you let it go?

Only 10 cycles, but it didn't seem to be growing...

After 150 cycles, still under 18M. (This is the same code as yours with the one dir path modification.)

···

--
       vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

I have run it under winxp, the code and results can be found at
http://bryvecer.sk/leak/
It seems that it might slowly leak something, but I don't know for sure.

I have used pslist from sysinternals.com for memory measurements.

···

On Fri, Feb 22, 2008 at 7:32 AM, Joel VanderWerf <vjoel@path.berkeley.edu> wrote:

Joel VanderWerf wrote:
> Keith Barr wrote:
>> Interesting. I am running (obviously) on Windows and I also have a
>> Linux box, both on patch 111, and both leaking. It isn't a quick
>> leak, if you run for an a while it might show up. How long did you
>> let it go?
>
> Only 10 cycles, but it didn't seem to be growing...

After 150 cycles, still under 18M. (This is the same code as yours with
the one dir path modification.)

I have run Ruby MemoryValidator by SoftwareVerify on your code:

This version doesn't leak:

#DATA_PATH = "C:/Program Files" # 80_000 files
DATA_PATH = "C:/Program Files/Microsoft Visual Studio 8" # 19_000 files

class LeakTest
def get_files(pathname)
   listing = Array.new

   Dir.glob("#{pathname}/*",0) do |f|
     # add the files
     listing << f
     # now add directories.
     if(File.directory?(f))
       returnlist = get_files(f)
       returnlist.each {|filename| listing << filename }
     end
   end
   return listing
end

include GC
def cleanup
   GC.start
   sleep(5)
end

def monitor
   listing = get_files(DATA_PATH)
   sleep(5)
end
end

lt = LeakTest.new
while true
lt.monitor
lt.cleanup
end

This is my version, that seems to use less objects
(I'm 1. using an accumulator, and 2. leave the arrays as are, until
they are all ready, and then flattening them,
so GC does it work once for all objects) It's possible that 1. without
2. is better.

#DATA_PATH = "C:/Program Files" # 80_000 files
DATA_PATH = "C:/Program Files/Microsoft Visual Studio 8" # 19_000 files

class LeakTest
def get_files(listing, pathname)
   files = Dir.glob("#{pathname}/*", 0)
   # add the files
   listing << files
   # now add directories.
   files.each do |f|
     get_files(listing, f) if File.directory?(f)
   end
   return listing
end

include GC
def cleanup
   GC.start
   sleep(5)
end

def monitor
   listing =
   get_files(listing, DATA_PATH)
   listing.flatten!
   sleep(5)
end
end

lt = LeakTest.new
while true
lt.monitor
lt.cleanup
end

It seems that the problem is in the little debug messages, that are
hard to collect.

···

On Fri, Feb 22, 2008 at 11:35 AM, Jano Svitok <jan.svitok@gmail.com> wrote:

On Fri, Feb 22, 2008 at 7:32 AM, Joel VanderWerf > <vjoel@path.berkeley.edu> wrote:

> Joel VanderWerf wrote:
> > Keith Barr wrote:
> >> Interesting. I am running (obviously) on Windows and I also have a
> >> Linux box, both on patch 111, and both leaking. It isn't a quick
> >> leak, if you run for an a while it might show up. How long did you
> >> let it go?
> >
> > Only 10 cycles, but it didn't seem to be growing...
>
> After 150 cycles, still under 18M. (This is the same code as yours with
> the one dir path modification.)

I have run it under winxp, the code and results can be found at
http://bryvecer.sk/leak/
It seems that it might slowly leak something, but I don't know for sure.

I have used pslist from sysinternals.com for memory measurements.

I meant: there's no leak in ruby land - i.e. the object count stays
the same. However, there might be a leak in C land.

···

On Fri, Feb 22, 2008 at 12:38 PM, Jano Svitok <jan.svitok@gmail.com> wrote:

This version doesn't leak:

Jano Svitok wrote:

···

On Fri, Feb 22, 2008 at 12:38 PM, Jano Svitok <jan.svitok@gmail.com> > wrote:

This version doesn't leak:

I meant: there's no leak in ruby land - i.e. the object count stays
the same. However, there might be a leak in C land.

Thanks for everyone's ideas. After further testing, it seems like the
move from p110 to p111 made a difference.

I appreciate the help.

Keith
--
Posted via http://www.ruby-forum.com/\.