[ANN] Mongrel 0.3.13.4 Pre-Release -- Ruby's LEAK Fixed (Death To Mutex!)

<snip>

me too :wink: i took a few pages from your book and included some of the other
suggestions before re-running the tests.

before i dump all the output i'll summarize:

   - the script new counts threads in objectspace before/after thread
     creation/deletion

   - time to create/kill threads under mutex/sync is tracked

   - memory usage is tracked via pct_mem, vsz, and rss

   - no gc voodoo, gc is run only once :wink:

   - no waiting on Thread.list. it's assumed the kill works.

with those mods my results are largely the same: mutex shows no leak at all
and sync show a gradual trend up. with respect to sync, however, the trend is
from a small amount of initial memory and it may well be the case that it
eventually gets cleared so i am not asserting a leak in sync at this point -
i'm doing a long run to see which is the case. one strange thing, while it's
the sync code that shows a gradual trend up - it also seems to result in
better cycling of threads through object space - which is to say all threads
seem to actually get killed instantly with sync while with mutex some stick
around (but are eventually cleared). this may well be a timing issue as kirk
suggested. in both cases there is not trend up - just different cycles.

one important note - while making my mods i manged to make both mutex/sync
leak by accidentally holding a reference to the created threads. of course
it's not either lib that was leakingbut my own damn code! it's still my
feeling that any perceived leak is due to programmer error but, as i said, the
verdict is still out on sync for me...

here's the code and output from 10 iterations using 420 threads under both
mutex and sync:

harp:~ > cat a.rb
require 'thread'
require 'sync'

class TestThreads
   def initialize which, n, iter
     c = case which
       when /mutex/io
         Mutex
       when /sync/io
         Sync
     end
     @guard = c.new
     @n = Integer n
     @iter = Integer iter
     puts "using: #{ c.name }"
     puts "n: #{ @n }"
     puts "iter: #{ @iter }"
   end

   def memory_usage
     stdout = `ps u #{ Process.pid }`
     last_line = stdout.split(%r/\n/).last.strip
     fields = last_line.split(%r/\s+/)
     v = fields[3,3]
     suffixes = %w( % b b )
     prefixes = %w( pct_mem vsz rss )
     prefixes.map{|k| "#{ k }: #{ v.shift }#{ suffixes.shift }"}.join ', '
   end

   def gc_start!
     GC.start
   end

   def n_threads_in_object_space
     (t = 0 and ObjectSpace.each_object{|o| t += 1 if Thread === o} and t)
   end

   def tq
     q = Queue.new
     t = Thread.new{
       mb = @guard.synchronize{ 0.chr * (2 ** 20) }
       q.push :ready
       Thread.stop
     }
     [t, q]
   end

   def elapsed
     a = Time.now
     yield
     b = Time.now
     b.to_f - a.to_f
   end

   def run
     list =
     @iter.times do |i|
       puts "---"
       puts "iter: #{ i }"

       e = elapsed{
         # load 1000 threads up
         @n.times{ list << tq }

         # wait for all threads to init memory with mb of data
         list.each{|t,q| q.pop}
       }

       # show memory usage
       gc_start!
       puts "threads loaded in #{ '%4.4f' % e }."
       puts "memory_usage : #{ memory_usage }"
       puts "n_threads_in_object_space : #{ n_threads_in_object_space }"

       # kill all threads
       list.each{|t,q| t.kill}.clear

       # show memory usage
       gc_start!
       puts "threads killed."
       puts "memory_usage : #{ memory_usage }"
       puts "n_threads_in_object_space : #{ n_threads_in_object_space }"
     end
   end
end

$VERBOSE = nil
STDOUT.sync = true
Thread.abort_on_exception = true
trap('INT'){ exit }

which, n, iter, ignored = ARGV
which ||= 'mutex'
n ||= 42
iter ||= 4242
TestThreads.new(which, n, iter).run

harp:~ > ruby a.rb mutex 420 10
using: Mutex
n: 420
iter: 10

···

On Sun, 27 Aug 2006 khaines@enigo.com wrote:

On Sun, 27 Aug 2006 ara.t.howard@noaa.gov wrote:

I have been playing with this all morning.

---
iter: 0
threads loaded in 24.7623.
memory_usage : pct_mem: 44.3%, vsz: 457056b, rss: 455528b
n_threads_in_object_space : 421
threads killed.
memory_usage : pct_mem: 13.3%, vsz: 138208b, rss: 136720b
n_threads_in_object_space : 121
---
iter: 1
threads loaded in 24.8212.
memory_usage : pct_mem: 44.4%, vsz: 458048b, rss: 456464b
n_threads_in_object_space : 422
threads killed.
memory_usage : pct_mem: 13.5%, vsz: 140616b, rss: 139128b
n_threads_in_object_space : 123
---
iter: 2
threads loaded in 24.6988.
memory_usage : pct_mem: 44.3%, vsz: 457848b, rss: 455268b
n_threads_in_object_space : 422
threads killed.
memory_usage : pct_mem: 12.9%, vsz: 134452b, rss: 132664b
n_threads_in_object_space : 117
---
iter: 3
threads loaded in 24.9785.
memory_usage : pct_mem: 44.1%, vsz: 457124b, rss: 453824b
n_threads_in_object_space : 421
threads killed.
memory_usage : pct_mem: 13.3%, vsz: 139572b, rss: 136768b
n_threads_in_object_space : 122
---
iter: 4
threads loaded in 24.5234.
memory_usage : pct_mem: 44.2%, vsz: 457020b, rss: 454120b
n_threads_in_object_space : 421
threads killed.
memory_usage : pct_mem: 13.5%, vsz: 141616b, rss: 138772b
n_threads_in_object_space : 124
---
iter: 5
threads loaded in 24.4355.
memory_usage : pct_mem: 44.3%, vsz: 457888b, rss: 455536b
n_threads_in_object_space : 422
threads killed.
memory_usage : pct_mem: 13.3%, vsz: 138532b, rss: 136856b
n_threads_in_object_space : 121
---
iter: 6
threads loaded in 24.3171.
memory_usage : pct_mem: 44.3%, vsz: 458184b, rss: 455076b
n_threads_in_object_space : 422
threads killed.
memory_usage : pct_mem: 13.3%, vsz: 140256b, rss: 137596b
n_threads_in_object_space : 123
---
iter: 7
threads loaded in 24.1998.
memory_usage : pct_mem: 44.3%, vsz: 458028b, rss: 455164b
n_threads_in_object_space : 422
threads killed.
memory_usage : pct_mem: 13.2%, vsz: 139228b, rss: 136468b
n_threads_in_object_space : 122
---
iter: 8
threads loaded in 24.2478.
memory_usage : pct_mem: 44.2%, vsz: 457156b, rss: 453976b
n_threads_in_object_space : 421
threads killed.
memory_usage : pct_mem: 13.3%, vsz: 139228b, rss: 137052b
n_threads_in_object_space : 122
---
iter: 9
threads loaded in 24.1954.
memory_usage : pct_mem: 44.3%, vsz: 458084b, rss: 455028b
n_threads_in_object_space : 422
threads killed.
memory_usage : pct_mem: 13.8%, vsz: 144472b, rss: 141732b
n_threads_in_object_space : 126

harp:~ > ruby a.rb sync 420 10
using: Sync
n: 420
iter: 10
---
iter: 0
threads loaded in 26.5464.
memory_usage : pct_mem: 36.5%, vsz: 454628b, rss: 375700b
n_threads_in_object_space : 421
threads killed.
memory_usage : pct_mem: 0.9%, vsz: 11328b, rss: 9276b
n_threads_in_object_space : 1
---
iter: 1
threads loaded in 25.9357.
memory_usage : pct_mem: 41.4%, vsz: 454524b, rss: 425824b
n_threads_in_object_space : 421
threads killed.
memory_usage : pct_mem: 1.9%, vsz: 23956b, rss: 20180b
n_threads_in_object_space : 1
---
iter: 2
threads loaded in 25.9797.
memory_usage : pct_mem: 43.3%, vsz: 454600b, rss: 445488b
n_threads_in_object_space : 421
threads killed.
memory_usage : pct_mem: 2.3%, vsz: 31544b, rss: 24364b
n_threads_in_object_space : 1
---
iter: 3
threads loaded in 24.9242.
memory_usage : pct_mem: 43.8%, vsz: 454316b, rss: 450620b
n_threads_in_object_space : 421
threads killed.
memory_usage : pct_mem: 3.5%, vsz: 39480b, rss: 35952b
n_threads_in_object_space : 1
---
iter: 4
threads loaded in 24.3832.
memory_usage : pct_mem: 43.9%, vsz: 454396b, rss: 451548b
n_threads_in_object_space : 421
threads killed.
memory_usage : pct_mem: 4.8%, vsz: 52036b, rss: 49352b
n_threads_in_object_space : 1
---
iter: 5
threads loaded in 24.1944.
memory_usage : pct_mem: 43.9%, vsz: 454596b, rss: 451692b
n_threads_in_object_space : 421
threads killed.
memory_usage : pct_mem: 5.9%, vsz: 63872b, rss: 61260b
n_threads_in_object_space : 1
---
iter: 6
threads loaded in 24.1334.
memory_usage : pct_mem: 43.9%, vsz: 454388b, rss: 451068b
n_threads_in_object_space : 421
threads killed.
memory_usage : pct_mem: 6.8%, vsz: 73560b, rss: 70336b
n_threads_in_object_space : 1
---
iter: 7
threads loaded in 24.0867.
memory_usage : pct_mem: 43.8%, vsz: 454252b, rss: 450204b
n_threads_in_object_space : 421
threads killed.
memory_usage : pct_mem: 7.5%, vsz: 81920b, rss: 78000b
n_threads_in_object_space : 1
---
iter: 8
threads loaded in 24.0251.
memory_usage : pct_mem: 43.8%, vsz: 454356b, rss: 450000b
n_threads_in_object_space : 421
threads killed.
memory_usage : pct_mem: 8.6%, vsz: 92836b, rss: 88532b
n_threads_in_object_space : 1
---
iter: 9
threads loaded in 24.2115.
memory_usage : pct_mem: 43.9%, vsz: 454260b, rss: 451804b
n_threads_in_object_space : 421
threads killed.
memory_usage : pct_mem: 9.8%, vsz: 103864b, rss: 101528b
n_threads_in_object_space : 1

kind regards.

-a
--
to foster inner awareness, introspection, and reasoning is more efficient than
meditation and prayer.
- h.h. the 14th dalai lama

Zed Shaw wrote:

For the software engineering philosophers on the list, what's the
difference between a language that forces the engineer to explicitly
manage dynamic memory allocation and de-allocation, and one that
supposedly relieves the engineer from that need -- until you crash in a
production system that worked on smaller test cases a couple of months
ago? :slight_smile:

This is exactly the problem I'm complaining about. It's not that Ruby's
GC isn't collecting the ram fast enough or anything like that. If Ruby
runs out or I call GC.start then dammit clear the dead objects. It's
not a suggestion, it's a command. Collect the RAM. Don't do it lazy
until my OS or process crashes.

And when a lazy garbage collector meets a lazy OS memory manager, all
heck breaks loose. Three syncs, a garbage collect, repeat thrice? This
*is* the 21st century, right? Do the &$%^# I/O, dagnabbit!! :slight_smile:

So, for me, despite everyone talking about timeouts and stuff (and I
know people are just trying to find out why Ruby does this) it's all
useless since I still have a program crashing and I'm now looking at a
total redesign to work around a GC.

And yet, all I really need is a single option I can pass to ruby that
says "never go above X" instead of this psuedo-leak-until-crash crap.

If I may use the J word ... the Java run-time has stuff like that. Set
your RAM size too small and the garbage collector is about all that
runs. Set it too large, and the JRE and its workload gets paged or
swapped out.

So you buy more RAM. Guess what? Some Linux kernels perform *worse* the
more RAM they need to manage! Tables to manage high memory (> 1 GB) are
kept in low memory (< 1 GB). It doesn't take much for the tables to fill
up low memory. So you buy 64-bit processors. :slight_smile:

But Windows has similar problems -- it's not just Linux. I don't know
about BSD or Mac OS X. As the marine biologist said, "Bah Humpback!"

5. Can somebody run these Ruby leak tests/demos on a Windows XP or 2003
Server with multiple processors? I'm really curious what happens.

Luis Lavena ran our tests on Win32 and found that Mutex works but Sync
doesn't, so I'm back to square one. I'm probably going to do a fast
patch so that people can just pick either and see if the leak goes away.

Yeah, I figured Windows was gonna fail one way or another. And it
probably varies depending on whether you use MinGW or Visual Stupido to
compile Ruby <weg>

Well, enough trolling for one night ... I'm going to see if I can write
a web server in GForth in a 16-bit address space.

<ducking>

···

On Sun, 2006-08-27 at 12:04 +0900, M. Edward (Ed) Borasky wrote:

Zed, this might be really bad or no problem at all. Is there a safe time for you to clobber the Mutex (or Sync I'd think)?

Cheers,
Bob

···

On Aug 27, 2006, at 9:53 AM, Bob Hutchison wrote:

The Mutex was somehow holding the Thread object in memory and that was holding other stuff.

----
Bob Hutchison -- blogs at <http://www.recursive.ca/hutch/&gt;
Recursive Design Inc. -- <http://www.recursive.ca/&gt;
Raconteur -- <http://www.raconteur.info/&gt;
xampl for Ruby -- <http://rubyforge.org/projects/xampl/&gt;

after a few hours i've finally seen the sync code cycle down after growing
slowly for a looooong time, so it does not appear to have a leak either.

except from output:

···

On Mon, 28 Aug 2006 Ara.T.Howard@noaa.gov wrote:

with those mods my results are largely the same: mutex shows no leak at all
and sync show a gradual trend up. with respect to sync, however, the trend
is from a small amount of initial memory and it may well be the case that it
eventually gets cleared so i am not asserting a leak in sync at this point -
i'm doing a long run to see which is the case.

---
iter: 449
threads loaded in 23.4420.
memory_usage : pct_mem: 44.0%, vsz: 453944b, rss: 452456b
n_threads_in_object_space : 421
threads killed.
memory_usage : pct_mem: 22.3%, vsz: 230740b, rss: 229380b
n_threads_in_object_space : 1
---
iter: 450
threads loaded in 23.2819.
memory_usage : pct_mem: 44.0%, vsz: 453776b, rss: 452372b
n_threads_in_object_space : 421
threads killed.
memory_usage : pct_mem: 22.3%, vsz: 230740b, rss: 229380b
n_threads_in_object_space : 1
---
iter: 451
threads loaded in 23.4827.
memory_usage : pct_mem: 44.0%, vsz: 453776b, rss: 452416b
n_threads_in_object_space : 421
threads killed.
memory_usage : pct_mem: 8.0%, vsz: 83740b, rss: 82380b
n_threads_in_object_space : 1
---
iter: 452
threads loaded in 23.3958.
memory_usage : pct_mem: 44.1%, vsz: 454336b, rss: 452932b
n_threads_in_object_space : 421
threads killed.
memory_usage : pct_mem: 8.5%, vsz: 89336b, rss: 87976b
n_threads_in_object_space : 1
---
iter: 453
threads loaded in 23.4515.
memory_usage : pct_mem: 44.0%, vsz: 454300b, rss: 452856b
n_threads_in_object_space : 421
threads killed.
memory_usage : pct_mem: 9.4%, vsz: 98472b, rss: 97116b
n_threads_in_object_space : 1

cheers.

-a
--
to foster inner awareness, introspection, and reasoning is more efficient than
meditation and prayer.
- h.h. the 14th dalai lama