Mem leak without add_heap()?

Okay. I went and did a little simple hacking on a 1.8.4 instance of Ruby to simply output to stdout some debugging info for various GC/memory related actions.

Using the info I gathered from that, I tuned a real application that I have which consistently leaks a small amount of RAM despite having no object leaks. I have tested that by setting up a signal handler with which I can have it dump complete object counts at any time. It grows slowly, but fairly deterministically. Given a certain number of units of work that I ask of the code, RAM usage will increase a predictable amount, and never seems to go back down, even across very long runtimes (I recently killed and restarted some processes that had been running since sometime in 2005) but the object counts are the same. So, that memory use is coming from somewhere else.

Anyway, but manually calling GC.start() at a modest interval within the application, I can prevent rb_newobj() from ever encountering an empty freelist and having to call garbage_collect() itself. And by doing so the freed count stays very consistent and well above FREE_MIN, so add_heap is never invoked there. I also put debugging output on each of the other two locations add_heap() can be called. It never is.

However, despite this, RAM usage of the process continues to creep upward.
So. Why? A leak somewhere else, in some usage of rc_xmalloc/rb_xcalloc/rb_realloc?

Thanks,

Kirk Haines

I've been using valgrind on my leaking code, and so far:

==31968== 21,364 bytes in 871 blocks are possibly lost in loss record 23 of 38
==31968== at 0x401A6C2: malloc (vg_replace_malloc.c:149)
==31968== by 0x806A2B5: ruby_xmalloc (gc.c:122)
==31968== by 0x805F41D: scope_dup (eval.c:7971)
==31968== by 0x805FB10: proc_alloc (eval.c:8254)
==31968== by 0x805FBF4: proc_s_new (eval.c:8289)
==31968== by 0x8065F86: call_cfunc (eval.c:5550)
==31968== by 0x805B65E: rb_call0 (eval.c:5692)
==31968== by 0x805BECC: rb_call (eval.c:5920)
==31968== by 0x80575FD: rb_eval (eval.c:3383)
==31968== by 0x8056B21: rb_eval (eval.c:3109)
==31968== by 0x8056FC9: rb_eval (eval.c:3551)
==31968== by 0x805B987: rb_call0 (eval.c:5826)
==31968== by 0x805BECC: rb_call (eval.c:5920)
==31968== by 0x80575FD: rb_eval (eval.c:3383)
==31968== by 0x805B987: rb_call0 (eval.c:5826)
==31968== by 0x805BECC: rb_call (eval.c:5920)
==31968== by 0x805C068: rb_f_send (ruby.h:638)
==31968== by 0x8065F86: call_cfunc (eval.c:5550)
==31968== by 0x805B65E: rb_call0 (eval.c:5692)
==31968== by 0x805BECC: rb_call (eval.c:5920)
==31968== by 0x80575FD: rb_eval (eval.c:3383)
==31968== by 0x8059EA4: rb_yield_0 (eval.c:4897)
==31968== by 0x805A421: rb_yield (eval.c:4979)
==31968== by 0x80B3686: rb_ary_each (array.c:1128)
==31968==
==31968== 53,832 (14,496 direct, 39,336 indirect) bytes in 906 blocks are definitely lost in loss record 26 of 38
==31968== at 0x401A6C2: malloc (vg_replace_malloc.c:149)
==31968== by 0x80A4179: st_init_table_with_size (st.c:154)
==31968== by 0x80A41B3: st_init_table (st.c:167)
==31968== by 0x806C96F: hash_alloc (hash.c:235)
==31968== by 0x806CABF: rb_hash_s_create (hash.c:328)
==31968== by 0x8065F86: call_cfunc (eval.c:5550)
==31968== by 0x805B65E: rb_call0 (eval.c:5692)
==31968== by 0x805BECC: rb_call (eval.c:5920)
==31968== by 0x80575FD: rb_eval (eval.c:3383)
==31968== by 0x8056FC9: rb_eval (eval.c:3551)
==31968== by 0x8055FAC: rb_eval (eval.c:2851)
==31968== by 0x805B987: rb_call0 (eval.c:5826)
==31968== by 0x805BECC: rb_call (eval.c:5920)
==31968== by 0x80575FD: rb_eval (eval.c:3383)
==31968== by 0x805769E: rb_eval (ruby.h:643)
==31968== by 0x805B987: rb_call0 (eval.c:5826)
==31968== by 0x805BECC: rb_call (eval.c:5920)
==31968== by 0x805C068: rb_f_send (ruby.h:638)
==31968== by 0x8065F86: call_cfunc (eval.c:5550)
==31968== by 0x805B65E: rb_call0 (eval.c:5692)
==31968== by 0x805BECC: rb_call (eval.c:5920)
==31968== by 0x80575FD: rb_eval (eval.c:3383)
==31968== by 0x8059EA4: rb_yield_0 (eval.c:4897)
==31968== by 0x805A421: rb_yield (eval.c:4979)

There is memory leaking.

So far I haven't had any success making a small test program that reproduces these results, but they are reliably reproduced with my complex piece of code, and if I extrapolate those numbers of lost bytes (which came about after about 300 units of work) out to the amount of RAM usage growth that I see after 100000 units, it fits.

Kirk Haines

ยทยทยท

On Mon, 28 Aug 2006 khaines@enigo.com wrote:

Anyway, but manually calling GC.start() at a modest interval within the application, I can prevent rb_newobj() from ever encountering an empty freelist and having to call garbage_collect() itself. And by doing so the freed count stays very consistent and well above FREE_MIN, so add_heap is never invoked there. I also put debugging output on each of the other two locations add_heap() can be called. It never is.

However, despite this, RAM usage of the process continues to creep upward.
So. Why? A leak somewhere else, in some usage of rc_xmalloc/rb_xcalloc/rb_realloc?