Force_recycle

(Kroeger Simon (ext)) #1

Hi Nobu,

i realy appreciate your input but that's not the kind of problem
I'm facing (and I probably should have elaborated the issue
further when asking such questions).
At one point in our project a large number of plain ruby objects
gets alocated (no relation to external resources) and stored in
an array (all of the same class). This happens from various Threads
but in a (hopefully) threadsafe manner. Later this array gets
#clear(ed). After GC there are still instances of the class in the
ObjectSpace even though this special class is only used for
elements of this array. This process repeats itself over and
over again and after some hours we have hundreds of MB memory
leak.

So, what I'm realy looking for is a way to ask: 'hey almighty
shepherd of objects, why are you still clinging to xyz?'

Simon

···

-----Original Message-----
From: nobuyoshi nakada [mailto:nobuyoshi.nakada@ge.com]
Sent: Thursday, August 11, 2005 3:29 AM
To: ruby-talk ML
Subject: Re: force_recycle

Hi,

At Thu, 11 Aug 2005 00:16:10 +0900,
Kroeger Simon (ext) wrote in [ruby-talk:151514]:
> but that's ok. I have a rather large project here and it is eating
> memory. I thought it would be nice to delete the objects we
think are
> obsolet and see where it goes down.

The wrong thing is that you misuse GC to maintain expensive
(or external) resources. Use a releasing method and blocks to
ensure they get called.

<Ensuring post process>
http://www.rubyist.net/~matz/slides/oscon2005/mgp00047.html
http://www.rubyist.net/~matz/slides/oscon2005/mgp00048.html
http://www.rubyist.net/~matz/slides/oscon2005/mgp00049.html

--
Nobu Nakada

(Nakada, Nobuyoshi) #2

Hi,

At Thu, 11 Aug 2005 17:23:18 +0900,
Kroeger Simon (ext) wrote in [ruby-talk:151641]:

At one point in our project a large number of plain ruby objects
gets alocated (no relation to external resources) and stored in
an array (all of the same class). This happens from various Threads
but in a (hopefully) threadsafe manner. Later this array gets
#clear(ed). After GC there are still instances of the class in the
ObjectSpace even though this special class is only used for
elements of this array. This process repeats itself over and
over again and after some hours we have hundreds of MB memory
leak.

Have those threads all terminated and got freed?

···

--
Nobu Nakada

(Joel VanderWerf) #3

Kroeger Simon (ext) wrote:

Hi Nobu,

i realy appreciate your input but that's not the kind of problem
I'm facing (and I probably should have elaborated the issue
further when asking such questions).
At one point in our project a large number of plain ruby objects
gets alocated (no relation to external resources) and stored in
an array (all of the same class). This happens from various Threads
but in a (hopefully) threadsafe manner. Later this array gets
#clear(ed). After GC there are still instances of the class in the
ObjectSpace even though this special class is only used for
elements of this array. This process repeats itself over and
over again and after some hours we have hundreds of MB memory
leak.

So, what I'm realy looking for is a way to ask: 'hey almighty
shepherd of objects, why are you still clinging to xyz?'

I patched the ruby interpreter to divulge this information, but that was
back at 1.6 and 1.7, so it would probably need some work for 1.8.

If you're interested, there's a mention of the patch at

http://www.rubygarden.org/ruby?GCAndMemoryManagement

and the patch itself is at

http://www.google.com/url?sa=D&q=http%3A%2F%2Fredshift.sourceforge.net%2Fdebugging-GC%2F

Briefly, the patch adds a method

GC.reachability_paths obj

which returns a list of all the ways you can reference that object,
starting from the basic references such as C and ruby globals, stack
variables, and others.

···

--
      vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

(Nakada, Nobuyoshi) #4

Hi,

At Fri, 12 Aug 2005 04:25:35 +0900,
Joel VanderWerf wrote in [ruby-talk:151751]:

Briefly, the patch adds a method

GC.reachability_paths obj

which returns a list of all the ways you can reference that object,
starting from the basic references such as C and ruby globals, stack
variables, and others.

Interesting.

A patch for CVS trunk.

Index: gc.c

···

===================================================================
RCS file: /cvs/ruby/src/ruby/gc.c,v
retrieving revision 1.206
diff -U2 -p -r1.206 gc.c
--- gc.c 12 Aug 2005 08:13:28 -0000 1.206
+++ gc.c 12 Aug 2005 10:17:15 -0000
@@ -92,5 +92,10 @@ static unsigned long malloc_limit = GC_M
static void run_final();
static VALUE nomem_error;
+#ifdef DEBUG_REACHABILITY
+static VALUE garbage_collect0 _((VALUE));
+#define garbage_collect() garbage_collect0(0)
+#else
static void garbage_collect();
+#endif

void
@@ -698,4 +703,25 @@ rb_gc_mark_maybe(obj)
#define GC_LEVEL_MAX 250

+#ifdef DEBUG_REACHABILITY
+VALUE rb_reach_test_obj = Qnil;
+VALUE rb_reach_test_result = Qnil;
+VALUE rb_reach_test_path = Qnil;
+
+static void
+rb_gc_unmark()
+{
+ RVALUE *p, *pend;
+ int i, used = heaps_used;
+
+ for (i = 0; i < used; i++) {
+ p = heaps[i].slot; pend = p + heaps[i].limit;
+ while (p < pend) {
+ RBASIC(p)->flags &= ~FL_MARK;
+ p++;
+ }
+ }
+}
+#endif
+
static void
gc_mark(ptr, lev)
@@ -704,8 +730,25 @@ gc_mark(ptr, lev)
{
     register RVALUE *obj;
+#ifdef DEBUG_REACHABILITY
+ long saved_len = 0;
+#endif

     obj = RANY(ptr);
     if (rb_special_const_p(ptr)) return; /* special const not marked */
     if (obj->as.basic.flags == 0) return; /* free cell */
+#ifdef DEBUG_REACHABILITY
+ if (!NIL_P(rb_reach_test_obj) &&
+ (obj->as.basic.flags & T_MASK) != T_NODE) {
+ saved_len = RARRAY(rb_reach_test_path)->len;
+ if ((VALUE)obj == rb_reach_test_obj) {
+ rb_warn(" ...found, after %ld steps!", saved_len);
+ rb_ary_push(rb_reach_test_result,
+ rb_ary_dup(rb_reach_test_path));
+ }
+ else if (!(obj->as.basic.flags & FL_MARK)) {
+ rb_ary_push(rb_reach_test_path, (VALUE)obj);
+ }
+ }
+#endif
     if (obj->as.basic.flags & FL_MARK) return; /* already marked */
     obj->as.basic.flags |= FL_MARK;
@@ -724,4 +767,9 @@ gc_mark(ptr, lev)
     }
     gc_mark_children(ptr, lev+1);
+#ifdef DEBUG_REACHABILITY
+ if (!NIL_P(rb_reach_test_path)) {
+ RARRAY(rb_reach_test_path)->len = saved_len;
+ }
+#endif
}

@@ -1287,7 +1335,17 @@ int rb_setjmp (rb_jmp_buf);
#endif /* __GNUC__ */

+#ifdef DEBUG_REACHABILITY
+static VALUE
+garbage_collect0(obj)
+ VALUE obj;
+#else
static void
garbage_collect()
+#endif
{
+#ifdef DEBUG_REACHABILITY
+ int i = 0;
+ VALUE result;
+#endif
     struct gc_list *list;
     struct FRAME * volatile frame; /* gcc 2.7.2.3 -O2 bug?? */
@@ -1300,29 +1358,73 @@ garbage_collect()
     }
#endif
- if (dont_gc || during_gc) {
- if (!freelist) {
- add_heap();
+#ifdef DEBUG_REACHABILITY
+#define IF_DEBUG_REACHABILITY(does) if (obj) {does;}
+ if (obj) {
+ if (!NIL_P(rb_reach_test_obj) ||
+ !NIL_P(rb_reach_test_result) ||
+ !NIL_P(rb_reach_test_path)) {
+ rb_raise(rb_eRuntimeError, "reachability_paths called recursively");
   }
- return;
+
+ rb_reach_test_obj = obj;
+ rb_reach_test_result = rb_ary_new();
+ rb_reach_test_path = rb_ary_new();
+ }
+ else
+#else
+#define IF_DEBUG_REACHABILITY(does)
+#endif
+ {
+ if (dont_gc || during_gc) {
+ if (!freelist) {
+ add_heap();
+ }
+#ifdef DEBUG_REACHABILITY
+ return 0;
+#else
+ return;
+#endif
+ }
+ during_gc++;
     }
- if (during_gc) return;
- during_gc++;

     init_mark_stack();

     /* mark frame stack */
+ IF_DEBUG_REACHABILITY(rb_warn("Checking frame stack..."));
     for (frame = ruby_frame; frame; frame = frame->prev) {
+ IF_DEBUG_REACHABILITY(
+ NODE *node = frame->node;
+ if (node) {
+ rb_ary_push(rb_reach_test_path,
+ rb_sprintf("frame %d: %s line %d", i, node->nd_file, nd_line(node)));
+ });
   rb_gc_mark_frame(frame);
+ IF_DEBUG_REACHABILITY((rb_ary_pop(rb_reach_test_path), i++));
   if (frame->tmp) {
       struct FRAME *tmp = frame->tmp;
+#ifdef DEBUG_REACHABILITY
+ int ti = 0;
+#endif
       while (tmp) {
+ IF_DEBUG_REACHABILITY(
+ NODE *node = tmp->node;
+ if (node) {
+ rb_ary_push(rb_reach_test_path,
+ rb_sprintf("tmp frame %d: %s line %d",
+ ti, node->nd_file, nd_line(node)));
+ });
     rb_gc_mark_frame(tmp);
+ IF_DEBUG_REACHABILITY((rb_ary_pop(rb_reach_test_path), ti++));
     tmp = tmp->prev;
       }
   }
     }
+ IF_DEBUG_REACHABILITY(rb_warn("Checking ruby_class..."));
     gc_mark((VALUE)ruby_scope, 0);
+ IF_DEBUG_REACHABILITY(rb_warn("Checking ruby_scope..."));
     gc_mark((VALUE)ruby_dyna_vars, 0);
     if (finalizer_table) {
+ IF_DEBUG_REACHABILITY(rb_warn("Checking finalizer_table..."));
   mark_tbl(finalizer_table, 0);
     }
@@ -1331,5 +1433,7 @@ garbage_collect()
     /* This assumes that all registers are saved into the jmp_buf (and stack) */
     setjmp(save_regs_gc_mark);
+ IF_DEBUG_REACHABILITY(rb_warn("Checking save_regs_gc_mark..."));
     mark_locations_array((VALUE*)save_regs_gc_mark, sizeof(save_regs_gc_mark) / sizeof(VALUE *));
+ IF_DEBUG_REACHABILITY(rb_warn("Checking stack_start..."));
#if STACK_GROW_DIRECTION < 0
     rb_gc_mark_locations((VALUE*)STACK_END, rb_gc_stack_start);
@@ -1371,23 +1475,34 @@ garbage_collect()
        (VALUE*)((char*)rb_gc_stack_start + 2));
#endif
+ IF_DEBUG_REACHABILITY(rb_warn("Checking threads..."));
     rb_gc_mark_threads();

     /* mark protected global variables */
+ IF_DEBUG_REACHABILITY(rb_warn("Checking C globals..."));
     for (list = global_List; list; list = list->next) {
+ IF_DEBUG_REACHABILITY(rb_ary_push(rb_reach_test_path, rb_sprintf("C global %d", i)));
   rb_gc_mark_maybe(*list->varptr);
+ IF_DEBUG_REACHABILITY((rb_ary_pop(rb_reach_test_path), i++));
     }
+ IF_DEBUG_REACHABILITY(rb_warn("Checking end_proc..."));
     rb_mark_end_proc();
+ IF_DEBUG_REACHABILITY(rb_warn("Checking global_tbl..."));
     rb_gc_mark_global_tbl();

+ IF_DEBUG_REACHABILITY(rb_warn("Checking class_tbl..."));
     rb_mark_tbl(rb_class_tbl);
+ IF_DEBUG_REACHABILITY(rb_warn("Checking trap_list..."));
     rb_gc_mark_trap_list();

     /* mark generic instance variables for special constants */
+ IF_DEBUG_REACHABILITY(rb_warn("Checking generic_ivar_tbl..."));
     rb_mark_generic_ivar_tbl();

+ IF_DEBUG_REACHABILITY(rb_warn("Checking mark parser..."));
     rb_gc_mark_parser();

     /* gc_mark objects whose marking are not completed*/
- while (!MARK_STACK_EMPTY){
+ IF_DEBUG_REACHABILITY(rb_warn("Checking mark stack..."));
+ while (!MARK_STACK_EMPTY) {
   if (mark_stack_overflow){
       gc_mark_all();
@@ -1397,4 +1512,19 @@ garbage_collect()
   }
     }
+
+ IF_DEBUG_REACHABILITY(
+ rb_warn("Unmarking...");
+ rb_gc_unmark();
+
+ rb_warn("Done.");
+
+ result = rb_reach_test_result;
+
+ rb_reach_test_obj = Qnil;
+ rb_reach_test_result = Qnil;
+ rb_reach_test_path = Qnil;
+
+ return result);
+
     gc_sweep();
}
@@ -1917,4 +2047,7 @@ Init_GC()
     rb_define_singleton_method(rb_mGC, "enable", rb_gc_enable, 0);
     rb_define_singleton_method(rb_mGC, "disable", rb_gc_disable, 0);
+#ifdef DEBUG_REACHABILITY
+ rb_define_singleton_method(rb_mGC, "reachability_paths", garbage_collect0, 1);
+#endif
     rb_define_method(rb_mGC, "garbage_collect", rb_gc_start, 0);

@@ -1941,3 +2074,8 @@ Init_GC()
     nomem_error = rb_exc_new2(rb_eNoMemError, "failed to allocate memory");
     rb_global_variable(&nomem_error);
+
+#ifdef DEBUG_REACHABILITY
+ rb_global_variable(&rb_reach_test_result);
+ rb_global_variable(&rb_reach_test_path);
+#endif
}
Index: variable.c

RCS file: /cvs/ruby/src/ruby/variable.c,v
retrieving revision 1.125
diff -U2 -p -r1.125 variable.c
--- variable.c 27 Jul 2005 07:27:17 -0000 1.125
+++ variable.c 12 Aug 2005 09:12:29 -0000
@@ -446,4 +446,11 @@ readonly_setter(val, id, var)
}

+#ifdef DEBUG_REACHABILITY
+extern VALUE rb_reach_test_path;
+#define IF_DEBUG_REACHABILITY(does) do {if (!NIL_P(rb_reach_test_path)) {does;}} while (0)
+#else
+#define IF_DEBUG_REACHABILITY(does)
+#endif
+
static int
mark_global_entry(key, entry)
@@ -453,9 +460,23 @@ mark_global_entry(key, entry)
     struct trace_var *trace;
     struct global_variable *var = entry->var;
-
+#ifdef DEBUG_REACHABILITY
+ int i = 0;
+#endif
+
+ IF_DEBUG_REACHABILITY(
+ rb_ary_push(rb_reach_test_path,
+ rb_sprintf("Ruby global %s", rb_id2name(key))));
     (*var->marker)(var->data);
+ IF_DEBUG_REACHABILITY(rb_ary_pop(rb_reach_test_path));
+
     trace = var->trace;
     while (trace) {
- if (trace->data) rb_gc_mark_maybe(trace->data);
+ if (trace->data) {
+ IF_DEBUG_REACHABILITY(
+ rb_ary_push(rb_reach_test_path,
+ rb_sprintf("Ruby global %s trace %d", rb_id2name(key), i++)));
+ rb_gc_mark_maybe(trace->data);
+ IF_DEBUG_REACHABILITY(rb_ary_pop(rb_reach_test_path));
+ }
   trace = trace->next;
     }

--
Nobu Nakada

(Simon Kröger) #5

Very cool!

it looks like nobu already did the work to integrate it in the
current version. I will give it a try as soon as possible.

thanks

Simon

···

Kroeger Simon (ext) wrote:

Hi Nobu,

i realy appreciate your input but that's not the kind of problem I'm facing (and I probably should have elaborated the issue further when asking such questions).
At one point in our project a large number of plain ruby objects
gets alocated (no relation to external resources) and stored in
an array (all of the same class). This happens from various Threads but in a (hopefully) threadsafe manner. Later this array gets #clear(ed). After GC there are still instances of the class in the
ObjectSpace even though this special class is only used for elements of this array. This process repeats itself over and
over again and after some hours we have hundreds of MB memory
leak.

So, what I'm realy looking for is a way to ask: 'hey almighty shepherd of objects, why are you still clinging to xyz?'

I patched the ruby interpreter to divulge this information, but that was
back at 1.6 and 1.7, so it would probably need some work for 1.8.

If you're interested, there's a mention of the patch at

http://www.rubygarden.org/ruby?GCAndMemoryManagement

and the patch itself is at

http://www.google.com/url?sa=D&q=http%3A%2F%2Fredshift.sourceforge.net%2Fdebugging-GC%2F

Briefly, the patch adds a method

GC.reachability_paths obj

which returns a list of all the ways you can reference that object,
starting from the basic references such as C and ruby globals, stack
variables, and others.

(Nobuyoshi Nakada) #6

Hi,

At Fri, 12 Aug 2005 19:19:49 +0900,
nobuyoshi nakada wrote in [ruby-talk:151854]:

@@ -1917,4 +2047,7 @@ Init_GC()
     rb_define_singleton_method(rb_mGC, "enable", rb_gc_enable, 0);
     rb_define_singleton_method(rb_mGC, "disable", rb_gc_disable, 0);
+#ifdef DEBUG_REACHABILITY
+ rb_define_singleton_method(rb_mGC, "reachability_paths", garbage_collect0, 1);
+#endif
     rb_define_method(rb_mGC, "garbage_collect", rb_gc_start, 0);

Oops, a wrapper function is needed.

static VALUE
rbx_reachability_paths(mod, obj)
    VALUE mod;
    VALUE obj;
{
    if (rb_special_const_p(obj)) return Qnil;
    return garbage_collect0(obj);
}

···

--
Nobu Nakada

(Simon Kröger) #7

Thanks a lot!

I will try this, for sure.
I was busy so didn't had the time, this looks promissing.

Simon

···

nobu.nokada@softhome.net wrote:

Hi,

At Fri, 12 Aug 2005 19:19:49 +0900,
nobuyoshi nakada wrote in [ruby-talk:151854]:

@@ -1917,4 +2047,7 @@ Init_GC()
    rb_define_singleton_method(rb_mGC, "enable", rb_gc_enable, 0);
    rb_define_singleton_method(rb_mGC, "disable", rb_gc_disable, 0);
+#ifdef DEBUG_REACHABILITY
+ rb_define_singleton_method(rb_mGC, "reachability_paths", garbage_collect0, 1);
+#endif
    rb_define_method(rb_mGC, "garbage_collect", rb_gc_start, 0);

Oops, a wrapper function is needed.

static VALUE
rbx_reachability_paths(mod, obj)
    VALUE mod;
    VALUE obj;
{
    if (rb_special_const_p(obj)) return Qnil;
    return garbage_collect0(obj);
}