Rb_gc_register_address problem

We ran into a problem today with the garbage collector (caused by our
own bug). We had a function that did something like this:
std::vector<char *> foo;
void func(int argc, VALUE * argv, VALUE self) {
foo.push_back(STR2CSTR(argv[0]));
rb_global_variable(&argv[0]);
}

This is a bug, because once the function returns, argv[0] is invalid,
and global_List contains an invalid VALUE* that may or may not point to
the VALUE we want to keep global. When the GC tries to mark this
VALUE*, it ends up marking an invalid object.

So my question is: why does rb_gc_register_address() store VALUE*'s
instead of VALUE’s? I suppose that if it stored a VALUE, then it would
be possible to register an object as a global but not keep a reference
to that object; this may not be desirable. In our case, though, we just
wanted the string to remain valid.

If I do have a bug like this in the future, what techniques can I use to
track it down?

Paul

Paul Brannan wrote:

If I do have a bug like this in the future, what techniques can I use to
track it down?

Not sure if this would help, but most memory debugging tools (like
Purify on Windows, or Valgrind on Linux) should trap it as a dangling
pointer and give you some information about where the original reference
was created – in your case, during the call to rb_gc_register_address().

std::vector<char *> foo;
void func(int argc, VALUE * argv, VALUE self) {
foo.push_back(STR2CSTR(argv[0]));
rb_global_variable(&argv[0]);
}

can i see more of this?

So my question is: why does rb_gc_register_address() store VALUE*'s
instead of VALUE’s? I suppose that if it stored a VALUE, then it would
be possible to register an object as a global but not keep a reference
to that object; this may not be desirable. In our case, though, we just
wanted the string to remain valid.

i’m not positive, but if you did not store pointers to pointers to objects,
you would not be able to change, in the gc, what that pointerer was pointing
to! you would only be able to change the thing pointed to itself! eg. the gc
would never be able to set an object pointer to nil, or anywhere else…

the difference is illustrated here :

#include <stdlib.h>
#include <stdio.h>

void
i_can_change_what_memory_you_point_to (ptr)
char **ptr;
{
*ptr = “barfoo”;
}

void
i_can_change_the_memory_you_point_to (ptr)
char *ptr;
{
*ptr = ‘b’;
}

int
main (int argc, char **argv, char **env)
{
char *buf = malloc(42);
char *ptr;

ptr = buf;
strcpy (ptr, "foobar");
printf("%s\n", ptr);                  /* -->> "foobar" */
i_can_change_what_memory_you_point_to (&ptr);
printf("%s\n", ptr);                  /* -->> "barfoo" */

ptr = buf;
strcpy (ptr, "foo");
printf("%s\n", ptr);                  /* -->> "foo" */
i_can_change_the_memory_you_point_to (ptr);
printf("%s\n", ptr);                  /* -->> "boo"    */

return 0;

}

the gc must have a prototype like ‘i_can_change_what_memory_you_point_to’

If I do have a bug like this in the future, what techniques can I use to
track it down?

gcc has built in memory leak debugging - but it only works with c. a good
reason not to use c++ imo. :wink:

i actually have a modification, somewhere, which works with c++, want it?

-a

···

On Tue, 22 Oct 2002, Paul Brannan wrote:

====================================

Ara Howard
NOAA Forecast Systems Laboratory
Information and Technology Services
Data Systems Group
R/FST 325 Broadway
Boulder, CO 80305-3328
Email: ahoward@fsl.noaa.gov
Phone: 303-497-7238
Fax: 303-497-7259
====================================

Hi Paul,

I was puzzled by the example itself; hopefully it is really just an
example:

  1. Why do you have a vector of char* instead of std::vector in
    C++? In this case you don’t get the advantage of the string copy
    constructor when you call the push_back() on the vector, which then
    requires you to store the corresponding VALUE.

  2. Why do you want store the char* of the VALUE instead of the VALUE
    itself? You can then store the VALUE in a global ruby array instead of a
    global vector so that you don’t need rb_global_variable(&argv[0]).

  3. By its name, rb_global_variable() is supposed to store a C global
    variable (of type VALUE); so when we call rb_global_variable() for a
    temporary C variable, at that point the C code design is questionable.

Therefore, based on 1), 2), and 3) I don’t see any problem that
rb_global_variable() (or rb_gc_register_address()) stores VALUE* instead
of VALUE.

Finally, if there is a technique that can track this kind of bug down
easily, please let us all know, since it will make C programming in
Ruby much safer (and much more fun :slight_smile: ).

Regards,

Bill

···

====================================================================
Paul Brannan pbrannan@atdesk.com wrote:

We ran into a problem today with the garbage collector (caused by our
own bug). We had a function that did something like this:
std::vector<char *> foo;
void func(int argc, VALUE * argv, VALUE self) {
foo.push_back(STR2CSTR(argv[0]));
rb_global_variable(&argv[0]);
}

This is a bug, because once the function returns, argv[0] is invalid,
and global_List contains an invalid VALUE* that may or may not point to
the VALUE we want to keep global. When the GC tries to mark this
VALUE*, it ends up marking an invalid object.

So my question is: why does rb_gc_register_address() store VALUE*'s
instead of VALUE’s? I suppose that if it stored a VALUE, then it would
be possible to register an object as a global but not keep a reference
to that object; this may not be desirable. In our case, though, we just
wanted the string to remain valid.

If I do have a bug like this in the future, what techniques can I use to
track it down?

Paul

Hi,

We ran into a problem today with the garbage collector (caused by our
own bug). We had a function that did something like this:
std::vector<char *> foo;
void func(int argc, VALUE * argv, VALUE self) {
foo.push_back(STR2CSTR(argv[0]));
rb_global_variable(&argv[0]);
}

This is a bug, because once the function returns, argv[0] is invalid,
and global_List contains an invalid VALUE* that may or may not point to
the VALUE we want to keep global. When the GC tries to mark this
VALUE*, it ends up marking an invalid object.

If this “func” will be called from ruby as a method, you don’t
need to call rb_global_variable() here.

So my question is: why does rb_gc_register_address() store VALUE*'s
instead of VALUE’s? I suppose that if it stored a VALUE, then it would
be possible to register an object as a global but not keep a reference
to that object; this may not be desirable. In our case, though, we just
wanted the string to remain valid.

And introduce memory bugs? Just store the VALUE in a local
variable instead.

···

At Tue, 22 Oct 2002 03:54:24 +0900, Paul Brannan wrote:


Nobu Nakada

Paul Brannan wrote:

I was about to fire up valgrind to find the problem. I've been told
that it doesn't play nicely with Ruby's GC, though... is this true?

Yes, but Tanaka Akira posted a valgrind suppressions file that you can
use with Ruby to tell it (valgrind) to shut up about those problems. See:

  http://www.ruby-talk.org/52065

Hope this helps,

Lyle

Hi Paul,

I was puzzled by the example itself; hopefully it is really just an
example:

  1. Why do you have a vector of char* instead of std::vector in
    C++? In this case you don’t get the advantage of the string copy
    constructor when you call the push_back() on the vector, which then
    requires you to store the corresponding VALUE.

I’m storing command-line arguments for later use, to pass to a function
that expects an array of char*.

  1. Why do you want store the char* of the VALUE instead of the VALUE
    itself? You can then store the VALUE in a global ruby array instead of a
    global vector so that you don’t need rb_global_variable(&argv[0]).

See above.

  1. By its name, rb_global_variable() is supposed to store a C global
    variable (of type VALUE); so when we call rb_global_variable() for a
    temporary C variable, at that point the C code design is questionable.

Agreed.

But if the GC stores a VALUE* instead of a VALUE, this is an extra (and
seemingly unnecessary) indirection. I was curious if there was a reason
for it, other than to prevent me from doing what I did.

Paul

···

On Tue, Oct 22, 2002 at 07:15:13AM +0900, William Djaja Tjokroaminata wrote:

We ran into a problem today with the garbage collector (caused by our
own bug). We had a function that did something like this:
std::vector<char *> foo;
void func(int argc, VALUE * argv, VALUE self) {
foo.push_back(STR2CSTR(argv[0]));
rb_global_variable(&argv[0]);
}

If this “func” will be called from ruby as a method, you don’t
need to call rb_global_variable() here.

foo is global, so the call is necessary, so that the string object will
still be valid after func() returns.

So my question is: why does rb_gc_register_address() store VALUE*'s
instead of VALUE’s? I suppose that if it stored a VALUE, then it would
be possible to register an object as a global but not keep a reference
to that object; this may not be desirable. In our case, though, we just
wanted the string to remain valid.

And introduce memory bugs? Just store the VALUE in a local
variable instead.

What memory bugs would be introduced?

(My goal here isn’t to get the interpreter to change, but to understand
why it is written the way it is).

Paul

···

On Tue, Oct 22, 2002 at 10:02:50AM +0900, nobu.nokada@softhome.net wrote:

At Tue, 22 Oct 2002 03:54:24 +0900, > Paul Brannan wrote:

i'm not _positive_, but if you did not store pointers to pointers to
objects, you would not be able to change, in the gc, what that
pointerer was pointing to! you would only be able to change the thing
pointed to itself! eg. the gc would never be able to set an object
pointer to nil, or anywhere else...

I see! That makes perfect sense.

I had not considered creating global variables that actually change. I
generally leave them alone once they are created. I guess I should be
using a constant instead.

gcc has built in memory leak debugging - but it only works with c. a
good reason not to use c++ imo. :wink:

Where can I read more about this facility?

i actually have a modification, somewhere, which works with c++, want
it?

What does the modification involve?

Paul

···

On Mon, Oct 21, 2002 at 09:39:10PM +0000, ahoward wrote:

But if the GC stores a VALUE* instead of a VALUE, this is an extra (and
seemingly unnecessary) indirection. I was curious if there was a reason
for it, other than to prevent me from doing what I did.

    a = Qnil;
    rb_gc_register_address(a);
    a = rb_ary_new();

Guy Decoux

> gcc has built in memory leak debugging - but it only works with c. a
> good reason not to use c++ imo. :wink:

Where can I read more about this facility?

i told a little lie, it works with glibc, whcih on our systems is pretty much
synonymous with gcc (eg our linux systems). you can read about it in

http://www.gnu.org/manual/glibc-2.2.3/html_mono/libc.html#SEC37

it's **really** usefull, and worth finding a machine with glibc on, at least
to debug with.

> i actually have a modification, somewhere, which works with c++, want
> it?

What does the modification involve?

you'll see in the docs above, that turning tracing on outputs a file which a
perl postprocessor then handles. it consists of lines like

i malloc'd 10 bytes to ptr 0x377
i free'd 0x377

etc..

the postprocessor scans this files and makes sure mallocs and frees are
balanced. it also uses some backtracing functions to note *where* the leaks
occur. the problem with c++ is that the leaks will always appear to occur
from the method 'new' (or whatever the internal name is). note that this is
only because new and delete are implemented in terms of malloc and free in
glibc - maybe this is normal - i don't really know. anyhow, i added some
checking that works using this logic

  if im called from new
    trace who called me
  else
    normal

in other words, when called from new i output information from further up the
call stack, so you can see where your new is leaking...

this modification requires rebuilding glibc! however, the file affected is
the mtrace.c file, which is only ever used for memory debugging, so i assure
you nothing would break! anyhow, it's been six months since i looked at that,
but if you want i'll scan my notes and send you a tar ball of what i did.
when time allows i will contrib this to gcc - but time has not yet allowed!

-a

···

On Tue, 22 Oct 2002, Paul Brannan wrote:

--

====================================
> Ara Howard
> NOAA Forecast Systems Laboratory
> Information and Technology Services
> Data Systems Group
> R/FST 325 Broadway
> Boulder, CO 80305-3328
> Email: ahoward@fsl.noaa.gov
> Phone: 303-497-7238
> Fax: 303-497-7259

Hi Paul,

I’m storing command-line arguments for later use, to pass to a function
that expects an array of char*.

In this case then what you want is probably something like

std::vector<char *> foo;
void func(int argc, VALUE * argv, VALUE self) {
int length;
char *str;
rb_str2cstr (argv[0], &length);
str = malloc (sizeof(char) * (length + 1));
strcpy (str, STR2CSTR(argv[0]));
foo.push_back(str);
}

Of course, I have created a macro/function that performs a task as above
so that I can simply write

std::vector<char *> foo;
void func(int argc, VALUE * argv, VALUE self) {
foo.push_back(create_str_from_rubystr (argv[0]));
}

My philosophy in Ruby-C integration: separate the C-part and the Ruby-part
as much and as far as possible; we are better-off in the long run this
way.

(deleted)

But if the GC stores a VALUE* instead of a VALUE, this is an extra (and
seemingly unnecessary) indirection. I was curious if there was a reason
for it, other than to prevent me from doing what I did.

I think this is because all Ruby VALUE’s of type pointer (not immediate
object) is allocated from a single heap. As you call
rb_global_variable(), Matz created a linked list containing all the
addresses of the global variables. During the gc marking process, Matz
simply goes through this list and marks the VALUE pointed by each
VALUE*. If you had given Matz a VALUE instead, during the gc marking
process Matz has to go through the heap to find the exact VALUE and mark
it for each global variable, and now the process is of O(N**2) instead of
O(N).

Regards,

Bill

···

Paul Brannan pbrannan@atdesk.com wrote:

I was about to fire up valgrind to find the problem. I’ve been
told that it doesn’t play nicely with Ruby’s GC, though… is this
true?

Yes, but Tanaka Akira posted a valgrind suppressions file that you can
use with Ruby to tell it (valgrind) to shut up about those problems. See:

http://www.ruby-talk.org/52065

What are the odds of getting this committed? Having ruby play nice
with debugging tools would be REALLY favorable when it’s 5am and
you’ve been wading through a debugger for 36hrs strait. :slight_smile: -sc

···


Sean Chittenden

Well, as long as after each new object creation we call
rb_gc_register_address() (and optionally call rb_gc_unregister_address()),
this is fine:

 a = Qnil;
 rb_gc_register_address(a);
 rb_gc_unregister_address(a);    /* optional */
 a = rb_ary_new();
 rb_gc_register_address(a);

Of course, we are here assuming rb_gc_register_address(VALUE) instead of
rb_gc_register_address(VALUE*) and Matz has to do the O(N**2) process in
rb_gc().

Bill

···

===========================================================================
ts decoux@moulon.inra.fr wrote:

a = Qnil;
rb_gc_register_address(a);
a = rb_ary_new();

Guy Decoux

In this case then what you want is probably something like

std::vector<char *> foo;
void func(int argc, VALUE * argv, VALUE self) {
int length;
char *str;
rb_str2cstr (argv[0], &length);
str = malloc (sizeof(char) * (length + 1));
strcpy (str, STR2CSTR(argv[0]));
foo.push_back(str);
}

This is one possible solution. We went a different route, but thanks
for the suggestion.

Note, though, that this is not exception-safe. vector::push_back can
throw an exception (std::bad_alloc). If it does, then str is left
allocated. One solution is to push a dummy onto the vector before
allocating memory. Using your helper fuction:

std::vector<char *> foo;
void func(int argc, VALUE * argv, VALUE self) {
foo.push_back(0);
foo[foo.length()-1] = create_str_from_rubystr (argv[0]));
}

The solution we went with was to leave the memory allocation to Ruby. I
prefer to let Ruby’s GC handle as much memory as possible, so that I
don’t have to worry quite so hard about exception-safety. The only
exceptions to this rule I see are:

  1. If I have many objects being allocated, I allocate them myself
  2. If I have resources allocated other than memory (such as open files
    or sockets), I also handle that myself.

My philosophy in Ruby-C integration: separate the C-part and the Ruby-part
as much and as far as possible; we are better-off in the long run this
way.

I agree. This is particularly important if the API ever changes. It
also allows porting extensions easily to other languages. This is part
of the philosophy behind SWIG.

Paul

···

On Tue, Oct 22, 2002 at 11:57:05PM +0900, William Djaja Tjokroaminata wrote:

sounds complicated - why not this simple explanation illustrated here :

#include <stdlib.h>
#include <stdio.h>

void
i_can_change_what_memory_you_point_to (ptr)
char **ptr;
{
*ptr = “barfoo”;
}

void
i_can_change_the_memory_you_point_to (ptr)
char *ptr;
{
*ptr = ‘b’;
}

int
main (int argc, char **argv, char **env)
{
char *buf = malloc(42);
char *ptr;

ptr = buf;
strcpy (ptr, "foobar");
printf("%s\n", ptr);                  /* -->> "foobar" */
i_can_change_what_memory_you_point_to (&ptr);
printf("%s\n", ptr);                  /* -->> "barfoo" */

ptr = buf;
strcpy (ptr, "foo");
printf("%s\n", ptr);                  /* -->> "foo" */
i_can_change_the_memory_you_point_to (ptr);
printf("%s\n", ptr);                  /* -->> "boo"    */

return 0;

}

the gc must have a prototype like ‘i_can_change_what_memory_you_point_to’ so
it may, for example, point an object to nil. otherwise it could only change
the objects pointed to themselves!

-a

···

On Tue, 22 Oct 2002, William Djaja Tjokroaminata wrote:

(deleted)

But if the GC stores a VALUE* instead of a VALUE, this is an extra (and
seemingly unnecessary) indirection. I was curious if there was a reason
for it, other than to prevent me from doing what I did.

I think this is because all Ruby VALUE’s of type pointer (not immediate
object) is allocated from a single heap. As you call
rb_global_variable(), Matz created a linked list containing all the
addresses of the global variables. During the gc marking process, Matz
simply goes through this list and marks the VALUE pointed by each
VALUE*. If you had given Matz a VALUE instead, during the gc marking
process Matz has to go through the heap to find the exact VALUE and mark
it for each global variable, and now the process is of O(N**2) instead of
O(N).

====================================

Ara Howard
NOAA Forecast Systems Laboratory
Information and Technology Services
Data Systems Group
R/FST 325 Broadway
Boulder, CO 80305-3328
Email: ahoward@fsl.noaa.gov
Phone: 303-497-7238
Fax: 303-497-7259
====================================

====================================

Ara Howard
NOAA Forecast Systems Laboratory
Information and Technology Services
Data Systems Group
R/FST 325 Broadway
Boulder, CO 80305-3328
Email: ahoward@fsl.noaa.gov
Phone: 303-497-7238
Fax: 303-497-7259
====================================

Uuups, sorry, reading too fast; in C the precedence is rather different:

rb_gc_mark(*list->varptr);    /* '->' is stronger than '*' */

So in fact, at the end Matz is giving back (&VALUE), which is the same as
VALUE. So forget about the O(N**2) process. It seems that you (Paul) are
correct, that we can have either VALUE or VALUE
. And based on your
example (I don’t know about other cases) it seems that
rb_gc_register_address is better off using VALUE instead of VALUE*.

Using VALUE, this is still valid:

 a = Qnil;
 rb_gc_register_address(a);
 rb_gc_unregister_address(a);    /* optional */
 a = rb_ary_new();
 rb_gc_register_address(a);

Using VALUE*, on the other hand, we just save some typing:

 a = Qnil;
 rb_gc_register_address(&a);
 a = rb_ary_new();
 /* no need for another rb_gc_register_address (&a) here */

Regards,

Bill

What are the odds of getting this committed? Having ruby play nice
with debugging tools would be REALLY favorable when it's 5am and
you've been wading through a debugger for 36hrs strait. :slight_smile: -sc

Because, for me, it don't solve the problem. I've seen many times, on this
mailing list, extension writers saying that there are bugs in ruby when
generally the bugs were just in the extension.

You don't need this modification, if extension writers just trust the ruby
source.

Guy Decoux

See [ruby-talk:53833] and [ruby-talk:53834]. I got my answer as to why
this is not feasible.

Paul

···

On Wed, Oct 23, 2002 at 12:57:18AM +0900, William Djaja Tjokroaminata wrote:

And based on your example (I don’t know about other cases) it seems
that rb_gc_register_address is better off using VALUE instead of
VALUE*.

Your concept is true for C programming in general, but here we are
specifically talking about Ruby gc. Ruby gc manages a heap of VALUE’s
(pointers) that may point to either a valid object or a free object/data
area (not an exact terminology here).

During the mark phase, given a VALUE (not &VALUE) Ruby gc just marks the
object pointed by VALUE. During the sweep phase, Ruby gc just goes
through the heap linearly, and for any VALUE that does not point to a
marked object, Ruby gc just changes the VALUE to point to a free
object/data area (to be recycled later) and/or “freeing” what the VALUE
pointed to before.

Therefore, we just need to tell Ruby gc a VALUE (a pointer) and not the
address of VALUE (a pointer to pointer). We gives Ruby a pointer, and
Ruby manages the pointer to pointers. Therefore, we can as well use a
function such as rb_gc_register_value(VALUE) instead of
rb_gc_register_address(VALUE*) (although the usage will slightly changes,
but this has nothing to do with feasibility).

Bill

···

==========================================================================
ahoward ahoward@fsl.noaa.gov wrote:

sounds complicated - why not this simple explanation illustrated here :

(deleted)

the gc must have a prototype like ‘i_can_change_what_memory_you_point_to’ so
it may, for example, point an object to nil. otherwise it could only change
the objects pointed to themselves!