A question about Circular References

Hello folks,

I have a question about Circular References in classes that I create
from an extension that I’ve been playing with. Since I’m very new to
Ruby and much newer to extensions, after thinking about my problem for
quite some time I decided it would probably be much more efficient to
ask someone than try to figure it out myself. So here goes.

At Apple’s World Wide Developer’s Conference in May, a presenter
demonstrated a way to access the Quartz drawing system that is Part of
Mac OS X from Python. I thought that was pretty neat, but was having
lots of fun with Ruby and thought it would be a pain to learn another
scripting language. Logically, of course, I decided that I might try
to write my own extension for Ruby to do the same thing.

As part of working with that I have created two classes that wrap C
structures commonly found in Quartz. The first is CGSize whose C type
looks something like:

struct CGSize
{
float height;
float width;
};

The second is equally simple, CGPoint

struct CGPoint
{
float x;
float y;
}

No big deal.

The problem came in when I started looking at CGRect which is defined
as:

struct CGRect
{
CGPoint origin;
CGSize size;
}

Now it seems likely that from ruby one might want to do something like
this:

someRect = CGRect.make(0, 0, 100, 200); # create a CGRect with it’s
origin at (0,0) and it’s extent 100 points wide and 200 points tall
theSize = someRect.size; # Get the size out of the rectangle.

When I create the CGRect object I use Data_Make_Struct. When I want to
return the size object out of that rectangle, my thought was to use
Data_Wrap_Struct.

The thing that worries me is that the CGSize object I think I want to
return from the “size()” method of the CGRect class would be wrapping a
piece of the CGRect structure itself. I’m concerned that should the
CGRect be garbage collected that the CGSize object, which is an object
wrapper around part of the CGRect’s memory, will no longer be valid
because the CGRect’s memory was reclaimed.

In other words, the pointer to the memory that the CGSize object wraps
will be halfway through the memory owned by the CGRect. I’m worried
that the system might dispose of the CGRect not knowing that the CGSize
object still needs that memory.

I hope my concern is clear… I’m struggling to find the correct words
to explain it.

At any rate, this leads to the following questions:

First, do I even need to worry about it? Is the Garbage Collector in
ruby clever enough to see that one of my objects has a pointer into the
memory that makes up the CGRect object and, therefore, will not release
the memory even though the CGRect object itself is not referenced?

If the Garbage Collector is not clever enough to do that, it seems
reasonable that the CGRect object might create references to a CGPoint
object and a CGSize object within itself. At the same time, however,
the CGPoint and CGSize objects want to keep a reference to the CGRect
so that it’s memory won’t go away until they are done with it. Will
setting up that kind of circular reference prevent Ruby from being able
to garbage collect the rectangle and it’s point/origin pair?

Any insight would be appreciated. I’ve you’ve made it this far, thank
you for your time!

Scott Thompson

Scott Thompson wrote:

When I create the CGRect object I use Data_Make_Struct. When I want to
return the size object out of that rectangle, my thought was to use
Data_Wrap_Struct.

Yes.

The thing that worries me is that the CGSize object I think I want to
return from the “size()” method of the CGRect class would be wrapping a
piece of the CGRect structure itself. I’m concerned that should the
CGRect be garbage collected that the CGSize object, which is an object
wrapper around part of the CGRect’s memory, will no longer be valid
because the CGRect’s memory was reclaimed.

Correct. When the CGRect Ruby instance is garbage-collected, the CGRect
C struct that it “wraps” will be free’d as well. And that in turn
invalidates the pointers to its contained CGPoint and CGSize structs,
even if there are still outstanding Ruby objects that wrap those structs.

In other words, the pointer to the memory that the CGSize object wraps
will be halfway through the memory owned by the CGRect. I’m worried
that the system might dispose of the CGRect not knowing that the CGSize
object still needs that memory.

Yes, this is a good thing to worry about :wink:

At any rate, this leads to the following questions:

First, do I even need to worry about it? Is the Garbage Collector in
ruby clever enough to see that one of my objects has a pointer into the
memory that makes up the CGRect object and, therefore, will not release
the memory even though the CGRect object itself is not referenced?

No. This is your responsibility.

If the Garbage Collector is not clever enough to do that, it seems
reasonable that the CGRect object might create references to a CGPoint
object and a CGSize object within itself…

Not sure I follow you here. Ruby only knows what you tell it about your
extension module’s classes. For example, even though /you/ know that a
CGRect object contains references to a CGPoint and a CGSize, Ruby
doesn’t deduce that. This is part of the motivation for writing a “mark”
function for your extension objects; the garbage collector calls that
mark function to ask, say, your CGRect object what /other/ Ruby objects
are reachable from there. In your case, you’d want to be sure to call
rb_gc_mark() for the Ruby objects that are wrapping the CGPoint and
CGSize objects:

 void mark_CGRect(CGRect *rect)
 {
     VALUE pointObj, sizeObj;
     pointObj = rubyObjectFor(rect->origin);
     sizeObj = rubyObjectFor(rect->size);
     rb_gc_mark(pointObj);
     rb_gc_mark(sizeObj);
 }

At the same time, however, the CGPoint and CGSize objects want
to keep a reference to the CGRect so that its memory won’t
go away until they are done with it. Will setting up that kind of
circular reference prevent Ruby from being able to garbage collect
the rectangle and its point/origin pair?

No, not as long as the mark functions for CGSize and CGPoint mark their
parent CGRect as still being “reachable”. If you still have some
outstanding references to the point or size objects for a given
rectangle, their mark functions will get called and thus the rectangle
won’t get garbage-collected.

Yes, it is a little complicated :wink:

Hope this helps,

Lyle

If the Garbage Collector is not clever enough to do that, it seems
reasonable that the CGRect object might create references to a
CGPoint object and a CGSize object within itself…

Not sure I follow you here. Ruby only knows what you tell it about
your extension module’s classes. For example, even though /you/ know
that a CGRect object contains references to a CGPoint and a CGSize,
Ruby doesn’t deduce that. This is part of the motivation for writing a
“mark” function for your extension objects; the garbage collector
calls that mark function to ask, say, your CGRect object what /other/
Ruby objects are reachable from there. In your case, you’d want to be
sure to call rb_gc_mark() for the Ruby objects that are wrapping the
CGPoint and CGSize objects:

void mark_CGRect(CGRect *rect)
{
    VALUE pointObj, sizeObj;
    pointObj = rubyObjectFor(rect->origin);
    sizeObj = rubyObjectFor(rect->size);
    rb_gc_mark(pointObj);
    rb_gc_mark(sizeObj);
}

The problem I’m running into is in the mark routine for a CGSize object
that was created as part of a CGRect. The tricky part is that when my
mark routine is called, it recieves a pointer to the CGSize structure
in C. I don’t have any way to go from that pointer to a CGSize, to a
pointer to a CGRect, to VALUE that represents the CGRect object. To
put it another way, given a pointer to a CGSize structure, I don’t
have a way to recover the CGRect object that contains that structure.

In the original message I was talking about code like this:

VALUE rb_CGRect_initialize(VALUE self)
{
CGRect *owningRect = NULL;

 Data_Get_Struct(self, CGRect, owningRect);

 if(NULL != owningRect) {
     VALUE originObject = Data_Wrap_Struct(
                             gClass_CGPoint,
                             0,
                             0,
                             &(owningRect->origin));

     VALUE sizeObject = Data_Wrap_Struct(
                             gClass_CGSize,
                             0,
                             0,
                             &(owningRect->size));

     rb_iv_set(self, "@size", sizeObject);
     rb_iv_set(self, "@origin", originObject);

     rb_iv_set(sizeObject, "@parentRect", self);
     rb_iv_set(originObject, "@parentRect", self);
 }
 return self;

}

This is the initialize method for CGRect. In the initialize method, I
define two instance variables “@size” and “@origin” that are part of
the CGRect object. At the same time, I set instance variables in the
size and origin objects that point back to the rectangle.

My theory is as follows:

  1. When ruby is trying to mark the CGRect it will “automagically” mark
    all of it’s instance variables so the size and origin objects will get
    marked.
  2. Similarly, when ruby tries to mark a size or origin object that has
    a parent rectangle, it will “automagically” mark the parent rectangle
    as well (preventing it from being Garbage Collected).
  3. if neither the CGRect nor the CGSize/CGPoint have “external
    references” then none of the objects will be marked and the GC can suck
    up the whole tangled mess in one swell foop.

That way, if there are “external references” to the CGRect, everybody
gets marked. If there are no “external references” to the CGRect, but
there are “external references” to the size, or the origin, the CGRect
will still get marked so it’s memory won’t be reclaimed.

Will that work out or am I trying to do things the hard way?

Scott

Scott Thompson wrote:

The problem I'm running into is in the mark routine for a CGSize object that was created as part of a CGRect. The tricky part is that when my mark routine is called, it recieves a pointer to the CGSize structure in C. I don't have any way to go from that pointer to a CGSize, to a pointer to a CGRect, to VALUE that represents the CGRect object. To put it another way, given a pointer to a CGSize _structure_, I don't have a way to recover the CGRect _object_ that contains that structure.

No, you definitely don't get this "for free". Obviously, this is not a problem unique to Ruby; in an arbitrary C/C++ program, if I have a pointer to a struct or object that happens to be member data for some other struct/object, there's no magic for navigating back to the parent object. As you have already deduced, you'll need to do some extra bookkeeping of your own to keep up with these relationships.

In the original message I was talking about code like this:

VALUE rb_CGRect_initialize(VALUE self)
{
    CGRect *owningRect = NULL;

    Data_Get_Struct(self, CGRect, owningRect);

    if(NULL != owningRect) {
        VALUE originObject = Data_Wrap_Struct(
                                gClass_CGPoint,
                                0,
                                &(owningRect->origin));

        VALUE sizeObject = Data_Wrap_Struct(
                                gClass_CGSize,
                                0,
                                &(owningRect->size));

        rb_iv_set(self, "@size", sizeObject);
        rb_iv_set(self, "@origin", originObject);

        rb_iv_set(sizeObject, "@parentRect", self);
        rb_iv_set(originObject, "@parentRect", self);
    }
    return self;
}

This is the initialize method for CGRect. In the initialize method, I define two instance variables "@size" and "@origin" that are part of the CGRect object. At the same time, I set instance variables in the size and origin objects that point back to the rectangle.

My theory is as follows:

1) When ruby is trying to mark the CGRect it will "automagically" mark all of it's instance variables so the size and origin objects will get marked.

I'm not sure that this is in fact true for "Data" objects like these. I took a quick look at the Ruby source code (see rb_gc_mark_children() in gc.c) and it's not clear to me that instance variables for Data (extension) objects get marked automatically. You should add some debugging statements (or whatever) to convince yourself that this is what actually happens.

2) Similarly, when ruby tries to mark a size or origin object that has a parent rectangle, it will "automagically" mark the parent rectangle as well (preventing it from being Garbage Collected).

This will only happen if instance variables for Data objects are automatically marked (see previous comments).

3) if neither the CGRect nor the CGSize/CGPoint have "external references" then none of the objects will be marked and the GC can suck up the whole tangled mess in one swell foop.

Yes, this should hold true.

That way, if there are "external references" to the CGRect, everybody gets marked. If there are no "external references" to the CGRect, but there are "external references" to the size, or the origin, the CGRect will still get marked so it's memory won't be reclaimed.

Yes.

Will that work out or am I trying to do things the hard way?

Your plan sounds good to me if Ruby's GC does in fact mark instance variables for Data objects, but as stated, I'm not sure that this is the case. If the GC doesn't automatically handle this for you, I still think you're on the right track by storing those values as instance variables (i.e. this is as good a bookkeeping system as any). It does mean, however, that you'll still need to write mark() functions for CGRect, CGPoint and CGSize that extract these instance variables and mark them.

Good luck,

Lyle

I'm not sure that this is in fact true for "Data" objects like these. I
took a quick look at the Ruby source code (see rb_gc_mark_children() in
gc.c) and it's not clear to me that instance variables for Data
(extension) objects get marked automatically. You should add some
debugging statements (or whatever) to convince yourself that this is
what actually happens.

line 622 in gc.c (ruby 1.8.0)

Guy Decoux

I’m not sure that this is in fact true for “Data” objects like
these. I
took a quick look at the Ruby source code (see
rb_gc_mark_children() in
gc.c) and it’s not clear to me that instance variables for Data
(extension) objects get marked automatically. You should add some
debugging statements (or whatever) to convince yourself that this is
what actually happens.

line 622 in gc.c (ruby 1.8.0)

Guy Decoux

Do you really mean line 662 (two 6’s not two 2’s)? On that line if a
ruby object has any “generic instance variables” then those variables
are marked. Then after that, 'round line 846, the mark routine for the
data pointer is called.

Looks like my stragegy will work. Thank you both for your time and
expertise!

Scott

line 622 in gc.c (ruby 1.8.0)

Do you really mean line 662 (two 6's not two 2's)?

yes (rb_mark_generic_ivar(ptr)), sorry

Guy Decoux