How can I pin a Ruby object in memory?

I have some data that I'm storing in a T_DATA VALUE. Is the data
that's stored there part of the GC heap - IOW can it move in memory?
If so, is there a way to pin it so that it doesn't move while I'm
using it?

Thanks!
-John
http://www.iunknown.com

Not sure if I understand the question. A Data object has a pointer
(RDATA(obj)->data) to some block of memory that you've allocated, and
no, Ruby's GC process isn't going to assign some new value to that
pointer.

If you're asking whether Ruby will move the address of the Data object
itself: I'm guessing that that's possible.

···

On 6/1/06, John Lam <drjflam@gmail.com> wrote:

I have some data that I'm storing in a T_DATA VALUE. Is the data
that's stored there part of the GC heap - IOW can it move in memory?
If so, is there a way to pin it so that it doesn't move while I'm
using it?

Hi.

···

On 6/2/06, Lyle Johnson <lyle.johnson@gmail.com> wrote:

If you're asking whether Ruby will move the address of the Data object
itself: I'm guessing that that's possible.

I guess this is not true, because Ruby's GC does not compact
memory (at least up to now).

- Minkoo Seo

I was wondering about the latter. I couldn't find any APIs for pinning
objects in memory so I was worried that the object might move out from
underneath me. But on second thought I'd have the DATA pointer cached in a
register / call stack in any event so it probably doesn't matter if the
object moves in the future.

Cheers,
-John

···

On 6/1/06, Lyle Johnson <lyle.johnson@gmail.com> wrote:

If you're asking whether Ruby will move the address of the Data object
itself: I'm guessing that that's possible.

Lyle Johnson wrote:

If you're asking whether Ruby will move the address of the Data object
itself: I'm guessing that that's possible.

If ruby moved objects like that (whether T_DATA or T_OBJECT, T_STRING,
etc), it would be a disaster. Every VALUE that referred to the object
(in other words every reference to it in a variable, array, hash, etc.)
would become invalid, since the VALUE type is actually a pointer in
these cases. (I may be misunderstanding the question though...)

···

--
      vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

No ruby does not move objects in memory. As to how horrible that would be if it did, there are GCs that do work like this (Copying GC). Believe it or not there are speed advantages to copying gcs in that the algorithm has runtime proportional to the number of reachable objects, rather than the size of the heap like mark-and-sweep (which is what ruby uses). Copying collectors also compact the the memory, reducing fragmentation. A copying GC would be difficult in the current ruby implementation since a copying gc cannot really be conservative (it has to change things in the root set), and ruby uses the C stack so it is difficult to be sure if something is definitely _not_ a pointer. With mark-and-sweep false positives are ok, since nothing ever gets moved. With a copying gc it could mistake an int on the c stack for a pointer "collect" the "object" it "pointed" to and then change the value. Which of course would be the cause of many odd and subtle bugs in ruby code.

···

On Jun 1, 2006, at 6:47 PM, Joel VanderWerf wrote:

Lyle Johnson wrote:

If you're asking whether Ruby will move the address of the Data object
itself: I'm guessing that that's possible.

If ruby moved objects like that (whether T_DATA or T_OBJECT, T_STRING,
etc), it would be a disaster. Every VALUE that referred to the object
(in other words every reference to it in a variable, array, hash, etc.)
would become invalid, since the VALUE type is actually a pointer in
these cases. (I may be misunderstanding the question though...)

--
      vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

You know, I did know that, but it didn't occur to me at the time. Good point.

···

On 6/1/06, Joel VanderWerf <vjoel@path.berkeley.edu> wrote:

If ruby moved objects like that (whether T_DATA or T_OBJECT, T_STRING,
etc), it would be a disaster. Every VALUE that referred to the object
(in other words every reference to it in a variable, array, hash, etc.)
would become invalid, since the VALUE type is actually a pointer in
these cases.

So I would guess that Ruby memory allocation is relatively expensive?
Certainly nowhere near as fast as allocating memory off of the "end"
of the heap or the stack, right? Does it have to search a free list of
blocks itself or does it delegate allocation to the system's malloc()
implementation?

It's tricky doing the interop with the CLR since things like boxed
value type objects *can* be moved in memory, so I need create a pinned
GCHandle object to keep the GC from moving the object (this is also
bad as you could imagine since it leads to heap fragmentation). So
after spending most of the day thinking about the CLR side of the
house, I was a bit surprised to find that Ruby doesn't move objects
around.

This makes me a bit happier in a way since I don't have to worry about
the issues on both sides of the house, but since I figured out how to
do it on the CLR side, I was hoping to reuse that new-found experience
on the Ruby side :slight_smile:

Thanks for the insights.
-John

···

No ruby does not move objects in memory. As to how horrible that
would be if it did, there are GCs that do work like this (Copying
GC). Believe it or not there are speed advantages to copying gcs in
that the algorithm has runtime proportional to the number of
reachable objects, rather than the size of the heap like mark-and-
sweep (which is what ruby uses). Copying collectors also compact the
the memory, reducing fragmentation. A copying GC would be difficult
in the current ruby implementation since a copying gc cannot really
be conservative (it has to change things in the root set), and ruby
uses the C stack so it is difficult to be sure if something is
definitely _not_ a pointer. With mark-and-sweep false positives are
ok, since nothing ever gets moved. With a copying gc it could mistake
an int on the c stack for a pointer "collect" the "object" it
"pointed" to and then change the value. Which of course would be the
cause of many odd and subtle bugs in ruby code.

Lyle Johnson wrote:

···

On 6/1/06, Joel VanderWerf <vjoel@path.berkeley.edu> wrote:

If ruby moved objects like that (whether T_DATA or T_OBJECT, T_STRING,
etc), it would be a disaster. Every VALUE that referred to the object
(in other words every reference to it in a variable, array, hash, etc.)
would become invalid, since the VALUE type is actually a pointer in
these cases.

You know, I did know that, but it didn't occur to me at the time. Good
point.

I had no doubt that you knew it; we would not have FXRuby otherwise :slight_smile:

--
      vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

Speaking without any knowledge of ruby's internals I imagine it's actually is just allocating from the end of some pre-allocated buffer until it reaches the end of the buffer. So if you never run out of room in the buffer the allocation is just incrementing a pointer. When you reach the end you do the first GC and subsequent allocations have to search the freelist for a big enough chunk.

···

On Jun 1, 2006, at 8:23 PM, John Lam wrote:

So I would guess that Ruby memory allocation is relatively expensive?
Certainly nowhere near as fast as allocating memory off of the "end"
of the heap or the stack, right? Does it have to search a free list of
blocks itself or does it delegate allocation to the system's malloc()
implementation?

Ruby does not use a compacting GC and doesn't manage memory itself (the way a
normal memory allocator does) either.
There are two parts to allocating an object:
* each non-immediate object takes a sizeof(RVALUE)-sized slot (typically 20
  bytes) from one of the heaps managed by ruby (look for RVALUE and heaps in
  gc.c). It's sizeof(RVALUE) for any object so there's no problem with "chunk
  sizes" and fragmentation (iow. all chunks are ~20 bytes long). A freelist
  is used to find unused slots in said heaps. Additional heaps of increasing
  size will be created when there are no free slots or too few were freed in a
  GC run.
* most objects need additional memory (pointed to by fields in their
  corresponding slots): instance variable tables, char* for Strings, VALUE*
  for Arrays... these are allocated with malloc and will be freed when the
  corresponding object is reclaimed.

ruby relies on malloc(3) for low-level allocation, instead of doing it all
with sbrk(2) and friends.

···

On Fri, Jun 02, 2006 at 04:57:54PM +0900, Logan Capaldo wrote:

On Jun 1, 2006, at 8:23 PM, John Lam wrote:

>So I would guess that Ruby memory allocation is relatively expensive?
>Certainly nowhere near as fast as allocating memory off of the "end"
>of the heap or the stack, right? Does it have to search a free list of
>blocks itself or does it delegate allocation to the system's malloc()
>implementation?

Speaking without any knowledge of ruby's internals I imagine it's
actually is just allocating from the end of some pre-allocated buffer
until it reaches the end of the buffer. So if you never run out of
room in the buffer the allocation is just incrementing a pointer.
When you reach the end you do the first GC and subsequent allocations
have to search the freelist for a big enough chunk.

--
Mauricio Fernandez - http://eigenclass.org - singular Ruby

Interesting. (-- takes notes --). Almost seems like cheating :). But in a good way. I'm going to have read gc.c. Speaking of reading ruby source, is there an order you would recommend? Every time I look at it I get overwhelmed by a) not knowing where to start and b) K&R C. I can power-through the K&R C for the most part I think, but figuring out what to read when is tougher.

···

On Jun 2, 2006, at 4:44 AM, Mauricio Fernandez wrote:

On Fri, Jun 02, 2006 at 04:57:54PM +0900, Logan Capaldo wrote:

On Jun 1, 2006, at 8:23 PM, John Lam wrote:

So I would guess that Ruby memory allocation is relatively expensive?
Certainly nowhere near as fast as allocating memory off of the "end"
of the heap or the stack, right? Does it have to search a free list of
blocks itself or does it delegate allocation to the system's malloc()
implementation?

Speaking without any knowledge of ruby's internals I imagine it's
actually is just allocating from the end of some pre-allocated buffer
until it reaches the end of the buffer. So if you never run out of
room in the buffer the allocation is just incrementing a pointer.
When you reach the end you do the first GC and subsequent allocations
have to search the freelist for a big enough chunk.

Ruby does not use a compacting GC and doesn't manage memory itself (the way a
normal memory allocator does) either.
There are two parts to allocating an object:
* each non-immediate object takes a sizeof(RVALUE)-sized slot (typically 20
  bytes) from one of the heaps managed by ruby (look for RVALUE and heaps in
  gc.c). It's sizeof(RVALUE) for any object so there's no problem with "chunk
  sizes" and fragmentation (iow. all chunks are ~20 bytes long). A freelist
  is used to find unused slots in said heaps. Additional heaps of increasing
  size will be created when there are no free slots or too few were freed in a
  GC run.
* most objects need additional memory (pointed to by fields in their
  corresponding slots): instance variable tables, char* for Strings, VALUE*
  for Arrays... these are allocated with malloc and will be freed when the
  corresponding object is reclaimed.

ruby relies on malloc(3) for low-level allocation, instead of doing it all
with sbrk(2) and friends.

--
Mauricio Fernandez - http://eigenclass.org - singular Ruby

_why had an interesting article last summer about the internals of
Ruby's memory management and how to use it efficiently:
http://whytheluckystiff.net/articles/theFullyUpturnedBin.html

···

2006/6/2, Logan Capaldo <logancapaldo@gmail.com>:

Interesting. (-- takes notes --). Almost seems like cheating :). But
in a good way. I'm going to have read gc.c. Speaking of reading ruby
source, is there an order you would recommend? Every time I look at
it I get overwhelmed by a) not knowing where to start and b) K&R C. I
can power-through the K&R C for the most part I think, but figuring
out what to read when is tougher.

Hi,

···

On Fri, 02 Jun 2006 20:25:02 +0200, Logan Capaldo <logancapaldo@gmail.com> wrote:

Interesting. (-- takes notes --). Almost seems like cheating :). But in a good way. I'm going to have read gc.c. Speaking of reading ruby source, is there an order you would recommend? Every time I look at it I get overwhelmed by a) not knowing where to start and b) K&R C. I can power-through the K&R C for the most part I think, but figuring out what to read when is tougher.

Have you seen the "Ruby Hacking Guide" translation at http://rhg.rubyforge.org/ ?

It's not complete, but it should definitely get you started.

Dominik

[...]

>ruby relies on malloc(3) for low-level allocation, instead of doing it all
>with sbrk(2) and friends.
>
Interesting. (-- takes notes --). Almost seems like cheating :). But
in a good way. I'm going to have read gc.c. Speaking of reading ruby
source, is there an order you would recommend? Every time I look at
it I get overwhelmed by a) not knowing where to start and b) K&R C. I
can power-through the K&R C for the most part I think, but figuring
out what to read when is tougher.

It depends on what you're interested in (/me slaps self). The easiest starting
points would be array.c, hash.c (st.c if you really want to see the underlying
st_table implementation, but it's just your regular hash table), string.c...
that is, the core data structures. They are very easy to read, but maybe not
that interesting ultimately due to this very straightforwardness.

As for the more interesting stuff, here are some functions to begin with:
* eval.c:
  * rb_eval: the basic AST walker
  * rb_call, rb_get_method_body: method dispatching (+method cache) at work
  * rb_add_method: managing the method tables (m_tbl)
  * rb_include_module: to see how proxy classes (T_ICLASS) work; bits of
    Ruby's object model
  ....
* parse.y: the grammar + yylex (*tricky*)

This is what I answered to a similar question 3 years ago in [74002]:
    
    Ruby Core
     * dln.c: wraps dlopen or the equiv. function of your platform, not very
       interesting
     * gc.c: quite easy to follow, of interest only if you want to know how
  the GC works internally, but it's just mark & sweep doing "common
  sense" things so you can safely skip it.
     * st.c: a hash table implementation used internally by Ruby, quite
       straightforward
     * eval.c: much harder to read as you have to know the node types to
       follow it; several functions are essentially a big switch() statement
       for a node
     * parse.y: this can help you see what different node types correspond
       to by having a look at the grammar.
     * regex.c: whatever, don't read it :slight_smile:

    some other .c files contain only support code

    Built-in classes
    Take the class you like, scroll down to the Init_xxx() function and
    locate the C function that implements the method you want to study. No
    particular order required.

Hope this helps,

···

On Sat, Jun 03, 2006 at 03:25:02AM +0900, Logan Capaldo wrote:

On Jun 2, 2006, at 4:44 AM, Mauricio Fernandez wrote:

--
Mauricio Fernandez - http://eigenclass.org - singular Ruby

[snip my "homework" for the rest of the summer]

Hope this helps,

It does, thanks.

···

On Jun 5, 2006, at 4:01 PM, Mauricio Fernandez wrote:

--
Mauricio Fernandez - http://eigenclass.org - singular Ruby