Data_Make_Struct Considered Dangerous?

Hi,

I don’t know whether any of us has encountered this kind of problem before
in developing Ruby C extension. The problem is the interaction between
memory allocation, the garbage collector (gc), and the mark
functions. And I think this can occur only in struct/class containing
another struct/class.

This problem can occur only with a slight chance. However, in my case, a
code that has been running for a while suddenly gives a segmentation fault
gave me a great headache.

The problem is like this. I have an outer structure, such as

typedef struct
{
    void  *data1;
    VALUE data2;
} sOuter;

which corresponds to an outer class, such as

obj = Data_Make_Struct (cOuter, sOuter, mark_Outer, free_Outer, ptr);

After creating the outer object, I create an inner object:

....
ptr->data2 = Data_Make_Struct (cInner, sInner, mark_Inner, free, p);
....

which corresponds to an inner structure, such as

typedef struct
{
    int   data3;
    VALUE data4;
}

But this is exactly the problem! After debugging my code, my conclusion
is that because Data_Make_Struct implicitly calls Ruby’s ALLOC, which may
result in the invocation of the gc, in the lines above, THE MARK FUNCTIONS
MAY BE CALLED BEFORE THE STRUCT/OBJECT ITSELF IS SETUP PROPERLY, with a
result of segmentation fault. I have been rather careful by checking the
pointers whether they are NULL in the mark functions (and I initialized
them to NULL), but I don’t think it is a foolproof way.

In the problem above, it just happened that after thousands of
iterations, when Data_Make_Struct (cInner…) was called, the gc is
invoked. Basically the gc tries to invoke rb_gc_mark (data2), which at
that point still does not contain a valid object.

Right now, the solution is simply to add this line:

....
ptr->data2 = Qnil;
ptr->data2 = Data_Make_Struct (cInner...);
....

I don’t know whether this is really all there is to it. At least, for
people who don’t understand, the code may look funny, because it seems
the Qnil assignment is just a waste. To me, rather than dealing with
these intricacies in a much more complex data structure, probably it is
better just to follow these simple principles:

1) NEVER use Data_Make_Struct.  Use Data_Wrap_Struct instead.
2) NEVER use Ruby's ALLOC.  Use, at least, C's malloc instead.

Regarding point 1), because Data_Wrap_Struct will not invoke the gc, it is
safer. Regarding point 2), Ruby’s ALLOC may have the same problem as
Data_Wrap_Struct, i.e., we will never know whether the gc will be invoked
at that point or not. Because the corresponding free function is usually
the C’s standard free () function anyway (there is no Ruby’s FREE
function), probably it is more consistent to use malloc instead of
ALLOC. When we use malloc, we know that the gc will not be invoked and we
can proceed in C as usual, without worrying about the gc trigger and
object states.

Finally, probably it is a good idea to remove Data_Make_Struct and ALLOC
from the Ruby C API. We will not lose any functionality while making the
C extensions safer.

Regards,

Bill

Right now, the solution is simply to add this line:

    ....

ptr-> data2 = Qnil;
ptr-> data2 = Data_Make_Struct (cInner...);

    ....

No, this is not the solution. If you make a right use of Data_Make_Struct
ptr->data2 == NULL before the call to Data_Make_Struct() and you can test
this in the mark functions.

    1) NEVER use Data_Make_Struct. Use Data_Wrap_Struct instead.
    2) NEVER use Ruby's ALLOC. Use, at least, C's malloc instead.

Don't say this : this is *stupid*

First learn how to use the ruby API, and remove *your* bugs in the source
of your extension before trying to say such stupidities.

Guy Decoux

Hi,

my two cents on that issue:

From: William Djaja Tjokroaminata [mailto:billtj@y.glue.umd.edu]
Sent: Tuesday, August 13, 2002 3:50 PM
To: ruby-talk ML
Subject: Data_Make_Struct Considered Dangerous?

Right now, the solution is simply to add this line:

....
ptr->data2 = Qnil;
ptr->data2 = Data_Make_Struct (cInner...);
....

I don’t know whether this is really all there is to it. At least, for
people who don’t understand, the code may look funny, because it seems
the Qnil assignment is just a waste.

To me, this doesn’t look funny, this looks just THE right thing to do:
secure already existing data structures reachable from the GC
before possibly triggering the GC.

Maybe some more information about this issue in README.ext wouldn’t hurt,
but I don’t want a restricted, “safer” C API.
For instance, ALLOC is an useful wrapper on top of malloc,
because if needed, it can free memory by triggering a GC.

– Christian

···

-----Original Message-----

Hi,

···

At Tue, 13 Aug 2002 22:50:08 +0900, William Djaja Tjokroaminata wrote:

1) NEVER use Data_Make_Struct.  Use Data_Wrap_Struct instead.
2) NEVER use Ruby's ALLOC.  Use, at least, C's malloc instead.

Regarding point 1), because Data_Wrap_Struct will not invoke the gc, it is
safer. Regarding point 2), Ruby’s ALLOC may have the same problem as
Data_Wrap_Struct, i.e., we will never know whether the gc will be invoked
at that point or not. Because the corresponding free function is usually
the C’s standard free () function anyway (there is no Ruby’s FREE
function), probably it is more consistent to use malloc instead of
ALLOC. When we use malloc, we know that the gc will not be invoked and we
can proceed in C as usual, without worrying about the gc trigger and
object states.

It sounds strange. Data_Make_Struct() fills the allocated
structure with 0 and rb_gc_mark() ignores 0 which is equal to
Qfalse. Also, since GC may be invoked while Data_Wrap_Struct()
allocates new Data object, your conclusion isn’t so meaningful.

Additionaly, you can use ruby_xfree for free.


Nobu Nakada

Hi,

Based on my experience, I will not use ALLOC unless it is necessary
because of the Ruby gc overhead. Based on what I have learned, when I
write Ruby in C, I will stay away from Ruby gc as far as possible (please
also see my response to Guy). Thanks.

Regards,

Bill

···

======================================================================
Christian Boos cboos@bct-technology.com wrote:

my two cents on that issue:

ptr->data2 = Qnil;
ptr->data2 = Data_Make_Struct (cInner...);
....

To me, this doesn’t look funny, this looks just THE right thing to do:
secure already existing data structures reachable from the GC
before possibly triggering the GC.

Maybe some more information about this issue in README.ext wouldn’t hurt,
but I don’t want a restricted, “safer” C API.
For instance, ALLOC is an useful wrapper on top of malloc,
because if needed, it can free memory by triggering a GC.

– Christian

nobu.nokada@softhome.net wrote in message news:200208132245.g7DMjAr07675@sharui.nakada.kanuma.tochigi.jp

···

It sounds strange. Data_Make_Struct() fills the allocated
structure with 0 and rb_gc_mark() ignores 0 which is equal to
Qfalse.


I think this can only happen when we have a struct/class inside
another struct/class. What I observed is, while we are creating the
inner struct/class, gc gets called and try to mark the outer
struct/class. Are you saying that even in this case it is safe
because the outer struct/class should all be filled with zeros?


Also, since GC may be invoked while Data_Wrap_Struct()
allocates new Data object, your conclusion isn’t so meaningful.


Yes, it is not a foolproof way. It just tries to get the programmer’s
attention and prevent error, because, I think, a good C programming
way is always to fill the data right after we do a memory allocation:

ptr = (myType*) malloc (sizeof (myType));
ptr->data_1 = val;
...
ptr->data_N = val;

(here ptr results from another Data_Make_Struct). This programming
discipline usually works, unless, of course, when we initialize a
member with another Data_Make_Struct():

ptr = (myType*) malloc (sizeof (myType));
ptr->data_1 = val;
...
ptr->data_k = Data_Make_Struct (...);

In this case, the C programmer has to realize that that seemingly
innocent function is actually very complicated, which may invoke gc
which results in the calling of all marking functions. Therefore, in
Ruby the C programmer has to add another discipline of
“double-initializing”:

ptr = (myType*) malloc (sizeof (myType));
ptr->data_1 = val;
...
ptr->data_k = Qnil;    /* or Qfalse actually better? */
ptr->data_k = Data_Make_Struct (...);

Based on your response above, are you saying that this
“double-initializing” is actually not needed? (in which case, the
error is actually generated somewhere else?) I just need some
assertion here, because I cannot fix something that is working or
considered working.


Additionaly, you can use ruby_xfree for free.


I am sorry. In my Ruby-C way, I will stay far from Ruby as much as
possible. So instead of having a combination of ALLOC and ruby_xfree,
I just use (at least) malloc() and free(). One problem of using Ruby
memory allocator is that it is tied to a garbage collector. Yes,
using garbage collector makes a programmer/user’s world less
complicated. However, the drawbacks of garbage collector are, at
least:

1) The calling of marking functions is a kind of waste for objects

that have full life (until the process itself is killed) or long
duration. This kind of objects should not be managed by gc; instead
they should be managed by C.

2) It makes it difficult to do real-time programming (I have seen

a discussion on this in this newsgroup.)

In point 2) above, actually I am not saying that Ruby will never be
appropriate for stuff related to real-time programming. If we design
the software careful enough, we still can provide input spec in Ruby,
and then run the kernel (in C) in real-time (by making GC.disable).
Of course, there will be cases where setting GC.disable is impossible
because of some continual new memory consumption. However, if we
design the software careful enough, we increase the number of cases
where GC can indeed be disabled. Furthermore, even where GC has to be
enabled, our software is much more execution-efficient, because the
calling of marking functions has been minimized; this is in regards
with point 1) above. (Currently I have only one marking function,
which is down from about six when I originally started writing the C
extension; this was accomplished by redesigning the data structure. I
still want to get rid of that one marking function…)

Regards,

Bill

Well, I don't like to have "ptr->data2 = NULL" simply because I usually
use NULL for pointers and currently VALUE is unsigned long. Also (see
below), I don't like to make mark functions too complicated, because they
are pure Ruby overhead.

Please be serious, this just add a test

   if (ptr->data2) rb_gc_mark(ptr->data2);

Therefore, the statements above are consistent with the principle of
staying away from Ruby as much as possible. There is the Ruby way, there
is the C way, and there is the Ruby-C way. :slight_smile:

You have just learned that you don't know how to use the ruby API, and
perhaps ruby is not appropriate for your job (I've never understood what
you're trying to do).

Personnaly, when I use GA they are written in C and not in ruby, because
my simulations can run more than a week.

But I'll never say the stupid thing that you've said, because I've tried
*to learn* the ruby API before using it, something that apparently you
have forgotten to do.

Guy Decoux

Hi,


It sounds strange. Data_Make_Struct() fills the allocated
structure with 0 and rb_gc_mark() ignores 0 which is equal to
Qfalse.


I think this can only happen when we have a struct/class inside
another struct/class. What I observed is, while we are creating the
inner struct/class, gc gets called and try to mark the outer
struct/class. Are you saying that even in this case it is safe
because the outer struct/class should all be filled with zeros?

Yes, I guess so.

Based on your response above, are you saying that this
“double-initializing” is actually not needed? (in which case, the
error is actually generated somewhere else?)

It should be in Data_Make_Struct().

I just need some
assertion here, because I cannot fix something that is working or
considered working.

I can’t say anything more unless see your real code.

1) The calling of marking functions is a kind of waste for objects

that have full life (until the process itself is killed) or long
duration. This kind of objects should not be managed by gc; instead
they should be managed by C.

2) It makes it difficult to do real-time programming (I have seen

a discussion on this in this newsgroup.)

I see, it’s interesting but another story.

···

At Wed, 14 Aug 2002 23:13:11 +0900, Bill Tj wrote:


Nobu Nakada

Hi,

After examining the code further, our inconsistency happened because I did
not give exactly the correct example. Instead of just struct/class
inside another struct/class, it is struct/class inside a struct inside a
struct/class.

Therefore, in the outer struct/class there is a pointer. I allocate
memory for this pointer. When I try to initialize this newly allocated
memory, the gc gets invoked. So I guess both of us are actually correct:

  1. In Data_Make_Struct Ruby fills the resulting struct with zeros.
  2. Because Data_Make_Struct (or even Data_Wrap_Struct) may invoke gc, when
    we initialize struct members with Data_Make_Struct, we have to do “double
    initialization”.

Actually the problem is much more general than this. Before we call any
Ruby function that may invoke gc (such as the innocent looking rb_str_new2
()), we have to make sure that all our data are initialized
properly. This was tricky because it occured inside my initialization
function itself. What I mean is, in C usually I do this:

ptr = (my_type*) malloc (sizeof (my_type));
ptr->data1 = val;
ptr->data2 = rb_str_new2 (...);
...

Now, in Ruby, I have to change my habit to

ptr = (type*) malloc (sizeof (type));
/* Fill in with some generic, default values */
ptr->data1 = 0;
ptr->data2 = Qfalse;
...
/* Fill in with the actual data */
ptr->data1 = val;
ptr->data2 = rb_str_new2 (...); /* may invoke gc */
...

Well, I guess so far the codes that I have dealt with are not as
convoluted as the gc concept. On the other hand, this supports my
assertion that ALLOC () is much more dangerous than malloc (). In ALLOC,
we have to make sure that all data are already proper; in malloc () there
is no possibility of jumping around at that point. Finally, because
Data_Make_Struct really calls ALLOC, then Data_Make_Struct is
indeed dangerous.

Regards,

Bill

···

============================================================================
nobu.nokada@softhome.net wrote:


It sounds strange. Data_Make_Struct() fills the allocated
structure with 0 and rb_gc_mark() ignores 0 which is equal to
Qfalse.


I think this can only happen when we have a struct/class inside
another struct/class. What I observed is, while we are creating the
inner struct/class, gc gets called and try to mark the outer
struct/class. Are you saying that even in this case it is safe
because the outer struct/class should all be filled with zeros?

Yes, I guess so.

Based on your response above, are you saying that this
“double-initializing” is actually not needed? (in which case, the
error is actually generated somewhere else?)

It should be in Data_Make_Struct().

Perhaps Ruby is not the tool you should be using.

This sounds like a classic case of putting a square peg into a round
hole.

Paul

···

On Wed, Aug 14, 2002 at 02:10:22AM +0900, William Djaja Tjokroaminata wrote:

Using the Ruby-C way, currently my simulation can run for more then 3 days
with stable memory. Whatever people say about one way or another, I stand
by my numbers. Currently the objective is to achieve close to 1
million events/sec. So far the bottlenecks are the Ruby gc, the mark
functions, and memory allocation/deallocation. So one objective in
the software architecture will be to be able to run the code with
GC.disable.

Pure Ruby                       : 31,000 events/sec
Straight Ruby-to-C translation  : 160,000 events/sec
Some Ruby-C way                 : 222,000 events/sec
(So far) The Furthest Ruby-C way: 312,000 events/sec

I’ve seen you bandy these “events/sec” figures around for a couple of
weeks.

For the unititated (me), could you tell me what an “event” is, and
how you are measuring them?

···

=====

Use your computer to help find a cure for cancer: http://members.ud.com/projects/cancer/

Yahoo IM: michael_s_campbell


Do You Yahoo!?
HotJobs - Search Thousands of New Jobs
http://www.hotjobs.com

is no possibility of jumping around at that point. Finally, because
Data_Make_Struct really calls ALLOC, then Data_Make_Struct is
indeed dangerous.

Data_Make_Struct() is *not* dangerous. The goal of Data_Make_Struct() is
to call ALLOC(), initialize all members of the struct to zero then call
Data_Wrap_Struct()

If you have not understood this : don't use it.

Guy Decoux

Actually, at least for now, Ruby is my ideal tool. So far I have
developed codes in Ruby, they work, and they are wonderful. However,
sometimes I get my result in 1 day. Therefore, when I am able to increase
my code efficiency from 31,000 events/sec to 310,000 events/sec, I should
get my result in 2.4 hours instead.

This is the classic interface/kernel software architecture. I can have
both the interface and the kernel in Ruby, or I can have the interface in
Ruby but the kernel in C. Therefore, system specification and
configuration can be done in Ruby, but the real computations are done in
C. Another (light) example probably is the NumPy class in Python, where
the input is done in Python, but the heavy duty numerical processing is
done in C.

The most valuable lesson that I have learned is that, when transforming my
kernel from Ruby to C, I cannot just do simple translation from Ruby to C,
but instead I have to stay away from Ruby as far as possible, or in other
words, I need to design the kernel very differently, creating Ruby
objects and interfacing with Ruby only as necessary.

I keep both versions of my kernel, so that my input scripts can have

require 'my_kernel_ruby_version'

or

require 'my_kernel_c_version'

In this way, I can always compare the output results for consistency while
seeing a speed up in execution.

Regards,

Bill

···

==========================================================================
Paul Brannan pbrannan@atdesk.com wrote:

Perhaps Ruby is not the tool you should be using.

This sounds like a classic case of putting a square peg into a round
hole.

Paul

Oh, I am writing a discrete event simulator that will be used to simulate
communications network (Ethernet, ATM, TCP, routing, etc.) An event is
equivalent to a (non-preemptible) interrupt in the corresponding
object. I measured how many of these interrupts I can process within the
real time.

For an example of network simulator, see
The Network Simulator - ns-2, where they use Tcl/OTcl as the
interface and C++ as the kernel. However, their model is rather
different; they force the “split-object” model, where for each new object
that you create, you need to create a Tcl class and a C++ class.

I am interested in having the kernel in C++ too, but browsing the
discussions in this newsgroup, I don’t think it is worth it, at least
for now; this Ruby C API gives me a headache already. I need to move on.

Regards,

Bill

···

=========================================================================
Michael Campbell michael_s_campbell@yahoo.com wrote:

Pure Ruby                       : 31,000 events/sec
Straight Ruby-to-C translation  : 160,000 events/sec
Some Ruby-C way                 : 222,000 events/sec
(So far) The Furthest Ruby-C way: 312,000 events/sec

I’ve seen you bandy these “events/sec” figures around for a couple of
weeks.

For the unititated (me), could you tell me what an “event” is, and
how you are measuring them?

=====

Use your computer to help find a cure for cancer: http://members.ud.com/projects/cancer/

Yahoo IM: michael_s_campbell


Do You Yahoo!?
HotJobs - Search Thousands of New Jobs
http://www.hotjobs.com

Surely I understand it now with this whole business of dynamic memory
management by garbage collector.

Understand my logic: ALLOC is “dangerous” because it is not a
straightforward function as malloc. Because Data_Make_Struct also calls
ALLOC, then it is also dangerous. Not only Data_Make_Struct, but all
the seemingly innocent Ruby function that may allocate memory are, in some
sense, dangerous. All C programmers who never dealt with garbage
collector before have to know about this issue of double initialization,
because the bug may be difficult to trace. Therefore:

"If I can use malloc instead of ALLOC to accomplish the same purpose,
why should I ever need to use ALLOC?"

It seems the word “dangerous” has been taken beyond its intended
interpretation. Remember the phrase “Goto Considered Harmful” or “Java
new Considered Harmful”? I have changed it from “harmful” to “dangerous”,
but it seems someone is realy upset about the word “dangerous”. We are
not talking about technicals anymore, but just personal stuff. What a
waste…

Another example: If you responded to the statement “never use ALLOC; use
malloc instead” with “it is not a good idea, because of this, this,
…”, you would have contributed to the body of knowledge; however, when
you just answer “it is stupid” without giving any reason, it is really a
waste… you accomplish nothing: it does not help me or anybody else.

Bill

···

===========================================================================
ts decoux@moulon.inra.fr wrote:

Data_Make_Struct() is not dangerous. The goal of Data_Make_Struct() is
to call ALLOC(), initialize all members of the struct to zero then call
Data_Wrap_Struct()

If you have not understood this : don’t use it.

Guy Decoux

This is a problem, though, because setting a struct to all zeroes
is not the same as initializing all members of the struct. It is
impossible to write a macro in C that can properly initialize all C
structs in a generic manner.

In C++, for example, a class is not initialized until its constructor
has completed. I might find it convenient to use Data_Make_Struct and
then to use placement new to initialize the object, but this would be a
bug. Instead, I should call ALLOC, use placement new on the allocated
structure, then call Data_Wrap_Struct to wrap the object. I have to be
very careful of exception-safety issues if I do this.

The same applies in C to any structure that has non-trivial
initialization. If you have non-trivial initialization, then perhaps
Data_Wrap_Struct is more appropriate for your application. I agree with
ts that THIS DOES NOT make Data_Make_Struct dangerous; it is only
dangerous when used incorrectly.

Paul

···

On Thu, Aug 15, 2002 at 11:44:59PM +0900, ts wrote:

Data_Make_Struct() is not dangerous. The goal of Data_Make_Struct()
is to call ALLOC(), initialize all members of the struct to zero then
call Data_Wrap_Struct()

OOohh! Ok. I’m sorry, I thought this “event” was some generic
metric that could be applied to any arbitrary program, not some
construct specific to your application.

It makes more sense now; thanks.

···

— William Djaja Tjokroaminata billtj@z.glue.umd.edu wrote:

Oh, I am writing a discrete event simulator that will be used to
simulate
communications network (Ethernet, ATM, TCP, routing, etc.) An
event is
equivalent to a (non-preemptible) interrupt in the corresponding
object. I measured how many of these interrupts I can process
within the
real time.

=====

Use your computer to help find a cure for cancer: http://members.ud.com/projects/cancer/

Yahoo IM: michael_s_campbell


Do You Yahoo!?
HotJobs - Search Thousands of New Jobs
http://www.hotjobs.com

Two things:

  1. First, it is not the intention to make a clear-cut judgment whether
    Data_Make_Struct IS dangerous or Data_Make_Struct IS NOT dangerous. As
    written below, “… it is only dangerous when used incorrectly”. By the
    same token, statement such as “Goto Considered Harmful” is then also
    false, Goto is only harmful when used incorrectly or unwisely. The
    intention is to bring up the issue that may occur during this memory
    allocation with respect to garbage collection. As written below, in C++
    we have to use Data_Wrap_Struct instead of Data_Make_Struct.

  2. Second, the “danger” in Data_Make_Struct comes from ALLOC and even from
    the Data_Wrap_Struct itself, which may invoke the garbage
    collector. However, I still do not get the answer why I ever need to use
    ALLOC instead of malloc and its derivatives. For example, ignoring
    run-out-of-memory problem, a C statement such as

    ptr = ALLOC (my_struct);

may create segmentation fault at this point (if some of the memory values
are not proper) while a C statement such as

ptr = (my_struct*) malloc (sizeof (my_struct));

will never create segmentation fault (at this point). Why do we ever need
to use ALLOC (which is tied to a garbage collector) instead of malloc and
its derivatives?

Regards,

Bill

···

=========================================================================
Paul Brannan pbrannan@atdesk.com wrote:

On Thu, Aug 15, 2002 at 11:44:59PM +0900, ts wrote:

Data_Make_Struct() is not dangerous. The goal of Data_Make_Struct()
is to call ALLOC(), initialize all members of the struct to zero then
call Data_Wrap_Struct()

This is a problem, though, because setting a struct to all zeroes
is not the same as initializing all members of the struct. It is
impossible to write a macro in C that can properly initialize all C
structs in a generic manner.

In C++, for example, a class is not initialized until its constructor
has completed. I might find it convenient to use Data_Make_Struct and
then to use placement new to initialize the object, but this would be a
bug. Instead, I should call ALLOC, use placement new on the allocated
structure, then call Data_Wrap_Struct to wrap the object. I have to be
very careful of exception-safety issues if I do this.

The same applies in C to any structure that has non-trivial
initialization. If you have non-trivial initialization, then perhaps
Data_Wrap_Struct is more appropriate for your application. I agree with
ts that THIS DOES NOT make Data_Make_Struct dangerous; it is only
dangerous when used incorrectly.

Paul