Matz, if you're reading, please scan this email

$ time rubytest -F -i 0 -n xml_parser4
=2E/tests/tc_xml_parser4.rb:12: [BUG] Segmentation fault
ruby 1.7.3 (2002-09-27) [i386-freebsd5]
Abort (core dumped)

Not really convincted that this is a bug in ruby.

And it wasn’t either. :slight_smile: I’ll gladly explain to anyone in detail what
was going on, but, the good news is that things are mostly fixed and
will be completely fixed in the next 48hrs or so (going to be pretty
busy for the next few days).

The short and skinny: I wasn’t properly using ruby’s mark/sweep
implimentation… infact, I wasn’t using it at all. In a few places I
was making bogus assumptions about the way that Ruby’s GC worked,
which I shouldn’t have done. I didn’t realize that Data_Wrap_Struct()
registered the wrapped ptr with the GC. Anyway, after matz explained
mark/sweep to me last night, I mostly understood what I had done wrong
and had fixed things in about 30min. :slight_smile: This morning I had the big
“ah ha!” and now fully grok how I need to use mark/sweep. From what I
can tell, here’s why I was seeing core dumps inside of ruby’s
interpreter:

In a few places I was using xmlCopyNode(). When those nodes were
free()'ed by Ruby’s GC, they were also unlinking and freeing the node.
Since a document wasn’t being marked, it was being cleaned up, leaving
the NodePtr with many dangling pointers. So when ruby came along
and allocated a new object and reused the memory of the free()'ed
document, and the node was GC’ed, xmlFreeNode()/xmlUnlinkNode() was
corrupting the parse tree and other various important bits. After
about 30min of work in crudely marking dependency objects, I was able
to run rubytest -i0 -F for 15min. There’s still a leak some place,
but I was able to track that down (still haven’t fixed it yet though).

And that’s all she wrote folks. libxml should be mod_ruby safe in two
or three days time. Big thanks to Matz, Guuy, and to Tobias Peters
for their help. -sc

PS Please note that this thread was close to, or under 10msgs long.
::grin::

···


Sean Chittenden

Tanaka Akira wrote:

Some reports from valgrind is due to Ruby’s conservative GC, which
touch all C stack region.

I use following suppression file to suppress such reports.

<… valgrind suppression file for Ruby snipped …>

Thanks very much for posting this!

which I shouldn't have done. I didn't realize that Data_Wrap_Struct()

                                                  ^^^^^^^^^^^^^^^^^^^^^

A last point : try to forget Data_Wrap_Struct(). You are in a case where
it's best to use Data_Make_Struct().

I can explain memory leak in your extension just because you use
Data_Wrap_Struct() rather than Data_Make_Struct().

Guy Decoux