./dict.rb:316: [BUG] gc_sweep(): unknown data type 48
ruby 1.6.8 (2002-12-24) [i386-linux]
Aborted
every once in a while when running with Ruby 1.6.8 one script that adds
hints (like jisyo.org) to HTML files. This didn’t happen in 1.6.7 and
doesn’t AFAIK in 1.7.2.
I don’t know how to reproduce the error predictably; testing on the
same conditions is difficult, as the script is heavily multi-threaded
(it makes concurrent connections to a dictd server to get definitions in
parallel as it processes the document) and I cannot control the precise
timing of the dictd responses. I am not using any custom extensions
(just digest/md5 and strscan which should hopefully be correct).
Just wanted to know if the above error indicates for sure (1) a bug in Ruby
or (2) if I could be causing it by having something wrong in my code (say,
wrt. to threading and synchronization issues).
If it is (1) I could try to isolate the error, but it won’t be easy for
the fore mentioned reasons. The source code as it stands now wouldn’t be
very helpful as it isn’t really small enough (~ 900 locs).
Just wanted to know if the above error indicates for sure (1) a bug in Ruby
or (2) if I could be causing it by having something wrong in my code (say,
wrt. to threading and synchronization issues).
I guess it should be (1), a bug around GC.
If it is (1) I could try to isolate the error, but it won’t be easy for
the fore mentioned reasons. The source code as it stands now wouldn’t be
very helpful as it isn’t really small enough (~ 900 locs).
Do you have core at the crash?
···
At Sat, 1 Mar 2003 19:12:40 +0900, Mauricio Fernández wrote:
I’m working on it; my ulimit -c was 0 and I just fixed it, but the
thing just doesn’t want to fail now. It will eventually: I’m setting it
up as a proxy so I stress it while browsing
At Sat, 1 Mar 2003 19:12:40 +0900, > Mauricio Fernández wrote:
Just wanted to know if the above error indicates for sure (1) a bug in Ruby
or (2) if I could be causing it by having something wrong in my code (say,
wrt. to threading and synchronization issues).
I guess it should be (1), a bug around GC.
If it is (1) I could try to isolate the error, but it won’t be easy for
the fore mentioned reasons. The source code as it stands now wouldn’t be
very helpful as it isn’t really small enough (~ 900 locs).
Finally got one. It’s 8MB, but I’m using the following script to try to
make Ruby dump a smaller core:
loop do cp /tmp/SPIEGEL\ ONLINE.html /tmp/tst1.html ruby wwwhints.rb /tmp/tst1.html
if $? != 0
# have a core, yuppie mv core co/core.#{Time.new.to_i}
end
end
[please tell me if this can affect in any undesirable way the generated
core]
I will try to get a core from a binary with debug information.
Here’s some info from gdb (sorry, have no debug info yet, will work on
it):
(gdb) where #0 0x4012fc01 in kill () from /lib/libc.so.6 #1 0x4012fa12 in raise () from /lib/libc.so.6 #2 0x40130c3e in abort () from /lib/libc.so.6 #3 0x4003389d in rb_bug () from /usr/lib/libruby.so.1.6 #4 0x400810b7 in ruby_posix_signal () from /usr/lib/libruby.so.1.6 #5 0x4012fb88 in sigaction () from /lib/libc.so.6 #6 0x4008f9ba in rb_free_generic_ivar () from /usr/lib/libruby.so.1.6 #7 0x4004e02d in rb_gc_force_recycle () from /usr/lib/libruby.so.1.6 #8 0x4004dc05 in rb_gc_mark () from /usr/lib/libruby.so.1.6 #9 0x4004e1dc in rb_gc () from /usr/lib/libruby.so.1.6 #10 0x4004cc65 in ruby_xmalloc () from /usr/lib/libruby.so.1.6 #11 0x40073a11 in ruby_re_compile_pattern () from
/usr/lib/libruby.so.1.6 #12 0x40070761 in rb_reg_mbclen2 () from /usr/lib/libruby.so.1.6 #13 0x400714d5 in rb_reg_match_last () from /usr/lib/libruby.so.1.6 #14 0x4007160c in rb_reg_new () from /usr/lib/libruby.so.1.6 #15 0x4003ac2b in rb_alias () from /usr/lib/libruby.so.1.6 #16 0x4003aa2b in rb_alias () from /usr/lib/libruby.so.1.6 #17 0x400389ab in rb_alias () from /usr/lib/libruby.so.1.6 #18 0x4003865a in rb_alias () from /usr/lib/libruby.so.1.6 #19 0x40038edb in rb_alias () from /usr/lib/libruby.so.1.6 #20 0x4003ec4e in rb_stack_check () from /usr/lib/libruby.so.1.6 #21 0x4003f13c in rb_stack_check () from /usr/lib/libruby.so.1.6 #22 0x4003a012 in rb_alias () from /usr/lib/libruby.so.1.6 #23 0x400399b8 in rb_alias () from /usr/lib/libruby.so.1.6 #24 0x400397b5 in rb_alias () from /usr/lib/libruby.so.1.6 #25 0x4003865a in rb_alias () from /usr/lib/libruby.so.1.6 #26 0x4003cccd in rb_iterator_p () from /usr/lib/libruby.so.1.6 #27 0x40047a13 in rb_thread_scope_shared_p () from
/usr/lib/libruby.so.1.6 #28 0x400478f1 in rb_thread_stop_timer () from /usr/lib/libruby.so.1.6 #29 0x40047ac0 in rb_thread_scope_shared_p () from
/usr/lib/libruby.so.1.6 #30 0x4003e58d in rb_stack_check () from /usr/lib/libruby.so.1.6 #31 0x4003e929 in rb_stack_check () from /usr/lib/libruby.so.1.6 #32 0x4003f13c in rb_stack_check () from /usr/lib/libruby.so.1.6 #33 0x4003f48d in rb_funcall2 () from /usr/lib/libruby.so.1.6 #34 0x400417e1 in rb_obj_call_init () from /usr/lib/libruby.so.1.6 #35 0x40047a4c in rb_thread_scope_shared_p () from
/usr/lib/libruby.so.1.6 #36 0x4003e58d in rb_stack_check () from /usr/lib/libruby.so.1.6
—Type to continue, or q to quit— #37 0x4003e929 in rb_stack_check () from /usr/lib/libruby.so.1.6 #38 0x4003f13c in rb_stack_check () from /usr/lib/libruby.so.1.6 #39 0x4003a012 in rb_alias () from /usr/lib/libruby.so.1.6 #40 0x400393cd in rb_alias () from /usr/lib/libruby.so.1.6
At Sat, 1 Mar 2003 19:12:40 +0900, > Mauricio Fernández wrote:
Just wanted to know if the above error indicates for sure (1) a bug in Ruby
or (2) if I could be causing it by having something wrong in my code (say,
wrt. to threading and synchronization issues).
I guess it should be (1), a bug around GC.
If it is (1) I could try to isolate the error, but it won’t be easy for
the fore mentioned reasons. The source code as it stands now wouldn’t be
very helpful as it isn’t really small enough (~ 900 locs).
* string.c (rb_str_dup): set FL_EXIVAR when copied generic ivar.
(ruby-bugs-ja:PR#400)
Do you make an instance variable of String object or mix a
module which does such thing, and then add an instance variable
again to a String duplicated from it?
You mean adding instance variables to a String? I don’t do that…
And I only use String#dup once in my script (to duplicate $1) so…
Moreover something very strange is happening:
the bug happens about once every 10 runs when using the binary provided
by my distribution (Debian sid => 1.6.8 from Dec. 24 2002).
However it didn’t happen in about ~1500 runs with a binary I compiled
myself (exactly same version, 1.6.8 Dec. 24) in order to get the
debugging info for you.
AFAIK there’s only three differences between my binary and the system’s
one:
mine is built for i686-linux, Debian for i386-linux => perhaps there’s
a bug in the gcc I’m using (gcc version 2.95.4 20011002 (Debian prerelease))
mine is installed in $HOME/usr; I built strscan for it and installed it there
Debian uses --enable-shared and I do not. Could this affect anything?
I cannot try a CVS snapshot until I get a predictable way to reproduce
the errors in the binaries I build…
string.c (rb_str_dup): set FL_EXIVAR when copied generic ivar.
(ruby-bugs-ja:PR#400)
Do you make an instance variable of String object or mix a
module which does such thing, and then add an instance variable
again to a String duplicated from it?
I finally found a way to reproduce the bug with some predictability
(it only happens when Ruby is compiled with gcc 3.2; could it be that the
compiler is broken?).
If it is a bug, it is a new one.
Converting /tmp/tst1.html
./dict.rb:316: [BUG] gc_sweep(): unknown data type 28
ruby 1.6.8 (2003-02-28) [i386-linux]
This time I got a core dump with all the info: #0 0x40094c01 in kill () from /lib/libc.so.6 #1 0x40094a12 in raise () from /lib/libc.so.6 #2 0x40095c3e in abort () from /lib/libc.so.6 #3 0x080ad77a in rb_bug (fmt=0x80b1300 “gc_sweep(): unknown data type
%d”) at error.c:180 #4 0x08066fbf in obj_free (obj=1075318916) at gc.c:949 #5 0x08066d61 in gc_sweep () at gc.c:767 #6 0x080671d2 in rb_gc () at gc.c:1076 #7 0x08065fad in ruby_xmalloc (size=1024) at gc.c:94 #8 0x08088d69 in ruby_re_compile_pattern (pattern=0x84d4168 “^250”,
size=4, bufp=0x84d4178) at regex.c:2416 #9 0x08085f6a in make_regexp (s=0x84d4168 “^250”, len=4, flag=0) at
re.c:400 #10 0x08086b74 in rb_reg_initialize (obj=1075579232, s=0x84d4168 “^250”,
len=4, options=0) at re.c:866 #11 0x08086c94 in rb_reg_new (s=0x84d4168 “^250”, len=4, options=0) at
re.c:886
…
What more info can I give you?
···
On Sun, Mar 02, 2003 at 12:55:48AM +0900, Yukihiro Matsumoto wrote:
Hi,
In message “Re: [BUG] gc_sweep(): unknown data type 48” > on 03/03/01, Mauricio Fernández batsman.geo@yahoo.com writes:
I will try to get a core from a binary with debug information.
Here’s some info from gdb (sorry, have no debug info yet, will work on
it):
Thank you. It seems a bug that Nobu already fixed recently. Try the
latest stable-snapshot (or ruby_1_6 branch in the CVS), and see how it
goes.
Do you make an instance variable of String object or mix a
module which does such thing, and then add an instance variable
again to a String duplicated from it?
You mean adding instance variables to a String? I don’t do that…
And I only use String#dup once in my script (to duplicate $1) so…
Yes, I meant it. Hmmm, it might not be, more information is
required.
Moreover something very strange is happening:
the bug happens about once every 10 runs when using the binary provided
by my distribution (Debian sid => 1.6.8 from Dec. 24 2002).
However it didn’t happen in about ~1500 runs with a binary I compiled
myself (exactly same version, 1.6.8 Dec. 24) in order to get the
debugging info for you.
AFAIK there’s only three differences between my binary and the system’s
one:
mine is built for i686-linux, Debian for i386-linux => perhaps there’s
a bug in the gcc I’m using (gcc version 2.95.4 20011002 (Debian prerelease))
Unless you set CFLAGS while configure&make, it wouldn’t affect.
mine is installed in $HOME/usr; I built strscan for it and installed it there
Debian uses --enable-shared and I do not. Could this affect anything?
These two and gcc version may affect memory location and GC.
···
At Sun, 2 Mar 2003 03:47:37 +0900, Mauricio Fernández wrote:
At Mon, 3 Mar 2003 03:33:11 +0900, Mauricio Fernández wrote:
I finally found a way to reproduce the bug with some predictability
(it only happens when Ruby is compiled with gcc 3.2; could it be that the
compiler is broken?).
If it is a bug, it is a new one.
Converting /tmp/tst1.html
./dict.rb:316: [BUG] gc_sweep(): unknown data type 28
ruby 1.6.8 (2003-02-28) [i386-linux]
This time I got a core dump with all the info: #0 0x40094c01 in kill () from /lib/libc.so.6 #1 0x40094a12 in raise () from /lib/libc.so.6 #2 0x40095c3e in abort () from /lib/libc.so.6 #3 0x080ad77a in rb_bug (fmt=0x80b1300 “gc_sweep(): unknown data type
%d”) at error.c:180 #4 0x08066fbf in obj_free (obj=1075318916) at gc.c:949
What’s this obj’s content? Try `rp obj’ with attached
“.gdbinit”.
I’m having a problem to get it If I rebuild the Debian package (so
that every but the compiler is identical) the bug no longer appears.
I will try to build with the very same compiler the packager used to see
if I can get this to fail again; I’ll then move on to add debugging
information.
At Sun, 2 Mar 2003 03:47:37 +0900, > Mauricio Fernández wrote:
Do you make an instance variable of String object or mix a
module which does such thing, and then add an instance variable
again to a String duplicated from it?
You mean adding instance variables to a String? I don’t do that…
And I only use String#dup once in my script (to duplicate $1) so…
Yes, I meant it. Hmmm, it might not be, more information is
required.
I finally found a way to reproduce the bug with some predictability
(it only happens when Ruby is compiled with gcc 3.2; could it be that the
compiler is broken?).
If it is a bug, it is a new one.
Converting /tmp/tst1.html
./dict.rb:316: [BUG] gc_sweep(): unknown data type 28
ruby 1.6.8 (2003-02-28) [i386-linux]
This time I got a core dump with all the info: #0 0x40094c01 in kill () from /lib/libc.so.6 #1 0x40094a12 in raise () from /lib/libc.so.6 #2 0x40095c3e in abort () from /lib/libc.so.6 #3 0x080ad77a in rb_bug (fmt=0x80b1300 “gc_sweep(): unknown data type
%d”) at error.c:180 #4 0x08066fbf in obj_free (obj=1075318916) at gc.c:949
What’s this obj’s content? Try `rp obj’ with attached
“.gdbinit”.