[BUG] gc_sweep(): unknown data type 48

I’m getting errors like the following:

./dict.rb:316: [BUG] gc_sweep(): unknown data type 48
ruby 1.6.8 (2002-12-24) [i386-linux]
Aborted

every once in a while when running with Ruby 1.6.8 one script that adds
hints (like jisyo.org) to HTML files. This didn’t happen in 1.6.7 and
doesn’t AFAIK in 1.7.2.

I don’t know how to reproduce the error predictably; testing on the
same conditions is difficult, as the script is heavily multi-threaded
(it makes concurrent connections to a dictd server to get definitions in
parallel as it processes the document) and I cannot control the precise
timing of the dictd responses. I am not using any custom extensions
(just digest/md5 and strscan which should hopefully be correct).

Just wanted to know if the above error indicates for sure (1) a bug in Ruby
or (2) if I could be causing it by having something wrong in my code (say,
wrt. to threading and synchronization issues).

If it is (1) I could try to isolate the error, but it won’t be easy for
the fore mentioned reasons. The source code as it stands now wouldn’t be
very helpful as it isn’t really small enough (~ 900 locs).

···


_ _

__ __ | | ___ _ __ ___ __ _ _ __
'_ \ / | __/ __| '_ _ \ / ` | ’ \
) | (| | |
__ \ | | | | | (| | | | |
.__/ _,
|_|/| || ||_,|| |_|
Running Debian GNU/Linux Sid (unstable)
batsman dot geo at yahoo dot com

Tut mir Leid, Jost, aber Du bist ein unertraeglicher Troll.
Was soll das? Du beleidigst die Trolle!
– de.comp.os.unix.linux.misc

Hi,

Just wanted to know if the above error indicates for sure (1) a bug in Ruby
or (2) if I could be causing it by having something wrong in my code (say,
wrt. to threading and synchronization issues).

I guess it should be (1), a bug around GC.

If it is (1) I could try to isolate the error, but it won’t be easy for
the fore mentioned reasons. The source code as it stands now wouldn’t be
very helpful as it isn’t really small enough (~ 900 locs).

Do you have core at the crash?

···

At Sat, 1 Mar 2003 19:12:40 +0900, Mauricio Fernández wrote:


Nobu Nakada

I’m working on it; my ulimit -c was 0 and I just fixed it, but the
thing just doesn’t want to fail now. It will eventually: I’m setting it
up as a proxy so I stress it while browsing :slight_smile:

···

On Sat, Mar 01, 2003 at 07:26:58PM +0900, nobu.nokada@softhome.net wrote:

Hi,

At Sat, 1 Mar 2003 19:12:40 +0900, > Mauricio Fernández wrote:

Just wanted to know if the above error indicates for sure (1) a bug in Ruby
or (2) if I could be causing it by having something wrong in my code (say,
wrt. to threading and synchronization issues).

I guess it should be (1), a bug around GC.

If it is (1) I could try to isolate the error, but it won’t be easy for
the fore mentioned reasons. The source code as it stands now wouldn’t be
very helpful as it isn’t really small enough (~ 900 locs).

Do you have core at the crash?


_ _

__ __ | | ___ _ __ ___ __ _ _ __
'_ \ / | __/ __| '_ _ \ / ` | ’ \
) | (| | |
__ \ | | | | | (| | | | |
.__/ _,
|_|/| || ||_,|| |_|
Running Debian GNU/Linux Sid (unstable)
batsman dot geo at yahoo dot com

Debian is like Suse with yast turned off, just better. :slight_smile:
– Goswin Brederlow

Finally got one. It’s 8MB, but I’m using the following script to try to
make Ruby dump a smaller core:

loop do
cp /tmp/SPIEGEL\ ONLINE.html /tmp/tst1.html
ruby wwwhints.rb /tmp/tst1.html
if $? != 0
# have a core, yuppie
mv core co/core.#{Time.new.to_i}
end
end

[please tell me if this can affect in any undesirable way the generated
core]

I will try to get a core from a binary with debug information.

Here’s some info from gdb (sorry, have no debug info yet, will work on
it):

(gdb) where
#0 0x4012fc01 in kill () from /lib/libc.so.6
#1 0x4012fa12 in raise () from /lib/libc.so.6
#2 0x40130c3e in abort () from /lib/libc.so.6
#3 0x4003389d in rb_bug () from /usr/lib/libruby.so.1.6
#4 0x400810b7 in ruby_posix_signal () from /usr/lib/libruby.so.1.6
#5 0x4012fb88 in sigaction () from /lib/libc.so.6
#6 0x4008f9ba in rb_free_generic_ivar () from /usr/lib/libruby.so.1.6
#7 0x4004e02d in rb_gc_force_recycle () from /usr/lib/libruby.so.1.6
#8 0x4004dc05 in rb_gc_mark () from /usr/lib/libruby.so.1.6
#9 0x4004e1dc in rb_gc () from /usr/lib/libruby.so.1.6
#10 0x4004cc65 in ruby_xmalloc () from /usr/lib/libruby.so.1.6
#11 0x40073a11 in ruby_re_compile_pattern () from
/usr/lib/libruby.so.1.6
#12 0x40070761 in rb_reg_mbclen2 () from /usr/lib/libruby.so.1.6
#13 0x400714d5 in rb_reg_match_last () from /usr/lib/libruby.so.1.6
#14 0x4007160c in rb_reg_new () from /usr/lib/libruby.so.1.6
#15 0x4003ac2b in rb_alias () from /usr/lib/libruby.so.1.6
#16 0x4003aa2b in rb_alias () from /usr/lib/libruby.so.1.6
#17 0x400389ab in rb_alias () from /usr/lib/libruby.so.1.6
#18 0x4003865a in rb_alias () from /usr/lib/libruby.so.1.6
#19 0x40038edb in rb_alias () from /usr/lib/libruby.so.1.6
#20 0x4003ec4e in rb_stack_check () from /usr/lib/libruby.so.1.6
#21 0x4003f13c in rb_stack_check () from /usr/lib/libruby.so.1.6
#22 0x4003a012 in rb_alias () from /usr/lib/libruby.so.1.6
#23 0x400399b8 in rb_alias () from /usr/lib/libruby.so.1.6
#24 0x400397b5 in rb_alias () from /usr/lib/libruby.so.1.6
#25 0x4003865a in rb_alias () from /usr/lib/libruby.so.1.6
#26 0x4003cccd in rb_iterator_p () from /usr/lib/libruby.so.1.6
#27 0x40047a13 in rb_thread_scope_shared_p () from
/usr/lib/libruby.so.1.6
#28 0x400478f1 in rb_thread_stop_timer () from /usr/lib/libruby.so.1.6
#29 0x40047ac0 in rb_thread_scope_shared_p () from
/usr/lib/libruby.so.1.6
#30 0x4003e58d in rb_stack_check () from /usr/lib/libruby.so.1.6
#31 0x4003e929 in rb_stack_check () from /usr/lib/libruby.so.1.6
#32 0x4003f13c in rb_stack_check () from /usr/lib/libruby.so.1.6
#33 0x4003f48d in rb_funcall2 () from /usr/lib/libruby.so.1.6
#34 0x400417e1 in rb_obj_call_init () from /usr/lib/libruby.so.1.6
#35 0x40047a4c in rb_thread_scope_shared_p () from
/usr/lib/libruby.so.1.6
#36 0x4003e58d in rb_stack_check () from /usr/lib/libruby.so.1.6
—Type to continue, or q to quit—
#37 0x4003e929 in rb_stack_check () from /usr/lib/libruby.so.1.6
#38 0x4003f13c in rb_stack_check () from /usr/lib/libruby.so.1.6
#39 0x4003a012 in rb_alias () from /usr/lib/libruby.so.1.6
#40 0x400393cd in rb_alias () from /usr/lib/libruby.so.1.6

···

On Sat, Mar 01, 2003 at 07:26:58PM +0900, nobu.nokada@softhome.net wrote:

Hi,

At Sat, 1 Mar 2003 19:12:40 +0900, > Mauricio Fernández wrote:

Just wanted to know if the above error indicates for sure (1) a bug in Ruby
or (2) if I could be causing it by having something wrong in my code (say,
wrt. to threading and synchronization issues).

I guess it should be (1), a bug around GC.

If it is (1) I could try to isolate the error, but it won’t be easy for
the fore mentioned reasons. The source code as it stands now wouldn’t be
very helpful as it isn’t really small enough (~ 900 locs).

Do you have core at the crash?


_ _

__ __ | | ___ _ __ ___ __ _ _ __
'_ \ / | __/ __| '_ _ \ / ` | ’ \
) | (| | |
__ \ | | | | | (| | | | |
.__/ _,
|_|/| || ||_,|| |_|
Running Debian GNU/Linux Sid (unstable)
batsman dot geo at yahoo dot com

Not only Guinness - Linux is good for you, too.
– Banzai on IRC

Hi,

            `mv core co/core.#{Time.new.to_i}`

On Linux, writing 1 to /proc/sys/kernel/core_uses_pid makes
core files unique.

#5 0x4012fb88 in sigaction () from /lib/libc.so.6
#6 0x4008f9ba in rb_free_generic_ivar () from /usr/lib/libruby.so.1.6

It looks like a recently fixed bug.

···

At Sat, 1 Mar 2003 22:32:50 +0900, Mauricio Fernández wrote:

Wed Feb 26 00:02:41 2003 Nobuyoshi Nakada nobu.nokada@softhome.net

* string.c (rb_str_dup): set FL_EXIVAR when copied generic ivar.
  (ruby-bugs-ja:PR#400)

Do you make an instance variable of String object or mix a
module which does such thing, and then add an instance variable
again to a String duplicated from it?


Nobu Nakada

Hi,

···

In message “Re: [BUG] gc_sweep(): unknown data type 48” on 03/03/01, Mauricio Fernández batsman.geo@yahoo.com writes:

I will try to get a core from a binary with debug information.

Here’s some info from gdb (sorry, have no debug info yet, will work on
it):

Thank you. It seems a bug that Nobu already fixed recently. Try the
latest stable-snapshot (or ruby_1_6 branch in the CVS), and see how it
goes.

Wed Feb 26 00:02:41 2003 Nobuyoshi Nakada nobu.nokada@softhome.net

* string.c (rb_str_dup): set FL_EXIVAR when copied generic ivar.
  (ruby-bugs-ja:PR#400)

						matz.

You mean adding instance variables to a String? I don’t do that…
And I only use String#dup once in my script (to duplicate $1) so…

Moreover something very strange is happening:

the bug happens about once every 10 runs when using the binary provided
by my distribution (Debian sid => 1.6.8 from Dec. 24 2002).
However it didn’t happen in about ~1500 runs with a binary I compiled
myself (exactly same version, 1.6.8 Dec. 24) in order to get the
debugging info for you.

AFAIK there’s only three differences between my binary and the system’s
one:

  • mine is built for i686-linux, Debian for i386-linux => perhaps there’s
    a bug in the gcc I’m using (gcc version 2.95.4 20011002 (Debian prerelease))
  • mine is installed in $HOME/usr; I built strscan for it and installed it there
  • Debian uses --enable-shared and I do not. Could this affect anything?

I cannot try a CVS snapshot until I get a predictable way to reproduce
the errors in the binaries I build…

···

On Sun, Mar 02, 2003 at 12:35:37AM +0900, nobu.nokada@softhome.net wrote:

Hi,

At Sat, 1 Mar 2003 22:32:50 +0900, > Mauricio Fernández wrote:

            `mv core co/core.#{Time.new.to_i}`

On Linux, writing 1 to /proc/sys/kernel/core_uses_pid makes
core files unique.

#5 0x4012fb88 in sigaction () from /lib/libc.so.6
#6 0x4008f9ba in rb_free_generic_ivar () from /usr/lib/libruby.so.1.6

It looks like a recently fixed bug.

Wed Feb 26 00:02:41 2003 Nobuyoshi Nakada nobu.nokada@softhome.net

  • string.c (rb_str_dup): set FL_EXIVAR when copied generic ivar.
    (ruby-bugs-ja:PR#400)

Do you make an instance variable of String object or mix a
module which does such thing, and then add an instance variable
again to a String duplicated from it?


_ _

__ __ | | ___ _ __ ___ __ _ _ __
'_ \ / | __/ __| '_ _ \ / ` | ’ \
) | (| | |
__ \ | | | | | (| | | | |
.__/ _,
|_|/| || ||_,|| |_|
Running Debian GNU/Linux Sid (unstable)
batsman dot geo at yahoo dot com

Stupid nick highlighting
Whenever someone starts with “stupid” it highlights the nick. Hmm.
#Debian

I finally found a way to reproduce the bug with some predictability
(it only happens when Ruby is compiled with gcc 3.2; could it be that the
compiler is broken?).

If it is a bug, it is a new one.

Converting /tmp/tst1.html
./dict.rb:316: [BUG] gc_sweep(): unknown data type 28
ruby 1.6.8 (2003-02-28) [i386-linux]

This time I got a core dump with all the info:
#0 0x40094c01 in kill () from /lib/libc.so.6
#1 0x40094a12 in raise () from /lib/libc.so.6
#2 0x40095c3e in abort () from /lib/libc.so.6
#3 0x080ad77a in rb_bug (fmt=0x80b1300 “gc_sweep(): unknown data type
%d”) at error.c:180
#4 0x08066fbf in obj_free (obj=1075318916) at gc.c:949
#5 0x08066d61 in gc_sweep () at gc.c:767
#6 0x080671d2 in rb_gc () at gc.c:1076
#7 0x08065fad in ruby_xmalloc (size=1024) at gc.c:94
#8 0x08088d69 in ruby_re_compile_pattern (pattern=0x84d4168 “^250”,
size=4, bufp=0x84d4178) at regex.c:2416
#9 0x08085f6a in make_regexp (s=0x84d4168 “^250”, len=4, flag=0) at
re.c:400
#10 0x08086b74 in rb_reg_initialize (obj=1075579232, s=0x84d4168 “^250”,
len=4, options=0) at re.c:866
#11 0x08086c94 in rb_reg_new (s=0x84d4168 “^250”, len=4, options=0) at
re.c:886

What more info can I give you?

···

On Sun, Mar 02, 2003 at 12:55:48AM +0900, Yukihiro Matsumoto wrote:

Hi,

In message “Re: [BUG] gc_sweep(): unknown data type 48” > on 03/03/01, Mauricio Fernández batsman.geo@yahoo.com writes:

I will try to get a core from a binary with debug information.

Here’s some info from gdb (sorry, have no debug info yet, will work on
it):

Thank you. It seems a bug that Nobu already fixed recently. Try the
latest stable-snapshot (or ruby_1_6 branch in the CVS), and see how it
goes.

Wed Feb 26 00:02:41 2003 Nobuyoshi Nakada nobu.nokada@softhome.net

  • string.c (rb_str_dup): set FL_EXIVAR when copied generic ivar.
    (ruby-bugs-ja:PR#400)


_ _

__ __ | | ___ _ __ ___ __ _ _ __
'_ \ / | __/ __| '_ _ \ / ` | ’ \
) | (| | |
__ \ | | | | | (| | | | |
.__/ _,
|_|/| || ||_,|| |_|
Running Debian GNU/Linux Sid (unstable)
batsman dot geo at yahoo dot com

We come to bury DOS, not to praise it.
– Paul Vojta, vojta@math.berkeley.edu

Hi,

Do you make an instance variable of String object or mix a
module which does such thing, and then add an instance variable
again to a String duplicated from it?

You mean adding instance variables to a String? I don’t do that…
And I only use String#dup once in my script (to duplicate $1) so…

Yes, I meant it. Hmmm, it might not be, more information is
required.

Moreover something very strange is happening:

the bug happens about once every 10 runs when using the binary provided
by my distribution (Debian sid => 1.6.8 from Dec. 24 2002).
However it didn’t happen in about ~1500 runs with a binary I compiled
myself (exactly same version, 1.6.8 Dec. 24) in order to get the
debugging info for you.

AFAIK there’s only three differences between my binary and the system’s
one:

  • mine is built for i686-linux, Debian for i386-linux => perhaps there’s
    a bug in the gcc I’m using (gcc version 2.95.4 20011002 (Debian prerelease))

Unless you set CFLAGS while configure&make, it wouldn’t affect.

  • mine is installed in $HOME/usr; I built strscan for it and installed it there
  • Debian uses --enable-shared and I do not. Could this affect anything?

These two and gcc version may affect memory location and GC.

···

At Sun, 2 Mar 2003 03:47:37 +0900, Mauricio Fernández wrote:


Nobu Nakada

Hi,

.gdbinit (6.82 KB)

···

At Mon, 3 Mar 2003 03:33:11 +0900, Mauricio Fernández wrote:

I finally found a way to reproduce the bug with some predictability
(it only happens when Ruby is compiled with gcc 3.2; could it be that the
compiler is broken?).

If it is a bug, it is a new one.

Converting /tmp/tst1.html
./dict.rb:316: [BUG] gc_sweep(): unknown data type 28
ruby 1.6.8 (2003-02-28) [i386-linux]

This time I got a core dump with all the info:
#0 0x40094c01 in kill () from /lib/libc.so.6
#1 0x40094a12 in raise () from /lib/libc.so.6
#2 0x40095c3e in abort () from /lib/libc.so.6
#3 0x080ad77a in rb_bug (fmt=0x80b1300 “gc_sweep(): unknown data type
%d”) at error.c:180
#4 0x08066fbf in obj_free (obj=1075318916) at gc.c:949

What’s this obj’s content? Try `rp obj’ with attached
“.gdbinit”.

I’m having a problem to get it :frowning: If I rebuild the Debian package (so
that every but the compiler is identical) the bug no longer appears.

I will try to build with the very same compiler the packager used to see
if I can get this to fail again; I’ll then move on to add debugging
information.

···

On Sun, Mar 02, 2003 at 06:59:10PM +0900, nobu.nokada@softhome.net wrote:

Hi,

At Sun, 2 Mar 2003 03:47:37 +0900, > Mauricio Fernández wrote:

Do you make an instance variable of String object or mix a
module which does such thing, and then add an instance variable
again to a String duplicated from it?

You mean adding instance variables to a String? I don’t do that…
And I only use String#dup once in my script (to duplicate $1) so…

Yes, I meant it. Hmmm, it might not be, more information is
required.


_ _

__ __ | | ___ _ __ ___ __ _ _ __
'_ \ / | __/ __| '_ _ \ / ` | ’ \
) | (| | |
__ \ | | | | | (| | | | |
.__/ _,
|_|/| || ||_,|| |_|
Running Debian GNU/Linux Sid (unstable)
batsman dot geo at yahoo dot com

“You, sir, are nothing but a pathetically lame salesdroid!
I fart in your general direction!”
– Randseed on #Linux

#3 0x080ad77a in rb_bug (fmt=0x80b1300 “gc_sweep(): unknown data type %d”) at error.c:180
180 abort();
(gdb)
#4 0x08066fbf in obj_free (obj=1075318916) at gc.c:949
949 rb_bug(“gc_sweep(): unknown data type %d”,
(gdb) rp obj
Unknown
(gdb) print *(struct RBasic *)obj
$1 = {flags = 1075318812, klass = 8}
(gdb) print ((struct RBasic *)obj)->flags & 0x3f
$2 = 28

Bug in GCC 3.2?

[sorry for taking so long to reply; the MTA of my site was down]

···

On Mon, Mar 03, 2003 at 10:38:59AM +0900, nobu.nokada@softhome.net wrote:

I finally found a way to reproduce the bug with some predictability
(it only happens when Ruby is compiled with gcc 3.2; could it be that the
compiler is broken?).

If it is a bug, it is a new one.

Converting /tmp/tst1.html
./dict.rb:316: [BUG] gc_sweep(): unknown data type 28
ruby 1.6.8 (2003-02-28) [i386-linux]

This time I got a core dump with all the info:
#0 0x40094c01 in kill () from /lib/libc.so.6
#1 0x40094a12 in raise () from /lib/libc.so.6
#2 0x40095c3e in abort () from /lib/libc.so.6
#3 0x080ad77a in rb_bug (fmt=0x80b1300 “gc_sweep(): unknown data type
%d”) at error.c:180
#4 0x08066fbf in obj_free (obj=1075318916) at gc.c:949

What’s this obj’s content? Try `rp obj’ with attached
“.gdbinit”.


_ _

__ __ | | ___ _ __ ___ __ _ _ __
'_ \ / | __/ __| '_ _ \ / ` | ’ \
) | (| | |
__ \ | | | | | (| | | | |
.__/ _,
|_|/| || ||_,|| |_|
Running Debian GNU/Linux Sid (unstable)
batsman dot geo at yahoo dot com

…[Linux’s] capacity to talk via any medium except smoke signals.
– Dr. Greg Wettstein, Roger Maris Cancer Center

Hi,

···

In message “Re: [BUG] gc_sweep(): unknown data type 48” on 03/03/04, Mauricio Fernández batsman.geo@yahoo.com writes:

Bug in GCC 3.2?

I’m afraid so. But, too often, when I doubt others, it turns out to
be my (our) fault after all. There’s old saying: “there’s no bug in
the compiler”.

						matz.

Can I give you any more information?

···

On Tue, Mar 04, 2003 at 06:06:58PM +0900, Yukihiro Matsumoto wrote:

Hi,

In message “Re: [BUG] gc_sweep(): unknown data type 48” > on 03/03/04, Mauricio Fernández batsman.geo@yahoo.com writes:

Bug in GCC 3.2?

I’m afraid so. But, too often, when I doubt others, it turns out to
be my (our) fault after all. There’s old saying: “there’s no bug in
the compiler”.


_ _

__ __ | | ___ _ __ ___ __ _ _ __
'_ \ / | __/ __| '_ _ \ / ` | ’ \
) | (| | |
__ \ | | | | | (| | | | |
.__/ _,
|_|/| || ||_,|| |_|
Running Debian GNU/Linux Sid (unstable)
batsman dot geo at yahoo dot com

Q: What’s the big deal about rm, I have been deleting stuff for years? And
never lost anything… oops!
A: …
– From the Frequently Unasked Questions

The best is to try to give a simple example where it's easy to reproduce
the problem : I know that it's difficult in your case

p.s. : personnaly I say that nobody has never found a bug in the ruby GC

Guy Decoux

···

On Tue, Mar 04, 2003 at 06:06:58PM +0900, Yukihiro Matsumoto wrote:

I'm afraid so. But, too often, when I doubt others, it turns out to
be my (our) fault after all. There's old saying: "there's no bug in
the compiler".

Can I give you any more information?