Help with a segv in mod_ruby

I’m getting a segv in mod_ruby:

#0 0x402c06f5 in st_lookup (table=0x81645f0,
key=0xc19 <Address 0xc19 out of bounds>, value=0xbfffd140) at st.c:253
#1 0x4026e895 in search_method (klass=1079869732, id=3097, origin=0xbfffd17c)
at eval.c:250
#2 0x4026e8e1 in rb_get_method_body (klassp=0xbfffd1cc, idp=0xbfffd1bc,
noexp=0xbfffd1c0) at eval.c:268
#3 0x402784b4 in rb_call (klass=1079869732, recv=4, mid=3097, argc=1,
argv=0xbfffd1ec, scope=1) at eval.c:4582
#4 0x40278869 in rb_funcall (recv=4, mid=3097, n=1) at eval.c:4679
#5 0x40274b47 in rb_eval (self=1079664792, n=0x405a5300) at eval.c:3049
#6 0x40271b63 in rb_eval (self=1079664792, n=0x405a5ecc) at eval.c:2027
#7 0x4026f99c in eval_node (self=1079664792, node=0x405a5ecc) at eval.c:1057
#8 0x40279c3d in rb_load (fname=1079664832, wrap=1) at eval.c:5258
#9 0x40279ea2 in rb_f_load (argc=2, argv=0xbfffdf4c) at eval.c:5306

The thing is, I’ve seen this before, and for the life of me I can’t
remember the cause. This same application has been running fine for
weeks, now it segvs every time it starts.

Any ideas?

Cheers

Dave

Dave Thomas wrote:

I’m getting a segv in mod_ruby:

Always helps me to know whereabouts in the Ruby code had it got to, i.e.
taking a look at ruby_sourceline and ruby_sourcefile from gdb. At least if
it’s consistent you know which bit of Ruby code to look at.

···


Matthew Bloch Bytemark Computer Consulting Limited
http://www.bytemark.co.uk/
tel. +44 (0) 8707 455026

I’m getting a segv in mod_ruby:
Hi Dave,

Do you happen to have installed a new (version of a ) library
that gets used?

Sth. along the lines of: the library was compiled
for version 13.1, but now gets used for version 13.2
or vice versa.

Sorry no better idea to offer,
A

···

#0 0x402c06f5 in st_lookup (table=0x81645f0,
key=0xc19 <Address 0xc19 out of bounds>, value=0xbfffd140) at
st.c:253 #1 0x4026e895 in search_method (klass=1079869732, id=3097,
origin=0xbfffd17c) at eval.c:250
#2 0x4026e8e1 in rb_get_method_body (klassp=0xbfffd1cc, idp=0xbfffd1bc,
noexp=0xbfffd1c0) at eval.c:268
#3 0x402784b4 in rb_call (klass=1079869732, recv=4, mid=3097, argc=1,
argv=0xbfffd1ec, scope=1) at eval.c:4582
#4 0x40278869 in rb_funcall (recv=4, mid=3097, n=1) at eval.c:4679
#5 0x40274b47 in rb_eval (self=1079664792, n=0x405a5300) at eval.c:3049
#6 0x40271b63 in rb_eval (self=1079664792, n=0x405a5ecc) at eval.c:2027
#7 0x4026f99c in eval_node (self=1079664792, node=0x405a5ecc) at
eval.c:1057 #8 0x40279c3d in rb_load (fname=1079664832, wrap=1) at
eval.c:5258 #9 0x40279ea2 in rb_f_load (argc=2, argv=0xbfffdf4c) at
eval.c:5306 …

The thing is, I’ve seen this before, and for the life of me I can’t
remember the cause. This same application has been running fine for
weeks, now it segvs every time it starts.

Any ideas?

Cheers

Dave


Armin.


Armin Roehrl, http://www.approximity.com
Training, Development and Mentoring
OOP, XP, Java, Ruby, Smalltalk, Datamining, Parallel computing, Webservices

Rubybuch: http://approximity.com/rubybuch/

Agile Entwicklerkonferenz: 22. und 23.10 in Nürnberg.
http://www.approximity.com/public/conferences/AgileConf.html

Armin Roehrl armin@xss.de writes:

Do you happen to have installed a new (version of a ) library that
gets used?

Sth. along the lines of: the library was compiled for version 13.1,
but now gets used for version 13.2 or vice versa.

Well, the problem goes away when I comment out the following three
lines of code:

def nil.empty?
true
end

Could this be an interpreter problem?

Dave

   def nil.empty?
      true
   end

Yes, the problem is on these line

   #0 0x402c06f5 in st_lookup (table=0x81645f0,
       key=0xc19 <Address 0xc19 out of bounds>, value=0xbfffd140) at st.c:253
   #1 0x4026e895 in search_method (klass=1079869732, id=3097, origin=0xbfffd17c)
       at eval.c:250
   #2 0x4026e8e1 in rb_get_method_body (klassp=0xbfffd1cc, idp=0xbfffd1bc,
       noexp=0xbfffd1c0) at eval.c:268
   #3 0x402784b4 in rb_call (klass=1079869732, recv=4, mid=3097, argc=1,
       argv=0xbfffd1ec, scope=1) at eval.c:4582
   #4 0x40278869 in rb_funcall (recv=4, mid=3097, n=1) at eval.c:4679

recv=4 ===> self = nil
mid=3097 ===> method is #singleton_method_added

ruby search the method #singleton_method_added in NilClass

Could this be an interpreter problem?

interpreter or mod_ruby problem

If you can give your complete configuration (apparently you run 1.6.7,
right ?) and your script to try to reproduce the segfault ?

Guy Decoux

“Dave Thomas” wrote

Well, the problem goes away when I comment out the following three
lines of code:

def nil.empty?
true
end

Have you tried using

class NilClass
def empty?
true
end
end

instead?

/Christoph

Dave Thomas Dave@PragmaticProgrammer.com wrote in message news:m2bs82pd9w.fsf_-_@zip.local.thomases.com

Armin Roehrl armin@xss.de writes:

Do you happen to have installed a new (version of a ) library that
gets used?

Sth. along the lines of: the library was compiled for version 13.1,
but now gets used for version 13.2 or vice versa.

Well, the problem goes away when I comment out the following three
lines of code:

def nil.empty?
true
end

Could this be an interpreter problem?

This sounds like an interpreter problem that we hit while working on
the narf cgi. Our bug caused the interpreter to segfault. I isolated
code to cause the fault on cygwin, but that particular series didn’t
cause segfaults in linux.

The lines of code in question had to do with nils, too.

Our guess was that it was some type of corrupted memory problem – the
nil lines would cause the corruption, but something else would have to
walk over the problem to cause a segfault.

~ Patrick

ts decoux@moulon.inra.fr writes:

If you can give your complete configuration (apparently you run 1.6.7,
right ?) and your script to try to reproduce the segfault ?

Guy:

The following script, if run twice, segv’s the second time.

 req = Apache::request
 req.status = Apache::HTTP_OK
 req.content_type = 'text/html'
 req.send_http_header

 def nil.empty?
   true
 end

 $stderr.puts "In t.rb"
 puts "Hello, world!"

This is with

 ruby 1.6.7 (2002-08-01)
 mod_ruby 0.9.9
 Apache 1.3.26

The full backtrace for this case is below. Is it possible that
st_lookup is thinking it’s using a strhash, rather than a numhash?

 #0  0x40191cd8 in main_arena () from /lib/libc.so.6
 #1  0x402c06fa in st_lookup (table=0x81fdd30, key=0xc19 <Address 0xc19 out of bounds>, 
     value=0xbfffd140) at st.c:253
 #2  0x4026e895 in search_method (klass=1078135192, id=3097, origin=0xbfffd17c) at eval.c:250
 #3  0x4026e8e1 in rb_get_method_body (klassp=0xbfffd1cc, idp=0xbfffd1bc, noexp=0xbfffd1c0)
     at eval.c:268
 #4  0x402784b4 in rb_call (klass=1078135192, recv=4, mid=3097, argc=1, argv=0xbfffd1ec, scope=1)
     at eval.c:4582
 #5  0x40278869 in rb_funcall (recv=4, mid=3097, n=1) at eval.c:4679
 #6  0x40274b47 in rb_eval (self=1078124132, n=0x4042da7c) at eval.c:3049
 #7  0x40271b63 in rb_eval (self=1078124132, n=0x4042dc98) at eval.c:2027
 #8  0x4026f99c in eval_node (self=1078124132, node=0x4042dc98) at eval.c:1057
 #9  0x40279c3d in rb_load (fname=1078124172, wrap=1) at eval.c:5258
 #10 0x40279ea2 in rb_f_load (argc=2, argv=0xbfffdf4c) at eval.c:5306
 #11 0x40277878 in call_cfunc (func=0x40279e60 <rb_f_load>, recv=1078133872, len=-1, argc=2, 
     argv=0xbfffdf4c) at eval.c:4248
 #12 0x40277e02 in rb_call0 (klass=1078201492, recv=1078133872, id=8473, argc=2, argv=0xbfffdf4c, 
     body=0x40433c24, nosuper=1) at eval.c:4385
 #13 0x40278614 in rb_call (klass=1078201492, recv=1078133872, mid=8473, argc=2, argv=0xbfffdf4c, 
     scope=1) at eval.c:4605
 #14 0x4027356c in rb_eval (self=1078133872, n=0x4042f8a4) at eval.c:2546
 #15 0x40271b63 in rb_eval (self=1078133872, n=0x4042ff70) at eval.c:2027
 #16 0x402781f8 in rb_call0 (klass=1078134792, recv=1078133872, id=10881, argc=0, argv=0xbfffec00, 
     body=0x4042ff70, nosuper=0) at eval.c:4512
 #17 0x40278614 in rb_call (klass=1078134792, recv=1078133872, mid=10881, argc=1, argv=0xbfffebfc, 
     scope=1) at eval.c:4605
 #18 0x402788e9 in rb_funcall2 (recv=1078133872, mid=10881, argc=1, argv=0xbfffebfc) at eval.c:4689
 #19 0x4025a78a in protect_funcall0 (arg=3221220404) at mod_ruby.c:240
 #20 0x402770ba in rb_protect (proc=0x4025a760 <protect_funcall0>, data=3221220404, state=0xbfffec90)
     at eval.c:4008
 #21 0x4025a828 in rb_protect_funcall (recv=1078133872, mid=10881, state=0xbfffec90, argc=1)
     at mod_ruby.c:270
 #22 0x4025bbcb in ruby_handler_0 (arg=0xbfffef18) at mod_ruby.c:890
 #23 0x4025b8f5 in run_safely_0 (arg=0xbfffeea4) at mod_ruby.c:798
 #24 0x40280287 in rb_thread_start_0 (fn=0x4025b8a0 <run_safely_0>, arg=0xbfffeea4, th_arg=0x82070a0)
     at eval.c:8376
 #25 0x40280487 in rb_thread_create (fn=0x4025b8a0 <run_safely_0>, arg=0xbfffeea4) at eval.c:8428
 #26 0x4025b968 in run_safely (safe_level=1, timeout=270, func=0x4025bb80 <ruby_handler_0>, 
     arg=0xbfffef18, retval=0xbfffef14) at mod_ruby.c:817
 #27 0x4025bd4f in ruby_handler (r=0x8204a1c, handlers_arr=0x80bab34, mid=10881, run_all=0, flush=1)
     at mod_ruby.c:944
 #28 0x4025be2e in ruby_object_handler (r=0x8204a1c) at mod_ruby.c:964
 #29 0x80550a9 in ap_invoke_handler ()
 #30 0x806b4ff in process_request_internal ()
 #31 0x806b572 in ap_process_request ()
 #32 0x8061c96 in child_main ()
 #33 0x8061f1a in make_child ()
 #34 0x8061fd6 in startup_children ()
 #35 0x806267d in standalone_main ()
 #36 0x8062eec in main ()
 #37 0x400a2baf in __libc_start_main () from /lib/libc.so.6

Thanks for looking at this.

Dave

“Christoph” chr_news@gmx.net writes:

def nil.empty?
true
end

Have you tried using

class NilClass
def empty?
true
end
end

Hmm… interesting.

The problem indeed goes away if I extend NilClass. So it would seem to
be a problem adding a singleton to NilClass under mod_ruby… Anyone
any ideas?

Dave

ts decoux@moulon.inra.fr wrote in message news:200208171152.g7HBqCg16388@moulon.inra.fr

If you can give your complete configuration (apparently you run 1.6.7,
right ?) and your script to try to reproduce the segfault ?

My config:
Windows 2k with ruby 1.6.5 (2001-09-19) [i386-cygwin]

to reproduce my segfault in cygwin ruby requires 2 files:

good luck,

~ Patrick

bug.rb

class TestSuite
def initialize()
nil == nil
end
end

nil.extend Enumerable

require ‘bug2’

bug2.rb

TestSuite.new

     ruby 1.6.7 (2002-08-01)
     mod_ruby 0.9.9
     Apache 1.3.26

I've tried with

  Server: Apache/1.3.26 (Unix) Debian GNU/Linux mod_ruby/0.9.9 Ruby/1.6.7

aestivum% ruby -v
ruby 1.6.7 (2002-08-12) [i686-linux]
aestivum%

Not found 'ruby 1.6.7 (2002-08-01)' but it seems to exist a strange
interaction between mod_ruby/apache/ruby

For example, I don't have a segfault but an infinite loop for the second
request

(gdb) p rb_cNilClass
$6 = 1077541192
(gdb) p *(struct RClass *)rb_cNilClass
$7 = {basic = {flags = 2051, klass = 1077607672}, iv_tbl = 0x81087e0,
  m_tbl = 0x81153a0, super = 1077541192}
(gdb)

See that rb_cNilClass->super == rb_cNilClass

Apparently the only line in the source where rb_cNilClass is modified is
in class.c

#define SPECIAL_SINGLETON(x,c) if (obj == (x)) {\
    if (!FL_TEST(c, FL_SINGLETON)) {\
        c = rb_singleton_class_new(c);\
        rb_singleton_class_attached(c,obj);\
    }\
    return c;\
}

VALUE
rb_singleton_class(obj)
    VALUE obj;
{
    VALUE klass;

    if (FIXNUM_P(obj) || SYMBOL_P(obj)) {
        rb_raise(rb_eTypeError, "can't define singleton");
    }
    if (rb_special_const_p(obj)) {
        SPECIAL_SINGLETON(Qnil, rb_cNilClass);
        SPECIAL_SINGLETON(Qfalse, rb_cFalseClass);
        SPECIAL_SINGLETON(Qtrue, rb_cTrueClass);
        rb_bug("unknown immediate %d", obj);
    }

[...]

but I don't understand how this can have an effect with apache
(multi-thread/process ???)

Guy Decoux

Hi,

···

At Sun, 18 Aug 2002 10:42:53 +0900, Patrick May wrote:

If you can give your complete configuration (apparently you run 1.6.7,
right ?) and your script to try to reproduce the segfault ?

My config:
Windows 2k with ruby 1.6.5 (2001-09-19) [i386-cygwin]

to reproduce my segfault in cygwin ruby requires 2 files:

Perhaps, Since new signleton classes for nil/false/true are
never referred from them, these classes will be collected.

This patch may fix the problem, but I guess the solution in 1.7
is much better.

Index: class.c

RCS file: /cvs/ruby/src/ruby/class.c,v
retrieving revision 1.14.2.15
diff -u -2 -p -r1.14.2.15 class.c
— class.c 11 Jul 2002 08:24:53 -0000 1.14.2.15
+++ class.c 18 Aug 2002 02:02:26 -0000
@@ -613,7 +613,8 @@ rb_undef_method(klass, name)
}

-#define SPECIAL_SINGLETON(x,c) if (obj == (x)) {
+#define SPECIAL_SINGLETON(x,c,n) if (obj == (x)) {
if (!FL_TEST(c, FL_SINGLETON)) {
c = rb_singleton_class_new©;\

  • st_add_direct(rb_class_tbl, rb_intern(n), c);
    rb_singleton_class_attached(c,obj);
    }
    @@ -631,7 +632,7 @@ rb_singleton_class(obj)
    }
    if (rb_special_const_p(obj)) {
  • SPECIAL_SINGLETON(Qnil, rb_cNilClass);
  • SPECIAL_SINGLETON(Qfalse, rb_cFalseClass);
  • SPECIAL_SINGLETON(Qtrue, rb_cTrueClass);
  • SPECIAL_SINGLETON(Qnil, rb_cNilClass, “NilClass”);
  • SPECIAL_SINGLETON(Qfalse, rb_cFalseClass, “FalseClass”);
  • SPECIAL_SINGLETON(Qtrue, rb_cTrueClass, “TrueClass”);
    rb_bug(“unknown immediate %d”, obj);
    }


Nobu Nakada

ts decoux@moulon.inra.fr wrote in message news:200208171647.g7HGlnW16951@moulon.inra.fr

but I don’t understand how this can have an effect with apache
(multi-thread/process ???)

If he’s running with mod_ruby, the second time may be run on the same interpreter.

~ Patrick

Hi,

···

At Sun, 18 Aug 2002 11:06:17 +0900, nobu.nokada@softhome.net wrote:

 if (!FL_TEST(c, FL_SINGLETON)) {\

c = rb_singleton_class_new©;\

  • st_add_direct(rb_class_tbl, rb_intern(n), c);

Oops, this line lacks \ at the end.


Nobu Nakada