Very Strange Error(s)

Hi,

I've been encountering strange behaviour with Ruby recently. I'm wondering if anybody has seen anything like this before or if anyone has any ideas.

Here's what's happening:

- something that should be a String turns out to be an Integer
- if I run the same method call again, say by using a begin...rescue in a loop, and I loop a few times it eventually works with identical input -- if I do run it again, and if an error occurs again, it is usually in a different place
- long running Ruby process (actually Rails running webrick in development mode) -- I *think* I've never seen this in a short running process
- the frequency of the error does not seem to increase (so once it happens it does not start happening all the time, the system seems to be behaving properly)
- error happens during a request by the user.
- error normally quite deep in an execution stack
- ruby 1.8.4 (2006-03-28) [powerpc-darwin8.5.0] -- that's OS X
- I've had this happen in XML parsers (REXML and xampl-pp), in rubyful soup, and in some code that simply prints a tree of objects
- I cannot reproduce it reliably

The going away on second or third try is particularly strange.

Any ideas?

Thanks,
Bob

···

----
Bob Hutchison -- blogs at <http://www.recursive.ca/hutch/>
Recursive Design Inc. -- <http://www.recursive.ca/>
Raconteur -- <http://www.raconteur.info/>
xampl for Ruby -- <http://rubyforge.org/projects/xampl/>

This is happening here as well sporadically, and I can't reproduce it either. Mac OS X 10.4.6, ruby 1.8.4 (2005-12-24) [powerpc-darwin8.3.0] compiled from tarball.

-- fxn

···

On Apr 18, 2006, at 22:10, Bob Hutchison wrote:

I've been encountering strange behaviour with Ruby recently. I'm wondering if anybody has seen anything like this before or if anyone has any ideas.

Here's what's happening:

- something that should be a String turns out to be an Integer

Are you absolutely sure that to_i isn't being called somewhere along
the way (maybe in some odd cases?)

Ryan

···

On 4/18/06, Bob Hutchison <hutch@recursive.ca> wrote:

Hi,

I've been encountering strange behaviour with Ruby recently. I'm
wondering if anybody has seen anything like this before or if anyone
has any ideas.

Here's what's happening:

- something that should be a String turns out to be an Integer

I've been experiencing this same thing on Mac OS X. I haven't been
able to come up with a small piece of code but it seems to happen most
often in my network intensive and threaded code (I have a custom
memcached client that seems to become unstable on heavy loads). I
should also note that this has been for sometime but on extremely rare
and hard to reproduce cases.

I am running on:

ruby 1.8.4 (2006-04-18) [powerpc-darwin8.6.0]

I can't share the code at this point unfortunately (too large and it
is up to my client that I am in contract with -- I will try to come up
with something I can though). Examples of traces I get (in my log
files):

!! undefined method `wakeup' for -517611318:Fixnum
/usr/local/lib/ruby/1.8/thread.rb:116:in `unlock'
/usr/local/lib/ruby/1.8/thread.rb:137:in `synchronize'
.. long trace ..

!! wrong instance allocation
memcache/memcache.rb:209:in `exception'
memcache/memcache.rb:209:in `incr'
.. long trace ..

One thing I can note at this point is that it seems to always be the
-517611318 magic number or a "wrong instance allocation" error. It
seems to be limited to Mac OS X (I've not tested anything else but
different Linux systems).

I would be happy to run any test code people might have to help narrow
the field down. If I get any further information I will post it to
ruby-core.

Brian.

···

On 4/18/06, Bob Hutchison <hutch@recursive.ca> wrote:

Hi,

I've been encountering strange behaviour with Ruby recently. I'm
wondering if anybody has seen anything like this before or if anyone
has any ideas.

Here's what's happening:

- something that should be a String turns out to be an Integer
- if I run the same method call again, say by using a begin...rescue
in a loop, and I loop a few times it eventually works with identical
input -- if I do run it again, and if an error occurs again, it is
usually in a different place
- long running Ruby process (actually Rails running webrick in
development mode) -- I *think* I've never seen this in a short
running process
- the frequency of the error does not seem to increase (so once it
happens it does not start happening all the time, the system seems to
be behaving properly)
- error happens during a request by the user.
- error normally quite deep in an execution stack
- ruby 1.8.4 (2006-03-28) [powerpc-darwin8.5.0] -- that's OS X
- I've had this happen in XML parsers (REXML and xampl-pp), in
rubyful soup, and in some code that simply prints a tree of objects
- I cannot reproduce it reliably

The going away on second or third try is particularly strange.

Any ideas?

Yeah, I've seen this plenty myself. I believe it is this:

http://ruby-talk.org/cgi-bin/scat.rb/ruby/ruby-core/7401

And if I understood the discussion (way over my head), it has been fixed:

http://ruby-talk.org/cgi-bin/scat.rb/ruby/ruby-core/7477

James Edward Gray II

···

On Apr 18, 2006, at 3:10 PM, Bob Hutchison wrote:

Hi,

I've been encountering strange behaviour with Ruby recently. I'm wondering if anybody has seen anything like this before or if anyone has any ideas.

I found this <http://blog.segment7.net/articles/2006/04/07/chasing-undefined-method-for-fixnum&gt; just now. Looks pretty relevant. Seems as though a re-compile with -O0 might be a work around. Some suggestion that using gcc 3.3 might work too.

···

On Apr 18, 2006, at 4:10 PM, Bob Hutchison wrote:

Hi,

I've been encountering strange behaviour with Ruby recently. I'm wondering if anybody has seen anything like this before or if anyone has any ideas.

Here's what's happening:

- something that should be a String turns out to be an Integer
- if I run the same method call again, say by using a begin...rescue in a loop, and I loop a few times it eventually works with identical input -- if I do run it again, and if an error occurs again, it is usually in a different place
- long running Ruby process (actually Rails running webrick in development mode) -- I *think* I've never seen this in a short running process
- the frequency of the error does not seem to increase (so once it happens it does not start happening all the time, the system seems to be behaving properly)
- error happens during a request by the user.
- error normally quite deep in an execution stack
- ruby 1.8.4 (2006-03-28) [powerpc-darwin8.5.0] -- that's OS X
- I've had this happen in XML parsers (REXML and xampl-pp), in rubyful soup, and in some code that simply prints a tree of objects
- I cannot reproduce it reliably

The going away on second or third try is particularly strange.

Any ideas?

Thanks,
Bob

----
Bob Hutchison -- blogs at <http://www.recursive.ca/hutch/&gt;
Recursive Design Inc. -- <http://www.recursive.ca/&gt;
Raconteur -- <http://www.raconteur.info/&gt;
xampl for Ruby -- <http://rubyforge.org/projects/xampl/&gt;

----
Bob Hutchison -- blogs at <http://www.recursive.ca/hutch/&gt;
Recursive Design Inc. -- <http://www.recursive.ca/&gt;
Raconteur -- <http://www.raconteur.info/&gt;
xampl for Ruby -- <http://rubyforge.org/projects/xampl/&gt;

I've not been having any of these problems myself. I've been running WEBrick for days now.

ruby 1.8.4 (2005-12-24) [powerpc-darwin8.6.0]

- Jake McArthur

Actually, from what I can tell the issue is separate (or possibly and
unsolved factor). You might note that my post above is using a check
out I made from stable today.

Brian.

···

On 4/18/06, James Edward Gray II <james@grayproductions.net> wrote:

Yeah, I've seen this plenty myself. I believe it is this:

http://ruby-talk.org/cgi-bin/scat.rb/ruby/ruby-core/7401

And if I understood the discussion (way over my head), it has been
fixed:

http://ruby-talk.org/cgi-bin/scat.rb/ruby/ruby-core/7477

Excellent! So just to double-check, is it an issue that has been seen only in OSX?

-- fxn

···

On Apr 18, 2006, at 23:42, Bob Hutchison wrote:

I found this <http://blog.segment7.net/articles/2006/04/07/chasing-undefined-method-for-fixnum&gt; just now. Looks pretty relevant. Seems as though a re-compile with -O0 might be a work around. Some suggestion that using gcc 3.3 might work too.

Oops, sorry. My bad.

James Edward Gray II

···

On Apr 18, 2006, at 4:09 PM, Brian Mitchell wrote:

On 4/18/06, James Edward Gray II <james@grayproductions.net> wrote:

Yeah, I've seen this plenty myself. I believe it is this:

http://ruby-talk.org/cgi-bin/scat.rb/ruby/ruby-core/7401

And if I understood the discussion (way over my head), it has been
fixed:

http://ruby-talk.org/cgi-bin/scat.rb/ruby/ruby-core/7477

Actually, from what I can tell the issue is separate (or possibly and
unsolved factor). You might note that my post above is using a check
out I made from stable today.

I found this <http://blog.segment7.net/articles/2006/04/07/chasing-undefined-method-for-fixnum&gt; just now. Looks pretty relevant. Seems as though a re-compile with -O0 might be a work around. Some suggestion that using gcc 3.3 might work too.

Excellent! So just to double-check, is it an issue that has been seen only in OSX?

It sure looks that way, but I don't know for sure. Someone thought that it might be happening on Windows too, but, to me, it sounded like a different problem. All references/hints that I've come across have been pretty recent compilations of Ruby, even though the Ruby version might be old. So that's a bit of support for the work around.

I hope that work around works. Though it is nice to know I'm not alone :slight_smile:

Cheers,
Bob

···

On Apr 19, 2006, at 3:11 AM, Xavier Noria wrote:

On Apr 18, 2006, at 23:42, Bob Hutchison wrote:

-- fxn

----
Bob Hutchison -- blogs at <http://www.recursive.ca/hutch/&gt;
Recursive Design Inc. -- <http://www.recursive.ca/&gt;
Raconteur -- <http://www.raconteur.info/&gt;
xampl for Ruby -- <http://rubyforge.org/projects/xampl/&gt;

An update...

I've re-compiled with the compiler's optimiser set to: -O0 and have not had that error come up again. However, Ruby is *very* slow.

Also I say this today: <http://blog.segment7.net/articles/2006/04/20/update-on-undefined-method-for-fixnum&quot;&gt;

Cheers,
Bob

···

On Apr 19, 2006, at 7:11 AM, Bob Hutchison wrote:

On Apr 19, 2006, at 3:11 AM, Xavier Noria wrote:

On Apr 18, 2006, at 23:42, Bob Hutchison wrote:

I found this <http://blog.segment7.net/articles/2006/04/07/chasing-undefined-method-for-fixnum&gt; just now. Looks pretty relevant. Seems as though a re-compile with -O0 might be a work around. Some suggestion that using gcc 3.3 might work too.

Excellent! So just to double-check, is it an issue that has been seen only in OSX?

It sure looks that way, but I don't know for sure. Someone thought that it might be happening on Windows too, but, to me, it sounded like a different problem. All references/hints that I've come across have been pretty recent compilations of Ruby, even though the Ruby version might be old. So that's a bit of support for the work around.

I hope that work around works. Though it is nice to know I'm not alone :slight_smile:

Cheers,
Bob

-- fxn

----
Bob Hutchison -- blogs at <http://www.recursive.ca/hutch/&gt;
Recursive Design Inc. -- <http://www.recursive.ca/&gt;
Raconteur -- <http://www.raconteur.info/&gt;
xampl for Ruby -- <http://rubyforge.org/projects/xampl/&gt;

----
Bob Hutchison -- blogs at <http://www.recursive.ca/hutch/&gt;
Recursive Design Inc. -- <http://www.recursive.ca/&gt;
Raconteur -- <http://www.raconteur.info/&gt;
xampl for Ruby -- <http://rubyforge.org/projects/xampl/&gt;