[BUG] Frequent Segfault with stable snapshot 2004-08-23

Since upgrading from a stable snapshot dated 2004-05-26 to a recent one dated
2004-08-23, I am getting regular segfaults from a ruby script which has worked
flawlessly for ages.

The output looks like this:

  /home/andrew/rubyx/lib/ruby/site_ruby/rubyx/strfile.rb:54: [BUG] Segmentation fault
  ruby 1.8.2 (2004-08-22) [i686-linux]

  Aborted

The relevant portion of code is this ls() function added to the String object.
The offending line 54 is indicated with *

  class String
    def ls()
      fa=Dir.glob(self+"/{.,}?*")
* fa.delete(self+"/..")
      return fa
    end
  end

Anything relevant changed recently?

Andrew Walrond

Andrew Walrond wrote:

Since upgrading from a stable snapshot dated 2004-05-26 to a recent one dated
2004-08-23, I am getting regular segfaults from a ruby script which has worked
flawlessly for ages.

The output looks like this:

  /home/andrew/rubyx/lib/ruby/site_ruby/rubyx/strfile.rb:54: [BUG] Segmentation fault
  ruby 1.8.2 (2004-08-22) [i686-linux]

  Aborted

The relevant portion of code is this ls() function added to the String object.
The offending line 54 is indicated with *

  class String
    def ls()
      fa=Dir.glob(self+"/{.,}?*")
* fa.delete(self+"/..")
      return fa
    end
  end

Anything relevant changed recently?

Andrew Walrond

This code works for me in both

$ ruby -v
ruby 1.9.0 (2004-08-25) [i686-linux]
$ ruby-stable-snapshot -v
ruby 1.8.2 (2004-08-24) [i686-linux]

The exact code I am using is:

   class String
     def ls()
       fa=Dir.glob(self+"/{.,}?*")
       fa.delete(self+"/..")
       return fa
     end
   end

   p ".".ls

It may be something data dependent. Could you try this:

* take it out of the class String and do it inline, with self being
  replaced by my_string or something
* replace my_string with it's value
* ls the directory on which it crashes to a file.
* make a little sample app in which the array is built (via %w{} say)
  without using Dir
* Do a binary-search-ish reduction to find the smallest array-of-strings
on which delete("xxxx/..") segfaults

The simplest case in which it fails is often the most enlightening.

Other questions:

* Have you tried it on multiple machines
* How frequently does it fail? Always on a particular directory?
Sometimes, on any directory? Occasionally, with no real pattern? Only
when ruby is compiled with your personal fork of glibc? (Just kidding
about the last one, I hope).

-- MarkusQ

···

On Fri, 2004-08-27 at 13:32, Andrew Walrond wrote:

Since upgrading from a stable snapshot dated 2004-05-26 to a recent one dated
2004-08-23, I am getting regular segfaults from a ruby script which has worked
flawlessly for ages.

The output looks like this:

  /home/andrew/rubyx/lib/ruby/site_ruby/rubyx/strfile.rb:54: [BUG] Segmentation fault
  ruby 1.8.2 (2004-08-22) [i686-linux]

  Aborted

The relevant portion of code is this ls() function added to the String object.
The offending line 54 is indicated with *

  class String
    def ls()
      fa=Dir.glob(self+"/{.,}?*")
* fa.delete(self+"/..")
      return fa
    end
  end

Anything relevant changed recently?

Andrew Walrond

CORRECTION:

On further testing I am also seeing the segfaults with the previously (known good)
snapshot.
The only relevant change is that both ruby snapshots were compiled with -O3, rather
than -O2 as before. I will do some further testing to try and confirm that this is
indeed the problem.

FYI I am using gcc 3.4.1

Andrew Walrond

···

On Sat, Aug 28, 2004 at 05:32:58AM +0900, Andrew Walrond wrote:

Since upgrading from a stable snapshot dated 2004-05-26 to a recent one dated
2004-08-23, I am getting regular segfaults from a ruby script which has worked
flawlessly for ages.

This code works for me in both

$ ruby -v
ruby 1.9.0 (2004-08-25) [i686-linux]
$ ruby-stable-snapshot -v
ruby 1.8.2 (2004-08-24) [i686-linux]

The exact code I am using is:

  class String
    def ls()
      fa=Dir.glob(self+"/{.,}?*")
      fa.delete(self+"/..")
      return fa
    end
  end

  p ".".ls

Yes, it also works for me most of the time. This function is called many hundreds
of times during execution of the main script, and the script as a whole only fails
one in every 5 tries or so.

Andrew

I think that is unlikely, simply because the segfault only occurs 1 in every
few attempts, even with exactly same data.

I will reconfirm this, and also that there are no failures with the previous snapshot,
and try and narrow down the problem.

In the meantime, (and the real purpose of my initial email ;)) - if the array or array
delete or string code has been breathed on recently, please review the changes
carefully :wink:

Andrew Walrond

···

On Sat, Aug 28, 2004 at 06:00:57AM +0900, Markus wrote:

It may be something data dependent. Could you try this:

After significant testing, I can only reproduce the sefgault when ruby is compiled
with -O3.

I will try and produce a script which can trigger the segfault easily, but I
would advise anyone against compiling ruby with -O3 until this is nailed down.

I have been using gcc 3.4.1, but I don't know if the bug is due to or confined
to that most recent version of gcc.

Andrew Walrond

···

On Sat, Aug 28, 2004 at 06:52:32AM +0900, Andrew Walrond wrote:

On further testing I am also seeing the segfaults with the previously (known good)
snapshot.
The only relevant change is that both ruby snapshots were compiled with -O3, rather
than -O2 as before. I will do some further testing to try and confirm that this is
indeed the problem.

FYI I am using gcc 3.4.1

> It may be something data dependent. Could you try this:

I think that is unlikely, simply because the segfault only occurs 1 in every
few attempts, even with exactly same data.

Check your logic. It could be (for example) a variable that is
uninitialized only under rather rare circumstances, and even then still
works most of the time since the "uninitialized" value is usually
something non-fatal (see my BIGNUM patch earlier on the list today for
what may be an example of this).

Rare or not rare doesn't have anything to do with data dependence, so
there's no harm trying to isolate causes.

I will reconfirm this, and also that there are no failures with the previous snapshot,
and try and narrow down the problem.

Good idea. You may also want to try it on a different machine, etc.

In the meantime, (and the real purpose of my initial email ;)) - if the array or array
delete or string code has been breathed on recently, please review the changes
carefully :wink:

I don't know, but the CVS is web accessible.

I'm about to leave for a camping trip, but I'll catch up on the list
Sunday evening. If it's still a problem I'll try to replicate it.

-- MarkusQ

···

On Fri, 2004-08-27 at 14:28, Andrew Walrond wrote:

On Sat, Aug 28, 2004 at 06:00:57AM +0900, Markus wrote: