Bug in Fox or in Ruby?

If I run this and press the “Pow!” button twice, I get a segfault. I’ve
tried whittling the example down, but everything seems to play a role…

Turning off GC prevents it, so it may be that something’s not getting
marked correctly.

Versions:

FXRuby 1.0.13
Fox 1.0.11
Ruby 1.6.7
Mandrake-Linux 8.1

bug.rb (1.65 KB)

Absolutely no problems nor segfaults!

ruby 1.7.3 (2002-10-18) [i386-linux-gnu]
fox-1.0.26-1
FXRuby-1.0.14-1
RH7.1, kernel 2.4.19

Joel VanderWerf wrote:

···

If I run this and press the “Pow!” button twice, I get a segfault. I’ve
tried whittling the example down, but everything seems to play a role…

Turning off GC prevents it, so it may be that something’s not getting
marked correctly.

Versions:

FXRuby 1.0.13
Fox 1.0.11
Ruby 1.6.7
Mandrake-Linux 8.1


#!/usr/bin/env ruby

require “fox”
include Fox

if false
puts “disabling GC”
GC.disable ### prevents SEGFAULT
end

class BugWindow < FXMainWindow
include Responder

def initialize(app)
super(app, “Bug”, nil, nil, DECOR_ALL, 0, 0, 0, 0)

splitter = FXSplitter.new(self, (LAYOUT_SIDE_TOP|LAYOUT_FILL_X|
  LAYOUT_FILL_Y| SPLITTER_TRACKING|SPLITTER_VERTICAL|SPLITTER_REVERSED))

contents = FXHorizontalFrame.new(splitter,
  LAYOUT_SIDE_TOP|FRAME_NONE|LAYOUT_FILL_X|LAYOUT_FILL_Y)

table_frame = FXHorizontalFrame.new(contents,
  FRAME_SUNKEN|FRAME_THICK|LAYOUT_FILL_X|LAYOUT_FILL_Y)

@table = FXTable.new(table_frame, 0, 0, nil, 0,
  TABLE_COL_SIZABLE|TABLE_ROW_SIZABLE|LAYOUT_FILL_X|LAYOUT_FILL_Y,
  0,0,0,0, 2,2,2,2)

@table.disable
@table.setFont(FXFont.new(getApp(), "courier", 9, FONTWEIGHT_LIGHT))

FXButton.new(splitter, "&Pow!", nil, self).
  connect(SEL_COMMAND, method(:onPow))

end

def onPow(sender, sel, index)
foo ### this seems to be necessary for SEGFAULT
@table.setTableSize 0,0

# make some garbage
dummy = nil
100000.times do |i|
  dummy = [i]*10
end

@table.setTableSize(1000, 10)

@table.enable ### this seems to be necessary for SEGFAULT

return 1

end

def foo
nr = @table.numRows
nc = @table.numCols

(1..nc-2).each { |c|
  item = @table.getItem(nr-1, c) ### 'item = ' seems to be necessary
}

end

def create
position(200, 200, 600, 400)
super
show
end
end

application = FXApp.new(“TEST”, “TEST”)
application.init(ARGV)
window = BugWindow.new(application)
application.create
application.run


Wai-Sun “Squidster” Chia
Consulting & Integration
Linux/Unix/Web Developer Dude

An update on the bug.

Problem 1:

Joel VanderWerf wrote:

If I run this and press the “Pow!” button twice, I get a segfault. I’ve
tried whittling the example down, but everything seems to play a role…

Problem 2:

Maurício wrote:

Hi,

An application I’m creating crashes ruby 1.73 (I’m using the windows
distribution from Pragmatic Programmers).
The lines where the crash occurs (at least that’s what the message
says) are:

def float_translator (str)
str.gsub(/([\d+-.]+) (+ | -) (\d*) (| | $)/ix){$1 + ‘E’ + $2 +
$3 + $4}.to_f
end

Now, I can recreate the former and maybe also the latter problem with a
fairly simple script. It uses FXRuby. Versions:

FXRuby-1.0.14
fox-1.0.26
ruby-1.7.3
linux-mandrake 8.1

Some observations:

  • On my box, the segfault happens pretty reliably after 341 iterations.
    The message is:

    ./bug2.rb:66: [BUG] rb_gc_mark() called for broken object
    ruby 1.7.3 (2002-09-27) [i686-linux]
    zsh: abort ./bug2.rb

The line in question is a case statement involving regexes, which is
what suggests to me that this is related to Mauricio’s bug.

My original app was much more complex, and the problem came up much
sooner. So there seems to be some connection with code complexity.

  • GC.disable seems to prevent it from happening (well, I run out of swap
    first, anyway). Looks like a ruby object isn’t getting marked, or
    malloced data is getting freed improperly.

  • I tried to recreate the problem without FXRuby, but couldn’t.

  • I can’t whittle the example down much farther in any dimension. For
    example, the two lines

      old_nr = @table.numRows
      old_nc = @table.numCols
    

define local vars which are never used, but taking them out prevents the
bug (or delays it for more than 1000 iterations).

Another example, there is an unused branch of a case statement which
cannot be removed without losing the bug:

        case entry
        when /\A(.*\.)(.*)\z/
          entry = sprintf("%10.5f", entry.to_f).
            sub(/\.(\d*?)(0*)$/) { ".#{$1}#{$2.gsub("0", " ")}" }
        when /\d+/
          entry << " " * 6
        end

The first ‘when’ is never taken, but somehow parsing it is necessary to
expose the bug at iteration 341.

*** Can anyone shed any light on this murky situation? ***

bug2.rb (2.05 KB)

Wai-Sun Chia wrote:

Absolutely no problems nor segfaults!

ruby 1.7.3 (2002-10-18) [i386-linux-gnu]
fox-1.0.26-1
FXRuby-1.0.14-1
RH7.1, kernel 2.4.19

Thanks for trying it. I’ll try upgrading fox and FXRuby, first. Then
ruby if I have to. (But I won’t go so far as switching to RH :wink:

BTW I got exactly the same problem with a recent PragProg 1.6.7 install
on Win2K.

Joel VanderWerf wrote:

ruby-1.7.3

That was with RUBY_RELEASE_DATE == “2002-09-27”.

But with “2002-10-30”, the bug is apparently gone from my little
example. Same is true on windows with the 1.7.3-6 installer, which also
uses that release of 1.7.3.

However, the bug remains in my original application. Since the bug seems
very sensitive to slight changes in source code, anyone who wants to
investigate might want to consider running bug2.rb with the 2002-9-27
version of 1.7.3:

http://mirrors.sunsite.dk/ruby/snapshots/ruby-1.7.3-2002.09.27.tar.bz2

Or I can send you my application source, and the problem will be visible
with 2002-10-30.

FXRuby-1.0.14
fox-1.0.26
ruby-1.7.3

Bad news

pigeon% cat b.rb
#!/usr/bin/env ruby

require "fox"
include Fox

if false
  puts "disabling GC"
  GC.disable ### prevents bug
end

class BugWindow < FXMainWindow

[...]

application = FXApp.new("TEST", "TEST")
application.init(ARGV)
window = BugWindow.new(application)
application.create
window.onPow(nil, nil, nil)
pigeon%

pigeon% /usr/bin/env ruby -v
ruby 1.7.3 (2002-09-27) [i686-linux]
pigeon%

pigeon% /usr/bin/env ruby -rfox -e 'p Fox.fxversion'
[1, 0, 26]
pigeon%

pigeon% b.rb | wc -l
   1000
pigeon%

linux-mandrake 8.1

Guy Decoux

Joel VanderWerf wrote:

ruby if I have to. (But I won’t go so far as switching to RH :wink:

No problem. Whatever turns you on. :wink:

BTW I got exactly the same problem with a recent PragProg 1.6.7 install
on Win2K.

Sorry don’t do windows…

···


Wai-Sun “Squidster” Chia
Consulting & Integration
Linux/Unix/Web Developer Dude

Joel VanderWerf wrote:

Wai-Sun Chia wrote:

Absolutely no problems nor segfaults!

ruby 1.7.3 (2002-10-18) [i386-linux-gnu]
fox-1.0.26-1
FXRuby-1.0.14-1
RH7.1, kernel 2.4.19

Thanks for trying it. I’ll try upgrading fox and FXRuby, first. Then
ruby if I have to. (But I won’t go so far as switching to RH :wink:

BTW I got exactly the same problem with a recent PragProg 1.6.7 install
on Win2K.

Installing fox 1.0.26 fixed the problem.

Joel VanderWerf wrote:

However, the bug remains in my original application. Since the bug seems
very sensitive to slight changes in source code, anyone who wants to
investigate might want to consider running bug2.rb with the 2002-9-27
version of 1.7.3:

http://mirrors.sunsite.dk/ruby/snapshots/ruby-1.7.3-2002.09.27.tar.bz2

Or I can send you my application source, and the problem will be visible
with 2002-10-30.

I have been trying to decide what my moral obligation is with respect to
this bug since, as you’ve state, it seems sensitive to changes in the
unstable, development version of Ruby :wink:

But I guess I can at least take a crack at it if you want to send me the
(full) application source code. I will try it with the latest versions
of Ruby and FXRuby.

ts wrote:

“J” == Joel VanderWerf vjoel@PATH.Berkeley.EDU writes:

FXRuby-1.0.14
fox-1.0.26
ruby-1.7.3

Bad news

Thanks for trying, Guy. I can’t guess what else would be different about
our systems. I’ve sent Lyle another version which (for me anyway)
exposes the bug with the latest ruby-1.7.3 (2002-10-30).

I’ll send it to you also if you like, or we can wait to hear what he
reports.

Joel VanderWerf wrote:

Installing fox 1.0.26 fixed the problem.

Well, it fixed the problem in my simplified version (“bug.rb”), but in
the real version it just delayed the problem. The messages are always
one of:

[BUG] rb_gc_mark(): unknown data type 0x38(0x82aaac0) non object
ruby 1.6.7 (2002-03-01) [i686-linux]

[BUG] Segmentation fault
ruby 1.6.7 (2002-03-01) [i686-linux]

Unfortunately, that doesn’t tell us whether the cause is ruby or fox.

I also tried ruby-1.7.3-2002.09.27 and still get the problem:

BUG] rb_gc_mark(): unknown data type 0x31(0x81d16a0) non object
ruby 1.7.3 (2002-09-27) [i686-linux]

I’ll try to isolate the problem again, and post some code.

I'll send it to you also if you like, or we can wait to hear what he
reports.

Well, you can send me in private email, I'll try to take a look at it this
week-end.

Guy Decoux

Joel VanderWerf wrote:

I’ll try to isolate the problem again, and post some code.

Please let me know how it goes and if I can be of assistance. I will be
at RubyConf over the next few days and so it may be the middle of next
week before I’d be able to look at anything.

ts wrote:

Well, you can send me in private email, I’ll try to take a look at it this
week-end.

I think I have identified this as an FXRuby bug; I responded to Joel
privately but I probably didn’t Cc the ML. I’ll try to get a bug fix
release out ASAP.