Ruby 3 mem corruption

Hi all,

I cannot login to bugs.ruby-lang.org because they're requiring 2 factor auth in a way that is not convenient for me. So I probabily will not report Ruby bugs anymore there.

As a last resort, I'd like to inform people here about a bug that I seem to run into constantly. It is very hard to reproduce though because of the heavy non-deterministic math behind it.

So I use Ruby v3.0.2 compiled from source on Ubuntu GNU/Linux 18.04 LTS and I use Ractors heavily to make computations parallel. And from time to time I get the following error message (fish is the shell I'm calling the script from):

free(): invalid pointer
fish: “./script.rb” terminated by signal SIGABRT (Abort)

Any idea of how to track it down or find more info?

Thanks,
Andras

Hi All,

I could reproduce the segmentation fault. This part of a function seems to cause it:

return obj.to_s[/^[+-]?[0-9]+\.[0-9]+[eE][+-]?[0-9]+$|^[+-]?[0-9]+[eE][+-]?[0-9]+$|^[+-]?[0-9]+\.[0-9]+$|^[+-]?[0-9]+$/]

This checks if an object is a number (the object can be a text etc). See the trace output below:

myscript_common.rb:362: [BUG] Segmentation fault at 0x0000000000000000
ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x86_64-linux]

-- Control frame information -----------------------------------------------
c:0007 p:---- s:0091 e:000090 CFUNC :
c:0006 p:0008 s:0086 e:000085 METHOD myscript_common.rb:362
c:0005 p:0082 s:0081 e:000080 BLOCK ./myscript_cluster.rb:2368 [FINISH]
c:0004 p:---- s:0075 e:000074 CFUNC :each
c:0003 p:0336 s:0071 e:000070 METHOD ./myscript_cluster.rb:2365
c:0002 p:0032 s:0009 e:000008 BLOCK ./myscript_cluster.rb:3830 [FINISH]
c:0001 p:---- s:0003 e:000002 (none) [FINISH]

-- Ruby level backtrace information ----------------------------------------
./myscript_cluster.rb:3830:in `block (7 levels) in <main>'
./myscript_cluster.rb:2365:in `cluster3'
./myscript_cluster.rb:2365:in `each'
./myscript_cluster.rb:2368:in `block in cluster3'
myscript_common.rb:362:in `check_if_number'
myscript_common.rb:362:in `'

-- Machine register context ------------------------------------------------
RIP: 0x000055fb741407c8 RBP: 0x00007f30267ec1d0 RSP: 0x00007f30267ea740
RAX: 0x000055fb742ee4dd RBX: 0x00007f30267ea778 RCX: 0x00007f30267ea750
RDX: 0x00007f2f9d705379 RDI: 0x00007f30267ea740 RSI: 0x000055fb745c06c0
R8: 0x00007f2f9d705378 R9: 0x00007f30267ec230 R10: 0x00007f2f9d705378
R11: 0x0000000000000000 R12: 0x0000000000000001 R13: 0x000055fb748134e0
R14: 0x0000000000000000 R15: 0x00007f30267ec230 EFL: 0x0000000000010246

-- C level backtrace information -------------------------------------------
/usr/local/bin/ruby(rb_vm_bugreport+0x53a) [0x55fb742032aa] vm_dump.c:758
/usr/local/bin/ruby(rb_bug_for_fatal_signal+0xe8) [0x55fb742abba8] error.c:786
/usr/local/bin/ruby(sigsegv+0x4b) [0x55fb7415e9bb] signal.c:960
/lib/x86_64-linux-gnu/libpthread.so.0(__restore_rt+0x0) [0x7f30374c0980] ../sysdeps/pthread/funlockfile.c:28
/usr/local/bin/ruby(match_at+0x178) [0x55fb741407c8] regexec.c:1694
/usr/local/bin/ruby(onig_search_gpos+0x200) [0x55fb741494f0] regexec.c:4423
/usr/local/bin/ruby(onig_search+0x16) [0x55fb7414a046] regexec.c:4152
/usr/local/bin/ruby(rb_reg_search0+0xbd) [0x55fb7412a99d] re.c:1579
/usr/local/bin/ruby(rb_reg_search) re.c:1629
...

it continues...

Andras

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐

···

On Monday, October 11th, 2021 at 9:44 AM, horv77 <horv77@protonmail.com> wrote:

Hi all,

I cannot login to bugs.ruby-lang.org because they're requiring 2 factor auth in a way that is not convenient for me. So I probabily will not report Ruby bugs anymore there.

As a last resort, I'd like to inform people here about a bug that I seem to run into constantly. It is very hard to reproduce though because of the heavy non-deterministic math behind it.

So I use Ruby v3.0.2 compiled from source on Ubuntu GNU/Linux 18.04 LTS and I use Ractors heavily to make computations parallel. And from time to time I get the following error message (fish is the shell I'm calling the script from):

free(): invalid pointer
fish: “./script.rb” terminated by signal SIGABRT (Abort)

Any idea of how to track it down or find more info?

Thanks,
Andras

Could it have been that I had run the code with a time prefix to measure memory and cpu usage? This is not the bash shell's time command but the one in /usr/bin/time on Linux. Could that cause a segmentation fault? Nothing else is different to any of my other systems.

Andras

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐

···

On Monday, October 11th, 2021 at 5:48 PM, horv77 <horv77@protonmail.com> wrote:

Hi All,

I could reproduce the segmentation fault. This part of a function seems to cause it:

return obj.to_s[/^[+-]?[0-9]+\.[0-9]+[eE][+-]?[0-9]+$|^[+-]?[0-9]+[eE][+-]?[0-9]+$|^[+-]?[0-9]+\.[0-9]+$|^[+-]?[0-9]+$/]

This checks if an object is a number (the object can be a text etc). See the trace output below:

myscript_common.rb:362: [BUG] Segmentation fault at 0x0000000000000000
ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x86_64-linux]

-- Control frame information -----------------------------------------------
c:0007 p:---- s:0091 e:000090 CFUNC :
c:0006 p:0008 s:0086 e:000085 METHOD myscript_common.rb:362
c:0005 p:0082 s:0081 e:000080 BLOCK ./myscript_cluster.rb:2368 [FINISH]
c:0004 p:---- s:0075 e:000074 CFUNC :each
c:0003 p:0336 s:0071 e:000070 METHOD ./myscript_cluster.rb:2365
c:0002 p:0032 s:0009 e:000008 BLOCK ./myscript_cluster.rb:3830 [FINISH]
c:0001 p:---- s:0003 e:000002 (none) [FINISH]

-- Ruby level backtrace information ----------------------------------------
./myscript_cluster.rb:3830:in `block (7 levels) in <main>'
./myscript_cluster.rb:2365:in `cluster3'
./myscript_cluster.rb:2365:in `each'
./myscript_cluster.rb:2368:in `block in cluster3'
myscript_common.rb:362:in `check_if_number'
myscript_common.rb:362:in `'

-- Machine register context ------------------------------------------------
RIP: 0x000055fb741407c8 RBP: 0x00007f30267ec1d0 RSP: 0x00007f30267ea740
RAX: 0x000055fb742ee4dd RBX: 0x00007f30267ea778 RCX: 0x00007f30267ea750
RDX: 0x00007f2f9d705379 RDI: 0x00007f30267ea740 RSI: 0x000055fb745c06c0
R8: 0x00007f2f9d705378 R9: 0x00007f30267ec230 R10: 0x00007f2f9d705378
R11: 0x0000000000000000 R12: 0x0000000000000001 R13: 0x000055fb748134e0
R14: 0x0000000000000000 R15: 0x00007f30267ec230 EFL: 0x0000000000010246

-- C level backtrace information -------------------------------------------
/usr/local/bin/ruby(rb_vm_bugreport+0x53a) [0x55fb742032aa] vm_dump.c:758
/usr/local/bin/ruby(rb_bug_for_fatal_signal+0xe8) [0x55fb742abba8] error.c:786
/usr/local/bin/ruby(sigsegv+0x4b) [0x55fb7415e9bb] signal.c:960
/lib/x86_64-linux-gnu/libpthread.so.0(__restore_rt+0x0) [0x7f30374c0980] ../sysdeps/pthread/funlockfile.c:28
/usr/local/bin/ruby(match_at+0x178) [0x55fb741407c8] regexec.c:1694
/usr/local/bin/ruby(onig_search_gpos+0x200) [0x55fb741494f0] regexec.c:4423
/usr/local/bin/ruby(onig_search+0x16) [0x55fb7414a046] regexec.c:4152
/usr/local/bin/ruby(rb_reg_search0+0xbd) [0x55fb7412a99d] re.c:1579
/usr/local/bin/ruby(rb_reg_search) re.c:1629
...

it continues...

Andras

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Monday, October 11th, 2021 at 9:44 AM, horv77 <horv77@protonmail.com> wrote:

Hi all,

I cannot login to bugs.ruby-lang.org because they're requiring 2 factor auth in a way that is not convenient for me. So I probabily will not report Ruby bugs anymore there.

As a last resort, I'd like to inform people here about a bug that I seem to run into constantly. It is very hard to reproduce though because of the heavy non-deterministic math behind it.

So I use Ruby v3.0.2 compiled from source on Ubuntu GNU/Linux 18.04 LTS and I use Ractors heavily to make computations parallel. And from time to time I get the following error message (fish is the shell I'm calling the script from):

free(): invalid pointer
fish: “./script.rb” terminated by signal SIGABRT (Abort)

Any idea of how to track it down or find more info?

Thanks,
Andras