Compilation fails for Altix (IA64)

During compile of the 08-26-2004 stable snapshot, I get a warning:

  gcc -g -O2 -I. -I. -c eval.c
  eval.c: In function `massign':
  eval.c:4912: warning: cast from pointer to integer of different size

For convenience,

  4907 assign(self, list->nd_head, RARRAY(val)->ptr[i], pcall);
  4908 list = list->nd_next;
  4909 }
  4910 if (pcall && list) goto arg_error;
  4911 if (node->nd_args) {
  4912 if ((int)(node->nd_args) == -1) {
  4913 /* no check for mere `*' */
  4914 }
  4915 else if (!list && i<len) {
  4916 assign(self, node->nd_args, rb_ary_new4(len-i, RARRAY(val)->ptr+i), pcall);

The make eventually fails with,

  gcc -g -O2 -rdynamic main.o dmyext.o libruby-static.a -ldl -lcrypt -lm -o miniruby
  ./lib/ftools.rb:204: [BUG] Segmentation fault
  ruby 1.8.2 (2004-08-26) [ia64-linux]

And

  gdb miniruby
  r mkconfig.rb rbconfig.rb

provides,

  Program received signal SIGSEGV, Segmentation fault.
  rb_yield_0 (val=2305843009218535704, self=2305843009218655784, klass=0, flags=-68136, avalue=0) at eval.c:4726
  4726 if ((state = EXEC_TAG()) == 0) {

  bt
  #0 rb_yield_0 (val=2305843009218535704, self=2305843009218655784, klass=0, flags=-68136, avalue=0) at eval.c:4726
  #1 0x4000000000026620 in rb_yield (val=2305843009218535704) at eval.c:4826
  #2 0x400000000011cda0 in rb_ary_each (ary=2305843009218534304) at array.c:1112
  #3 0x400000000002ae00 in rb_call0 (klass=2305843009218685664, recv=2305843009218534304, id=3825, oid=0, argc=0, argv=0x0, body=0x20000000004c2298,
      nosuper=62488) at eval.c:5404
  #4 0x400000000002ccb0 in rb_call (klass=2305843009218685664, recv=2305843009218534304, mid=3825, argc=0, argv=0x0, scope=0) at eval.c:5756
  #5 0x400000000001be30 in rb_eval (self=2305843009218655784, n=0x60000000000062b8) at eval.c:2988
  #6 0x400000000002c240 in rb_call0 (klass=2305843009218655744, recv=2305843009218655784, id=10273, oid=0, argc=2, argv=0x60000fffffff6500,
      body=0x20000000004a0c60, nosuper=-45416) at eval.c:5663
  #7 0x400000000002ccb0 in rb_call (klass=2305843009218655744, recv=2305843009218655784, mid=10273, argc=2, argv=0x60000fffffff6500, scope=0) at eval.c:5756
  #8 0x400000000001e010 in rb_eval (self=2305843009218737024, n=0x60000000000062b8) at eval.c:3257
  #9 0x4000000000013380 in eval_node (self=2305843009218737024, node=0x20000000004b0160) at eval.c:1287
  #10 0x40000000000140f0 in ruby_exec () at eval.c:1456
  #11 0x40000000000141d0 in ruby_run () at eval.c:1477
  #12 0x400000000000f3d0 in Init_ext () at main.c:50
  #13 0x4000000000025b80 in rb_yield_0 (val=Cannot access memory at address 0x60000fff7fffbfa0) at eval.c:4726
  #14 0x400000000000f200 in _start ()
  #15 0x4000000000025b80 in rb_yield_0 (val=Cannot access memory at address 0x60000fff7fffbef8) at eval.c:4726
  Cannot access memory at address 0x60000fff7fffbf98

  list
  4721 }
  4722 ruby_current_node = node;
  4723
  4724 PUSH_ITER(block->iter);
  4725 PUSH_TAG(lambda ? PROT_NONE : PROT_YIELD);
  4726 if ((state = EXEC_TAG()) == 0) {
  4727 redo:
  4728 if (nd_type(node) == NODE_CFUNC || nd_type(node) == NODE_IFUNC) {
  4729 if (node->nd_state == YIELD_FUNC_AVALUE) {
  4730 if (!avalue) {

Any ideas?

Thanks,

···

--
Bil, Hampton, Virginia

Bil Kleb wrote:

The make eventually fails with,

./lib/ftools.rb:204: [BUG] Segmentation fault
ruby 1.8.2 (2004-08-26) [ia64-linux]

My apologies: I forgot the environment.

The machine is using a special SGI Linux kernel,

  % uname -r
  2.4.21-sgi240rp04080615_10094

and a fairly old gcc,

  % gcc -v
  Reading specs from /usr/lib/gcc-lib/ia64-redhat-linux/2.96/specs
  gcc version 2.96 20000731 (Red Hat Linux 7.2 2.96-118.7.2)

Regards,

···

--
Bil, Hampton, Virginia

Hi,

At Sat, 28 Aug 2004 03:43:39 +0900,
Bil Kleb wrote in [ruby-talk:110682]:

During compile of the 08-26-2004 stable snapshot, I get a warning:

  gcc -g -O2 -I. -I. -c eval.c
  eval.c: In function `massign':
  eval.c:4912: warning: cast from pointer to integer of different size

For convenience,

  4907 assign(self, list->nd_head, RARRAY(val)->ptr[i], pcall);
  4908 list = list->nd_next;
  4909 }
  4910 if (pcall && list) goto arg_error;
  4911 if (node->nd_args) {
  4912 if ((int)(node->nd_args) == -1) {
  4913 /* no check for mere `*' */
  4914 }
  4915 else if (!list && i<len) {
  4916 assign(self, node->nd_args, rb_ary_new4(len-i, RARRAY(val)->ptr+i), pcall);

I think that it is fixed in CVS already. This is the backport.

Index: eval.c

···

===================================================================
RCS file: /cvs/ruby/src/ruby/eval.c,v
retrieving revision 1.616.2.44
diff -U2 -p -d -r1.616.2.44 eval.c
--- eval.c 25 Aug 2004 19:39:14 -0000 1.616.2.44
+++ eval.c 28 Aug 2004 02:22:13 -0000
@@ -4910,5 +4910,5 @@ massign(self, node, val, pcall)
     if (pcall && list) goto arg_error;
     if (node->nd_args) {
- if ((int)(node->nd_args) == -1) {
+ if ((long)(node->nd_args) == -1) {
       /* no check for mere `*' */
   }
@@ -5611,5 +5611,5 @@ rb_call0(klass, recv, id, oid, argc, arg
          argc, i);
         }
- if ((int)node->nd_rest == -1) {
+ if ((long)node->nd_rest == -1) {
       int opt = i;
       NODE *optnode = node->nd_opt;
@@ -5646,5 +5646,5 @@ rb_call0(klass, recv, id, oid, argc, arg
       }
       local_vars = ruby_scope->local_vars;
- if ((int)node->nd_rest >= 0) {
+ if ((long)node->nd_rest >= 0) {
           VALUE v;

--
Nobu Nakada

Thanks for the patch. However, I am still getting a segmentation fault in
exactly the same manner as reported in the original message.

FWIW, I received the following data from the NASA sysadm folks:

  Here is an update on building ruby.

  One of our sysadms [..] and I tried a few
  experiments on building Ruby on our Altix machines.
  We tried different combinations
  of (1) two versions of OS, one with SGI ProPack2.4,
  the other with SGI ProPack3.0, (2) different versions of
  c compilers, gcc.2.96, gcc.3.2.3, gcc.3.3.1 and Intel ecc 7.0.27,
  (3) two versions of Ruby, 1.8.1 and 1.8.2-preview.
  Unfortunately, none of them succeeded.

  (1) ruby-1.8.1, gcc 2.96,
      OS 2.4.21-sgi240rp04080615_10094 #1 SMP
      (SGI ProPack 2.4) make -> segmentation fault

  (2) ruby-1.8.2-preview2, gcc 2.96,
      OS 2.4.21-sgi240rp04080615_10094 #1 SMP
      (SGI ProPack 2.4) make -> segmentation fault

  (3) ruby-1.8.2-preview2, gcc 3.2.3
      OS 2.4.21-sgi240rp04080615_10094 #1 SMP
      (SGI ProPack 2.4) make failed

  (4) ruby-1.8.1, gcc 2.96,
      OS 2.4.21-sgi300rp04081111_10096
      (SGI ProPack 3.0) make stuck

  (5) ruby-1.8.1, gcc 3.2.3
      OS 2.4.21-sgi300rp04081111_10096
      (SGI ProPack 3.0) make stuck

  (6) ruby-1.8.1, gcc 3.3.1
      OS 2.4.21-sgi300rp04081111_10096
      (SGI ProPack 3.0) make with error

  (7) ruby-1.8.1, ecc 7.1.027
      OS 2.4.21-sgi300rp04081111_10096
      (SGI ProPack 3.0) make with error

  (8) ruby-1.8.2, gcc 3.2.3
      OS 2.4.21-sgi300rp04081111_10096
      (SGI ProPack 3.0) make with error

  (9) ruby-1.8.2, gcc 3.3.1
      OS 2.4.21-sgi300rp04081111_10096
      (SGI ProPack 3.0) make with error

  On the contrary, I also asked one of my colleagues
  to test building Ruby on his Linux workstation, it built
  just fine.

  ruby-1.8.2-preview2 built successfully
  using gcc.3.2.3 and the following Linux OS
  Linux [..] 2.4.21-15.0.3.EL #1

Regards,

···

nobu.nokada@softhome.net wrote:

At Sat, 28 Aug 2004 03:43:39 +0900,
Bil Kleb wrote in [ruby-talk:110682]:

During compile of the 08-26-2004 stable snapshot, I get a warning:

gcc -g -O2 -I. -I. -c eval.c
eval.c: In function `massign':
eval.c:4912: warning: cast from pointer to integer of different size

I think that it is fixed in CVS already. This is the backport.

--
Bil, Hampton, Virginia

Bil Kleb wrote:

FWIW, I received the following data from the NASA sysadm folks:

Here is an update on building ruby.

One of our sysadms [..] and I tried a few
experiments on building Ruby on our Altix machines.
We tried different combinations
of (1) two versions of OS, one with SGI ProPack2.4,
the other with SGI ProPack3.0, (2) different versions of
c compilers, gcc.2.96, gcc.3.2.3, gcc.3.3.1 and Intel ecc 7.0.27,
(3) two versions of Ruby, 1.8.1 and 1.8.2-preview.
Unfortunately, none of them succeeded.

Their latest experiments:

  With the great help from my colleague,
  we are able to build ruby. Here are the steps we
  took on [the Altix] using ruby-1.8.1:

  (1) modify ext/Setup and uncomment out the first line
      #option nodynamic

  (2) % limit stacksize unlimited
  (3) setenv CC /opt/intel/comp/7.1.027/compiler70/ia64/bin/ecc
  (4) ./configure
  (5) make

  We tried using gcc, but it did not work.

But our Fortran dependency code fails due to a lack of available
stack levels. The following little bit demonstrates the behavior:

% cat > helloLevel.rb
def helloLevel level
   puts "hello world! "+level.to_s
   helloLevel(level+1)
end

helloLevel 0
[Ctrl-D]

% /path/to/ruby-1.8.1/ruby ./helloLevel.rb
hello world! 0
hello world! 1
[snip...]
hello world! 43
hello world! 44
helloLevel.rb:4:in `helloLevel': stack level too deep (SystemStackError)
         from helloLevel.rb:4:in `helloLevel'
          ... 33 levels...
         from helloLevel.rb:4:in `helloLevel'
         from helloLevel.rb:7

My x86 linux box can get to a recursion level of 3886
before stopping.

Thanks,

···

--
Bil Kleb, Hampton, Virginia

  (2) % limit stacksize unlimited
  (3) setenv CC /opt/intel/comp/7.1.027/compiler70/ia64/bin/ecc

#ifdef __ia64__
    /* ruby crashes on IA64 if compiled with optimizer on */
    /* when if STACK_LEVEL_MAX is greater than this magic number */
    /* I know this is a kludge. I suspect optimizer bug */
#define IA64_MAGIC_STACK_LIMIT 49152
    if (STACK_LEVEL_MAX > IA64_MAGIC_STACK_LIMIT)
        STACK_LEVEL_MAX = IA64_MAGIC_STACK_LIMIT;
#endif
#endif

Can you verify that `49152' give you only 44 levels ?

Guy Decoux

ts wrote:

"B" == Bil Kleb <Bil.Kleb@NASA.Gov> writes:

> (2) % limit stacksize unlimited
> (3) setenv CC /opt/intel/comp/7.1.027/compiler70/ia64/bin/ecc

#ifdef __ia64__
    /* ruby crashes on IA64 if compiled with optimizer on */
    /* when if STACK_LEVEL_MAX is greater than this magic number */
    /* I know this is a kludge. I suspect optimizer bug */
#define IA64_MAGIC_STACK_LIMIT 49152
    if (STACK_LEVEL_MAX > IA64_MAGIC_STACK_LIMIT)
        STACK_LEVEL_MAX = IA64_MAGIC_STACK_LIMIT;
#endif

Can you verify that `49152' give you only 44 levels ?

I will gladly do so, but I need guidance: Can you sketch
the steps to do this?

Thanks,

···

--
Bil Kleb, Hampton, Virginia

I will gladly do so, but I need guidance: Can you sketch
the steps to do this?

break point in stack_check() or in rb_call0() when it call stack_check()
(it's probably inlined) and see what value it has

Guy Decoux

ts wrote:

break point in stack_check() or in rb_call0() when it call stack_check()
(it's probably inlined) and see what value it has

Here's what I did and found, but I am not sure what you mean by "value it has",
i.e., I need more hand-holding.

  gdb /path/to/ruby-1.8.1/ruby

  (gdb) break stack_check
  Breakpoint 1 at 0x4000000000011c31: file eval.c, line 4758.

  (gdb) r helloLevel.rb
  Starting program: /u/.realmounts/staff/schang/ruby-1.8.1/ruby helloLevel.rb
  hello world! 0
  hello world! 1
  hello world! 2
  hello world! 3
  hello world! 4
  hello world! 5
  hello world! 6
  hello world! 7

  Breakpoint 1, stack_check () at eval.c:4758
  4758 if (!overflowing && ruby_stack_check()) {

  (gdb) p STACK_LEVEL_MAX
  $1 = 49152

  (gdb) c
  Continuing.
  hello world! 8
  hello world! 9
  [..]
  hello world! 43
  hello world! 44

  Breakpoint 1, stack_check () at eval.c:4758
  4758 if (!overflowing && ruby_stack_check()) {

  (gdb) p STACK_LEVEL_MAX
  $2 = 49152

  (gdb) list
  4753 static inline void
  4754 stack_check()
  4755 {
  4756 static int overflowing = 0;
  4757
  4758 if (!overflowing && ruby_stack_check()) {
  4759 int state;
  4760 overflowing = 1;
  4761 PUSH_TAG(PROT_NONE);
  4762 if ((state = EXEC_TAG()) == 0) {
  (gdb) list
  4763 rb_exc_raise(sysstack_error);
  4764 }
  4765 POP_TAG();
  4766 overflowing = 0;
  4767 JUMP_TAG(state);
  4768 }
  4769 }

  (gdb) bt
  #0 stack_check () at eval.c:4758
  #1 0x4000000000050320 in rb_call0 (klass=2305843009219151112, recv=89, id=43,
      oid=43, argc=1, argv=0x60000fffffea3d20, body=0x20000000005342d8,
      nosuper=0) at eval.c:5031
  #2 0x4000000000054190 in rb_call (klass=2305843009219151112, recv=89, mid=43,
      argc=1, argv=0x60000fffffea3d20, scope=0) at eval.c:5287
  #3 0x4000000000036850 in $_1$rb_eval$TAG$GLOB () at eval.c:3076
  #4 0x4000000000036f70 in $_1$rb_eval$TAG$GLOB () at eval.c:3086
  #5 0x40000000000528e0 in rb_call0 (klass=2305843009219189312,
      recv=2305843009219179552, id=10193, oid=10193, argc=0,
      argv=0x60000fffffeb0e08, body=0x200000000051c458, nosuper=0) at eval.c:5194
  #6 0x4000000000054190 in rb_call (klass=2305843009219189312,
      recv=2305843009219179552, mid=10193, argc=1, argv=0x60000fffffeb0e00,
      scope=1) at eval.c:5287
  #7 0x4000000000037220 in $_1$rb_eval$TAG$GLOB () at eval.c:3091
  #8 0x40000000000528e0 in rb_call0 (klass=2305843009219189312,
      recv=2305843009219179552, id=10193, oid=10193, argc=0,
      argv=0x60000fffffeb83c8, body=0x200000000051c458, nosuper=0) at eval.c:5194
  #9 0x4000000000054190 in rb_call (klass=2305843009219189312,
      recv=2305843009219179552, mid=10193, argc=1, argv=0x60000fffffeb83c0,
      scope=1) at eval.c:5287
  [..]

···

--
Bil Kleb, Hampton, Virginia

Here's what I did and found, but I am not sure what you mean by "value it has",
i.e., I need more hand-holding.

Well, you need the value of rb_gc_stack_start and the value of the current
stack which must be in a register.

Apparently this is the limit `49152' which is the problem

For example at moulon

Breakpoint 2, stack_check () at eval.c:5254
5254 overflowing = 1;
(gdb) p STACK_LEVEL_MAX
$1 = 1835008
(gdb) p rb_gc_stack_start
$2 = (VALUE *) 0xbfffdb90
(gdb) info register esp
esp 0xbf8ed300 0xbf8ed300
(gdb) p (0xbfffdb90-0xbf8ed300)/4
$3 = 1851940
(gdb)

Guy Decoux

ts wrote:

Well, you need the value of rb_gc_stack_start and the value of the current
stack which must be in a register.

I could only get to rb_gc_stack_start,

  (gdb) p rb_gc_stack_start
  $4 = (VALUE *) 0x60000fffffffb168

because I have no clue which register holds the stack pointer on IA64,

  (gdb) info register esp
  esp: invalid register

Is it not enough to show that Ruby fails after 44 levels and
STACK_LEVEL_MAX = 49152?

Regards,

···

--
Bil Kleb, Hampton, Virginia

  (gdb) info register esp
  esp: invalid register

info register will give you the value of all registers

Guy Decoux

  (gdb) info register esp
  esp: invalid register

If I'm right, it's r12

Guy Decoux

ts wrote:

"B" == Bil Kleb <Bil.Kleb@NASA.Gov> writes:

info register will give you the value of all registers

I did that and the list went on forever. Do you want to see
that?

···

--
Bil Kleb, Hampton, Virginia

ts wrote:

If I'm right, it's r12

(gdb) p rb_gc_stack_start
$1 = (VALUE *) 0x60000fffffffb168
(gdb) info register r12
r12 0x60000fffffea17a0 6917546619825690528
(gdb) p (0x60000fffffffb168-0x60000fffffea17a0)/4
$2 = 353906
(gdb) p (0x60000fffffffb168-0x60000fffffea17a0)/8
$3 = 176953
(gdb) p STACK_LEVEL_MAX
$4 = 49152

···

--
Bil Kleb, Hampton, Virginia

I did that and the list went on forever. Do you want to see
that?

it has only 128 registers :slight_smile:

only r12 if you can display it.

Guy Decoux

(gdb) p (0x60000fffffffb168-0x60000fffffea17a0)/8
$3 = 176953
(gdb) p STACK_LEVEL_MAX
$4 = 49152

the problem is really will STACK_LEVEL_MAX, now how to solve it ???

Guy Decoux

ts wrote:

#ifdef __ia64__
    /* ruby crashes on IA64 if compiled with optimizer on */
    /* when if STACK_LEVEL_MAX is greater than this magic number */
    /* I know this is a kludge. I suspect optimizer bug */
#define IA64_MAGIC_STACK_LIMIT 49152
    if (STACK_LEVEL_MAX > IA64_MAGIC_STACK_LIMIT)
        STACK_LEVEL_MAX = IA64_MAGIC_STACK_LIMIT;
#endif

[gdb experiments]

the problem is really wi[th] STACK_LEVEL_MAX, now how to solve it ???

Where do we go from here?

Regards,

···

--
Bil Kleb, Hampton, Virginia