Allow *array expansion anywhere in list

Regarding the proposal mentioned in the subject (see also
http://www.rubygarden.org/article.php?sid=258), I decided to grab the ruby
source and see how difficult it would be to implement, and if anything in
the grammar itself precluded *array in the middle of the list.

The findings of fact (well, OK, my opinion) are these:

*array is only allowed at the end of arrays because it’s implemented as a hack.

For those unfamiliar with the inner workings, usually parse.y creates a
NODE_ARRAY for an array, which is a linked list of nodes with the first node
containing the length of the array. If a *arg value is found, a
NODE_ARGSCAT node is inserted via arg_concat()[1], which has as arguments
the original array and the arg.

[1] parse.y, look for tSTAR arg, e.g. in call_args or aref_args.

In eval.c, which walks the node tree (“AST”) and evaluates it, NODE_ARRAY
walks the list and builds a ruby array (via array.c:rb_ary_new2() and direct
manipulation of the content, inserting the result of evaluating each
element-node); NODE_ARGSCAT evals the array and the arg (which is usually
also an array), and calls rb_ary_concat (which may also call to_ary).
Simple enough, but you can see why this precludes easily allowing *array
mid-list.

Changing the parser to allow *array mid-list (in call args and [] arrays,
and probably other places[2]) is trivial, but deciding what sort of node
structure to build is more complicated. I assume slowing down NODE_ARRAY’s
processing is a definite no-no, since (from looking at ruth parse trees)
it’s used everywhere. However, if tSTAR arg_value instead added a new node,
NODE_XELEMENT (expand element) to the array, e.g.

args ‘,’ tSTAR arg_value { list_append($1,NEW_XELEMENT($4)); }

and the processing of NODE_ARRAY in eval.c worked something like:

case NODE_ARRAY:
{
    VALUE ary;
    long i;

    i = node->nd_alen;
    ary = rb_ary_new2(i);
    for (i=0;node;node=node->nd_next)
      if(NODE_XELEMENT == nd_type(node->nd_head)) {
        for(;node;node=node->nd_next)
          if(NODE_XELEMENT == nd_type(node->nd_head))
            rb_ary_concat(ary,rb_eval(self,node->nd_head->nd_head));
          else
            rb_ary_push(ary,rb_eval(self,node->nd_head));
        break;
      } else {
        RARRAY(ary)->ptr[i++] = rb_eval(self, node->nd_head);
        RARRAY(ary)->len = i;
      }

    result = ary;
}
break;

(which doesn’t slow down the normal array processing except to add an
equality check which is probably only a couple of operations), then *array
can appear safely anywhere in a list.

However this still doesn’t work, because eval.c’s SETUP_ARGS macro (probably
among others) assumes that the array node’s alen parameter is accurate and
goes and allocates a chunk of memory based on it. The solution to this is
to add a NODE_XARRAY (extended array) that is NODE_ARRAY except it adds
*-interpolation; if anyone was worrying about the overhead of the
NODE_XELEMENT test in regular array processing, it’s gone; the only change
is that as soon as a NODE_XELEMENT is added to an array, the array’s node
type must become NODE_XARRAY instead. This also allows for the ditching of
NODE_ARGSCAT, maybe NODE_REST(ARGS|ARY) too. (Furthermore, SETUP_ARGS will
automatically call rb_eval on a NODE_XARRAY, which can hand it back an
array.)

[2] rhs only of course, lhs is more complicated but it would be nice for
consistency if (as discussed on OPN #ruby-talk) e.g. a,*b,c = f()
assigned the first element returned to a, the last to c, and the rest
to b, in effect doing tmp = f(); a = tmp.unshift; c = tmp.pop; b = tmp,
but that’s a separate (if related) problem.

While I’m looking at NODE_ARRAY, is there any reason why NEW_ZARRAY couldn’t
have been (node.h) #define NEW_ZARRAY rb_node_newnode(NODE_ARRAY,0,0,0)?
Was it done for speed?

Apologies if this has already been discussed, I couldn’t find it in the
archives.

If this idea looks workable I’d be happy to discuss it further or submit a
patch at some point. (Hm… maybe NODE_ARGSCAT could be used multiple times
and multiple NODE_ARRAYs created instead, but that’s ugly.)

Hmm… something still seems rotten in the state of Denmark about how ‘*’ is
parsed, maybe greater reforms are needed.

···


Dave
Isa. 40:31

Hi,

···

In message “Allow *array expansion anywhere in list” on 02/11/19, David Robins dbrobins@davidrobins.net writes:

The findings of fact (well, OK, my opinion) are these:

*array is only allowed at the end of arrays because it’s implemented as a hack.

This is a false assumption. The reason is it’s not encouraged. If
you need array expansion in the middle of arguments, there must be
something wrong, perhaps in method argument design.

						matz.

The findings of fact (well, OK, my opinion) are these:

*array is only allowed at the end of arrays because it’s implemented as a hack.

This is a false assumption. The reason is it’s not encouraged. If
you need array expansion in the middle of arguments, there must be
something wrong, perhaps in method argument design.

Yes, but sometimes you don’t design the method, and expansion is useful to
encapsulate arguments. For example, this is what I tried to do some weeks
ago, and seemed very natural:

···

==
class Point
attr_accessor :x, :y

def initialize x=0,y=0
	@x=x; @y=y
end

def to_a
	[@x, @y]
end

end

p1=Point.new(5,4)
p2=Point.new(10,10)

now I have a ‘line’ method, from an external library, defined as:

def line x1, y1, x2, y2, attrs

line(*p1, *p2, attrs) # parse error

Isn’t this more clear and natural and mantainable and easy than

line(p1.x, p1.y, p2.x, p2.y, attrs)

?

Hi,

···

In message “Re: Allow *array expansion anywhere in list” on 02/11/23, Carlos angus@quovadis.com.ar writes:

now I have a ‘line’ method, from an external library, defined as:

def line x1, y1, x2, y2, attrs

line(*p1, *p2, attrs) # parse error

Isn’t this more clear and natural and mantainable and easy than

line(p1.x, p1.y, p2.x, p2.y, attrs)

?

I don’t think so. In my opinion, the latter is far easier to read and
maintain, since we don’t have to guess. Only case that *array is
useful, is extending homogeneous values in the list, e.g.
[a,b,c,*d,e,f]. But I’m not sure if it is that useful enough o
change the syntax, and loose symmetry with unary
in parameter list.

						matz.

The findings of fact (well, OK, my opinion) are these:
*array is only allowed at the end of arrays because it’s
implemented as a hack.
This is a false assumption. The reason is it’s not encouraged. If
you need array expansion in the middle of arguments, there must
be something wrong, perhaps in method argument design.
Yes, but sometimes you don’t design the method, and expansion is
useful to encapsulate arguments. For example, this is what I tried
to do some weeks ago, and seemed very natural:

line(*p1, *p2, attrs) # parse error
vs.
line(p1.x, p1.y, p2.x, p2.y, attrs)

It’s not as pretty, but:

line(*[p1.to_a, p2.to_a, “boo”].flatten)

Alternately:

class Point
attr_accessor :x, :y

def initialize(x = 0, y = 0)
@x = x
@y = y
end

def to_a
[@x, @y]
end

def self.
[p.x, p.y]
end
end

p1 = Point.new(5, 4)
p2 = Point.new(10, 10)

def line(x1, y1, x2, y2, attrs)
puts “line(#{x1}, #{y1}, #{x2}, #{y2}, #{attrs})”
end

line([p1.to_a, p2.to_a, “boo”].flatten)
line(
[Point[p1], Point[p2], “boo”].flatten)

Ideal? No. But I do think that while line(*p1, *p2, attrs) looks
cleaner, it makes too many assumptions about the interfaces of both
classes to be a valid general case.

-austin
– Austin Ziegler, austin@halostatue.ca on 2002.11.22 at 12.12.28

···

On Sat, 23 Nov 2002 01:35:38 +0900, Carlos wrote:

If you believe so, why not

def my_line(p1, p2, attrs)
line(p1.x, p1.y, p2.x, p2.y, attrs)
end

···

On Sat, Nov 23, 2002 at 01:35:38AM +0900, Carlos wrote:

now I have a ‘line’ method, from an external library, defined as:

def line x1, y1, x2, y2, attrs

line(*p1, *p2, attrs) # parse error

Isn’t this more clear and natural and mantainable and easy than

line(p1.x, p1.y, p2.x, p2.y, attrs)


Alan Chen
Digikata Computing
http://digikata.com

In article 20021122163518.GA9460@pelikan,

···

Carlos angus@quovadis.com.ar wrote:

The findings of fact (well, OK, my opinion) are these:

*array is only allowed at the end of arrays because it’s implemented
as a hack.

This is a false assumption. The reason is it’s not encouraged. If
you need array expansion in the middle of arguments, there must be
something wrong, perhaps in method argument design.

Yes, but sometimes you don’t design the method, and expansion is useful to
encapsulate arguments. For example, this is what I tried to do some weeks
ago, and seemed very natural:

==
class Point
attr_accessor :x, :y

def initialize x=0,y=0
@x=x; @y=y
end

def to_a
[@x, @y]
end
end

p1=Point.new(5,4)
p2=Point.new(10,10)

now I have a ‘line’ method, from an external library, defined as:

def line x1, y1, x2, y2, attrs

line(*p1, *p2, attrs) # parse error

Isn’t this more clear and natural and mantainable and easy than

line(p1.x, p1.y, p2.x, p2.y, attrs)

?

Ummmm… I’d have to answer no.

Is the external library written in Ruby? If so, then it would be better
to do add another method:

line_from_points(p1,p2,attrs)

Phil