[RCR] New [] Semantics

Currently, the following code

   a = [1, 2, 3, 4, 5]
   a[0, 3]

returns

   [1, 2, 3]

This is somewhat counter-intuitive. Since Ruby has a built-in range
type, [] ought to take advantage of it. I propose that the []
operators be redefined so that this behavior can only be achieved by
explicitly providing a Range, e.g. a[0...3]. The original code would
then work like #values_at and return [1, 3].

Also, I don't know what happened with the earlier mention about the
confusion between .. / ... but I'm a supporter of getting rid of '...'
and just making .. inclusive. Exclusive ranges can be represented
with 0..(n + 1) if necessary. I don't know if this is appropriate for
an RCR.

I think the above [] behavior is more in keeping with POLS and is
slightly more intuitive than the current default.

Obviously this suggestion (and the sub-suggestion about .. and ...)
would break existing code. I don't know if RCR's are allowed to do
that, but I'm just throwing this idea out there.

Bill Atkins

Hi,

···

In message "Re: [RCR] New Semantics" on Tue, 5 Oct 2004 08:24:23 +0900, Bill Atkins <batkins57@gmail.com> writes:

I think the above behavior is more in keeping with POLS and is
slightly more intuitive than the current default.

Obviously this suggestion (and the sub-suggestion about .. and ...)
would break existing code. I don't know if RCR's are allowed to do
that, but I'm just throwing this idea out there.

RCR's are allowed to break exisiting codes, but NOT ALLOWED to mention
POLS in them. Being intuitive is a weak reason to break
compatibility.

              matz.

Bill Atkins wrote:

Currently, the following code

   a = [1, 2, 3, 4, 5]
   a[0, 3]

returns

   [1, 2, 3]

This is somewhat counter-intuitive. Since Ruby has a built-in range
type, ought to take advantage of it. I propose that the
operators be redefined so that this behavior can only be achieved by
explicitly providing a Range, e.g. a[0...3]. The original code would
then work like #values_at and return [1, 3].

I've also thought about this circa one and a half year ago. (And I didn't even realize that I'm using Ruby for that long a time already. I still feel like I've only barely scratched the surface of what this language is able to do and there's so many unexplored libraries, concepts, mind sets that I haven't explored properly yet...)

That aside, my solution was to use ary[[0, 1..3, 4]] for this.

I coded up an implementation of it, but please note that this is quite old code. I didn't know about values_at at the time I wrote it and I'm not even sure if the code is correct in all cases. (Had I written this now, I would want to have a handful of test cases, even though I might hesitate to write them -- I really need to make this a habit.)

I have attached the code, because it is a quick way of testing this alternate syntax from irb -- it might be especially interesting to see what happens with other Enumerables in action -- it is hard to think about such border cases without an implementation.

So, what do you think about using this syntax instead? It is slightly more complex than the one you proposed of course, but it is also consistent with ary[1..3] and could in theory be useful in more cases.

Regards,
Florian Gross

array_fetch_array.rb (1.35 KB)

Bill Atkins wrote:

Currently, the following code

   a = [1, 2, 3, 4, 5]
   a[0, 3]

returns

   [1, 2, 3]

This is somewhat counter-intuitive. Since Ruby has a built-in range
type, ought to take advantage of it. I propose that the
operators be redefined so that this behavior can only be achieved by
explicitly providing a Range, e.g. a[0...3]. The original code would
then work like #values_at and return [1, 3].

Yes but [start,length] is capable of expressing ranges that make no sense using [range]. For instance:

a = [:a,:b,:c,:d,:e,:f,:g,:h]
a[-2,2] # => [:g,:h]
a[-2..2] # =>
a[-2..-1] # => [:g, :h]
a[0..-1] # => [:a,:b,:c,:d,:e,:f,:g,:h]

Obviously it isn't too hard to convert between the two formats, but many times it makes far more sense to express it in the [start,length] format as opposed to the range format of start..end.

Also, I don't know what happened with the earlier mention about the
confusion between .. / ... but I'm a supporter of getting rid of '...'
and just making .. inclusive. Exclusive ranges can be represented
with 0..(n + 1) if necessary. I don't know if this is appropriate for
an RCR.

This makes perfect sense when the range is over numeric values, but to me ("a"..("c".succ)) is ugly and not easily read. Or any other range of objects like that for that matter.

Charles Comstock

Bill Atkins wrote:

Also, I don't know what happened with the earlier mention about the
confusion between .. / ... but I'm a supporter of getting rid of '...'
and just making .. inclusive. Exclusive ranges can be represented
with 0..(n + 1) if necessary. I don't know if this is appropriate for
an RCR.

I find it surprising that several people have expressed so much aversion to '...' that they would actually favor removing it from the language. In all earnestness, why don't you just forget it exists?

While you may find inclusive ranges less confusing than exclusive ranges, for some people (including me and, apparently[1], Ara), exclusive ranges are more useful and intuitive.

Using exclusive ranges eliminates error-prone "n+1" calculations. As a case in point, you said "0..(n + 1)" but I think you actually meant "0..(n - 1)". This may have just been a typo on your part, but I personally make a lot of mistakes of this kind when I have to use inclusive ranges, especially when converting from start/end indexes to offset/length.

[1]http://groups-beta.google.com/group/comp.lang.ruby/msg/a6cbf0ee261fc9b4

Also, I don't know what happened with the earlier mention about the
confusion between .. / ... but I'm a supporter of getting rid of '...'

     I think I killed it (admittedly, the thread was on it's last legs
in any case) by pointing out an idiom (which everyone had seemed to
approve of) from the cypher-quiz that would be very hard to replicate as
concisely without exclusive ranges:

Here's a simple example of where ... is very nice to have. You want
to cut a deck and card x, so that x is on the top after the cut:

deck = deck.values_at(x..-1,0...x)

*grin* Try doing that as concisely without "..."

-- Markus

···

On Mon, 2004-10-04 at 16:24, Bill Atkins wrote:

that's NOT what my wife says! :wink:

-a

···

On Tue, 5 Oct 2004, Yukihiro Matsumoto wrote:

Being intuitive is a weak reason to break compatibility.

--

EMAIL :: Ara [dot] T [dot] Howard [at] noaa [dot] gov
PHONE :: 303.497.6469
A flower falls, even though we love it;
and a weed grows, even though we do not love it. --Dogen

===============================================================================

Good point. I hadn't thought about that. I just think the difference
between .. / ... is sort of confusing.

Bill Atkins

···

On Tue, 5 Oct 2004 10:09:52 +0900, Charles Comstock <cc1@cec.wustl.edu> wrote:

This makes perfect sense when the range is over numeric values, but to
me ("a"..("c".succ)) is ugly and not easily read. Or any other range of
objects like that for that matter.

Hmm... It's still signifies a range. So if there were just a notation, then it
might be nice. I'm not sure what that would be though.

Really, in looking over Ruby's Range class, it is bit limited. You can't
exclude the start element, and it doesn't provide a way to specify a
increment so you can't iterate over floats. A more complete range would have
the initializer something like:

  Range.new(start, end, start_exclude=false, end_exclude=false, inc=1)

Again, sure the best way to make a nice neat literal notation for all that.
Although one simple suggestion is to have a '+' method to set the increment.

  (1.0 .. 3.0 + 0.5).to_a #=> [1.0, 1.5, 2.0, 2.5, 3.0]

T.

···

On Monday 04 October 2004 09:09 pm, Charles Comstock wrote:

Yes but [start,length] is capable of expressing ranges that make no
sense using [range]. For instance:

a = [:a,:b,:c,:d,:e,:f,:g,:h]
a[-2,2] # => [:g,:h]
a[-2..2] # =>
a[-2..-1] # => [:g, :h]
a[0..-1] # => [:a,:b,:c,:d,:e,:f,:g,:h]

Obviously it isn't too hard to convert between the two formats, but many
times it makes far more sense to express it in the [start,length] format
as opposed to the range format of start..end.

Hi,

···

In message "Re: [RCR] New Semantics" on Tue, 5 Oct 2004 08:54:53 +0900, Ara.T.Howard@noaa.gov writes:

Being intuitive is a weak reason to break compatibility.

that's NOT what my wife says! :wink:

Don't tell her. She would notice we are weird people.
Maybe it's too late.

              matz.

Hi --

> This makes perfect sense when the range is over numeric values, but to
> me ("a"..("c".succ)) is ugly and not easily read. Or any other range of
> objects like that for that matter.

Good point. I hadn't thought about that. I just think the difference
between .. / ... is sort of confusing.

There are a couple of mneumonics for remembering which is which.

  .. has two letters: in
  ... has three letters: out

:slight_smile:

Or you can think of the three dots as kind of pushing the end value
out of reach, so that the included values of the range can't quite get
there.

David

···

On Tue, 5 Oct 2004, Bill Atkins wrote:

On Tue, 5 Oct 2004 10:09:52 +0900, Charles Comstock <cc1@cec.wustl.edu> wrote:

--
David A. Black
dblack@wobblini.net

The ideas I'm (slowly) playing with for ranges:

     Extend the Range so that either or both ends can be
         inclusive, exclusive or unbounded (i.e., open, closed, or
         infinite)

     Define construction operators '<..<', '<=..<', '<..<=' and '<=..<='
     Likewise '<.._', '<=.._', '_..<=' and '_..<'
         (the last two being unary prefixes)
     Keep '..' as an alias for '<=..<='
     Keep '...' as an alias for '<=..<'
     Define construction operator '..+' for the start/length-1 case
     Define construction operator '..<+' for the start/length case

     Add Range#by(step)

     Defining a related class for "disordered" ranges like "2..-1" which
          are handy but semantically disjoint for pure ranges. I'm
          thinking something that water would roll off the back of in
          a duck typing world, but that would raise reasonable error
          messages in preference to producing unexpected behaviour.

Typically, the versions of ruby I produce in these experiments are
killed by angry villages before they can show their essential
kindheartedness. But I still hope.

    -- Markus

P.S.

···

On Mon, 2004-10-04 at 20:36, trans. (T. Onoma) wrote:

On Monday 04 October 2004 09:09 pm, Charles Comstock wrote:
> Yes but [start,length] is capable of expressing ranges that make no
> sense using [range]. For instance:
>
> a = [:a,:b,:c,:d,:e,:f,:g,:h]
> a[-2,2] # => [:g,:h]
> a[-2..2] # =>
> a[-2..-1] # => [:g, :h]
> a[0..-1] # => [:a,:b,:c,:d,:e,:f,:g,:h]
>
> Obviously it isn't too hard to convert between the two formats, but many
> times it makes far more sense to express it in the [start,length] format
> as opposed to the range format of start..end.

Hmm... It's still signifies a range. So if there were just a notation, then it
might be nice. I'm not sure what that would be though.

Really, in looking over Ruby's Range class, it is bit limited. You can't
exclude the start element, and it doesn't provide a way to specify a
increment so you can't iterate over floats. A more complete range would have
the initializer something like:

  Range.new(start, end, start_exclude=false, end_exclude=false, inc=1)

Again, sure the best way to make a nice neat literal notation for all that.
Although one simple suggestion is to have a '+' method to set the increment.

  (1.0 .. 3.0 + 0.5).to_a #=> [1.0, 1.5, 2.0, 2.5, 3.0]

Hmm... a bit of a touch up (btw unbound can be represented by it's own object,
so no special syntax required):

  r = 0<..<43
  r = 0<..<=42
  r = 0<=..<43
  r = 0<=..<=42

  r = 0<..+<42
  r = 0<..+<=42
  r = 0<=..+<42
  r = 0<=..+<=42

Basically tie-fighter ranges ( and x-wing ranges :wink: F'ugly! :frowning: Alhtough I
like your direction. Alternative is just to use standard-like notation:

  0 < r < 43
  0 < r <= 42
  0 <= r < 43
  0 <= r <= 42

  0 < r +< 43
  0 < r +<= 42
  0 <= r +< 43
  0 <= r +<= 42

Who said assignment always had to be 'r =' ? Of course it would be nice if we
could just do like:

  r = :(0,43)
  r = :(0,42]
  r = :[0,43)
  r = :[0,42]

  r = :(0:43)
  r = :(0:42]
  r = :[0:43)
  r = :[0:42]

T.

···

On Tuesday 05 October 2004 01:25 am, Markus wrote:

I occurs to me that the angry villagers might be confused. The example of the
never ending

  (0..(10.0/0)).member?(4)

comes to mind. Why would this be an infinite loop? It must be trying to
generate the list before looking to see if 4 is in it (?) Are these ranges
that stupid? Even so, if it used succ to test this then it would take a while
to find out:

time ruby -e '(0..1000000000).member?(999999999)'

real 7m10.971s
user 6m56.150s
sys 0m0.617s

Yuk. But there is nothing one can do about it as long as one depends on #succ.
I suppose it's awfully clever and OOP and all to have any object supporting
<=> and succ work with ranges, but I wonder how much use they get outside of
numbers and occasional character ranges. In other words perhaps succ isn;t
the way to go (or perhaps a fallback) and a simple increment/decrement in the
Range itself would be more usable --then the above 7 minutes would be about 7
milliseconds.

T.

···

On Tuesday 05 October 2004 01:25 am, Markus wrote:

Typically, the versions of ruby I produce in these experiments are
killed by angry villages before they can show their essential
kindheartedness. But I still hope.

David A. Black wrote:

There are a couple of mneumonics for remembering which is which.

  .. has two letters: in
  ... has three letters: out

:slight_smile:

Or you can think of the three dots as kind of pushing the end value
out of reach, so that the included values of the range can't quite get
there.

And as I mentioned in a previous post, but which probably got lost in the noise, you can think of "..." as what it notates in English, i.e. ellipsis--meaning "something left out". As a previous poster noted, it does not map exactly into the English semantics, but as a mnemonic, I think it works well.

Bob

I occurs to me that the angry villagers might be confused. The example of the
never ending

  (0..(10.0/0)).member?(4)

comes to mind. Why would this be an infinite loop? It must be trying to
generate the list before looking to see if 4 is in it (?)

Yes, see adjacent thread. What it actually does is iterate all value from
start to end using succ, and set a flag to true when it finds a match (but
it doesn't break out of the loop when a match is found)

Are these ranges
that stupid?

Yes, but what you probably want is 'include?' rather than 'member?'

include? just checks the given value against the start and end values.

Both Range#include? and Range#member? override the methods mixed in from
Enumerable, where those methods are just synonyms for each other.

This is certainly confusing!

I'd say if you want to iterate over the range, then use Enumerable#find or
Enumerable#find_all as appropriate, then get rid of this distinction.

Yuk. But there is nothing one can do about it as long as one depends on #succ.
I suppose it's awfully clever and OOP and all to have any object supporting
<=> and succ work with ranges, but I wonder how much use they get outside of
numbers and occasional character ranges. In other words perhaps succ isn;t
the way to go (or perhaps a fallback) and a simple increment/decrement in the
Range itself would be more usable --then the above 7 minutes would be about 7
milliseconds.

If you're going to rely on increment/decrement then I'm pretty sure you can
also rely on the mathematic properties of < and >, i.e. just compare the
boundary values as 'include?' does.

Besides,
  a = a.succ
and
  a = a + 1
take almost identical amounts of time, since even '+ 1' involves a method
dispatch:
  a = a.send(:+,1)

Regards,

Brian.

···

On Tue, Oct 05, 2004 at 06:19:09PM +0900, trans. (T. Onoma) wrote:

Hmm... a bit of a touch up (btw unbound can be represented by it's own object,
so no special syntax required):

     If there is no special syntax, but how do create them? I'll admit
I like the idea of Infinity posted on the next thread over, but that
came up after I started.

  r = 0<..<43
  r = 0<..<=42
  r = 0<=..<43
  r = 0<=..<=42

  r = 0<..+<42
  r = 0<..+<=42
  r = 0<=..+<42
  r = 0<=..+<=42

     I'm not sure what distinction you are making here with the '+'.

Basically tie-fighter ranges ( and x-wing ranges :wink: F'ugly! :frowning: Alhtough I
like your direction.

     The core idea is actually peripheral to this (or visa versa); I'm
modifying parse.y to extend the idea of tOP_ASGN ( +=, -=, etc.) to
include (as user redefinable methods like <=> is presently) _all_
combinations of operator characters.

Alternative is just to use standard-like notation:

  0 < r < 43
  0 < r <= 42
  0 <= r < 43
  0 <= r <= 42

  0 < r +< 43
  0 < r +<= 42
  0 <= r +< 43
  0 <= r +<= 42

Who said assignment always had to be 'r =' ? Of course it would be nice if we
could just do like:

     Yikes! I don't think that would be easy to parse at all,
especially since all three non-terminals could be syntactical complex.
And what if you wanted to pass a range as an actual parameter?

  r = :(0,43)
  r = :(0,42]
  r = :[0,43)
  r = :[0,42]

  r = :(0:43)
  r = :(0:42]
  r = :[0:43)
  r = :[0:42]

     Hmmm. That would be a little harder--or at least, I don't quite
see how to bend the parser to handle it.

-- Markus

···

On Tue, 2004-10-05 at 01:33, trans. (T. Onoma) wrote:

If I ever get past the parse.y hurdles, I plan to look at this. I
think it's because ranges are actually a conflation of at least four
ideas:

      * ranges of Comparable objects, with the low less than the high,
        in which case <=> gives a quick member test
      * arcs of demi-circular array subscripts, where negative values
        "count back" from the end
      * Sets of discreet values than fall within one of the above
      * Arbitrary "starting" and "ending" tests

but I suspect this is not an exhaustive list.

     -- Markus

···

On Tue, 2004-10-05 at 02:19, trans. (T. Onoma) wrote:

On Tuesday 05 October 2004 01:25 am, Markus wrote:
> Typically, the versions of ruby I produce in these experiments are
> killed by angry villages before they can show their essential
> kindheartedness. But I still hope.

I occurs to me that the angry villagers might be confused. The example of the
never ending

  (0..(10.0/0)).member?(4)

comes to mind. Why would this be an infinite loop? It must be trying to
generate the list before looking to see if 4 is in it (?) Are these ranges
that stupid? Even so, if it used succ to test this then it would take a while
to find out:

time ruby -e '(0..1000000000).member?(999999999)'

real 7m10.971s
user 6m56.150s
sys 0m0.617s

Yuk. But there is nothing one can do about it as long as one depends on #succ.
I suppose it's awfully clever and OOP and all to have any object supporting
<=> and succ work with ranges, but I wonder how much use they get outside of
numbers and occasional character ranges. In other words perhaps succ isn;t
the way to go (or perhaps a fallback) and a simple increment/decrement in the
Range itself would be more usable --then the above 7 minutes would be about 7
milliseconds.

T.

Hi,

I changed the subject.

···

In message "Re: [RCR] New Semantics" on Tue, 5 Oct 2004 18:19:09 +0900, "trans. (T. Onoma)" <transami@runbox.com> writes:

I occurs to me that the angry villagers might be confused. The example of the
never ending

(0..(10.0/0)).member?(4)

comes to mind. Why would this be an infinite loop?

'member?' should have terminated iteration as soon as it find the
value. I will fix.

Range serves as both continuous and discrete interval of values.
'member?' treat it as discrete, whereas 'include?' treat it as
continuous.

              matz.

Yes, but what you probably want is 'include?' rather than 'member?'

Certainly helps to know the distinction (which in unintuitive btw)

Both Range#include? and Range#member? override the methods mixed in from
Enumerable, where those methods are just synonyms for each other.

This is certainly confusing!

Amazing how even the simple things get that way!

I'd say if you want to iterate over the range, then use Enumerable#find or
Enumerable#find_all as appropriate, then get rid of this distinction.

Understandable. #member? should just be an alias for #find then and use
#between? for other need. At least, that seems the most consistant.

> Yuk. But there is nothing one can do about it as long as one depends on
> #succ. I suppose it's awfully clever and OOP and all to have any object
> supporting <=> and succ work with ranges, but I wonder how much use they
> get outside of numbers and occasional character ranges. In other words
> perhaps succ isn;t the way to go (or perhaps a fallback) and a simple
> increment/decrement in the Range itself would be more usable --then the
> above 7 minutes would be about 7 milliseconds.

If you're going to rely on increment/decrement then I'm pretty sure you can
also rely on the mathematic properties of < and >, i.e. just compare the
boundary values as 'include?' does.

Besides,
  a = a.succ
and
  a = a + 1
take almost identical amounts of time, since even '+ 1' involves a method
dispatch:
  a = a.send(:+,1)

With inc/dec modulo can be used. Something like:

  def member?(e)
    return ( (((e + self.begin) % @increment) == 0) && self.between?(e) )
  end

T.

···

On Tuesday 05 October 2004 05:59 am, Brian Candler wrote: