New block notation (was: Re: ruby-dev summary 26468-26661)

Florian,

I always thougt that if Ruby supported multiblocks it might look like:

  f(a,b){...}{...}{...}

The language io puts blocks inside the parens and I think you're right
about that. It's hard to follow.

T

Florian Groß wrote:

Yukihiro Matsumoto wrote about new lambda syntaxes:

>f = ->(a) { ... }

[...]

>f = { : a,b=1|2 : ... }
>f = { : * : ... } # equivalent to f = lambda { ... }

Unfortunately :+space causes conflicts as well.

I'm probably not in the position to criticize, but I still feel like I have to or I will regret not having done it later.

Is it really in accordance to Ruby's design mentality to introduce new syntax just because parsing the most obvious one is too complex?

This is how I understand the issue. A new syntax is being considered because this will be hard to get working:

l = lambda { |x, y, z = (x | y)| ... }

And because block arguments don't look much (and aren't handled much like) method arguments.

If this is not only for the above two reasons, but also because you think that the above sample is hard to parse visually then please consider other choices that are more obvious than arrows pointing into random directions:

f = def(x, y, z = x | y) { ... } # also need to allow def x() { ... }
f = fun(x, y, z = x | y) { ... } # a new keyword

I find this the most unified looking, similar to JavaScript's function() {}.

Personally, I think this complete situation is not as much as a problem as it might appear. -- Blocks are a special syntax in Ruby and we have all gotten used to it. It is okay for them to also have a special argument syntax. Having default arguments with the current one is not hard. We don't need both method call semantics and special block semantics both at the same time. We could get rid of Proc.new without losing anything but the danger of confusion.

Sure, the C#, Perl and ECMAScript way of doing this has advantages like being able to assign closures to variables with the same syntax as with method calls. It also allows multiple closures per method and storing them in hashes without a second syntax. Should we switch to it? Should this have been the way for Ruby to do it since the beginning?

As stated above I think switching might be more trouble than its worth. Sure, if you learn Ruby you will at one point try to do "x = { exit }", but after a small surprise you will have learnt to insert the "lambda".

Now to the more difficult part. Would it have been better if Ruby would have merged blocks and lambdas since the beginning? This is hard to answer. When I was learning Ruby I thought that it would be great to be able to supply multiple blocks to one method without needing a different syntax. Has this been a limitation? Nope, but that could be related to users of languages automatically working around issues before they encounter them.

So would there have been a downside to Ruby having unified blocks and lambdas since the beginning? Yes, I think that

10.upto(20, { |i| puts i })

is harder to type and read than

10.upto(20) { |i| puts i }

One can now argue that it does not matter for cases where there only is a block:

1.times { } is the same as 1.times({ }) anyway.

Or that we could allow trailing arguments that happen to be blocks to appear outside of the argument list:

Array.new(5) { |i| i * 2 } == Array.new(5, { |i| i * 2 })

# It is hard to come up with methods that take multiple blocks which
# isn't very surprising.
ary.context_each(
  first: { |i| puts "First: #{i}" },
  last: { |i| puts "Last: #{i}" },
) { |i| puts i }

# Or:
ary.context_each(
  first: { |i| puts "First: #{i}" },
  last: { |i| puts "Last: #{i}" },
  middle: { |i| puts i }
)

I also think that you might be able to fix the arrow syntax by moving it. I think something like this is acceptable:

adder = (a, b) -> { a + b }

Could you remove the -> entirely? Then an anonymous function is *exactly
that*. :slight_smile:

f = (a, b) { a + b }

And perhaps:

printer = a -> { puts a }

And even more perhaps:

printer = a -> puts a

This is again very hard to parse. Perhaps even for humans. So:

adder = \(a, b) -> { a + b }
printer = \a -> { puts a }

So, basically Haskell? :slight_smile: Forgive me for hardly ever posting to Ruby, I just
like to watch.

Don't get me wrong here. It might be rare, but in this case I am not actually trying to talk you into changing the language. I think the best and least risky option is to just keep the current situation. I am usually for sacrificing backwards compatibility to gain more intuitiveness and simplicity, but in this case the problem is not big enough to solve it (yet?).

Mike A.

But if you had a language (and I imagine there is one somewhere :slight_smile:
that used unique strings of punctuation for all its keywords, it would
look pretty bad. Obviously that's not an issue here, but it does
suggest that there can be such a thing as too much punctuation, even
if it's unambiguous.

Yes! Brainf*ck your time has come...

http://www.muppetlabs.com/~breadbox/bf/

and for a couple of examples...

http://www.muppetlabs.com/~breadbox/bf/factor.b.txt

http://www.muppetlabs.com/~breadbox/bf/quine.b.txt

Where punctuation riegns supreme!!

Kev (way too much caffiene today)

"David A. Black" <dblack@wobblini.net> writes:

I don't agree. Punctuation symbol nor sequence of them itself is not
a problem. The problems can be caused by context dependent
interpretation of symbols, I think.

But if you had a language (and I imagine there is one somewhere :slight_smile:
that used unique strings of punctuation for all its keywords, it would
look pretty bad. Obviously that's not an issue here, but it does
suggest that there can be such a thing as too much punctuation, even
if it's unambiguous.

I guess you don't like APL and friends. :slight_smile:

···

David

--
Christian Neukirchen <chneukirchen@gmail.com> http://chneukirchen.org

FWIW, this is what Lua allows - a special syntax for the special case where the sole argument to the function is a table (aka Hash) allows you to drop the surrounding parens.

(Not a suggestion that this is sufficient for blocks, just a random peek into another interpreted language's syntax.)

···

On Aug 4, 2005, at 6:49 PM, Florian Groß wrote:

10.upto(20, { |i| puts i })

is harder to type and read than

10.upto(20) { |i| puts i }

One can now argue that it does not matter for cases where there only is a block:

1.times { } is the same as 1.times({ }) anyway.

I'm just going to throw in that when I first found Ruby I was so excited
about how close to perfect it was that I wanted to perfect it further.

Now, however, I'd be happy to have the syntax stay the same as 1.8 from
here on out and hope that work on future versions would be concentrated
on the backend, improving speed and supporting native threads for example.

The arrow syntax is making me nervous. I say let's don't let the Perfect
be the enemy of the Really Damn Good.

$.02

···

On Fri, 5 Aug 2005 09:49:28 +0900 Florian Groß <florgro@gmail.com> wrote:

I'm probably not in the position to criticize, but I still feel like I
have to or I will regret not having done it later.

----------------------------------------------------------------------

Jim Hranicky, Senior SysAdmin UF/CISE Department |
E314D CSE Building Phone (352) 392-1499 |
jfh@cise.ufl.edu http://www.cise.ufl.edu/~jfh |

----------------------------------------------------------------------

"Trans" <transfire@gmail.com> wrote in message

Nah, just use the _collection_ brackets for both array and hash. The
parser can tell them apart:

  # array
  [ a, b, c ]

  # hash
  [ a=>x, b=>y, c=>z ]
  [ a: x, b: y, c: z ]

Compared to this other stuff that'd be a breeze to parse.

+1

Hi --

···

On Fri, 5 Aug 2005, Trans wrote:

That's rather drastic. All it needs is a new symbol for literal hashes.
(I like [| |] myself.)

Nah, just use the _collection_ brackets for both array and hash. The
parser can tell them apart:

# array
[ a, b, c ]

# hash
[ a=>x, b=>y, c=>z ]
[ a: x, b: y, c: z ]

I don't like to have to scan visually for a separator in order to know
what the [ meant. That seems ad hoc and fragile.

David

--
David A. Black
dblack@wobblini.net

# empty array

# empty hash

martin

···

Trans <transfire@gmail.com> wrote:

> That's rather drastic. All it needs is a new symbol for literal hashes.
> (I like [| |] myself.)

Nah, just use the _collection_ brackets for both array and hash. The
parser can tell them apart:

  # array
  [ a, b, c ]

  # hash
  [ a=>x, b=>y, c=>z ]
  [ a: x, b: y, c: z ]

Compared to this other stuff that'd be a breeze to parse.

Hi,

···

In message "Re: new block notation" on Fri, 5 Aug 2005 19:56:06 +0900, "Trans" <transfire@gmail.com> writes:

Nah, just use the _collection_ brackets for both array and hash. The
parser can tell them apart:

# array
[ a, b, c ]

# hash
[ a=>x, b=>y, c=>z ]
[ a: x, b: y, c: z ]

Yes, if it isn't empty.

              matz.

"David A. Black" <dblack@wobblini.net> writes:

But if you had a language (and I imagine there is one somewhere :slight_smile:
that used unique strings of punctuation for all its keywords, it would
look pretty bad.

Prolog!

The language where all words are part of some argument towards a
goal, so the work of the language itself is restricted to bits of
punctuation between those words.

There are very few people in the world who are smart enough to
understand, and be productive, in Prolog.

--Dave

Nah, just use the _collection_ brackets for both array and hash. The
parser can tell them apart:

nice, dunno if this breaks other things though!

George

···

--

http://www.navel.gr

matz,

Why not pioneer the first language with it's own fontset :slight_smile:

T.

Yukihiro Matsumoto wrote:

>Is it really in accordance to Ruby's design mentality to introduce new >syntax just because parsing the most obvious one is too complex?

I didn't choose -> syntax just because "the most obvious one is too
complex".

Pardon the trollish oversimplification and thanks for the detailed explanation. It was very informative.

This is probably the wrong place to ask, but is there resources that explain the evolution of Ruby's design in detail? It is something that I find very interesting.

Block parameters are destination of multiple assignment. It was
natural since it was designed to be loop variables. Later on,
closures were introduced. Closure, or function object has different
requirement for their parameters.

  * loop variables requires no strict check, it is OK to ignore given
    value in the loop body. But method does strict check. I think
    closures as well.

I think that the check is not strictly necessary in blocks. I guess it could be argued about if it is necessary to have it for lambdas.

Or can we just add the check to all blocks and lambdas? This would mean that multiple assignment semantics are totally gone and that [1, 2, 3].each_with_index { |x| p x } would no longer work which might be a good thing, anyway.

Ignoring values has been troublesome even with multiple assignment because of the array assignment effect anyway.

  * since block parameters are multiple assignment, it does have some
    weird behavior in corner cases, especially when arrays and values
    associated in left hand expression.

Hmmm, yes.

In short, their have been some big (well, at least for me) semantic
gap in block parameters since closures come into the language. This
is a good chance to fix.

I see, but currently I don't think I like the fix. I think I would prefer to keep the current semantics instead of x = ->(a) { ... }

Here's another try to come up with an alternative:

[1, 2, 3].inject { |a| a } # => 1
[1, 2, 3].inject { |*a| a } # => [[1, 2], 3]
I think both would eventually not produce warnings. The first could still emit a warning for a short period to get users used to the change.

There is at least two options regarding default and keyword arguments in block argument lists.

Option A:
def x() yield(1, 2, 5); yield(1, 2) end;
def y() yield(c: 5); yield() end
x { |a, b, c = (a | b)| p c } # outputs 5, 3
y { |a: 0, b: 0, c: (a | b)| p c } # outputs 5, 0

Option B:
No default values and keyword arguments for blocks. (Rarely needed anyway.)

adder = fun(a, b) -> { a + b } or
adder = fun(a, b) { a + b }

And these would have exactly the same semantics as method calls.

The problem with this approach is that it officially makes a split between blocks and anonymous functions, but it easily allows to ignore arguments in iterators.

If ignoring arguments in iterators is not as important as simplicity there is not much of a reason to ignore a new syntax IMHO. In that case this could be done:

[1, 2, 3].inject { |a| a } # ArgumentError at yield
# I think ,* makes a lot of sense for ignoring everything from here on
# and I think it fits visually well as well
[1, 2, 3].inject { |a,*| a } # 1
[1, 2, 3].inject { |*a| a } # [[1, 2], 3]

And we would still use lambda:

adder = lambda { |a, b| a + b }
alias :λ :lambda # I wonder if Unicode names will ever be in core Ruby
adder = λ { |a, b| a + b }

Even in this situation it still needs to be decided if keyword and default arguments should be possible for both lambda and blocks.

I feel a bit like I have lost a line of thought, somehow. I hope this still looks sane.

I really don't understand the idea behind x = ->(a, b) { ... } -- it makes as much sense as x = ^^(a, b) { ... } to me so I would like if it would not replace the anonymous functions or blocks I use so frequently.

>I also think that you might be able to fix the arrow syntax by moving >it. I think something like this is acceptable:
>
>adder = (a, b) -> { a + b }

Again, I don't think yacc would allow this.

I wonder how perl parses code. I think they are doing a lot of semantic look-ahead. Doesn't help with parsing of humans, though.

>This is again very hard to parse. Perhaps even for humans. So:
>
>adder = \(a, b) -> { a + b }
>printer = \a -> { puts a }

I don't want to use backslashes just for yen-sign problem, besides
this particular idea requires even more punctuation. It is an
unfortunate character (at least in Japan) which has totally different
appearance between fontset. For example, '\' above can be seen as a
backslash on my Emacs, yen-sign on my browser, only if the page
contains any Japanese characters. That might be my personal problem,
but I consider myself very important in the design process of the Ruby
language.

I don't think it is a personal problem of you. It would likely be a huge problem for all Japanese Ruby users then (After all posting code on the web is fairly common.) and it would probably a bad idea to spoil the language for those. :slight_smile: -- Had I known about it I would not have suggested using backslashes.

James F. Hranicky wrote:

I'm probably not in the position to criticize, but I still feel like I have to or I will regret not having done it later.

I'm just going to throw in that when I first found Ruby I was so excited about how close to perfect it was that I wanted to perfect it further.

Now, however, I'd be happy to have the syntax stay the same as 1.8 from
here on out and hope that work on future versions would be concentrated
on the backend, improving speed and supporting native threads for example.

The arrow syntax is making me nervous. I say let's don't let the Perfect
be the enemy of the Really Damn Good.

Yes, thank you, I agree.

I'm not opposed to minor syntax changes aong the way, but in my opinion
the benefits have to be very significant in relation to the "size" of
the change.

I have two language changes that I favor, a syntactic change I consider
minor and a semantic change that would break no code.

Other than that, I'm all in favor of improvement and expansion -- native
threads, a VM, more libs, and so on.

But Ruby should be changed the way it has for the last twelve years --
conservatively and carefully.

If the changes had been controlled less tightly five or six years ago,
then today we wouldn't have the Ruby we have now. We would have a
bizarre cross between Perl and APL.

Hal

···

On Fri, 5 Aug 2005 09:49:28 +0900 > Florian Groß <florgro@gmail.com> wrote:

I hadn't seen this particular variant posted in this thread yet.
Apologies if it was and I missed it. I make no claims as to its ease
of parsing or whatever, but it looks better than many of the other
options to me.

bar = { x, y=3, z=b|5 ->
  x + y * z
}

I believe this is similar to the approach that Groovy [1] uses for its blocks.

Jason

[1] http://groovy.codehaus.org

Ah, but that's the problem, IMO. Javascript's "function(){ ... }" is so verbose that I would never *think* of using an anonymous function the way we use blocks in Ruby:

var out = "";
myArray.each( function(el){
   out += el;
} );

···

On Aug 4, 2005, at 11:01 PM, Mike Austin wrote:

f = def(x, y, z = x | y) { ... } # also need to allow def x() { ... }
f = fun(x, y, z = x | y) { ... } # a new keyword

I find this the most unified looking, similar to JavaScript's function() {}.

it's even worse than that:

   jib:~ > ruby -e 'm = lambda{|x| p x}; m [ 0b101010 ]'
   42

   jib:~ > ruby -e 'def m a; p a.first; end; m [ 0b101010 ]'
   42

   jib:~ > ruby -e 'm = Hash::new{p 42}; m [ 0b101010 ]'
   42

you can't even scan for these - you must know m's type...

-a

···

On Sat, 6 Aug 2005, David A. Black wrote:

Hi --

On Fri, 5 Aug 2005, Trans wrote:

That's rather drastic. All it needs is a new symbol for literal hashes.
(I like [| |] myself.)

Nah, just use the _collection_ brackets for both array and hash. The
parser can tell them apart:

# array
[ a, b, c ]

# hash
[ a=>x, b=>y, c=>z ]
[ a: x, b: y, c: z ]

I don't like to have to scan visually for a separator in order to know
what the [ meant. That seems ad hoc and fragile.

--

email :: ara [dot] t [dot] howard [at] noaa [dot] gov
phone :: 303.497.6469
My religion is very simple. My religion is kindness.
--Tenzin Gyatso

===============================================================================

David A. Black wrote:

> Nah, just use the _collection_ brackets for both array and hash. The
> parser can tell them apart:
>
> # array
> [ a, b, c ]
>
> # hash
> [ a=>x, b=>y, c=>z ]
> [ a: x, b: y, c: z ]

I don't like to have to scan visually for a separator in order to know
what the [ meant. That seems ad hoc and fragile.

David,

Yes, contrary to you other post, there are already "scanning" to be
done. Eg.

  def ameth(n=nil,&b)
    # ...
  end

Is it a hash or a block?

  ameth {[1032,12,34,89].collect{|x|x*7+3}.join('->');7+15**2}

  ameth {[1032,12,34,89].collect{|x|x*7+3}.join('->'),7+15**2}

I'm sure others can provide even better examples. But the point really
is that the "=>" (or ": ") is a equally as important visual cue that
stands out in the scanning, and is usually just a few charactes in.
Personally I have never written a hash without it. Thus using [ ... ]
in place of { ... } would at the least be no more or less difficult.
But would prove better in one respect --that the purely comma hash
noation would no longer be valid and a "=>" or ":" has to always be
used. That's a readability improvement in my mind.

T.

P.S. Might it help clear up ambiguity of '(1,2,3)' which is currently a
syntax error? Why can't that be parsed as an array?

Martin DeMello wrote:

# empty array

# empty hash

  [:]

and

  [=>]

Thanks,
T.