Struggling with Blocks

Newbie wrote:

I’m a new to Ruby, and struggling to understand the different kinds of
blocks/procs/methods/lambadas etc.

I found a good intro at

Understanding Ruby blocks, Procs and methods - Eli Bendersky's website,

but one thing which it doesn’t cover is begin/end. Why is a new keyword
used, instead of adding rescue/else/ensure to do/end blocks?

To put it in simplest terms, the keyword “do” must be preceded by a data
source, while the keyword “begin” doesn’t. So “begin” meets a syntactical
requirement that “do” doesn’t.

Maybe this will help:

valid:

1.upto(10) do |x|
puts x
end

not valid:

do |x|
puts x
end

valid:

begin

normal process

rescue

deal with errors

else

no-errors section

ensure

always-run section

end

also valid:

x = 0

begin
puts x
x += 1
end while x <= 10

also valid:

x = 0

begin
puts x
x += 1
end until x > 10

even this works:

x = 0

begin
puts x
x += 1
x /= 0 if x == 8 # generate an error
rescue
puts “Error!”
end until x > 10

The last form emits this:

0
1
2
3
4
5
6
7
Error!
8
9
10

HTH

···


Paul Lutus
http://www.arachnoid.com

Newbie wrote:

Perhaps I wasn't clear enough in my question (or misunderstood your
answer). I'm trying to understand a design decision - why isn't do/end
used for begin/end scenarios?

Because do ... end functions differently than begin ... end, and begin ...
end is necessary for goals where do ... end just won't do. And vice versa.
They have distinct purposes.

A begin ... end block can be nested within a do ... end block (and vice
versa), to mix iteration with the rescue feature.

···

--
Paul Lutus
http://www.arachnoid.com

Newbie wrote:

So, from a language design perspective, what would be wrong with this?

do
   stuff
   do
     risky stuff
   rescue
     SOS
   end
   more stuff
end while bored

"do" doesn't work this way -- it's an iterator (loosely speaking). "begin"
is not an iterator. They are different. There are good reasons to make this
distinction.

···

--
Paul Lutus
http://www.arachnoid.com

What *are* those reasons? That's the whole point of his question.

···

On Sun, 10 Sep 2006 15:05:10 -0700, Paul Lutus wrote:

Newbie wrote:

So, from a language design perspective, what would be wrong with this?

do
   stuff
   do
     risky stuff
   rescue
     SOS
   end
   more stuff
end while bored

"do" doesn't work this way -- it's an iterator (loosely speaking). "begin"
is not an iterator. They are different. There are good reasons to make this
distinction.

--
Ken Bloom. PhD candidate. Linguistic Cognition Laboratory.
Department of Computer Science. Illinois Institute of Technology.
http://www.iit.edu/~kbloom1/

Ken Bloom wrote:

Newbie wrote:

So, from a language design perspective, what would be wrong with this?

do
   stuff
   do
     risky stuff
   rescue
     SOS
   end
   more stuff
end while bored

"do" doesn't work this way -- it's an iterator (loosely speaking).
"begin" is not an iterator. They are different. There are good reasons to
make this distinction.

What *are* those reasons? That's the whole point of his question.

"do" reads from a list of items and provides them to its controlling block:

(implicit "do")

array.each { |item|
   # do something here
}

(explicit "do")

array.each do |item|
   # do something here
end

These two forms are interchangeable. One can argue that "do" is an implicit
"{ ... }" block, or the reverse. But the point is "do" receives items and
operates on them one at a time, then exits when its stream is empty.

"begin" doesn't get fed with items, it has a different purpose. It
demarcates a controlled block, to which a "rescue" clause might apply, or
to which a "while" test might apply, or others. And a "begin" block won't
persist on its own. Without some internal block that does something
repetitive (or a "while" test at the end), the "begin" block will exit in
one pass.

There is a need for both "do" and "begin" blocks. There is a need to
distinguish syntactically between a block that must be fed with items, and
one that must not be fed with items. To combine "do" and "begin" would lead
to syntactical ambiguity ... and surely then someone would ask why the two
purposes of "do" were not more clearly distinguished in the syntax.

If "do" and "begin" were to be merged, it would be a little like the
ambiguous use of "<<" in C++. In one context, it shifts bits:

int x = 1,y;

y = x << 4; // y = 16

In another context, "<<" inserts items into a stream:

iostream x;
bool y;

y = x << 4; // y = true if the operation was successful

See the problem? Without my clear declarations directly above each case, you
would have a hard time distinguishing cases that use the same syntax. This
would make program listings hard to interpret and debug (a fact in C++).

The multiple uses of "<<" in C++ is a well-known example, but the point I am
making is that it's important to avoid ambiguous syntax in language design.
"do ... end" always has a stream, and when the stream is exhausted, the
block exits. "begin ... end" never has a stream. It's easy to remember and
easy to read.

The end result of responding to every request for creative syntax variations
is called ... umm ... "Perl". :slight_smile:

···

On Sun, 10 Sep 2006 15:05:10 -0700, Paul Lutus wrote:

--
Paul Lutus
http://www.arachnoid.com

Excellent argument, poor choice of example code *cough*.

a = 1 #=> 1
a << 1 #=> 2
$stdout << 4 # outputs '4'

···

On Mon, Sep 11, 2006 at 07:45:28AM +0900, Paul Lutus wrote:

There is a need for both "do" and "begin" blocks. There is a need to
distinguish syntactically between a block that must be fed with items, and
one that must not be fed with items. To combine "do" and "begin" would lead
to syntactical ambiguity ... and surely then someone would ask why the two
purposes of "do" were not more clearly distinguished in the syntax.

If "do" and "begin" were to be merged, it would be a little like the
ambiguous use of "<<" in C++. In one context, it shifts bits:

int x = 1,y;

y = x << 4; // y = 16

In another context, "<<" inserts items into a stream:

iostream x;
bool y;

y = x << 4; // y = true if the operation was successful

See the problem? Without my clear declarations directly above each case, you
would have a hard time distinguishing cases that use the same syntax. This
would make program listings hard to interpret and debug (a fact in C++).

The multiple uses of "<<" in C++ is a well-known example, but the point I am
making is that it's important to avoid ambiguous syntax in language design.
"do ... end" always has a stream, and when the stream is exhausted, the
block exits. "begin ... end" never has a stream. It's easy to remember and
easy to read.

The end result of responding to every request for creative syntax variations
is called ... umm ... "Perl". :slight_smile:

Excellent explanation Paul... I guess many newbies, and not so new are
grateful.

Regards,

Jose L. Hurtado
Web Developer
Toronto, Canada

Paul Lutus wrote:

···

Ken Bloom wrote:

> On Sun, 10 Sep 2006 15:05:10 -0700, Paul Lutus wrote:
>
>> Newbie wrote:
>>
>>> So, from a language design perspective, what would be wrong with this?
>>>
>>> do
>>> stuff
>>> do
>>> risky stuff
>>> rescue
>>> SOS
>>> end
>>> more stuff
>>> end while bored
>>
>> "do" doesn't work this way -- it's an iterator (loosely speaking).
>> "begin" is not an iterator. They are different. There are good reasons to
>> make this distinction.
>>
>
> What *are* those reasons? That's the whole point of his question.

"do" reads from a list of items and provides them to its controlling block:

(implicit "do")

array.each { |item|
   # do something here
}

(explicit "do")

array.each do |item|
   # do something here
end

These two forms are interchangeable. One can argue that "do" is an implicit
"{ ... }" block, or the reverse. But the point is "do" receives items and
operates on them one at a time, then exits when its stream is empty.

"begin" doesn't get fed with items, it has a different purpose. It
demarcates a controlled block, to which a "rescue" clause might apply, or
to which a "while" test might apply, or others. And a "begin" block won't
persist on its own. Without some internal block that does something
repetitive (or a "while" test at the end), the "begin" block will exit in
one pass.

There is a need for both "do" and "begin" blocks. There is a need to
distinguish syntactically between a block that must be fed with items, and
one that must not be fed with items. To combine "do" and "begin" would lead
to syntactical ambiguity ... and surely then someone would ask why the two
purposes of "do" were not more clearly distinguished in the syntax.

If "do" and "begin" were to be merged, it would be a little like the
ambiguous use of "<<" in C++. In one context, it shifts bits:

int x = 1,y;

y = x << 4; // y = 16

In another context, "<<" inserts items into a stream:

iostream x;
bool y;

y = x << 4; // y = true if the operation was successful

See the problem? Without my clear declarations directly above each case, you
would have a hard time distinguishing cases that use the same syntax. This
would make program listings hard to interpret and debug (a fact in C++).

The multiple uses of "<<" in C++ is a well-known example, but the point I am
making is that it's important to avoid ambiguous syntax in language design.
"do ... end" always has a stream, and when the stream is exhausted, the
block exits. "begin ... end" never has a stream. It's easy to remember and
easy to read.

The end result of responding to every request for creative syntax variations
is called ... umm ... "Perl". :slight_smile:

--
Paul Lutus
http://www.arachnoid.com

Paul Lutus wrote:

"do" reads from a list of items and provides them to its controlling block:

These two forms are interchangeable. One can argue that "do" is an implicit
"{ ... }" block, or the reverse. But the point is "do" receives items and
operates on them one at a time, then exits when its stream is empty.

Not strictly true. Reemember that the iteration comes from the fact
that the each method is invoking the block multiple times.

Here's a do/end that will only execute once (and by the way
has no parameters):

   Dir.chdir(some_dir) do process_something end

And by the way, the rescue that can appear between def and end
is rather an example of what some people want, IIUC. My impression
is that do/end don't have a rescue basically to avoid unneeded
complexity and because putting rescue inside {} for orthognality
looks questionable.

Hal

This just doesn't sound right to me. do/end and {} are part of the syntax of
a method call in ruby. They aren't operators. There is no requirement that
the code in a do/end or {} block be used within some sort of a looping
or iterative context. For example:

  File.open('datafile') do |fd|
            # do something with the open file handle: fd
         end

A do/end block is simply a snippet of code and a set of variable bindings that
are made available to the called method. The called method can ignore the block,
call it one or more times, call it conditionally, pass it along to some other
method, and so on.

Gary Wright

···

On Sep 10, 2006, at 6:45 PM, Paul Lutus wrote:

"do" reads from a list of items and provides them to its controlling block:

(implicit "do")

array.each { |item|
   # do something here
}

(explicit "do")

array.each do |item|
   # do something here
end

These two forms are interchangeable. One can argue that "do" is an implicit
"{ ... }" block, or the reverse. But the point is "do" receives items and
operates on them one at a time, then exits when its stream is empty.

Let me try a direction in answering this question. I don't know anything
about the internals of Ruby to know whether this is right or not, but
perhaps someone can confirm.

do..end and {...} blocks are lexical closures that remember their
surrounding binding. This requires setup and teardown code for dealing
with bindings that could otherwise go out of scope.

begin...end can't be passed around like do...end, and so it doesn't need
the same kind of setup and teardown code for dealing with bindings.

Different syntax is needed because the ruby interpreter isn't smart enough
to know which situation is which.

Is this close to correct?

···

On Sun, 10 Sep 2006 15:40:11 -0700, Paul Lutus wrote:

Ken Bloom wrote:

On Sun, 10 Sep 2006 15:05:10 -0700, Paul Lutus wrote:

Newbie wrote:

So, from a language design perspective, what would be wrong with this?

do
   stuff
   do
     risky stuff
   rescue
     SOS
   end
   more stuff
end while bored

"do" doesn't work this way -- it's an iterator (loosely speaking).
"begin" is not an iterator. They are different. There are good reasons to
make this distinction.

What *are* those reasons? That's the whole point of his question.

"do" reads from a list of items and provides them to its controlling block:

(implicit "do")

array.each { |item|
   # do something here
}

(explicit "do")

array.each do |item|
   # do something here
end

These two forms are interchangeable. One can argue that "do" is an implicit
"{ ... }" block, or the reverse. But the point is "do" receives items and
operates on them one at a time, then exits when its stream is empty.

"begin" doesn't get fed with items, it has a different purpose. It
demarcates a controlled block, to which a "rescue" clause might apply, or
to which a "while" test might apply, or others. And a "begin" block won't
persist on its own. Without some internal block that does something
repetitive (or a "while" test at the end), the "begin" block will exit in
one pass.

There is a need for both "do" and "begin" blocks. There is a need to
distinguish syntactically between a block that must be fed with items, and
one that must not be fed with items. To combine "do" and "begin" would lead
to syntactical ambiguity ... and surely then someone would ask why the two
purposes of "do" were not more clearly distinguished in the syntax.

If "do" and "begin" were to be merged, it would be a little like the
ambiguous use of "<<" in C++. In one context, it shifts bits:

int x = 1,y;

y = x << 4; // y = 16

In another context, "<<" inserts items into a stream:

iostream x;
bool y;

y = x << 4; // y = true if the operation was successful

See the problem? Without my clear declarations directly above each case, you
would have a hard time distinguishing cases that use the same syntax. This
would make program listings hard to interpret and debug (a fact in C++).

The multiple uses of "<<" in C++ is a well-known example, but the point I am
making is that it's important to avoid ambiguous syntax in language design.
"do ... end" always has a stream, and when the stream is exhausted, the
block exits. "begin ... end" never has a stream. It's easy to remember and
easy to read.

The end result of responding to every request for creative syntax variations
is called ... umm ... "Perl". :slight_smile:

--
Ken Bloom. PhD candidate. Linguistic Cognition Laboratory.
Department of Computer Science. Illinois Institute of Technology.
http://www.iit.edu/~kbloom1/

Hal Fulton wrote:

Paul Lutus wrote:

"do" reads from a list of items and provides them to its controlling
block:

These two forms are interchangeable. One can argue that "do" is an
implicit "{ ... }" block, or the reverse. But the point is "do" receives
items and operates on them one at a time, then exits when its stream is
empty.

Not strictly true.

Yes, strictly true, IMHO.

Reemember that the iteration comes from the fact
that the each method is invoking the block multiple times.

Not if there is only one item.

array =

array << "one item"

array.each do |item|
   # one item
end

Here's a do/end that will only execute once (and by the way
has no parameters):

   Dir.chdir(some_dir) do process_something end

In this example, "do" acts on its stream until it is empty. The stream
happens to empty after one item, because that is what the stream contains.

And by the way, the rescue that can appear between def and end
is rather an example of what some people want, IIUC. My impression
is that do/end don't have a rescue basically to avoid unneeded
complexity and because putting rescue inside {} for orthognality
looks questionable.

Yep. In any case, one can get the rescue within the do ... end block by
enclosing a begin .. end block. I don't think this will satisfy anyone who
would like to see "rescue" added to do ... end.

I've believed for a long time that a "good" language has a minimum number of
constructs.

···

--
Paul Lutus
http://www.arachnoid.com

Logan Capaldo wrote:

/ ...

Excellent argument, poor choice of example code *cough*.

Hmm?

a = 1 #=> 1
a << 1 #=> 2
$stdout << 4 # outputs '4'

Yes, as does the iostream 'x' in my example (y = x << 4), and y is set to
"true" unless the operation fails. Remember, the example is C++, not Ruby,
solely to point out a language ambiguity there.

···

--
Paul Lutus
http://www.arachnoid.com

Logan Capaldo wrote:

···

On Mon, Sep 11, 2006 at 07:45:28AM +0900, Paul Lutus wrote:

There is a need for both "do" and "begin" blocks. There is a need to
distinguish syntactically between a block that must be fed with items,
and one that must not be fed with items. To combine "do" and "begin"
would lead to syntactical ambiguity ... and surely then someone would ask
why the two purposes of "do" were not more clearly distinguished in the
syntax.

If "do" and "begin" were to be merged, it would be a little like the
ambiguous use of "<<" in C++. In one context, it shifts bits:

int x = 1,y;

y = x << 4; // y = 16

In another context, "<<" inserts items into a stream:

iostream x;
bool y;

y = x << 4; // y = true if the operation was successful

See the problem? Without my clear declarations directly above each case,
you would have a hard time distinguishing cases that use the same syntax.
This would make program listings hard to interpret and debug (a fact in
C++).

The multiple uses of "<<" in C++ is a well-known example, but the point I
am making is that it's important to avoid ambiguous syntax in language
design. "do ... end" always has a stream, and when the stream is
exhausted, the block exits. "begin ... end" never has a stream. It's easy
to remember and easy to read.

The end result of responding to every request for creative syntax
variations is called ... umm ... "Perl". :slight_smile:

Excellent argument, poor choice of example code *cough*.

Second reply, after reflection. Okay, I think you are saying the example
isn't very good because Ruby has the same ambiguity. Perhaps Matz didn't
want to leave C++ converts without any familiar landmarks?

--
Paul Lutus
http://www.arachnoid.com

gwtmp01@mac.com wrote:

A do/end block is simply a snippet of code and a set of variable
bindings that
are made available to the called method. The called method can
ignore the block,
call it one or more times, call it conditionally, pass it along to
some other
method, and so on.

That's what I was thinking. do...end / {...} is an implicit closure,
begin...end is a simple grouping construct. Also do...end has delayed
evaluation (like a quoted expression in Lisp), where begin...end is
evaluated immediately. That also means that do...end can be assigned to
a name (which results in similar functionality to a Lisp macro; e.g., a
= lambda { |x,y| puts x[y] }; a.call({:tree, 'cat'}, :tree)), and can
be passed around and so on; but begin...end cannot be.

This may help the OP understand the difference a little better:
http://www.artima.com/intv/closures.html

Regards,
Jordan

Hal Fulton wrote:

Paul Lutus wrote:

"do" reads from a list of items and provides them to its controlling
block:

These two forms are interchangeable. One can argue that "do" is an
implicit "{ ... }" block, or the reverse. But the point is "do" receives
items and operates on them one at a time, then exits when its stream is
empty.

Not strictly true. Reemember that the iteration comes from the fact
that the each method is invoking the block multiple times.

Yes, you are right (an I am quite wrong in the above quotation). The "do ...
end" block is simply a closure with an alternate syntax, and the calling
method provides the data and any repetition that may exist.

···

---------------------------------------------------

#! /usr/bin/ruby

def my_funct
   1.upto(8) do |x|
      yield x
   end
end

my_funct do |item|
   puts item
end

(or, equivalently)

my_funct { |item|
   puts item
}

So it appears that the syntactic role played by "do" is that it receives and
acts on the data fed to it by the calling method, once (barring any
repetitions internal to the do ... end block). IOW it is a closure block,
nothing more. Any looping is in the hands of the calling method.

So it appears the only difference between "do ... end" and "begin ... end"
is that "begin ... end" doesn't receive data from a caller.

The moment I realized I was wrong was when I fed the do ... end block "nil"
repeatedly, and it processed it anyway. It wasn't going to skip over a
false input, so the entire process is in the hands of the calling method.

--
Paul Lutus
http://www.arachnoid.com

gwtmp01@mac.com wrote:

But the point is "do" receives
items and
operates on them one at a time, then exits when its stream is empty.

This just doesn't sound right to me.

I was wrong. I should have thought a bit more deeply about what I was
saying. See my other post on this topic.

···

--
Paul Lutus
http://www.arachnoid.com

Ken Bloom wrote:

/ ...

Let me try a direction in answering this question. I don't know anything
about the internals of Ruby to know whether this is right or not, but
perhaps someone can confirm.

do..end and {...} blocks are lexical closures that remember their
surrounding binding. This requires setup and teardown code for dealing
with bindings that could otherwise go out of scope.

begin...end can't be passed around like do...end, and so it doesn't need
the same kind of setup and teardown code for dealing with bindings.

Different syntax is needed because the ruby interpreter isn't smart enough
to know which situation is which.

Is this close to correct?

It's reasonable. I can put a "do ... end" block in a Proc object (because
that's what it is) but I can't do this with begin ... end (unless the
begin ... end block is nested in a do ... end block). So they differ that
way. This may not be a particularly persuasive argument for or against
rescue within do ... end, because I can put a rescue clause in a begin ...
end block that is nested in a do ... end block.

Like so:

···

-------------------------------------------------

#! /usr/bin/ruby

p = Proc.new { |s|
   begin
      puts s
      y = x/0
   rescue
      puts "Error!"
   end
}

[ 'a' , 'b', 'c' ].each { |s| p.call(s) }

Output:

a
Error!
b
Error!
c
Error!

--
Paul Lutus
http://www.arachnoid.com

Ken Bloom wrote:

do..end and {...} blocks are lexical closures that remember their
surrounding binding. This requires setup and teardown code for dealing
with bindings that could otherwise go out of scope.

begin...end can't be passed around like do...end, and so it doesn't need
the same kind of setup and teardown code for dealing with bindings.

Different syntax is needed because the ruby interpreter isn't smart enough
to know which situation is which.

Is this close to correct?

Disclaimer: I don't know the internals of Ruby either. This is only
guesswork on my part, and stands a large chance of being wrong.

I believe what you're saying is close to correct.

You're correct in saying that rescue requires some extra setup and
teardown. That seems intuitively obvious to me.

My guess is that "begin" is like an extra hint to the interpreter,
a foreshadowing if you will, that there is probably going to be a
rescue (and thus some setup/teardown).

Note that def/end and begin/end are much rarer in code than do/end
and {}.

So if we start allowing rescue on do/end and {}, we expose the
interpreter to a lot of guessing or tentatively holding onto
information or "looking ahead." Stuff like: Do the setup, but only
if there's a rescue forthcoming. Or do the setup, but throw it away
if we never see a rescue.

Hal

That's a bit of a stretch, don't you think? What exactly is in "the
stream"? I can pretty much gurantee you that Dir.chdir looks like:

def Dir.chdir(directory)
  old_directory = Dir.pwd
  begin
    chdir_without_block(directory)
    yield
  ensure
   chdir_without_block(old_directory)
  end
end

I see state, but no stream. What about instance_eval? That's not a
"stream" is it? Heck, lambda { |x, y| x + y }.call(1, 2), where's the
stream in that?

···

On Mon, Sep 11, 2006 at 11:15:23AM +0900, Paul Lutus wrote:

Hal Fulton wrote:

> Paul Lutus wrote:
>>
>> "do" reads from a list of items and provides them to its controlling
>> block:
>>
>> These two forms are interchangeable. One can argue that "do" is an
>> implicit "{ ... }" block, or the reverse. But the point is "do" receives
>> items and operates on them one at a time, then exits when its stream is
>> empty.
>
> Not strictly true.

Yes, strictly true, IMHO.

> Reemember that the iteration comes from the fact
> that the each method is invoking the block multiple times.

Not if there is only one item.

array =

array << "one item"

array.each do |item|
   # one item
end

>
> Here's a do/end that will only execute once (and by the way
> has no parameters):
>
> Dir.chdir(some_dir) do process_something end

In this example, "do" acts on its stream until it is empty. The stream
happens to empty after one item, because that is what the stream contains.

Entirely my point, the exact same ambiguity exists in ruby. Why can't be
also have the do ... end ambiguity since we've already introduced
ambiguities. Ruby is full ambiguities (some might suggest that's what
makes Duck-typing such a useful paradigm). My point was that saying
"Ruby shouldn't have an ambiguity, look at this ambiguity in this other
language", when ruby has the exact same ambiguity is sort of counter
productive.

···

On Mon, Sep 11, 2006 at 11:35:28AM +0900, Paul Lutus wrote:

Logan Capaldo wrote:

/ ...

> Excellent argument, poor choice of example code *cough*.

Hmm?

> a = 1 #=> 1
> a << 1 #=> 2
> $stdout << 4 # outputs '4'

Yes, as does the iostream 'x' in my example (y = x << 4), and y is set to
"true" unless the operation fails. Remember, the example is C++, not Ruby,
solely to point out a language ambiguity there.