#collect with block modifying receiver

Hello, all…

I’m wondering what should be the behavior of collect when the block
modifies the receiver.

I don’t recall thinking about this before, but I would have said
that the collect always returns as many items as the array had
at the time of the call.

Has this somehow changed since 1.6.*?

The reason I ask is the email (below) I got from Jonathan Lim.
The randomize methods on page 152/153 of The Ruby Way do not
work. It’s possible they never did, but I find that hard to
believe.

I don’t have an old Ruby around, or I’d test it.

The original line in randomize, of course, read as follows:

arr.collect { arr.slice!(rand arr.length) }

Thanks to you who reply, and thanks again to Jonathan.

Cheers,
Hal

···

-------- Original Message --------
Subject: Randomizing an Array Errata?
Date: Thu, 28 Aug 2003 15:12:45 +0100
From: Jonathan Lim jhgl@dial.pipex.com
To: hal9000@hypermetrics.com

Hi,

On p.153 of the Ruby Way

Should it not read this?

def randomize
arr = self.dup
self.collect { arr.slice!(rand arr.length) }
end

def randomize!
arr = self.dup
result = self.collect { arr.slice!(rand arr.length) }
self.replace result
end

Cheers,
Jon

Running the code from The Ruby Way I get

[mark@laptop mark]$ ruby1.6 -v test.rb
ruby 1.6.7 (2002-03-01) [i586-linux-gnu]
[1, 4, 3, 5, 2]
[4, 5, 2, 3, 1]
[mark@laptop mark]$ ruby -v test.rb
ruby 1.8.0 (2003-08-04) [i686-linux]
test.rb:5: warning: parenthesize argument(s) for future version
test.rb:10: warning: parenthesize argument(s) for future version
[2, 5, 1]
[2, 3, 4]

So it does look like there has been a change between 1.6.7 and 1.8.0

After changing

arr.collect { arr.slice!(rand arr.length) }

to

self.collect { arr.slice!(rand arr.length) }

I get

[mark@laptop mark]$ ruby1.6 -v test.rb
ruby 1.6.7 (2002-03-01) [i586-linux-gnu]
[5, 3, 4, 2, 1]
[2, 1, 3, 5, 4]
[mark@laptop mark]$ ruby -v test.rb
ruby 1.8.0 (2003-08-04) [i686-linux]
[5, 3, 1, 2, 4]
[5, 2, 1, 4, 3]

So this does fix the problem.

Best regards

Mark Sparshatt

···

On Thursday 28 Aug 2003 4:48 pm, Hal Fulton wrote:

Hello, all…

I’m wondering what should be the behavior of collect when the block
modifies the receiver.

I don’t recall thinking about this before, but I would have said
that the collect always returns as many items as the array had
at the time of the call.

Has this somehow changed since 1.6.*?

The reason I ask is the email (below) I got from Jonathan Lim.
The randomize methods on page 152/153 of The Ruby Way do not
work. It’s possible they never did, but I find that hard to
believe.

I don’t have an old Ruby around, or I’d test it.

mark wrote:

Running the code from The Ruby Way I get

[mark@laptop mark]$ ruby1.6 -v test.rb
ruby 1.6.7 (2002-03-01) [i586-linux-gnu]
[1, 4, 3, 5, 2]
[4, 5, 2, 3, 1]
[mark@laptop mark]$ ruby -v test.rb
ruby 1.8.0 (2003-08-04) [i686-linux]
test.rb:5: warning: parenthesize argument(s) for future version
test.rb:10: warning: parenthesize argument(s) for future version
[2, 5, 1]
[2, 3, 4]

So it does look like there has been a change between 1.6.7 and 1.8.0

OK, so this really makes me wonder what the “theoretically correct”
behavior is.

My first thought is that this:

 receiver.collect { some_operation_altering_receiver }

should behave the same as this:

 receiver.dup.collect { some_operation_altering_receiver }

Matz, are you listening?? Enlighten us, please…

Hal

Hi,

···

In message “Re: #collect with block modifying receiver” on 03/08/29, Hal Fulton hal9000@hypermetrics.com writes:

OK, so this really makes me wonder what the “theoretically correct”
behavior is.

Matz, are you listening?? Enlighten us, please…

Ah, listen, it’s undefined behavior. :wink:

To be serious, I don’t want to slow down performance by defining any
exact behavior. Don’t modify the receiver while you are iterating
over it.

						matz.

Yukihiro Matsumoto wrote:

Hi,

OK, so this really makes me wonder what the “theoretically correct”
behavior is.

Matz, are you listening?? Enlighten us, please…

Ah, listen, it’s undefined behavior. :wink:

To be serious, I don’t want to slow down performance by defining any
exact behavior. Don’t modify the receiver while you are iterating
over it.

I have to assert my support of this position; modifying a receiver while
in a block would be VERY tedious to anticipate, and would be an endless
source for wonderful new bugs, if the defined behavior were to allow it.

Sean O'Dell
···

In message “Re: #collect with block modifying receiver” > on 03/08/29, Hal Fulton hal9000@hypermetrics.com writes:

So it’s undefined behaviour, but “Don’t modify the receiver while you
are iterating over it.” :slight_smile:

Can I suggest the solution that Java uses, which is to fail fast in that
situation (ConcurrentModificationException). You would not believe how
much debugging time that one tiny feature has saved me!

Cheers,
Dan

Yukihiro Matsumoto wrote:

···

Hi,

In message “Re: #collect with block modifying receiver” > on 03/08/29, Hal Fulton hal9000@hypermetrics.com writes:

OK, so this really makes me wonder what the “theoretically correct”
behavior is.

Matz, are you listening?? Enlighten us, please…

Ah, listen, it’s undefined behavior. :wink:

To be serious, I don’t want to slow down performance by defining any
exact behavior. Don’t modify the receiver while you are iterating
over it.

  					matz.


Dan North

Dan North wrote:

So it’s undefined behaviour, but “Don’t modify the receiver while you
are iterating over it.” :slight_smile:

Can I suggest the solution that Java uses, which is to fail fast in that
situation (ConcurrentModificationException). You would not believe how
much debugging time that one tiny feature has saved me!

I think throwing an exception falls under “exact behavior.” :slight_smile:
And detecting this situation would be non-trivial, and proabably
not worthwhile anyway. I’m happy enough with the admonition
“Just Don’t Do It” (opposite of Nike).

Shame on me for allowing this (pp 152-153) to have found its way
into print.

Hal

···

Cheers,
Dan

Yukihiro Matsumoto wrote:

Hi,

In message “Re: #collect with block modifying receiver” >> on 03/08/29, Hal Fulton hal9000@hypermetrics.com writes:

OK, so this really makes me wonder what the “theoretically correct”
behavior is.

Matz, are you listening?? Enlighten us, please…

Ah, listen, it’s undefined behavior. :wink:

To be serious, I don’t want to slow down performance by defining any
exact behavior. Don’t modify the receiver while you are iterating
over it.

                        matz.


Dan North
http://www.thoughtworks.com

I wouldn’t like that, personally. I don’t remember exactly where, but I
believe that I have a piece of code taking advantage of the fact that a
receiver can be modified during iteration. I don’t modify during a
#collect, but during an #each. One of the things you can do is:

arr.each { |e| arr << f.flatten if f.kind_of?(Array) }

-austin

···

On Fri, 29 Aug 2003 22:02:01 +0900, Dan North wrote:

So it’s undefined behaviour, but “Don’t modify the receiver while you are
iterating over it.” :slight_smile:

Can I suggest the solution that Java uses, which is to fail fast in that
situation (ConcurrentModificationException). You would not believe how
much debugging time that one tiny feature has saved me!


austin ziegler * austin@halostatue.ca * Toronto, ON, Canada
software designer * pragmatic programmer * 2003.08.29
* 13.04.26

Austin Ziegler wrote:

···

On Fri, 29 Aug 2003 22:02:01 +0900, Dan North wrote:

So it’s undefined behaviour, but “Don’t modify the receiver while you are
iterating over it.” :slight_smile:

Can I suggest the solution that Java uses, which is to fail fast in that
situation (ConcurrentModificationException). You would not believe how
much debugging time that one tiny feature has saved me!

I wouldn’t like that, personally. I don’t remember exactly where, but I
believe that I have a piece of code taking advantage of the fact that a
receiver can be modified during iteration. I don’t modify during a
#collect, but during an #each. One of the things you can do is:

arr.each { |e| arr << f.flatten if f.kind_of?(Array) }

Well, if I understand Matz correctly, this behavior is not guaranteed.

If that’s the case, you should change this to avoid being bitten
the way I was.

Hal

I donno. I think that this is something “dangerous” that should be
implicitly permitted. In my case, I needed to process a list of items (order
didn’t matter) and process sublists after everything else. I can’t think of
anything quite as elegant as the above that guarantees that each item will
only be processed once. (There was more to it than that, but I can’t find my
use of the technique in any case. It must have been less interesting than I
thought.)

-austin

···

On Sat, 30 Aug 2003 02:11:12 +0900, Hal Fulton wrote:

Austin Ziegler wrote:

On Fri, 29 Aug 2003 22:02:01 +0900, Dan North wrote:

So it’s undefined behaviour, but “Don’t modify the receiver while you
are iterating over it.” :slight_smile:
arr.each { |e| arr << f.flatten if f.kind_of?(Array) }
Well, if I understand Matz correctly, this behavior is not guaranteed.

If that’s the case, you should change this to avoid being bitten the way
I was.


austin ziegler * austin@halostatue.ca * Toronto, ON, Canada
software designer * pragmatic programmer * 2003.08.29
* 13.31.57

Hi –

Austin Ziegler wrote:

So it’s undefined behaviour, but “Don’t modify the receiver while you are
iterating over it.” :slight_smile:

Can I suggest the solution that Java uses, which is to fail fast in that
situation (ConcurrentModificationException). You would not believe how
much debugging time that one tiny feature has saved me!

I wouldn’t like that, personally. I don’t remember exactly where, but I
believe that I have a piece of code taking advantage of the fact that a
receiver can be modified during iteration. I don’t modify during a
#collect, but during an #each. One of the things you can do is:

arr.each { |e| arr << f.flatten if f.kind_of?(Array) }

Well, if I understand Matz correctly, this behavior is not guaranteed.

If that’s the case, you should change this to avoid being bitten
the way I was.

I think (Matz - ?) that Matz was talking specifically about changing
the length of the receiver during an iteration, since that raises the
question of how many iterations there should be, what the next one
should be if something gets sliced out, etc.

Just modifying the objects during an iteration is different, I think.
#map! does it, for example. Or:

names.each {|name| name.upcase!}

I’m pretty sure there’s no danger with in-place things like this
(as opposed to things that alter the meaning of the ‘place’ you’re
‘in’ :slight_smile:

David

···

On Sat, 30 Aug 2003, Hal Fulton wrote:

On Fri, 29 Aug 2003 22:02:01 +0900, Dan North wrote:


David Alan Black
home: dblack@superlink.net
work: blackdav@shu.edu
Web: http://pirate.shu.edu/~blackdav

dblack@superlink.net wrote:

I wouldn’t like that, personally. I don’t remember exactly where, but I
believe that I have a piece of code taking advantage of the fact that a
receiver can be modified during iteration. I don’t modify during a
#collect, but during an #each. One of the things you can do is:

arr.each { |e| arr << f.flatten if f.kind_of?(Array) }

Well, if I understand Matz correctly, this behavior is not guaranteed.

If that’s the case, you should change this to avoid being bitten
the way I was.

I think (Matz - ?) that Matz was talking specifically about changing
the length of the receiver during an iteration, since that raises the
question of how many iterations there should be, what the next one
should be if something gets sliced out, etc.

Just modifying the objects during an iteration is different, I think.
#map! does it, for example. Or:

names.each {|name| name.upcase!}

I’m pretty sure there’s no danger with in-place things like this
(as opposed to things that alter the meaning of the ‘place’ you’re
‘in’ :slight_smile:

Yes, I agree. But to avoid confusion for others, let me point out that
Austin’s code does change the length of the receiver.

However, let’s examine this at a slightly finer granularity. It seems
likely to me that it is “safer” to increase the length of a container
(to append) than it is to decrease its length, to add/delete in the
middle, and so on.

The “less safe” behavior was what the randomizing code did. It worked
under 1.6.* and now fails under 1.8.0.

OK, here’s a modified question for Matz. If we are iterating over a
list, and the block appends to the list, is the behavior guaranteed
and the code thus acceptable?

Hal

Hi –

···

On Sat, 30 Aug 2003, Hal Fulton wrote:

dblack@superlink.net wrote:

I wouldn’t like that, personally. I don’t remember exactly where, but I
believe that I have a piece of code taking advantage of the fact that a
receiver can be modified during iteration. I don’t modify during a
#collect, but during an #each. One of the things you can do is:

arr.each { |e| arr << f.flatten if f.kind_of?(Array) }

Well, if I understand Matz correctly, this behavior is not guaranteed.

If that’s the case, you should change this to avoid being bitten
the way I was.

I think (Matz - ?) that Matz was talking specifically about changing
the length of the receiver during an iteration, since that raises the
question of how many iterations there should be, what the next one
should be if something gets sliced out, etc.

Just modifying the objects during an iteration is different, I think.
#map! does it, for example. Or:

names.each {|name| name.upcase!}

I’m pretty sure there’s no danger with in-place things like this
(as opposed to things that alter the meaning of the ‘place’ you’re
‘in’ :slight_smile:

Yes, I agree. But to avoid confusion for others, let me point out that
Austin’s code does change the length of the receiver.

Whoops – too late to prevent confusion for me. My only excuse is
that I was so focused on wondering what ‘f’ was that I lost track of
what was actually happening :slight_smile:

David


David Alan Black
home: dblack@superlink.net
work: blackdav@shu.edu
Web: http://pirate.shu.edu/~blackdav