Ruby-dev summary 18711-18810

Hi all,

This is a summary of ruby-dev ML in these days.

[ruby-dev:18651] Enumerable#zip (contd.)

Matz has noted some preconditions about this issue:

* The main reason to introduce Enumerable#zip is the parallel
  iteration (to be precise, "finite parallel each").

* We should not use Thread/Continuation.

* We should not use external iterator (Iterator pattern).

* (on zip) We should not raise exceptions when the length of the
  components are differ.

Under these conditions, following candidates are remained:

# Enumerable#zip
for x,y in a.zip(b) do ... end
a.zip(b).each {|x,y| ...}
    # Problem: If the length of a and b are different,
    # which one should we choose?  The default value/block does
    # not resolve this problem, because it is equal to the
    # "choose longest" strategy.

# Enumerable#sync_each
a.sync_each(b,c) {|a,b,c| ... }
    # Problem: "sync_each" is not a good name.

Following candidates have been already rejected:

# Array.zip --- "Array." is too redundant for our aim.
for x,y in Array.zip(a,b) do ... end
Array.zip(a,b).each {|x,y| ...}

# Enumerable#map_with_index --- too long description.
a.map_with_index {|x,idx| [a, b[idx]] }.each {|x,y| ... }

# Array#zip --- Temporal object (an array in this case)
#               should not be a receiver.
for x,y in [a,b].zip do ... end
[a,b].zip.each {|x,y| ... }

# Kernel#zip --- We already have too many toplevel methods.
zip(a,b).each {|x,y| ... }

[ruby-dev:18711] another implementation of pstore

Current pstore.rb may crash a database file when a ruby process
is interrupted on certain bad timings. YANAGAWA Kazuhisa announced
his implementation of pstore.rb, which resolves such problems.

You can get it from:

http://www.dm4lab.to/~kjana/ruby/ps.tar.gz

NOTE: He does NOT intend on replacing current implementation.
This is just a sample, by now.

[ruby-dev:18739] change chomp!

Shin-ichiro HARA suggested that String#chomp should cut off
CR and LR at once. Knu pointed out that ruby 1.7 already acts
like such.

– Minero Aoki

“Minero Aoki” aamine@loveruby.net wrote in message
news:20021118181718N.aamine@mx.edit.ne.jp…

Hi all,

This is a summary of ruby-dev ML in these days.

[ruby-dev:18651] Enumerable#zip (contd.)

Aplying zip as an instance method to the module Enumerable makes the method
assymetric. There is the object left to the method name, and its parameters.
But in mind the operation is symmetrical except the order is significant.
Where should the object be placed in sequence?

Matz has noted some preconditions about this issue:

* The main reason to introduce Enumerable#zip is the parallel
  iteration (to be precise, "finite parallel each").

* We should not use Thread/Continuation.

* We should not use external iterator (Iterator pattern).

* (on zip) We should not raise exceptions when the length of the
  components are differ.

If the longest sequence would be chosen, what value should receive the
absent items?
While the exception will not be raised how can one know if the sequences
have different lengths?

···

Under these conditions, following candidates are remained:

# Enumerable#zip
for x,y in a.zip(b) do ... end
a.zip(b).each {|x,y| ...}
    # Problem: If the length of a and b are different,
    # which one should we choose?  The default value/block does
    # not resolve this problem, because it is equal to the
    # "choose longest" strategy.

# Enumerable#sync_each
a.sync_each(b,c) {|a,b,c| ... }
    # Problem: "sync_each" is not a good name.

Following candidates have been already rejected:

# Array.zip --- "Array." is too redundant for our aim.
for x,y in Array.zip(a,b) do ... end
Array.zip(a,b).each {|x,y| ...}

# Enumerable#map_with_index --- too long description.
a.map_with_index {|x,idx| [a, b[idx]] }.each {|x,y| ... }

# Array#zip --- Temporal object (an array in this case)
#               should not be a receiver.
for x,y in [a,b].zip do ... end
[a,b].zip.each {|x,y| ... }

# Kernel#zip --- We already have too many toplevel methods.
zip(a,b).each {|x,y| ... }

[ruby-dev:18711] another implementation of pstore

Current pstore.rb may crash a database file when a ruby process
is interrupted on certain bad timings. YANAGAWA Kazuhisa announced
his implementation of pstore.rb, which resolves such problems.

You can get it from:

http://www.dm4lab.to/~kjana/ruby/ps.tar.gz

NOTE: He does NOT intend on replacing current implementation.
This is just a sample, by now.

[ruby-dev:18739] change chomp!

Shin-ichiro HARA suggested that String#chomp should cut off
CR and LR at once. Knu pointed out that ruby 1.7 already acts
like such.

– Minero Aoki

Hi –

···

On Mon, 18 Nov 2002, Minero Aoki wrote:

Following candidates have been already rejected:

# Enumerable#map_with_index --- too long description.
a.map_with_index {|x,idx| [a, b[idx]] }.each {|x,y| ... }

Do you mean that map_with_index has been rejected on its own, or just
as a zip implementation?

David


David Alan Black
home: dblack@candle.superlink.net
work: blackdav@shu.edu
Web: http://pirate.shu.edu/~blackdav

Hi –

# Enumerable#zip
for x,y in a.zip(b) do ... end
a.zip(b).each {|x,y| ...}
    # Problem: If the length of a and b are different,
    # which one should we choose?  The default value/block does
    # not resolve this problem, because it is equal to the
    # "choose longest" strategy.

Since a is the receiver, it would seem natural to me to have the
zip’ping only go the length of a.

Also – separate point, but maybe related – do we need “each”?
What about:

a.zip(b) {|x,y| … }

i.e., have zip itself iterate (rather than have to iterate through
the returned array).

David

···

On Mon, 18 Nov 2002, Minero Aoki wrote:


David Alan Black
home: dblack@candle.superlink.net
work: blackdav@shu.edu
Web: http://pirate.shu.edu/~blackdav

[ruby-dev:18651] Enumerable#zip (contd.)

Matz has noted some preconditions about this issue:

* The main reason to introduce Enumerable#zip is the parallel
  iteration (to be precise, "finite parallel each").

* We should not use Thread/Continuation.

* We should not use external iterator (Iterator pattern).

What are the reasons for avoiding external iterators? Using zip for
parallel iteration has a nice clean syntax, but I think I’d avoid it for
large data structures – unless it could be defined to return some sort
of proxy object that responds to each() instead of an Array.

Under these conditions, following candidates are remained:

# Enumerable#zip
for x,y in a.zip(b) do ... end
a.zip(b).each {|x,y| ...}
    # Problem: If the length of a and b are different,
    # which one should we choose?  The default value/block does
    # not resolve this problem, because it is equal to the
    # "choose longest" strategy.

# Enumerable#sync_each
a.sync_each(b,c) {|a,b,c| ... }
    # Problem: "sync_each" is not a good name.

How about Enumerable#iterate_with?

[ruby-dev:18739] change chomp!

Shin-ichiro HARA suggested that String#chomp should cut off
CR and LR at once. Knu pointed out that ruby 1.7 already acts
like such.

Does it do this on all platforms or only on Windows?

Paul

···

On Mon, Nov 18, 2002 at 06:13:12PM +0900, Minero Aoki wrote:

How are you going to implement it?

Regards,
Pit

···

On 18 Nov 2002 at 18:13, Minero Aoki wrote:

[ruby-dev:18651] Enumerable#zip (contd.)

Matz has noted some preconditions about this issue:
(…)
* We should not use Thread/Continuation.
(…)

H,

[ruby-dev:18651] Enumerable#zip (contd.)

Aplying zip as an instance method to the module Enumerable makes the method
assymetric. There is the object left to the method name, and its parameters.
But in mind the operation is symmetrical except the order is significant.
Where should the object be placed in sequence?

If the longest sequence would be chosen, what value should receive the
absent items?

nil

While the exception will not be raised how can one know if the sequences
have different lengths?

By explicit check, if you really want to know.

						matz.
···

In message “Re: ruby-dev summary 18711-18810” on 02/11/18, “Aleksei Guzev” aleksei.guzev@bigfoot.com writes:

Hi,

···

In message “Re: ruby-dev summary 18711-18810” on 02/11/18, dblack@candle.superlink.net dblack@candle.superlink.net writes:

Do you mean that map_with_index has been rejected on its own, or just
as a zip implementation?

map_with_index was rejected just because no one proposed its typical
usage. YANGI principle applied.

						matz.

On “zip”:

I’m sure most readers of this thread are aware that the snake-language
introduced about two years ago a “zip” function very much like the method
being proposed here – same name and everything. Although I’m a Ruby fan and
write lots of Ruby, I do a lot of programming in Python since my company’s
product has lots of Python code. And I can’t actually remember ever using
"zip". Parallel iteration is an occasional need (in my world), but not
exactly a frequent need. When parallel iteration is needed, other
techniques like each_with_index work just fine. This is a YAGNI (the acronym
I just learned :slight_smile: warning.

One of the reasons I don’t like using Python’s “zip” is that it actually
zips up all of the input sequences into a physical result sequence (similar
to their “range” function which is semantically like Ruby’s “m…n” but
returns a physical sequence) – it’s always bothered me to have to create a
possibly large sequence to do that operation when it isn’t necessary.
Hopefully, if “zip” becomes part of Ruby, the implementation could be as an
iterator which never builds a physical version of the zipped sequence. But
that is an implementation challenge if the input sequence are general
instances of Enumerable, since the input sequences cannot be efficiently
iterated in parallel. (If the input Enumerables were limited to those which
are numerically indexed, such as Arrays, implementation would be easy – but
it would be unfortunate to have to make that restriction.)

Anyway, my personal feeling about “zip” is that I would likely use it only
if it could efficiently iterate the zipped elements of the input sequences
without creating a full physical output sequence. Otherwise, I don’t think
it’s especially useful.

By the way, I just scanned the Python library code in the most recent
distribution and, outside of test modules, there are only two occurrences of
"zip"!

Bob

Hi,

* We should not use external iterator (Iterator pattern).

What are the reasons for avoiding external iterators? Using zip for
parallel iteration has a nice clean syntax, but I think I’d avoid it for
large data structures – unless it could be defined to return some sort
of proxy object that responds to each() instead of an Array.

Internal iterator is far simpler, and as powerful as external
iterator, except for parallel iteration. I won’t buy complexity for
rarely used parallel iteration.

a.sync_each(b,c) {|a,b,c| ... }
    # Problem: "sync_each" is not a good name.

How about Enumerable#iterate_with?

Hmm. Better than “sync_each”.

[ruby-dev:18739] change chomp!

Shin-ichiro HARA suggested that String#chomp should cut off
CR and LR at once. Knu pointed out that ruby 1.7 already acts
like such.

Does it do this on all platforms or only on Windows?

On all platforms.

						matz.
···

In message “Re: ruby-dev summary 18711-18810” on 02/11/19, Paul Brannan pbrannan@atdesk.com writes:

“Aleksei Guzev” aleksei.guzev@bigfoot.com wrote in message news:arac16$gmero$1@ID-167200.news.dfncis.de

“Minero Aoki” aamine@loveruby.net wrote in message
news:20021118181718N.aamine@mx.edit.ne.jp…

Hi all,

This is a summary of ruby-dev ML in these days.

[ruby-dev:18651] Enumerable#zip (contd.)

Aplying zip as an instance method to the module Enumerable makes the method
assymetric. There is the object left to the method name, and its parameters.
But in mind the operation is symmetrical except the order is significant.
Where should the object be placed in sequence?

OTOH, the reason I don’t like Kernel#zip is that it implies a false
symmetry which breaks in the case len(a) < len(b). What I’d expect in
the two cases would be

a = [1,2,3]
b = [5,6,7,8,9]
a.zip(b) # => [[1,5], [2,6], [3,7]]
zip(a,b) # => [[1,5], [2,6], [3,7], [nil,4], [nil,5]]

I still think a zipf (zip with function) would be useful too

a.zipf {|i| f} # => [[a1, f(a1)], [a2, f(a2), …]

Frinstance the Schwartzian transform could then be written

def stsort(&block)
zipf(&block).sort(a[1] <=> b[1]}.map(|a| a[0]}
end

martin

“Yukihiro Matsumoto” matz@ruby-lang.org wrote in message
news:1037624440.355792.19569.nullmailer@picachu.netlab.jp…

H,

[ruby-dev:18651] Enumerable#zip (contd.)

Aplying zip as an instance method to the module Enumerable makes the
method
assymetric. There is the object left to the method name, and its
parameters.
But in mind the operation is symmetrical except the order is significant.
Where should the object be placed in sequence?

If the longest sequence would be chosen, what value should receive the
absent items?

nil

But this will raise TypeError in the case of incompatible type/length of the
sequences. This would hide the error and violate “…* (on zip) We should
not raise exceptions when the length of the components are differ…”

···

In message “Re: ruby-dev summary 18711-18810” > on 02/11/18, “Aleksei Guzev” aleksei.guzev@bigfoot.com writes:

While the exception will not be raised how can one know if the sequences
have different lengths?

By explicit check, if you really want to know.

matz.

Hi,

Do you mean that map_with_index has been rejected on its own, or just
as a zip implementation?

map_with_index was rejected just because no one proposed its typical
usage. YANGI principle applied.

matz.

I don’t want to be the one to ask, but what’s YANGI?

Gavin

···

From: “Yukihiro Matsumoto” matz@ruby-lang.org

Hi –

···

On Mon, 18 Nov 2002, Yukihiro Matsumoto wrote:

Hi,

In message “Re: ruby-dev summary 18711-18810” > on 02/11/18, dblack@candle.superlink.net dblack@candle.superlink.net writes:

Do you mean that map_with_index has been rejected on its own, or just
as a zip implementation?

map_with_index was rejected just because no one proposed its typical
usage. YANGI principle applied.

Hmmmm… I’ll try to think of some examples. It certainly seems to
me to be at least as useful as each_with_index, potentially (i.e., if
it existed :slight_smile:

David


David Alan Black
home: dblack@candle.superlink.net
work: blackdav@shu.edu
Web: http://pirate.shu.edu/~blackdav

Hi –

On “zip”:

[very interesting comments snipped]

One of the reasons I don’t like using Python’s “zip” is that it actually
zips up all of the input sequences into a physical result sequence (similar
to their “range” function which is semantically like Ruby’s “m…n” but
returns a physical sequence) – it’s always bothered me to have to create a
possibly large sequence to do that operation when it isn’t necessary.
Hopefully, if “zip” becomes part of Ruby, the implementation could be as an
iterator which never builds a physical version of the zipped sequence. But
that is an implementation challenge if the input sequence are general
instances of Enumerable, since the input sequences cannot be efficiently
iterated in parallel. (If the input Enumerables were limited to those which
are numerically indexed, such as Arrays, implementation would be easy – but
it would be unfortunate to have to make that restriction.)

I’m not sure the numerical indexing is such a bad restriction.
There’s always to_a and/or to_ary… Besides, perhaps very wrongly,
I always think of enumerables as, by definition, ordered sequences,
which can always be numerically indexed. (Which is why I don’t think
that hashes should be Enumerables, but that’s another story :slight_smile:

Anyway, my personal feeling about “zip” is that I would likely use it only
if it could efficiently iterate the zipped elements of the input sequences
without creating a full physical output sequence. Otherwise, I don’t think
it’s especially useful.

Just to clarify: does that include a sequence of (I think) pointers?
I’m thinking along the lines of the component objects themselves not
being duplicated… but I don’t know whether that’s relevant to your
point.

David

···

On Tue, 19 Nov 2002, Bob Alexander wrote:


David Alan Black
home: dblack@candle.superlink.net
work: blackdav@shu.edu
Web: http://pirate.shu.edu/~blackdav

They are useful for more than just parallel iteration; they are also
useful for complex iteration (e.g. I move to the next element when a
particular event occurs). In C++ I might write:

int a = { 1, 2, 3, 4, 5 };
std::vector v(a, a + 5);
std::vector::iterator it = v.begin();
while(it != v.end()) {
std::cout << *it << std::endl;
if(*it == 3) {
it = v.erase(*it);
} else {
++it;
}
}

which produces:

1
2
3
4
5

but my naive solution in Ruby doesn’t work:

a = [1, 2, 3, 4, 5]
a.each_with_index do |x, idx|
puts x
if x == 3 then
a.delete_at(idx)
end
end

(it skips the 4).

Or is there a better way to write this that I am missing?

Paul

···

On Tue, Nov 19, 2002 at 08:48:01AM +0900, Yukihiro Matsumoto wrote:

Internal iterator is far simpler, and as powerful as external
iterator, except for parallel iteration. I won’t buy complexity for
rarely used parallel iteration.

“Bob Alexander” bobalex@attbi.com wrote in message
news:007701c28f24$64c657b0$1c72ea0c@C322162A…

On “zip”:

I’m sure most readers of this thread are aware that the snake-language
introduced about two years ago a “zip” function very much like the method
being proposed here – same name and everything. Although I’m a Ruby fan
and
write lots of Ruby, I do a lot of programming in Python since my company’s
product has lots of Python code. And I can’t actually remember ever using
“zip”. Parallel iteration is an occasional need (in my world), but not
exactly a frequent need. When parallel iteration is needed, other
techniques like each_with_index work just fine. This is a YAGNI (the
acronym
I just learned :slight_smile: warning.

One of the reasons I don’t like using Python’s “zip” is that it actually
zips up all of the input sequences into a physical result sequence
(similar
to their “range” function which is semantically like Ruby’s “m…n” but
returns a physical sequence) – it’s always bothered me to have to create
a
possibly large sequence to do that operation when it isn’t necessary.
Hopefully, if “zip” becomes part of Ruby, the implementation could be as
an
iterator which never builds a physical version of the zipped sequence. But
that is an implementation challenge if the input sequence are general
instances of Enumerable, since the input sequences cannot be efficiently
iterated in parallel. (If the input Enumerables were limited to those
which
are numerically indexed, such as Arrays, implementation would be easy –
but
it would be unfortunate to have to make that restriction.)

Anyway, my personal feeling about “zip” is that I would likely use it only
if it could efficiently iterate the zipped elements of the input sequences
without creating a full physical output sequence. Otherwise, I don’t think
it’s especially useful.

By the way, I just scanned the Python library code in the most recent
distribution and, outside of test modules, there are only two occurrences
of
“zip”!

Bob

That’s right. Again the debates show versatile Ruby’s “abilities”. It allows
implementing “zipping operation” just as one needs. Writing the code one
will have full control of what the method does.

Do you know implementations like (from Ruby CVS)
rough/lib/generator.rb? Why do you think these are inefficient?

Regards,
Pit

···

On 19 Nov 2002 at 2:00, Bob Alexander wrote:

(…)
Hopefully, if “zip” becomes part of
Ruby, the implementation could be as an iterator which never builds a
physical version of the zipped sequence. But that is an implementation
challenge if the input sequence are general instances of Enumerable,
since the input sequences cannot be efficiently iterated in parallel.

: From: “Yukihiro Matsumoto” matz@ruby-lang.org
:
: > map_with_index was rejected just because no one
: > proposed its typical usage. YANGI principle applied.
:
: I don’t want to be the one to ask, but what’s YANGI?

Perhaps it might be either

    * YNGNI = You're Not Gonna Need It

or

    * YAGNI = You Aren't Gonna Need It
···

On Mon, 19 Nov 2002 00:51 +1100, Gavin Sinclair wrote:


SugHimsi

I’d go so far as to say that map_with_index
does for map exactly what each_with_index
does for each.

I do find it useful.

It’s used in scanf, David, isn’t it? :slight_smile:

Hal

···

----- Original Message -----
From: dblack@candle.superlink.net
To: “ruby-talk ML” ruby-talk@ruby-lang.org
Sent: Monday, November 18, 2002 9:19 AM
Subject: Re: ruby-dev summary 18711-18810

Hi –

On Mon, 18 Nov 2002, Yukihiro Matsumoto wrote:

Hi,

In message “Re: ruby-dev summary 18711-18810” > > on 02/11/18, dblack@candle.superlink.net dblack@candle.superlink.net writes:

Do you mean that map_with_index has been rejected on its own, or just
as a zip implementation?

map_with_index was rejected just because no one proposed its typical
usage. YANGI principle applied.

Hmmmm… I’ll try to think of some examples. It certainly seems to
me to be at least as useful as each_with_index, potentially (i.e., if
it existed :slight_smile: