[RCR] More enumerator functionality

Hi everybody,

I have finally posted my RCR for more enumerator
functionality.

Find more about it here:
http://rcrchive.net/rcr/RCR/RCR262

Kristof Bastiaensen

Can you give some more examples, where and how enum_if could be useful?

Regards,

  Michael

···

On Fri, Jun 18, 2004 at 12:03:44AM +0900, Kristof Bastiaensen wrote:

Hi everybody,

I have finally posted my RCR for more enumerator
functionality.

Find more about it here:
RCR::RCR262 - RCRchive home

Hi,

···

In message "[RCR] More enumerator functionality" on 04/06/18, Kristof Bastiaensen <kristof@vleeuwen.org> writes:

I have finally posted my RCR for more enumerator
functionality.

Find more about it here:
RCR::RCR262 - RCRchive home

Hmm, how about making Enumerator#select etc. (methods that return
array in Enumerable) return new filtered enumerator, that makes no
need for enum_if and,

  huge_data.to_enum.delete_if {|x| ...}

works like a charm?

              matz.

Well, enum_if is useful in at least two cases (as I wrote in
the analysis). It could be useful as a replacement for
Enumerable#select that may be more efficient in some cases,
(when dealing with a large amount of data).
For example:
  large_dataset.enum_if { |d| sometest(d) }.collect do |d|
    <data manipulation>
  end

It could also be useful to have a specialized enumerator
that will reflect any changes made to the original object:

data = ["a", 6, 9, "foo", -19, "fuga", -19, "bach"]
ints = data.enum_if { |i| i.is_a? Numeric }
ints.to_a
=> [6, 9, -19, -19]

data += ["Bear", 20, 3]
ints.to_a
=> [6, 9, -19, -19, 20, 3]

(I have added these examples to the RCR)

Cheers,
Kristof

···

On Fri, 18 Jun 2004 01:54:19 +0900, Michael Neumann wrote:

On Fri, Jun 18, 2004 at 12:03:44AM +0900, Kristof Bastiaensen wrote:

Hi everybody,

I have finally posted my RCR for more enumerator
functionality.

Find more about it here:
RCR::RCR262 - RCRchive home

Can you give some more examples, where and how enum_if could be useful?

Regards,

  Michael

Hi,

At Fri, 18 Jun 2004 21:55:21 +0900,
Yukihiro Matsumoto wrote in [ruby-talk:104047]:

Hmm, how about making Enumerator#select etc. (methods that return
array in Enumerable) return new filtered enumerator, that makes no
need for enum_if and,

  huge_data.to_enum.delete_if {|x| ...}

Then Enumerator#select actually is equivalent to enum_if?

I'm afraid that it might cause confusion, it returns Enumerator
whereas Enumerable#select returns Array.

···

--
Nobu Nakada

>> Hi everybody,
>>
>> I have finally posted my RCR for more enumerator
>> functionality.
>>
>> Find more about it here:
>> RCR::RCR262 - RCRchive home
>
> Can you give some more examples, where and how enum_if could be useful?
>
> Regards,
>
> Michael

Well, enum_if is useful in at least two cases (as I wrote in
the analysis). It could be useful as a replacement for
Enumerable#select that may be more efficient in some cases,
(when dealing with a large amount of data).
For example:
  large_dataset.enum_if { |d| sometest(d) }.collect do |d|
    <data manipulation>
  end

It could also be useful to have a specialized enumerator
that will reflect any changes made to the original object:

You first enum_if example, is a bit confusing:

  (0..4).enum_if { |i| i[0] == 0 }.to_a
  => [0, 2, 4]

It took me a while until I recognized that i[0] means the value of the
lowest bit. Just for clearness, could you write either a comment, or
use (i & 0b1) instead? Or (i % 2) == 0.

data = ["a", 6, 9, "foo", -19, "fuga", -19, "bach"]
ints = data.enum_if { |i| i.is_a? Numeric }
ints.to_a
=> [6, 9, -19, -19]

data += ["Bear", 20, 3]
ints.to_a
=> [6, 9, -19, -19, 20, 3]

(I have added these examples to the RCR)

Aha, then it's something like a "lazy" enumerable, right?
Have a look at my code I wrote some weeks ago:

  require 'generator'
  module Enumerable
    def select_lazy(&block)
      Generator.new {|c| self.each { |elem| c.yield(elem) if block.call(elem) } }
    end

    def collect_lazy(&block)
      Generator.new {|c| self.each { |elem| c.yield(block.call(elem)) } }
    end

    alias map_lazy collect_lazy

    def lazy
      ChainingGenerator.new(self)
    end

    def no_lazy
      to_a
    end
  end

  class ChainingGenerator < Generator
    def select(&block)
      self.class.new {|c| self.each { |elem| c.yield(elem) if block.call(elem) } }
    end

    def collect(&block)
      self.class.new {|c| self.each { |elem| c.yield(block.call(elem)) } }
    end

    alias map collect

    # TODO: implement others
  end

  [1,2,3,4].lazy.map{|i| i + 1}.to_a # => [2,3,4,5]

  ["a", 6, 9, "foo", -19, "fuga", -19, "bach"].lazy.select{|i| i.is_a? Numeric}.to_a

That's a more general form, as after the "lazy", all Enumerable
operations do not create intermediate arrays. Of course, it's very slow
compared to the non-lazy methods (Generator uses continuations).

How performant is enum_if?

Could I write for example:

  [1,2,3].enum_if{ cond }.map {|i| i + 1}

Regards,

  Michael

···

On Fri, Jun 18, 2004 at 08:58:33PM +0900, Kristof Bastiaensen wrote:

On Fri, 18 Jun 2004 01:54:19 +0900, Michael Neumann wrote:
> On Fri, Jun 18, 2004 at 12:03:44AM +0900, Kristof Bastiaensen wrote:

You first enum_if example, is a bit confusing:

  (0..4).enum_if { |i| i[0] == 0 }.to_a => [0, 2, 4]

It took me a while until I recognized that i[0] means the value of the
lowest bit. Just for clearness, could you write either a comment, or
use (i & 0b1) instead? Or (i % 2) == 0.

You have a good point. I changed it.

data = ["a", 6, 9, "foo", -19, "fuga", -19, "bach"] ints = data.enum_if
{ |i| i.is_a? Numeric } ints.to_a
=> [6, 9, -19, -19]

data += ["Bear", 20, 3]
ints.to_a
=> [6, 9, -19, -19, 20, 3]

(I have added these examples to the RCR)

Aha, then it's something like a "lazy" enumerable, right?

Yes, that's right. The given block will only be executed at the time of
yielding the corresponding value.

Have a look at my code I wrote some weeks ago:

  require 'generator'
  module Enumerable
    def select_lazy(&block)
      Generator.new {|c| self.each { |elem| c.yield(elem) if
      block.call(elem) } }
    end

    def collect_lazy(&block)
      Generator.new {|c| self.each { |elem| c.yield(block.call(elem)) }
      }
    end

    alias map_lazy collect_lazy

    def lazy
      ChainingGenerator.new(self)
    end

    def no_lazy
      to_a
    end
  end

  class ChainingGenerator < Generator
    def select(&block)
      self.class.new {|c| self.each { |elem| c.yield(elem) if
      block.call(elem) } }
    end

    def collect(&block)
      self.class.new {|c| self.each { |elem| c.yield(block.call(elem)) }
      }
    end

    alias map collect

    # TODO: implement others
  end

  [1,2,3,4].lazy.map{|i| i + 1}.to_a # => [2,3,4,5]

  ["a", 6, 9, "foo", -19, "fuga", -19, "bach"].lazy.select{|i| i.is_a?
  Numeric}.to_a

That's a more general form, as after the "lazy", all Enumerable
operations do not create intermediate arrays. Of course, it's very slow
compared to the non-lazy methods (Generator uses continuations).

That's interesting.
If I am correct, your collect_lazy and lazy.collect behaves the same as
enum_for with a block. Your select_lazy and lazy.select the same as
enum_if.

How performant is enum_if?

It should be quite fast, since all it does is pass each yield through the
block.
Even more so, since Nobu Nokada was kind enough to provide an
implementation in c. :slight_smile:

Could I write for example:

  [1,2,3].enum_if{ cond }.map {|i| i + 1}

Yes, exactly. That was also the kind of thing I had in mind.

Cheers,
Kristof

···

On Fri, 18 Jun 2004 21:28:45 +0900, Michael Neumann wrote:

I don't know what enum_for is doing, but select_lazy is the same as
enum_if, only it's implementation is very different.

Regards,

  Michael

···

On Fri, Jun 18, 2004 at 10:13:25PM +0900, Kristof Bastiaensen wrote:

On Fri, 18 Jun 2004 21:28:45 +0900, Michael Neumann wrote:

[...]

>
> That's a more general form, as after the "lazy", all Enumerable
> operations do not create intermediate arrays. Of course, it's very slow
> compared to the non-lazy methods (Generator uses continuations).
>
>
That's interesting.
If I am correct, your collect_lazy and lazy.collect behaves the same as
enum_for with a block. Your select_lazy and lazy.select the same as
enum_if.

Currently enum_for (and its alias to_enum) creates an enumerable
that uses a different method than each. For example:

  require "enumerator"
  str = "xyz"

  enum = str.enum_for(:each_byte)
  a = enum.map {|b| '%02x' % b } #=> ["78", "79", "7a"]

My proposal is that enum_for can take a block, so it can
do a custom transformation on the data:

  data = [2, 3, 6]
  powers = data.enum_for { |i| i * i } #"each" implied
  powers.to_a
  => [4, 9, 36]

  data << 7
  powers.to_a
  => [4, 9, 36, 49]

If I am not mistaken this is the same as your collect_lazy.

Regards,
Kristof

···

On Sat, 19 Jun 2004 19:28:28 +0900, Michael Neumann wrote:

On Fri, Jun 18, 2004 at 10:13:25PM +0900, Kristof Bastiaensen wrote:

On Fri, 18 Jun 2004 21:28:45 +0900, Michael Neumann wrote:

[...]

>
> That's a more general form, as after the "lazy", all Enumerable
> operations do not create intermediate arrays. Of course, it's very slow
> compared to the non-lazy methods (Generator uses continuations).
>
>
That's interesting.
If I am correct, your collect_lazy and lazy.collect behaves the same as
enum_for with a block. Your select_lazy and lazy.select the same as
enum_if.

I don't know what enum_for is doing, but select_lazy is the same as
enum_if, only it's implementation is very different.