Enumerator Restrictions

Hi there,

I was wondering if there was a way to 'restrict' an Enumerator in an elegant way.
Like a method like 'take' that returns another Enumerator instead of an Array.

The Point is not to instantiate a big, costly Array.

Example:

enum = Enumerator.new do |yielder|
     n = 0
     loop do
       yielder << n
       n += 1
     end
   end

# Take creates a new Array
enum.take(10) #=> [0,1,2,3,4,5,6,7,8,9]

# What it would be like with a enumerator restriction:
new_enum = enum.enum_take(10) #=> New Enumerator that wraps the old one
new_enum.to_a #=> [0,1,2,3,4,5,6,7,8,9]

# Possible implementation for enum_take
class Enumerator
   def enum_take(ceiling)
     Enumerator.new do |yielder|
       index = 0
       loop do
         break if index > ceiling
         yielder << self.next
         index += 1
       end
     end
   end
end

I made a little gist on github if anyone is interested: https://gist.github.com/Haniyya/f1541e7636e7d25ca2c1928a0b2df710 Thank you, Paul

I was wondering if there was a way to 'restrict' an Enumerator in an elegant
way.
Like a method like 'take' that returns another Enumerator instead of an
Array.

The Point is not to instantiate a big, costly Array.

So, if I understood you correctly, you want to call .take and
.take_while on a enumerator and get another enumerator as result,
instead of an array, right?

I tried some examples in IRB using the Numeric#step method which
creates an enumerator. So, you just need to call .lazy on that, which
returns a lazy enumerator.

  >> 0.step
  => #<Enumerator: 0:step>

  >> 0.step.take(5)
  => [0, 1, 2, 3, 4]

  >> 0.step.take_while {|x| x < 5 }
  => [0, 1, 2, 3, 4]

  >> 0.step.lazy.take(5)
  => #<Enumerator::Lazy: #<Enumerator::Lazy: #<Enumerator: 0:step>>:take(5)>

  >> 0.step.lazy.take_while {|x| x < 5 }
  => #<Enumerator::Lazy: #<Enumerator::Lazy: #<Enumerator: 0:step>>:take_while>

  >> 0.step.lazy.take(5).to_a
  => [0, 1, 2, 3, 4]

  >> 0.step.lazy.take_while {|x| x < 5 }.to_a
  => [0, 1, 2, 3, 4]

···

On Tue, Sep 20, 2016 at 10:29 AM, Paul Martensen <paul.martensen@gmx.de> wrote:

Hi there,

I was wondering if there was a way to 'restrict' an Enumerator in an elegant
way.
Like a method like 'take' that returns another Enumerator instead of an
Array.

The Point is not to instantiate a big, costly Array.

Example:

enum = Enumerator.new do |yielder|
    n = 0
    loop do
      yielder << n
      n += 1
    end
  end

# Take creates a new Array
enum.take(10) #=> [0,1,2,3,4,5,6,7,8,9]

# What it would be like with a enumerator restriction:
new_enum = enum.enum_take(10) #=> New Enumerator that wraps the old one
new_enum.to_a #=> [0,1,2,3,4,5,6,7,8,9]

# Possible implementation for enum_take
class Enumerator
  def enum_take(ceiling)
    Enumerator.new do |yielder|
      index = 0
      loop do
        break if index > ceiling

I think this should be

break if index >= ceiling

Typical off by one error. :slight_smile:

        yielder << self.next
        index += 1
      end
    end
  end
end

How do you make sure this works if there are less than ceiling items?

I made a little gist on github if anyone is interested:
enumerator_restriction.rb · GitHub Thank you,

I'd do it differently - the explicit Enumerator is not needed and the
method would better go into Enumerable:

module Enumerable
  def enum_take(ceiling)
    return to_enum(:enum_take, ceiling) unless block_given?

    each_with_index do |x, i|
      return self if i >= ceiling
      yield x
    end

    self
  end
end

Test:

irb(main):039:0> 10.times.to_a.enum_take(2){|x| p x}
0
1
=> [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
irb(main):040:0> 10.times.to_a.enum_take(2).to_a
=> [0, 1]

Cheers

robert

···

On Tue, Sep 20, 2016 at 3:29 PM, Paul Martensen <paul.martensen@gmx.de> wrote:

--
[guy, jim, charlie].each {|him| remember.him do |as, often| as.you_can
- without end}
http://blog.rubybestpractices.com/

Hi

why not return only the sliced part at the end, instead of self ?

...
  all=[]
  each_with_index do |x, i|
      if i < ceiling then
        yield x
        all.push x
   end
   return all.to_enum
...

Can this be optimized (without ((big)) all)?

Cheers Berg

Thats a really elegant solution, thank you.

Paul

···

On 21.09.2016 08:12, Robert Klemme wrote:

On Tue, Sep 20, 2016 at 3:29 PM, Paul Martensen <paul.martensen@gmx.de> wrote:

Hi there,

I was wondering if there was a way to 'restrict' an Enumerator in an elegant
way.
Like a method like 'take' that returns another Enumerator instead of an
Array.

The Point is not to instantiate a big, costly Array.

Example:

enum = Enumerator.new do |yielder|
     n = 0
     loop do
       yielder << n
       n += 1
     end
   end

# Take creates a new Array
enum.take(10) #=> [0,1,2,3,4,5,6,7,8,9]

# What it would be like with a enumerator restriction:
new_enum = enum.enum_take(10) #=> New Enumerator that wraps the old one
new_enum.to_a #=> [0,1,2,3,4,5,6,7,8,9]

# Possible implementation for enum_take
class Enumerator
   def enum_take(ceiling)
     Enumerator.new do |yielder|
       index = 0
       loop do
         break if index > ceiling

I think this should be

break if index >= ceiling

Typical off by one error. :slight_smile:

         yielder << self.next
         index += 1
       end
     end
   end
end

How do you make sure this works if there are less than ceiling items?

I made a little gist on github if anyone is interested:
enumerator_restriction.rb · GitHub Thank you,

I'd do it differently - the explicit Enumerator is not needed and the
method would better go into Enumerable:

module Enumerable
   def enum_take(ceiling)
     return to_enum(:enum_take, ceiling) unless block_given?

     each_with_index do |x, i|
       return self if i >= ceiling
       yield x
     end

     self
   end
end

Test:

irb(main):039:0> 10.times.to_a.enum_take(2){|x| p x}
0
1
=> [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
irb(main):040:0> 10.times.to_a.enum_take(2).to_a
=> [0, 1]

Cheers

robert

why not return only the sliced part at the end, instead of self ?

You suggestion unnecessarily materializes the slice. The whole point
of the suggested #enum_take was to have an iterator that will only
iterate n steps. This can have a dramatic impact if calculating
individual entries takes extraordinary CPU or memory resources.

...
  all=
  each_with_index do |x, i|
      if i < ceiling then
        yield x
        all.push x
   end
   return all.to_enum
...

all.to_enum is totally superfluous since all is an Array which does
already have all the methods - plus it allows indexed access.

Can this be optimized (without ((big)) all)?

Use my suggestion. It will also create an Enumerator if no block is
provided. And it will only iterate elements on demand. You can
actually save two lines, since #each_with_index will return self
already:

module Enumerable
  def enum_take(ceiling)
    return to_enum(:enum_take, ceiling) unless block_given?

    each_with_index do |x, i|
      return self if i >= ceiling
      yield x
    end
  end
end

Cheers

robert

···

On Wed, Sep 21, 2016 at 10:06 AM, A Berger <aberger7890@gmail.com> wrote:

--
[guy, jim, charlie].each {|him| remember.him do |as, often| as.you_can
- without end}
http://blog.rubybestpractices.com/

Hi
I should have written
why not return only the WANTED part, instead of ALL(self) ?

The result differs - self seems to RETURN all (_ignoring_ the
limit/ceiling); that's what i meant.

cheers Berg

Hi
I should have written
why not return only the WANTED part, instead of ALL(self) ?

That is exactly what my enum_take does when called without block.

The result differs - self seems to RETURN all (_ignoring_ the
limit/ceiling); that's what i meant.

Conventionally iteration methods (each, each_with_index etc.) return self.

Cheers

robert

···

On Wed, Sep 21, 2016 at 8:14 PM, A Berger <aberger7890@gmail.com> wrote:

--
[guy, jim, charlie].each {|him| remember.him do |as, often| as.you_can
- without end}
http://blog.rubybestpractices.com/

Hi
wouldn't it be more useful (expected) to also return the
wanted/selected/sliced part, if calling with a block?

Cheers
Berg

The return value of a method used for iteration is rarely used so this
is less important. You would create at least one unnecessary instance
every of these iterations. I think self is pretty OK since you can
still get the slice. If you need the slice in an Array you can easily
do something.enum_take(3).to_a.

Cheers

robert

···

On Thu, Sep 22, 2016 at 9:30 AM, A Berger <aberger7890@gmail.com> wrote:

wouldn't it be more useful (expected) to also return the
wanted/selected/sliced part, if calling with a block?

--
[guy, jim, charlie].each {|him| remember.him do |as, often| as.you_can
- without end}
http://blog.rubybestpractices.com/