A different Version of Enumerable#inject

Apparently the other post got overlooked - and it had an error in the
implementation. What do others think of this suggestion of a changed
Enumerable#inject that uses a sliding window whose size is determined by
the block’s arity.

With this implementation you can do the usual injection tricks plus
everything that needs more than one argument:

ordered?

enum.inject(true) {|c,a,b|c&&a<b}

differentiation

enum.inject([]) {|m,x,y|m<<(y-x).to_f/2}

integration

enum.inject([]) {|m,x,y|m<<(x+y).to_f/2}

weighted integration

enum.inject([]) {|m,x,y,z|m<<(x+y+y+z).to_f/4}

What do you think? Is this a reasonable replacement for inject or an
additional method?

Cheers

robert

module Enumerable
def inject(val=nil, &bl)
raise “Block missing” unless bl

size = bl.arity
raise "Need block with at least 1 argument" if size == 0

size = size < 0 ? 1+size : size-1

if size == 0
  # 1 arg
  each do |e|
    val = yield val
  end
else
  # >1 args
  args=[]

  each do |e|
    args.push e

    if args.length == size
      val = yield val, *args
      args.shift
    end
  end
end

val

end
end

Apparently the other post got overlooked - and it had an error in the
implementation. What do others think of this suggestion of a changed
Enumerable#inject that uses a sliding window whose size is determined by
the block’s arity.

I tend to dislike #inject cause it’s almost always more complex than
using #each + the needed logic (and slower too), as it’s providing the
wrong sort of abstraction in many cases (gets further away from the
problem domain, not closer, IMHO). To put it another way, #inject
represents a movement orthogonal to the “complexity line of force”.

However, your extended #inject seems OK to me, because of the sliding
window concept. Now that is something useful in some cases, and using
your #inject means I don’t have to propagate values.

I would like it to become common, as it’s probably the only kind of
#inject I would like to use :slight_smile:

integration

enum.inject() {|m,x,y|m<<(x+y).to_f/2}

Is this really integrating? It won’t give the same results as
m =
enum.each{ |x| m << (m[-1]||0) + x }

will it?

If I understand what your #inject does, this is only averaging pair-wise.
(too lazy to read your code now :slight_smile:

weighted integration

enum.inject() {|m,x,y,z|m<<(x+y+y+z).to_f/4}

you mean a convolution with [0.25, 0.5, 0.25], no?

Convolution probably deserves a method on its own, but I don’t think
Enumerable#convolution is going to be really used anyway :slight_smile:

···

On Wed, Jul 23, 2003 at 06:03:08PM +0900, Robert Klemme wrote:


_ _

__ __ | | ___ _ __ ___ __ _ _ __
'_ \ / | __/ __| '_ _ \ / ` | ’ \
) | (| | |
__ \ | | | | | (| | | | |
.__/ _,
|_|/| || ||_,|| |_|
Running Debian GNU/Linux Sid (unstable)
batsman dot geo at yahoo dot com

Turn right here. No! NO! The OTHER right!

I think you’ll have problems with this approach if you use it with
Enumerables that yield more than one element in each, such as Hashes.

Regards,
Pit

···

On 23 Jul 2003 at 18:03, Robert Klemme wrote:

What do others think of this suggestion of a changed
Enumerable#inject that uses a sliding window whose size is determined by
the block’s arity.

“Robert Klemme” bob.news@gmx.net wrote about:
[ inject with variable num args block ]

My first thought was that inject is complex enough as it is and more
features might prevent me from using it. My second thought was: Why
restrict this feature to inject? But I still see some problems, too.

I don’t use inject very often. (Reason: I always have to look up which
argument of the block was the result of last steps and which one was
the next entry of my enumerable. More arguments won’t help me there.
:wink:

But I often need to do things between two steps of an iteration
(do:separatedBy: in Smalltalk) or do things with two adjacent entries
(pairsDo: in Smalltalk).

As you see my problem is method naming. I tend to expect differently
named methods for different tasks and not one method or block doing
different jobs depending on the number of arguments.

IMO the first argument of inject is meant to carry the information
from the steps already done (e. g. a computational result) into the
next iteration step. Each iteration step gets this information plus
the next entry of my collection. The carried information could contain
all the needed informations about previous steps (e. g. a fifo-list
holding some entries). From this point of view inject does everything
I need. But it would be a lot of code to achieve what the new inject
could do in a single line.

My suggestion would be to add methods like eachTupel or injectTupel
instead of changing inject.

Cheers
Sascha

“Mauricio Fernández” batsman.geo@yahoo.com schrieb im Newsbeitrag
news:20030723093107.GA14554@student.ei.uni-stuttgart.de

Apparently the other post got overlooked - and it had an error in the
implementation. What do others think of this suggestion of a changed
Enumerable#inject that uses a sliding window whose size is determined
by
the block’s arity.

I tend to dislike #inject cause it’s almost always more complex than
using #each + the needed logic (and slower too),

I could agree with the performance issue, although I’d like to test that.
But I don’t see the difference in complexity between:

m =
enum.each{ |x| m << (m[-1]||0) + x }

and

m = enum.inject() {|m,x| m << (m[-1]||0) + x}

In fact, the formula is pretty much the same. :slight_smile:

as it’s providing the
wrong sort of abstraction in many cases (gets further away from the
problem domain, not closer, IMHO).

Hm, that I don’t understand. IMHO this follows the spirit of Ruby since
it allows to implement a lot of useful things very easily. Could you
elaborate or do you have any sources to back that up?

To put it another way, #inject
represents a movement orthogonal to the “complexity line of force”.

May the force be with you.

However, your extended #inject seems OK to me, because of the sliding
window concept. Now that is something useful in some cases, and using
your #inject means I don’t have to propagate values.

Yep.

I would like it to become common, as it’s probably the only kind of
#inject I would like to use :slight_smile:

:slight_smile:

integration

enum.inject() {|m,x,y|m<<(x+y).to_f/2}

Is this really integrating? It won’t give the same results as
m =
enum.each{ |x| m << (m[-1]||0) + x }

will it?

You’re right. That was not the mathematical correct formulation. It’s
more like an averaging functionality. This should give the same results
as yours:

enum.inject([0]) {|m,x|m<<m[-1] + x}

If I understand what your #inject does, this is only averaging
pair-wise.
(too lazy to read your code now :slight_smile:

Yep, true.

weighted integration

enum.inject() {|m,x,y,z|m<<(x+y+y+z).to_f/4}

you mean a convolution with [0.25, 0.5, 0.25], no?

If “convolution” is the right term. I’m not so used to english
statistical terms.

Convolution probably deserves a method on its own, but I don’t think
Enumerable#convolution is going to be really used anyway :slight_smile:

Possibly. Apart from that, Enumerable#convolution might be implemented
differently.

Thanks for checking

robert
···

On Wed, Jul 23, 2003 at 06:03:08PM +0900, Robert Klemme wrote:

“Pit Capitain” pit@capitain.de schrieb im Newsbeitrag
news:3F1EF07C.23852.16EFED25@localhost…

···

On 23 Jul 2003 at 18:03, Robert Klemme wrote:

What do others think of this suggestion of a changed
Enumerable#inject that uses a sliding window whose size is determined by
the block’s arity.

I think you’ll have problems with this approach if you use it with
Enumerables that yield more than one element in each, such as Hashes.

I know, but this is true for some of the other Enumerable methods, too.

Regards

robert

“Sascha Dördelmann” wsdng@onlinehome.de schrieb im Newsbeitrag
news:a39cf4fe.0307240041.dd541f8@posting.google.com

“Robert Klemme” bob.news@gmx.net wrote about:
[ inject with variable num args block ]

My first thought was that inject is complex enough as it is and more
features might prevent me from using it. My second thought was: Why
restrict this feature to inject?

Yeah, i’ve thought of this. In fact, map could “benefit” from this, too.

But I still see some problems, too.

I don’t use inject very often. (Reason: I always have to look up which
argument of the block was the result of last steps and which one was
the next entry of my enumerable. More arguments won’t help me there.
:wink:

I think it would help you, because if you know that the number of block
arguments can differ it’s quite naturally that the first one is the one
that carries the value through the iteration while the rest are the enum
elements.

But I often need to do things between two steps of an iteration
(do:separatedBy: in Smalltalk) or do things with two adjacent entries
(pairsDo: in Smalltalk).

The proposed inject serves the latter. do:separatedBy: would need a
different implementation. Something like

module Enumerable
def doSeparatedBy(blockEach, &blockIntermediate)
first = true
last = nil

each do |e|
  if first
    first = false
  else
    blockIntermediate.call( last, e )
  end

  blockEach.call( e )
  last = e
end

end
end

irb(main):031:0> a=%w{a b c d e}
[“a”, “b”, “c”, “d”, “e”]
irb(main):032:0> str=“”
“”
irb(main):033:0> a.doSeparatedBy( proc{|e|str<<e} ) {str<<“|”}
[“a”, “b”, “c”, “d”, “e”]
irb(main):034:0> str
“a|b|c|d|e”
irb(main):035:0>

As you see my problem is method naming. I tend to expect differently
named methods for different tasks and not one method or block doing
different jobs depending on the number of arguments.

IMHO my proposed inject does not change behavior depending on the number
of arguments. It always yields the carry value + n enumeration elements,
where n>=0. Of course you can change the initial value as well as the
number of block arguments to achieve such different things as counting,
mapping and the others I have shown. But that’s true even for
Enumerable#each. The difference lies in the block and not in the
implementation of inject.

IMO the first argument of inject is meant to carry the information
from the steps already done (e. g. a computational result) into the
next iteration step.

Exactly. That property of the method doesn’t change.

Each iteration step gets this information plus
the next entry of my collection.

… or the next entries.

The carried information could contain
all the needed informations about previous steps (e. g. a fifo-list
holding some entries). From this point of view inject does everything
I need. But it would be a lot of code to achieve what the new inject
could do in a single line.

Now, is this pro or con? Sounds more like pro to me.

My suggestion would be to add methods like eachTupel or injectTupel
instead of changing inject.

Yeah, this might be better. But then again, inject as it stands today is
just a special case of the more general one, i.e., the sliding window has
size 1.

Thanks for commenting.

robert

I tend to dislike #inject cause it’s almost always more complex than
using #each + the needed logic (and slower too),

I could agree with the performance issue, although I’d like to test that.
But I don’t see the difference in complexity between:

m =
enum.each{ |x| m << (m[-1]||0) + x }

and

m = enum.inject() {|m,x| m << (m[-1]||0) + x}

The former is more readable for me. Possibly because I don’t use
inject often so it’s not natural for me. And then again, why use the
second if it’s going to be slower and (at least for me) it’s not
clearer?

In fact, the formula is pretty much the same. :slight_smile:

as it’s providing the
wrong sort of abstraction in many cases (gets further away from the
problem domain, not closer, IMHO).

Hm, that I don’t understand. IMHO this follows the spirit of Ruby since
it allows to implement a lot of useful things very easily. Could you
elaborate or do you have any sources to back that up?

#inject does two things:

  1. pass elements sequentially to the block
  2. propagate the state

in many cases, (2) feels strange because I don’t want the state object
to be passed again and again to the block. In the previous example, I
want to obtain the array corresponding to the integral; in my mind, it
should be easy to track what ‘m’ is pointing to. But w/ inject, I have
to make the effort to evaluate what ‘m’ represents in each iteration;
admittedly not a really hard issue here, but even if it’s only a bit
more complex (in matz’ terms: adds a little stress, consumes some
brain power), #inject didn’t buy me anything in that (and several other)
examples.

···

On Wed, Jul 23, 2003 at 09:04:20PM +0900, Robert Klemme wrote:


_ _

__ __ | | ___ _ __ ___ __ _ _ __
'_ \ / | __/ __| '_ _ \ / ` | ’ \
) | (| | |
__ \ | | | | | (| | | | |
.__/ _,
|_|/| || ||_,|| |_|
Running Debian GNU/Linux Sid (unstable)
batsman dot geo at yahoo dot com

Yes I have a Machintosh, please don’t scream at me.
– Larry Blumette on linux-kernel

The carried information could contain
all the needed informations about previous steps (e. g. a fifo-list
holding some entries). From this point of view inject does everything
I need. But it would be a lot of code to achieve what the new inject
could do in a single line.

Now, is this pro or con? Sounds more like pro to me.

That was loud thinking which lead to a pro, yes!

Thanks for commenting.

You’re welcome! I found your answers useful, too.

Cheers
Sascha

···

“Robert Klemme” bob.news@gmx.net wrote: