For or each?

Xavier_Noria · 21 September 2008 15:25

Relative speed between for (2) and each are similar in JRuby. That 1.9
thing is intriguing.

···

On Sun, Sep 21, 2008 at 2:54 AM, Joe Wölfel <joe@talkhouse.com> wrote:

Ruby 1.86 output:
Rehearsal --------------------------------------------------
for loop 2.840000 0.040000 2.880000 ( 2.880437)
for loop 2 1.660000 0.000000 1.660000 ( 1.661229)
each 1.750000 0.010000 1.760000 ( 1.755502)
----------------------------------------- total: 6.300000sec

user system total real
for loop 2.300000 0.000000 2.300000 ( 2.307566)
for loop 2 1.660000 0.010000 1.670000 ( 1.666218)
each 1.760000 0.000000 1.760000 ( 1.760356)

Charles_Oliver_Nutte · 21 September 2008 15:40

Phlip wrote:

'for' is arguably more readable. And it's not a performance issue - I suspect the opcodes will be the same. It is very much a technical issue.

For is generally faster because it doesn't instantiate a new scope for each loop. It shares the same scope as its containing block of code, where each creates a new scope each time the block is called. I'm reasonably sure this applies to all Ruby versions too.

- Charlie

Charles_Oliver_Nutte · 21 September 2008 15:44

Joe Wölfel wrote:

You can try my test if you like. I haven't checked it carefully. But it does seem like 'each' is much faster under both ruby 1.86 and ruby 1.9. The code is below.

Odd, now that I try it, each *is* faster than for. I'm going to have to investigate.

JRuby:
user system total real
for loop 0.334000 0.000000 0.334000 ( 0.334033)
each 0.236000 0.000000 0.236000 ( 0.235695)

Ruby 1.8.6:
user system total real
for loop 1.050000 0.010000 1.060000 ( 1.079704)
each 0.950000 0.010000 0.960000 ( 0.982576)

Ruby 1.9:
user system total real
for loop 0.910000 0.010000 0.920000 ( 0.933510)
each 0.420000 0.000000 0.420000 ( 0.431548)

- Charlie

Sylvain_Joyeux · 22 September 2008 08:26

Nope. each { } needs one new scope for each iteration while "for ... in"
explicitely uses the parent scope... In the end, you create with #each
as many objects as there are in your collection - which can be a huge
performance hit in some cases. That's why I personally use "for ... in" in
performance-critical parts of my code, to avoid unnecessary GC.

Sylvain

···

On Sun, Sep 21, 2008 at 07:12:06AM +0900, Phlip wrote:

tekwiz wrote:

It leaves you closer to a refactor to .map or .inject or .select or
.reject or .delete_if or .each_index or .each_with_index or ...

So, it's a code-readability issue and not a functional or complexity
issue?

'for' is arguably more readable. And it's not a performance issue - I
suspect the opcodes will be the same. It is very much a technical issue.

Brian_Candler · 22 September 2008 10:22

Phlip wrote:

_why wrote:

You folks can argue all you want about the look of the `for` but
you're forgetting the utility of having two nice choices.

Tx that's why I said 'for' can be more readable - even though I know
nobody in
person aware of its existence.

FWIW, the skeleton code which Rails generates uses the 'for' loop.

$ rails wombat
$ cd wombat
$ script/generate scaffold flurble
$ cat app/views/flurbles/index.html.erb
...
<% for flurble in @flurbles %>
<tr>
..
</tr>
<% end %>
...

···

--
Posted via http://www.ruby-forum.com/\.

Phlip1 · 20 September 2008 21:57

... or .each_char or .each_line or .each_byte or ...

What?

I illustrated that you augmented the "or" list from my first post.

Xavier_Noria · 20 September 2008 22:57

There's a technical difference in scopes between them, but for indeed
calls #each. A for loop

   for user in users
     ...
   end

assumes that users is whatever object that responds to #each, calls
the iterator on users and yields the value.

···

On Sun, Sep 21, 2008 at 12:52 AM, Phlip <phlip2005@gmail.com> wrote:

Joe Wölfel wrote:

You can try my test if you like. I haven't checked it carefully. But it
does seem like 'each' is much faster under both ruby 1.86 and ruby 1.9.
The code is below.

I just experimented with 'for' and found it does not reevaluate its header
each time it runs. That's the only thing that could have explained the time
difference, so maybe Matz & Co. have simply neglected 'for' while optimizing
.each, which everyone uses. (Though it's still technically superior; not
just an /ad populum/ thing...)

Another reason to use .each is your collection-like class might override it
to do something cool...

Joe_Wolfel · 20 September 2008 23:08

Philip, I hadn't thought about your refactoring arguments. I think I'm swayed by them. I do change an each to a map, etc., on occasion. Also, it seems simpler and less confusing to have one simple grammatical construction that does so many things.

···

On 20 sept. 08, at 18:52, Phlip wrote:

Joe Wölfel wrote:

You can try my test if you like. I haven't checked it carefully. But it does seem like 'each' is much faster under both ruby 1.86 and ruby 1.9. The code is below.

I just experimented with 'for' and found it does not reevaluate its header each time it runs. That's the only thing that could have explained the time difference, so maybe Matz & Co. have simply neglected 'for' while optimizing .each, which everyone uses. (Though it's still technically superior; not just an /ad populum/ thing...)

Another reason to use .each is your collection-like class might override it to do something cool...

Phlip1 · 21 September 2008 03:22

Xavier Noria wrote:

It is so unlikely that an #each becomes an
#inject that...

You have probably never pair-programmed with me during a savage refactoring session.

···

--
Phlip

Joe_Wolfel · 21 September 2008 16:21

It's hard to predict performance in advance. I think it's gotten harder as processors have become more complicated and there are more alternatives for running code. I'm just guessing here, but maybe blocks are so critical to ruby performance that 'each' has benefited from attempts to optimize the performance of blocks in general.

-Joe

···

On 21 sept. 08, at 11:44, Charles Oliver Nutter wrote:

Joe Wölfel wrote:

You can try my test if you like. I haven't checked it carefully. But it does seem like 'each' is much faster under both ruby 1.86 and ruby 1.9. The code is below.

Odd, now that I try it, each *is* faster than for. I'm going to have to investigate.

JRuby:
                     user system total real
for loop 0.334000 0.000000 0.334000 ( 0.334033)
each 0.236000 0.000000 0.236000 ( 0.235695)

Ruby 1.8.6:
                     user system total real
for loop 1.050000 0.010000 1.060000 ( 1.079704)
each 0.950000 0.010000 0.960000 ( 0.982576)

Ruby 1.9:
                     user system total real
for loop 0.910000 0.010000 0.920000 ( 0.933510)
each 0.420000 0.000000 0.420000 ( 0.431548)

- Charlie

Xavier_Noria · 22 September 2008 09:29

In what sense does #each create objects? Assuming a block with just
one parameter, you mean there's a new reference to existing objects
per iteration? May that impact that much GC?

···

On Mon, Sep 22, 2008 at 10:26 AM, Sylvain Joyeux <sylvain.joyeux@polytechnique.org> wrote:

Nope. each { } needs one new scope for each iteration while "for ... in"
explicitely uses the parent scope... In the end, you create with #each
as many objects as there are in your collection - which can be a huge
performance hit in some cases.

Brian_Candler · 22 September 2008 10:42

Sylvain Joyeux wrote:

Nope. each { } needs one new scope for each iteration

By 'scope' do you mean 'stack frame'?

Take this trivial example:

  def each1
    yield 1
    yield 2
    yield 3
  end
  each1 { :dummy }

The only 'objects' being created here are stack frames, by the yield
statements calling the block. To make it more explicit,

  def each2(&blk)
    blk.call(1)
    blk.call(2)
    blk.call(3)
  end
  each2 { :dummy }

Now, if you are arguing that the above code creates three objects which
need to be garbage-collected later, then you're also arguing that the
sequence

   foo(1)
   foo(2)
   foo(3)

creates three objects which need to be garbage-collected, and therefore
that the loop

  for i in (1..3)
    foo(i)
  end

also creates three garbage objects.

I don't believe that's the case. I would imagine that the stack runs as,
well, a stack. (It's not quite that simple when you get into creating
closures of course, but if you call a closure 1000 times, you're not
creating 1000 new closures)

Am I missing something?

Finally, I tried some simple measurements.

$ time ruby -e 'a = (1..5_000_000).to_a; a.each { :dummy }'
$ time ruby -e 'a = (1..5_000_000).to_a; for i in a; :dummy; end'

Under ruby 1.8.6p114, I find the first is about 5% faster.

Under ruby 1.8.4 (Ubuntu Dapper), I find the first is about 25% faster.

This is on relatively old Pentium machines though.

···

--
Posted via http://www.ruby-forum.com/\.

Brian_Candler · 22 September 2008 10:52

Sylvain Joyeux wrote:

Nope. each { } needs one new scope for each iteration while "for ... in"
explicitely uses the parent scope... In the end, you create with #each
as many objects as there are in your collection

P.S. Here's a simple experiment, and I can't see any of these
dark-matter objects that you talk about.

  def countobj
    count = 0
    ObjectSpace.each_object(Object) { count += 1 }
    count
  end

  def foo
    :dummy
  end

  puts "#{countobj} objects"
  GC.disable
  (1..1_000_000).each { foo }
  puts "#{countobj} objects"

···

--
Posted via http://www.ruby-forum.com/\.

Xavier_Noria · 20 September 2008 23:27

You cannot rationalize a convention. Conventions happen, it is
difficult to explain why things are the way they are when they are
mostly stylistic.

You use two spaces in Ruby because you want your code to be idiomatic.
Can it be said that two spaces are obviously better than four or
eight? I don't think so, it is just a convention. And when you write
Perl or Java you use four. That's it.

In my opinion you use #each in your Ruby code because that's what
people use. That's what the book you first read use, that's what
everybody writes. Your code is supposed to use #each, you learn that
when you learn Ruby and probably *force a change in your mind a
priori* if you come from almost any other language. Just to follow the
conventions and write code that resembles what the community has
converged into.

You can argue that the convention has converged because iterators blah
and yielding blah, but

   for user in users
     ...
   end

is crystal clear, readable, has all the benefits of #each because it
uses #each, and what not.

As a counterargument, in ERb templates for-loops are not perceived as
"funny" and they are commonly used.

Sylvain_Joyeux · 22 September 2008 13:42

Thanks but nope...

each_object *specifically* filters out objects that are internal to the
interpreter, therefore you don't see those here. For that, you actually
need a better object-counting setup. See Ruby patches for one that I
submitted. I hope that the "new Ruby GC profiler" that have been
included in 1.9 will provide the same amount of information.

Sylvain

···

On Mon, Sep 22, 2008 at 07:52:47PM +0900, Brian Candler wrote:

Sylvain Joyeux wrote:
> Nope. each { } needs one new scope for each iteration while "for ... in"
> explicitely uses the parent scope... In the end, you create with #each
> as many objects as there are in your collection

P.S. Here's a simple experiment, and I can't see any of these
dark-matter objects that you talk about.

  def countobj
    count = 0
    ObjectSpace.each_object(Object) { count += 1 }
    count
  end

  def foo
    :dummy
  end

  puts "#{countobj} objects"
  GC.disable
  (1..1_000_000).each { foo }
  puts "#{countobj} objects"

Sylvain_Joyeux · 22 September 2008 13:49

Sylvain Joyeux wrote:
> Nope. each { } needs one new scope for each iteration

By 'scope' do you mean 'stack frame'?
creates three objects which need to be garbage-collected, and therefore
that the loop

  for i in (1..3)
    foo(i)
  end

also creates three garbage objects.

Yes. Except that

  collection.each do |obj|
    foo(obj)
  end

creates twice the amount of

  for obj in collection
    foo(obj)
  end

which is what we are trying to compare here.

I don't believe that's the case. I would imagine that the stack runs as,
well, a stack. (It's not quite that simple when you get into creating
closures of course, but if you call a closure 1000 times, you're not
creating 1000 new closures)

Am I missing something?

Yes. I don't know for the current version of the 1.9 VM, but currently
the interpreter does not 'reuse' stack frames, and therefore you *do*
create one object per new scope. To cut down the discussion,
ObjectSpace#each_object does *not* give you those, so you can't count
them with it.

Finally, I tried some simple measurements.

$ time ruby -e 'a = (1..5_000_000).to_a; a.each { :dummy }'
$ time ruby -e 'a = (1..5_000_000).to_a; for i in a; :dummy; end'

Under ruby 1.8.6p114, I find the first is about 5% faster.

Under ruby 1.8.4 (Ubuntu Dapper), I find the first is about 25% faster.

Interesting. I have the same results, but (on my machine) the following
is even slower:
$ time ruby -e 'a = (1..5_000_000).to_a; a.each { |i| :dummy }'

···

On Mon, Sep 22, 2008 at 07:42:24PM +0900, Brian Candler wrote:

Brian_Candler · 22 September 2008 14:08

Sylvain Joyeux wrote:

To cut down the discussion,
ObjectSpace#each_object does *not* give you those, so you can't count
them with it.

OK, I see that certain objects are not yielded, including T_SCOPE.

On the other hand, observe that the following program doesn't leak
memory:

  def foo
    :dummy
  end

  GC.disable
  (1..10_000_000).each { |i| foo }
  puts `ps auxwww | grep ruby | grep -v grep`
  puts "Press enter"
  STDIN.gets

On the two machines I tried the RSS is 3MB, regardless of how big I make
the loop. (That's 1.8.4 stock Ubuntu Dapper, and 1.8.6p114 compiled from
source)

So are you sure a scope is created every time round the loop - not just
on the first invocation?

The following program *does* consume memory:

  def foo
    :dummy
  end

  GC.disable
  (1..1_000_000).each { |x|
    (1..1).each { |y| foo }
  }
  puts `ps auxwww | grep ruby | grep -v grep`
  puts "Press enter"
  STDIN.gets

That is: I'm happy to accept that every time the inner loop starts, it
creates a new scope (since the block is a closure with a different value
of x bound to it)

But if I change the inner loop to

(1..10).each { |y| foo }

the memory consumption is the same. So whether a block is invoked once
or 10 times makes no difference to memory usage.

···

--
Posted via http://www.ruby-forum.com/\.

namekuseijin · 24 September 2008 20:34

Come on! Is it so hard to realize that foo.each {|i| ...} is used
more simply because it fits better the ruby all-OO mindset? Like
Foo.new instead of the more common new Foo?

In search of purity, everything is an object and all actions are
method calls. It feels very much like Scheme and its (sweet)
obsession for doing everything out of functions and function
application...

Rubinists are used to postfix notation just as much as Forth users or
Lispers to prefix...

Topic		Replies	Views
For vs. each ruby-talk	12	86	1 March 2005
The difference between for-loop and each ruby-talk	1	124	20 October 2007
Array#each vs. Array#each_index iteration ruby-talk	8	153	18 July 2011
Noob - loop indexing ruby-talk	4	95	28 September 2009
Best style for setting items in a collection ruby-talk	7	120	25 July 2011

For or each?

Related topics