Benchmark obsession?

Hi,

After having read this list for a while, I wonder why some of you put so
much weight on speed optimizations. I'm not talking about big things
that really make sense but small stuff like "Don't use symbols, they
can't be garbage collected", "Don't concatenate strings, use string
interpolation instead", "Don't use Enumerable#inject to build up
objects" etc. etc.

In my opinion, this is like trying to get a classic car faster. It just
makes no sense. Ruby isn't about speed, it's about elegance and clarity.
If you're looking for speed, you've got the wrong language. Use C or
whatever.

I'd always prefer an elegant solution over a fast one. For example, I
love functional style programming with Enumerable#inject. I don't care
if it's some milliseconds slower than assigning the values to a
variable.

···

--
Posted via http://www.ruby-forum.com/.

Speed does sometimes matter, even in this kind of micro-benchmarks.
Maybe you're writing a JSON processor, maybe a parser, maybe a math
library. Of course, most times you don't, and you shouldn't care about
being a millisecond faster.

And now I can't not comment on the three examples you brought up.

"Don't use symbols, they can't be garbage collected" – this sounds
like someone who doesn't know what the hell they are doing would say.
If there is *ever* a case where you have created enough symbols to
create a visible memory footprint, you are doing something extremely
wrong. Using symbols where you would use magic constants or enums in
language like C, or as "static" (as in, not dynamically generated, or
limited to a certain number – for example columns of a table in a
database) keys of a hash is perfectly okay and it is near impossible
for this to cause problems with GC (symbol's text is internally only
stored *once*, and symbols are special-cased to avoid memory
indirection in C code and passed around as fake-pointers; only other
class treated like this is Fixnum). In fact, this can lead to a faster
code when you replace constant strings with symbols. (Disclaimer: I
didn't benchmark this.)

"Don't concatenate strings, use string interpolation instead" – I'd
say that in most cases string interpolation is clearer than
concatenating, especially when a lot of ".to_s" calls would have to be
used.

"Don't use Enumerable#inject to build up objects" – I am morally
opposed to using #inject for anything else that actually folding an
array into a single value, which is what is was intended for.
Injecting with an object, as clever as it is, is not clear code.

-- Matma Rex

After having read this list for a while, I wonder why some of you put so
much weight on speed optimizations.

I can't say I have observed an _obsession_ with things like these.
Maybe it's just the fun or that it is so easy to Benchmark. :slight_smile:

I'm not talking about big things
that really make sense but small stuff like "Don't use symbols, they
can't be garbage collected",

That is certainly a stupid general rule because in some situations
this is exactly what you want: identifiers which do not need to be
created and which are not GC'ed.

"Don't concatenate strings, use string
interpolation instead", "Don't use Enumerable#inject to build up
objects" etc. etc.

As I said, I don't have made the same observation you apparently have
made. I do not read most of the traffic here but if it really was
obsessive I think I would have noticed. Strange how perceptions can
be so different.

Cheers

robert

···

On Wed, Jun 20, 2012 at 4:43 PM, Jan E. <lists@ruby-forum.com> wrote:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

After having read this list for a while, I wonder why some of you put so
much weight on speed optimizations. I'm not talking about big things
that really make sense but small stuff like "Don't use symbols, they
can't be garbage collected", "Don't concatenate strings, use string
interpolation instead", "Don't use Enumerable#inject to build up
objects" etc. etc.

See http://twitter.com/roflscaletips

In my opinion, this is like trying to get a classic car faster. It just

makes no sense. Ruby isn't about speed, it's about elegance and clarity.
If you're looking for speed, you've got the wrong language. Use C or
whatever.

There are some very egregious things that libraries can do which they
shouldn't and will significantly affect the performance of all running
code, like altering the class hierarchy at runtime and thus invalidating
all method caches at all call sites. This is bad and people should call
that thing out ("DCI" people, I'm looking at you...)

However, when it comes to microoptimizing your Ruby code like that, you
should probably be using something like perftools to measure. Code has
different performance characteristics in different scenarios, so unless you
have some real-world code you're trying to make faster, it's kind of a
pointless exercise. If you do have said code, you should optimize it in a
data-driven way. The best speedups you get will probably be from using code
with better algorithmic properties and not from microoptimizing minutiae.

···

On Wed, Jun 20, 2012 at 7:43 AM, Jan E. <lists@ruby-forum.com> wrote:

--
Tony Arcieri

The real question is: how many microtunes does it take for the advantage
to offset the cost of one additional bug?

Clarity and simplicity is almost always the best approach, I feel.

···

A: a lot.

--
Posted via http://www.ruby-forum.com/.

I kinda feel like I'm being called out as I'm on record (many times) for 2/3rds of your examples so I'll address them specifically:

"Don't concatenate strings, use string interpolation instead"

Using a recent real example from this list where I suggested interpolation:

    m.ClassMethodString + " " + m.ClassMethodString + ": " + m.ClassMethodInteger.to_s

mmmmm java code.

1) slower -- several more method calls
2) wasteful -- creates much more garbage
3) longer/uglier -- I'd argue that it is much less elegant

vs

    "#{m.ClassMethodString} #{m.ClassMethodString}: #{m.ClassMethodInteger}"

1) clarity -- it's just a string.
2) elegance -- it's JUST a string AND you don't need those stupid #to_s calls.
3) efficient -- takes less time, uses less memory, makes less garbage, and even easier to read.

"Don't use Enumerable#inject to build up objects" etc. etc.

Given #inject's other alias, #reduce, it is obvious that you don't use #inject for building up other objects. Even in a functional style of programming you'd _never_ see it building up anything. You'd see it REDUCING (folding) an object. If #inject is applied in a non-folding manner, it isn't functional, it is just dumb. Don't pretend otherwise (and if you do pretend otherwise, go read more books on lisp--start with SICP). The second I see a semicolon (or return) in an inject, I immediately suspect that someone is writing clevar/stupid code.

I don't have any recent examples from the list, but I'm on record in multiple mediums ranting against people who use #inject improperly. I'll make up one based on examples I've seen time and time again:

    return im_a_lazy_bastard.inject(Hash.new 0) { |h, o| h[o.really_really_lazy] += 1; h }

vs

    counter = Hash.new 0
    thingies.each do |o|
      counter[o.key] += 1
    end
    return counter

1) I use #each because it adds CLARITY. I want to enumerate each element. I'm not folding anything.
2) Yes, it's faster. I don't actually care about that nearly as much as #1.
3) Yes, it is more lines:
   1) but only if you write the inject version that way.
   2) I use the Weirich Method [1][2] of choosing {} vs do/end. INCREASING clarity and intent.
   3) each line is a stand-alone concept that helps increase clarity.

Here is a perfect example of an actual folding application of #inject:

    classname.split(/::/).inject(Object) { |k, n| k.const_get n }

vs:

    k = Object
    classname.split(/::/).each { |n| k = k.const_get n }
    k

As you can see, the #inject version is incredibly clear and concise. The second example takes longer to figure out. That is what the natural fit of a well designed method is supposed to do.

···

On Jun 20, 2012, at 07:43 , Jan E. wrote:

---

Come to think of it (!!!) I DO have a real world example of inject that I used in my Ruby Sadism talk:

if MODELS.keys.inject(true) {|b, klass| b and klass.constantize.columns.map(&:name).include? association.options[:foreign_key]} then
  # ...
end

Have fun with that... It's probably the most egregious use of inject I've ever found. The original author actually argued that he wrote it that way "for maintainability".

[1]: http://onestepback.org/index.cgi/Tech/Ruby/BraceVsDoEnd.rdoc
[2]: http://talklikeaduck.denhaven2.com/2007/10/02/ruby-blocks-do-or-brace

I'd say more like a "common fallacy" than an obsession. Indeed, it's a common problem across all languages, not just Ruby. I recall advice in Basic that said "don't use new lines unless you need a label." In thirty years of coding and performance evaluation, I can't recall a case where micro-tuning was sufficient to solve a performance issue, yet there are many times I've seen it used.

···

Sent from my iPhone

On Jun 20, 2012, at 11:02 AM, Robert Klemme <shortcutter@googlemail.com> wrote:

On Wed, Jun 20, 2012 at 4:43 PM, Jan E. <lists@ruby-forum.com> wrote:

After having read this list for a while, I wonder why some of you put so
much weight on speed optimizations.

I can't say I have observed an _obsession_ with things like these.
Maybe it's just the fun or that it is so easy to Benchmark. :slight_smile:

I'm not talking about big things
that really make sense but small stuff like "Don't use symbols, they
can't be garbage collected",

That is certainly a stupid general rule because in some situations
this is exactly what you want: identifiers which do not need to be
created and which are not GC'ed.

"Don't concatenate strings, use string
interpolation instead", "Don't use Enumerable#inject to build up
objects" etc. etc.

As I said, I don't have made the same observation you apparently have
made. I do not read most of the traffic here but if it really was
obsessive I think I would have noticed. Strange how perceptions can
be so different.

Cheers

robert

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

The real question is: how many microtunes does it take for the advantage
to offset the cost of one additional bug?

A: a lot.

or a micro optimization that makes some that happens many, many times faster.

if performance is not acceptable, and profiling indicates a spot that
needs fixing, a micro-optimization could be the right thing to do.

Clarity and simplicity is almost always the best approach, I feel.

Clarity, simplicity, and profiling if/when you run into problems.

···

On Wed, Jun 20, 2012 at 11:04 AM, Dan Connelly <lists@ruby-forum.com> wrote:

--
Posted via http://www.ruby-forum.com/\.

--
thanks,
-pate
-------------------------
Don't judge those who choose to sin differently than you do

Ryan Davis wrote in post #1065406:

I kinda feel like I'm being called out as I'm on record (many times) for
2/3rds of your examples so I'll address them specifically:

Well, it seems those example were a bit ambiguous. I'm *not* arguing
against string interpolation etc. I'm against using them for the sole
purpose of saving some bytes and CPU cycles.

If you use string interpolation for clarity, that's perfect. I fully
agree with you. The same goes for inject (with "building up an object" I
actually meant "building up the aggregate value" -- so there's no
disagreement on that).

My point is that we should focus on readability, clarity, elegance etc.
rather than do everything to make our programs run a bit faster. That's
just not what Ruby is for (at least to my understanding).

···

--
Posted via http://www.ruby-forum.com/\.

I'm curious if you consider #each_with_object a reasonable choice for this.

···

On Jun 20, 2012 5:57 PM, "Ryan Davis" <ryand-ruby@zenspider.com> wrote:

   counter = Hash.new 0
   thingies.each do |o|
     counter[o.key] += 1
   end
   return counter

--
Avdi

I come across this quite often, especially in Rails apps.

a = {:list => [1,2,3,4]}
b = {:list => [9,8,7,6,5]}

c = [a,b]

c.inject() {|memo, run| memo + run[:list] }

I always cringe when I see it but I haven't found an alternative that is as clear and concise.
collect and flatten looks ugly. I'd love to be able to do...

c.collect {|run| *run[:list]}

Henry

···

On 21/06/2012, at 9:50 AM, Ryan Davis wrote:

Given #inject's other alias, #reduce, it is obvious that you don't use #inject for building up other objects. Even in a functional style of programming you'd _never_ see it building up anything. You'd see it REDUCING (folding) an object. If #inject is applied in a non-folding manner, it isn't functional, it is just dumb. Don't pretend otherwise (and if you do pretend otherwise, go read more books on lisp--start with SICP). The second I see a semicolon (or return) in an inject, I immediately suspect that someone is writing clevar/stupid code.

I don't have any recent examples from the list, but I'm on record in multiple mediums ranting against people who use #inject improperly. I'll make up one based on examples I've seen time and time again:

Jan E. wrote in post #1065412:

If you use string interpolation for clarity, that's perfect. I fully
agree with you. The same goes for inject (with "building up an object" I
actually meant "building up the aggregate value" -- so there's no
disagreement on that).

If I have a choice between writing:

puts "a = #{a}\n"\
     "b = #{b}"

and

$stdout << "a = " << a << "\nb = " << b << "\n"

which is clearer? C++ fans might prefer the second, while I prefer the
first. In any case, I'm glad to hear the first happens to be faster, as
well :).

···

--
Posted via http://www.ruby-forum.com/\.

Is that basically the same thing wrapped in another method so that counter and o are yielded to a block?

def each_with_object(memo)
   return to_enum :each_with_object, memo unless block_given?
   each do |element|
     yield element, memo
   end
   memo
end

Sam

···

On 21/06/12 12:49, Avdi Grimm wrote:

On Jun 20, 2012 5:57 PM, "Ryan Davis" <ryand-ruby@zenspider.com > <mailto:ryand-ruby@zenspider.com>> wrote:
>
> counter = Hash.new 0
> thingies.each do |o|
> counter[o.key] += 1
> end
> return counter
>

I'm curious if you consider #each_with_object a reasonable choice for this.

--
Avdi

indeed. i have 3 solns for this using 1) tap, 2) inject, 3)
each_with_object (or .each.with_object).

(Hash.new(0)).tap{|h| thingies.each{|i| h[i] += 1} }

thingies.inject(Hash.new(0)){|h,i| h[i] += 1; h} }

thingies.each.with_object(Hash.new(0)){|i,h| h[i] += 1}

looking at inject.. hmm, not sure. it does not seem so bad.. unless i
be so dogmatic.. nah, i've multiple religion.. it's more fun :slight_smile:

best regards -botp

···

On Thu, Jun 21, 2012 at 8:49 AM, Avdi Grimm <groups@inbox.avdi.org> wrote:

On Jun 20, 2012 5:57 PM, "Ryan Davis" <ryand-ruby@zenspider.com> wrote:

counter = Hash.new 0
thingies.each do |o|
counter[o.key] += 1
end
return counter

I'm curious if you consider #each_with_object a reasonable choice for this.

You could move the array outside:

a = {:list => [1,2,3,4]}
b = {:list => [9,8,7,6,5]}

c = [a,b]

all =

c.each { |h| all.concat h[:list] }

Saves a little memory, too?

-Justin

···

On 06/21/2012 04:30 PM, Henry Maddocks wrote:

On 21/06/2012, at 9:50 AM, Ryan Davis wrote:

Given #inject's other alias, #reduce, it is obvious that you don't use #inject for building up other objects. Even in a functional style of programming you'd _never_ see it building up anything. You'd see it REDUCING (folding) an object. If #inject is applied in a non-folding manner, it isn't functional, it is just dumb. Don't pretend otherwise (and if you do pretend otherwise, go read more books on lisp--start with SICP). The second I see a semicolon (or return) in an inject, I immediately suspect that someone is writing clevar/stupid code.

I don't have any recent examples from the list, but I'm on record in multiple mediums ranting against people who use #inject improperly. I'll make up one based on examples I've seen time and time again:

I come across this quite often, especially in Rails apps.

a = {:list => [1,2,3,4]}
b = {:list => [9,8,7,6,5]}

c = [a,b]

c.inject() {|memo, run| memo + run[:list] }

I always cringe when I see it but I haven't found an alternative that is as clear and concise.
collect and flatten looks ugly. I'd love to be able to do...

c.collect {|run| *run[:list]}

Henry

c.collect {|run| run[:list]} . flatten

or if there are only few elements,

a[:list] + b[:list]

kind regards -botp

···

On Fri, Jun 22, 2012 at 7:30 AM, Henry Maddocks <hmaddocks@me.com> wrote:

I come across this quite often, especially in Rails apps.

a = {:list => [1,2,3,4]}
b = {:list => [9,8,7,6,5]}

c = [a,b]

c.inject() {|memo, run| memo + run[:list] }

I always cringe when I see it but I haven't found an alternative that is as
clear and concise.
collect and flatten looks ugly. I'd love to be able to do...

c.collect {|run| *run[:list]}

c.flat_map {|run| run[:list]}

···

On 06/22/2012 01:30 AM, Henry Maddocks wrote:

I come across this quite often, especially in Rails apps.

a = {:list => [1,2,3,4]}
b = {:list => [9,8,7,6,5]}

c = [a,b]

c.inject() {|memo, run| memo + run[:list] }

I always cringe when I see it but I haven't found an alternative that is
as clear and concise.
collect and flatten looks ugly. I'd love to be able to do...

c.collect {|run| *run[:list]}

--
Lars Haugseth

I'll add to the string interpolation issue with an anecdote: I've had
real world examples (in projects about which I'm forbidden to talk for
legal reasons) where a refactoring from:

  foo = "a" + b + "c"

type string assembly to:

  foo = ""; foo << "a" << b << "c"

caused an immense speedup (we're talking tens of minutes here),
reduced the memory footprint dramatically, and generally made our
lives on the floor that little bit easier. Of course for something
like my abc example above I'd definitely use "a#{b}c" because it's
more readable (as well as everything else); but with large document
generation sometimes interpolation is just not feasible.

Ryan mentioned Java; the concatenation optimisation is exactly
analogous to a previous time in the same company I achieved a very
similar improvement by converting Java Strings to StringBuilders.

It's still not interpolation, but it can have a genuine, measurable
effect. Knowing that + creates all those new instances while <<
doesn't can be useful and practical knowledge.

Caveat: I'm pretty damned sure Ruby was not the right language to be
using on that project. One makes do with what one is given.

···

--
Matthew Kerwin, B.Sc (CompSci) (Hons)
http://matthew.kerwin.net.au/
ABN: 59-013-727-651

"You'll never find a programming language that frees
you from the burden of clarifying your ideas." - xkcd

Yes, and it's part of Enumerable:

···

On Wed, Jun 20, 2012 at 9:07 PM, Sam Duncan <sduncan@wetafx.co.nz> wrote:

Is that basically the same thing wrapped in another method so that counter
and o are yielded to a block?

--
Avdi

Justin Collins wrote in post #1065607:

Saves a little memory, too?

Quod erat demonstrandum. :smiley:

···

--
Posted via http://www.ruby-forum.com/\.