Hash#===()


(Han Holl) #1

Hi,

I ran the following benchmark:

require 'benchmark’
include Benchmark

class Hash
def ===(a)
has_key?(a)
end
end

n = 100000
hash = { “abc”=> 1, “def”=> 1, “kli”=> 1, “lop”=> 1, “tli”=> 1, “ppp”=> 1 }
[ “ppp”, “abc”, “for” ].each do |a|
bm(7) do |x|
x.report(a){for i in 1…n;
case a
when “abc”, “def”, “kli”, “lop”, “tli”, "ppp"
end
end}
x.report(a){for i in 1…n;
case a
when /^(abc|def|kli|lop|tli|ppp)$/
end
end}
x.report(a){for i in 1…n;
case a
when /^(?:abc|def|kli|lop|tli|ppp)$/
end
end}
x.report(a){for i in 1…n;
case a
when hash
end
end}

end
end

With the following result:
user system total real
ppp 0.770000 0.000000 0.770000 ( 0.765261)
ppp 0.520000 0.000000 0.520000 ( 0.518373)
ppp 0.480000 0.000000 0.480000 ( 0.487162)
ppp 0.290000 0.000000 0.290000 ( 0.286171)
user system total real
abc 0.210000 0.000000 0.210000 ( 0.204143)
abc 0.440000 0.000000 0.440000 ( 0.443976)
abc 0.430000 0.000000 0.430000 ( 0.425147)
abc 0.290000 0.000000 0.290000 ( 0.292743)
user system total real
for 0.780000 0.000000 0.780000 ( 0.776409)
for 0.360000 0.000000 0.360000 ( 0.366605)
for 0.340000 0.000000 0.340000 ( 0.340457)
for 0.260000 0.000000 0.260000 ( 0.258510)

This shows that in a case with several alternatives, using a Hash
is almost always the best solution.
Wouldn’t the above definition for Hash#=== make sense as the
default ?

Cheers,

Han Holl


(Han Holl) #2

Han Holl wrote:

Hi,

I ran the following benchmark:

require 'benchmark’
include Benchmark

class Hash
def ===(a)
has_key?(a)
end
end

Of course
class Hash
alias === has_key?
end

is even faster.
ppp 0.180000 0.000000 0.180000 ( 0.178967)

Han


(ts) #3

class Hash
    alias === has_key?
end

I've not understood you want to write this ?

   hash = {"aa" => 12}
   case "aa"
   when hash
      puts "ok"
   end

Guy Decoux


(Han Holl) #4

ts wrote:

“H” == Han Holl han@pobox.com writes:

class Hash
alias === has_key?
end

I’ve not understood you want to write this ?

hash = {“aa” => 12}
case "aa"
when hash
puts "ok"
end

Guy Decoux

Yes, like a test for set membership. It seems
intuitive to me.

Han


(Kent Dahl) #5

Han Holl wrote:

Yes, like a test for set membership. It seems
intuitive to me.

I’d be more inclined to alias it to the Enumerable#include? myself. That
is, unless someone does use Hash#=== for something as is. (Doesn’t it
just go up to Object#=== and degenerate into a == as it is now?)

But as for your benchmarks:

  1. hash key lookups are supposed to be fast, with constant lookup
    time. This is not surprising. I feel that you are comparing apples to
    oranges.
  2. your tests aren’t completely fair, as the hashobject is
    pregenerated, while the others are literals in the case statements
    themselves. Doesn’t yield much difference, but still:
require 'benchmark' include Benchmark

class Hash
alias === has_key?
end

array1 = “abc”, “def”, “kli”, “lop”, “tli”, "ppp"
regexp1 = /^(abc|def|kli|lop|tli|ppp)$/
regexp2 = /^(?:abc|def|kli|lop|tli|ppp)$/

n = 100000
hash = { “abc”=> 1, “def”=> 1, “kli”=> 1, “lop”=> 1, “tli”=> 1, “ppp”=>
1 }
[ “ppp”, “abc”, “for” ].each do |a|
bm(7) do |x|
# first the explicit array version
x.report(a+’ a’){for i in 1…n;
case a
when “abc”, “def”, “kli”, “lop”, “tli”, "ppp"
end
end}
x.report(a){for i in 1…n;
case a
when *array1 #“abc”, “def”, “kli”, “lop”, “tli”, "ppp"
end
end}

x.report(a+’ r1’){for i in 1…n;
case a
when /^(abc|def|kli|lop|tli|ppp)$/
end
end}
x.report(a){for i in 1…n;
case a
when regexp1 #/^(abc|def|kli|lop|tli|ppp)$/
end
end}

x.report(a+’ r2’){for i in 1…n;
case a
when /^(?:abc|def|kli|lop|tli|ppp)$/
end
end}
x.report(a){for i in 1…n;
case a
when regexp2 #/^(?:abc|def|kli|lop|tli|ppp)$/
end
end}

x.report(a+’ h’){for i in 1…n;
case a
when { “abc”=> 1, “def”=> 1, “kli”=> 1, “lop”=> 1, “tli”=> 1,
“ppp”=> 1 }
end
end}
x.report(a){for i in 1…n;
case a
when hash
end
end}

end
end

ppp a 0.640000 0.000000 0.640000 ( 0.645890) ppp 0.290000 0.000000 0.290000 ( 0.299588) ppp r1 0.420000 0.010000 0.430000 ( 0.424303) ppp 0.420000 0.000000 0.420000 ( 0.418180) ppp r2 0.380000 0.000000 0.380000 ( 0.382281) ppp 0.380000 0.000000 0.380000 ( 0.382734) ppp h 1.800000 0.000000 1.800000 ( 1.799063) ppp 0.130000 0.000000 0.130000 ( 0.128566) user system total real abc a 0.170000 0.000000 0.170000 ( 0.167413) abc 0.110000 0.000000 0.110000 ( 0.109137) abc r1 0.340000 0.000000 0.340000 ( 0.335180) abc 0.330000 0.000000 0.330000 ( 0.331492) abc r2 0.310000 0.000000 0.310000 ( 0.315915) abc 0.320000 0.000000 0.320000 ( 0.318973) abc h 1.820000 0.000000 1.820000 ( 1.815141) abc 0.130000 0.000000 0.130000 ( 0.128977) user system total real for a 0.660000 0.000000 0.660000 ( 0.659464) for 0.300000 0.000000 0.300000 ( 0.296972) for r1 0.270000 0.000000 0.270000 ( 0.276839) for 0.280000 0.000000 0.280000 ( 0.277514) for r2 0.260000 0.000000 0.260000 ( 0.260307) for 0.260000 0.000000 0.260000 ( 0.261449) for h 1.790000 0.000000 1.790000 ( 1.795176) for 0.120000 0.000000 0.120000 ( 0.115688)

Note that pregenerating the regexps doesn’t yield much, but expanding a
pregenerated array seems to have very much effect, even going below the
hash version for ‘abc’. But then that is because the first element of
the array is ‘abc’.

···


([ Kent Dahl ]/)_ ~ [ http://www.stud.ntnu.no/~kentda/ ]/~
))_student
/(( _d L b_/ NTNU - graduate engineering - 4. year )
( __õ|õ// ) )Industrial economics and technological management(
_
/ö____/ (_engineering.discipline=Computer::Technology)


(ts) #6

I'd be more inclined to alias it to the Enumerable#include? myself.

In this case you change Array#===

you can also have a surprise with IO object

Guy Decoux


(Han Holl) #7

Kent Dahl wrote:

Han Holl wrote:

Yes, like a test for set membership. It seems
intuitive to me.

I’d be more inclined to alias it to the Enumerable#include? myself. That
is, unless someone does use Hash#=== for something as is. (Doesn’t it
just go up to Object#=== and degenerate into a == as it is now?)

I believe it does. My point is that Hash#=== doesn’t have a useful purpose
now, and I wanted to propose to have the above meaning.
I’m a bit nervous about aliasing operations on builtin classes.
What if some ruby extension has a brillant use for Hash#=== ?
To be safe, I will have to unalias when I’m done.

Therefore I propose this as a default interpretation. My benchmarks were
just to show that it’s quite reasonable from a performance standpoint as
well.

Re-aliasing Enumerable#include? seems even scarier.

Cheers,

Han Holl


(ts) #8

I'd be more inclined to alias it to the Enumerable#include? myself.

In this case you change Array#===

Forget it I've said stupidity, as usually

Guy Decoux