I recently tracked down a bug in my code to a line that read:
cards = cards.sort.uniq
This seemed perfectly reasonable - the cards array contained Card
objects which were Enumerable. However, the bug was due to the fact that
Array#uniq does not use the object’s == method to do comparisons - even
though Array#sort uses the object’s <=> method. Upon investigation, I found
that the implementation of Array#uniq uses a hash, and since the key to the
hash is the object ID, uniqueness ends up being defined as (obj1.id ==
obj2.id). This was surprising to me.
Further investigation shows that the following methods of Array ignore
the object’s == method:
Array#&
Array#|
Array#uniq
Array#uniq!
On the other hand, the following methods honor the object's == method:
Array#-
Array#==
Array#===
Array#assoc
Array#delete
Array#include?
Array#index
Array#rassoc
Array@rindex
In addition, the following functions honor the object's <=> method:
Array#<=>
Array#sort
Array#sort!
Not only does this cause an inconsistency between the definition of
equality between “set intersection” (Array#&) and “set difference”
(Array#-), it also leads to different behavior between Numerics (and
Strings) and other objects:
irb(main):001:0> RUBY_VERSION
=> "1.6.8"
irb(main):002:0> class A
irb(main):003:1> def initialize(i) @i = i end
irb(main):004:1> def ==(j) @i == j end
irb(main):005:1> end
=> nil
irb(main):006:0> [1,1,2].uniq
=> [1, 2]
irb(main):007:0> [‘1’,‘1’,‘2’].uniq
=> [“1”, “2”]
irb(main):008:0> [A.new(1),A.new(1),A.new(2)].uniq
=> [#<A:0x810190c @i=1>, #<A:0x81018f8 @i=1>, #<A:0x81018e4 @i=2>]
It seems to me that uniqueness should be determined by the object's ==
method instead of the object’s ID as in Array#&, Array#|, Array#uniq, and
Array#uniq!. This seems more in line with methods like Array#- and
Array#include?.
Any other thoughts?
- Warren Brown