Comparing objects

Let me be clear, People of the Future: implement eql?, ===, and hash
on your own classes as appropriate. Doing so is the proper way to
allow your code to interact with other libraries and coders. Even if
your code lives in isolation, ensuring proper semantics via these
methods prevents a class of tricky bug that your successors may have
to deal with.

···

On Sun, Jun 6, 2010 at 9:00 PM, Rein Henrichs <reinh@reinh.com> wrote:>

I hope that other Rubyists that may stumble upon this thread will take
Robert's FUD with a grain of salt and will feel free to determine the
usefulness and any potential dangers of implementing #eql? and #hash --
along with other Ruby idioms like #each (for Enumerable) and #<=> (for
Comparable) -- on their own. An ounce of critical thinking is better than a
pound of dogma.

Not #eql?, #== please.
Cheers
R.

···

On Sat, Jun 5, 2010 at 2:35 PM, Robert Klemme <shortcutter@googlemail.com> wrote:

On 05.06.2010 00:35, Anderson Leite wrote:

Before

we can provide solutions we have to know what problem must be solved.

I have a list of objects that came from database and another list of
objects extracted from a xml. I need the elements who are in both lists.

Then...I thought to compare objects overriding the == method like Marcin
Wolski wrote. There is another solution ?

If all objects you are dealing with implement #eql? and #hash in a way to be
suitable for that comparison then using #eql? is the most straightforward
approach.

--
The best way to predict the future is to invent it.
-- Alan Kay

Even if

your code lives in isolation, ensuring proper semantics via these
methods prevents a class of tricky bug that your successors may have
to deal with.

Hmm? Would you care to show an example where overloading those methods
(#eql? and #hash) is needed to ensure proper behavior? I am willing to
learn. But I am not willing to accept this statement as such.
Cheers
R.

···

On Thu, Jun 10, 2010 at 3:41 PM, Wilson Bilkovich <wilsonb@gmail.com> wrote:

--
The best way to predict the future is to invent it.
-- Alan Kay

Why? Array#& uses #eql? - as Hash lookup methods do, too.

$ ruby19 -e 'class C;def eql?(x)printf "eq %p\n",x;false;end;def hash;0;end;end;[C.new] & [C.new]'
eq #<C:0x10028284>

Kind regards

  robert

···

On 05.06.2010 16:41, Robert Dober wrote:

On Sat, Jun 5, 2010 at 2:35 PM, Robert Klemme > <shortcutter@googlemail.com> wrote:

On 05.06.2010 00:35, Anderson Leite wrote:

Before

we can provide solutions we have to know what problem must be solved.

I have a list of objects that came from database and another list of
objects extracted from a xml. I need the elements who are in both lists.

Then...I thought to compare objects overriding the == method like Marcin
Wolski wrote. There is another solution ?

If all objects you are dealing with implement #eql? and #hash in a way to be
suitable for that comparison then using #eql? is the most straightforward
approach.

Not #eql?, #== please.

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

You have been presented with one in this very thread. The OP wants objects of his class to have the correct semantics for Array#& and Hash#, etc. The correct answer is to implement #hash and #eql?, just as implementing <=> provides objects of his class with the correct semantics for Array#sort.

···

On 2010-06-10 06:59:40 -0700, Robert Dober said:

On Thu, Jun 10, 2010 at 3:41 PM, Wilson Bilkovich <wilsonb@gmail.com> wrote:
Even if

your code lives in isolation, ensuring proper semantics via these
methods prevents a class of tricky bug that your successors may have
to deal with.

Hmm? Would you care to show an example where overloading those methods
(#eql? and #hash) is needed to ensure proper behavior? I am willing to
learn. But I am not willing to accept this statement as such.
Cheers
R.

--
Rein Henrichs

http://reinh.com

Hi,

Just wanting to add my thoughts about this (I made a thread about this a few
months ago).

I searched a bit and concluded this:

Array methods using comparison
- with #hash and #eql?
    &, |, uniq(!), -
- with #==
    include?, (r)assoc, count, delete, (r,find_)index
(please say me if I forgot one)

I think Array methods should never have to look at #hash and #eql? methods.
I suppose this is done for performance.

I think this should change, because:
- it violates POLS
- it can make unexpected behavior because you defined #hash and #eql? , for
objects which should not need that (when you manage objects in an Array, you
do not expect to need to think about Hash's keys).
- it is not consistent with other Array's methods

PS: Rein: I saw your implementation of #hash. I think to "add one" is
useless, because #eql? is always used (so even if #hash was always the same,
it would work). It could maybe speed up a bit, but only if you have a lot of
comparison of User and User's instances, which is very unlikely.

See also
http://blog.rubybestpractices.com/posts/rklemme/018-Complete_Class.html
http://blog.rubybestpractices.com/posts/rklemme/019-Complete_Numeric_Class.html

Cheers

  robert

···

On 06/10/2010 05:27 PM, Rein Henrichs wrote:

On 2010-06-10 06:59:40 -0700, Robert Dober said:

On Thu, Jun 10, 2010 at 3:41 PM, Wilson Bilkovich <wilsonb@gmail.com> >> wrote:
Even if

your code lives in isolation, ensuring proper semantics via these
methods prevents a class of tricky bug that your successors may have
to deal with.

Hmm? Would you care to show an example where overloading those methods
(#eql? and #hash) is needed to ensure proper behavior? I am willing to
learn. But I am not willing to accept this statement as such.
Cheers
R.

You have been presented with one in this very thread. The OP wants objects of his class to have the correct semantics for Array#& and Hash#, etc. The correct answer is to implement #hash and #eql?, just as implementing <=> provides objects of his class with the correct semantics for Array#sort.

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

I guess you really do not know what I was talking about? Or do you
just repeat the same stuff over and over again in order to convince
me?
overwriting #hash and #eql? breaks Hash! Why the hack should OP's
usecase justify this?
And it does not answer my question. Where would I like that Hash
behaves accordingly to the redefined #eql? and #hash. And BTW I asked
Wilson, did I not?
Cheers
Robert

···

On Thu, Jun 10, 2010 at 5:30 PM, Rein Henrichs <reinh@reinh.com> wrote:

On 2010-06-10 06:59:40 -0700, Robert Dober said:
You have been presented with one in this very thread. The OP wants objects
of his class to have the correct semantics for Array#& and Hash#, etc. The
correct answer is to implement #hash and #eql?, just as implementing <=>
provides objects of his class with the correct semantics for Array#sort.

--
The best way to predict the future is to invent it.
-- Alan Kay

I agree that using eql? and hash for some methods is surprising. I would not mind seeing this changed to at least just eql?. == seems too general for the semantics of these methods.

···

On 2010-06-06 15:03:30 -0700, Benoit Daloze said:

PS: Rein: I saw your implementation of #hash. I think to "add one" is
useless, because #eql? is always used (so even if #hash was always the same,
it would work). It could maybe speed up a bit, but only if you have a lot of
comparison of User and User's instances, which is very unlikely.

Without the + 1, User.new('').hash == User.hash. I don't like the existance of a bug even if I don't yet have a way to exercise it in my code.

--
Rein Henrichs

http://reinh.com

Benoit Daloze wrote:

I searched a bit and concluded this:

Array methods using comparison
- with #hash and #eql?
    &, |, uniq(!), -
- with #==
    include?, (r)assoc, count, delete, (r,find_)index
(please say me if I forgot one)

I think Array methods should never have to look at #hash and #eql?
methods.
I suppose this is done for performance.

I think this should change, because:
- it violates POLS
- it can make unexpected behavior because you defined #hash and #eql? ,
for
objects which should not need that (when you manage objects in an Array,
you
do not expect to need to think about Hash's keys).
- it is not consistent with other Array's methods

For me it doesn't work anyway.
Unsure how to paste code here, you could see an example here:
http://pastie.org/999353
I am still like "WTF?"

$ ruby -v
ruby 1.8.7 (2009-06-12 patchlevel 174) [i686-darwin9.8.0]

···

--
Posted via http://www.ruby-forum.com/\.

I
You define #eql? and #hash for your convenience. So good, so bad. My
question simply was: Show my why *not* redefining #hash and #eql? will
cause problems, because that was Wilson's statement. I am still
waiting :(.

Cheers
R.

···

On Thu, Jun 10, 2010 at 6:10 PM, Robert Klemme <shortcutter@googlemail.com> wrote:

http://blog.rubybestpractices.com/posts/rklemme/018-Complete_Class.html
http://blog.rubybestpractices.com/posts/rklemme/019-Complete_Numeric_Class.html

--
The best way to predict the future is to invent it.
-- Alan Kay

Robert Dober wrote:

···

On Thu, Jun 10, 2010 at 5:30 PM, Rein Henrichs <reinh@reinh.com> wrote:
overwriting #hash and #eql? breaks Hash!

That's not true, I think.
--
Posted via http://www.ruby-forum.com/\.

Mark Abramov wrote:

[tl;dr]

Sorry, guys, didn't notice how I used eql instead of eql?
Btw, without #hash it won't work anyways which I consider *weird* at the
very least.

···

--
Posted via http://www.ruby-forum.com/\.

Judge for yourself

require "forwardable"

def count klass
  ObjectSpace.each_object( klass ).to_a.size
end
class N
  extend Forwardable
  attr_reader :n
  def_delegators :n, :hash
  def eql? otha
    n == otha.n
  end
  private
  def initialize n
    @n = n
  end
end # class N

h = { N.new( 42 ) => true }
h[ N.new( 42 ) ] = 42
p h
GC.start
p count(N)

Cheers
R.

···

On Thu, Jun 10, 2010 at 6:48 PM, Mark Abramov <markizko@gmail.com> wrote:

Robert Dober wrote:

On Thu, Jun 10, 2010 at 5:30 PM, Rein Henrichs <reinh@reinh.com> wrote:
overwriting #hash and #eql? breaks Hash!

That's not true, I think.

The advice to implement #eql? and #hash really only makes sense if equivalence can reasonably be defined for a class and if instances of that class should be used as Hash keys or in Set. If not at least equivalence can be defined other than via identity (which is the default) then it is perfectly reasonable to not override both methods and go with the default implementation.

Kind regards

  robert

···

On 10.06.2010 18:27, Robert Dober wrote:

On Thu, Jun 10, 2010 at 6:10 PM, Robert Klemme > <shortcutter@googlemail.com> wrote:

http://blog.rubybestpractices.com/posts/rklemme/018-Complete_Class.html
http://blog.rubybestpractices.com/posts/rklemme/019-Complete_Numeric_Class.html

I
You define #eql? and #hash for your convenience. So good, so bad. My
question simply was: Show my why *not* redefining #hash and #eql? will
cause problems, because that was Wilson's statement. I am still
waiting :(.

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

#hash makes sense for Hash# and etc. #eql? makes more sense for Array#&. I too find it odd that both are necessary.

···

On 2010-06-10 07:20:03 -0700, Mark Abramov said:

Mark Abramov wrote:

[tl;dr]

Sorry, guys, didn't notice how I used eql instead of eql?
Btw, without #hash it won't work anyways which I consider *weird* at the
very least.

--
Rein Henrichs

http://reinh.com

This breaks Hash? Quite the opposite!

This is precisely what is meant by "defining the semantics" of a class for use by hashes and the very behavior you want when you define #eql? and #hash in the first place!

You wouldn't say that defining #<=> breaks Array#sort, so why would you say that this "breaks Hash"? This doesn't break Hash. If anything, it fixes it when using N objects as keys!

···

On 2010-06-10 22:52:14 -0700, Robert Dober said:

On Thu, Jun 10, 2010 at 6:48 PM, Mark Abramov <markizko@gmail.com> wrote:

Robert Dober wrote:

On Thu, Jun 10, 2010 at 5:30 PM, Rein Henrichs <reinh@reinh.com> wrote:
overwriting #hash and #eql? breaks Hash!

That's not true, I think.

Judge for yourself

require "forwardable"

def count klass
  ObjectSpace.each_object( klass ).to_a.size
end
class N
  extend Forwardable
  attr_reader :n
  def_delegators :n, :hash
  def eql? otha
    n == otha.n
  end
  private
  def initialize n
    @n = n
  end
end # class N

h = { N.new( 42 ) => true }
h[ N.new( 42 ) ] = 42
p h
GC.start
p count(N)

Cheers
R.

--
Rein Henrichs

http://reinh.com

http://blog.rubybestpractices.com/posts/rklemme/018-Complete_Class.html

http://blog.rubybestpractices.com/posts/rklemme/019-Complete_Numeric_Class.html

I
You define #eql? and #hash for your convenience. So good, so bad. My
question simply was: Show my why *not* redefining #hash and #eql? will
cause problems, because that was Wilson's statement. I am still
waiting :(.

The advice to implement #eql? and #hash really only makes sense if
equivalence can reasonably be defined for a class and if instances of that
class should be used as Hash keys or in Set. If not at least equivalence
can be defined other than via identity (which is the default) then it is
perfectly reasonable to not override both methods and go with the default
implementation.

But that was *exactly* my point.

OP wanted to use Array#&, and Array#&, for a reason not too clear to
me, uses Object#eql? instead of Object#== I did discourage the
overloading of Object#eql? and Object#hash for *that purpose*.

If you want to change Hash then it is the right thing to do.
Now I might strongly disagree about if one should do that, but that is
rather OT and I would never have made such strong statements about
that issue.
However the technique you suggest is not to be put into non expert
hands as I tried to show with the memory leaking code above.

Cheers
Robert

···

On Fri, Jun 11, 2010 at 6:47 PM, Robert Klemme <shortcutter@googlemail.com> wrote:

On 10.06.2010 18:27, Robert Dober wrote:

On Thu, Jun 10, 2010 at 6:10 PM, Robert Klemme >> <shortcutter@googlemail.com> wrote:

Kind regards

   robert

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

--
The best way to predict the future is to invent it.
-- Alan Kay

Rein Henrichs wrote:

···

On 2010-06-10 07:20:03 -0700, Mark Abramov said:

Mark Abramov wrote:

[tl;dr]

Sorry, guys, didn't notice how I used eql instead of eql?
Btw, without #hash it won't work anyways which I consider *weird* at the
very least.

#hash makes sense for Hash# and etc. #eql? makes more sense for
Array#&. I too find it odd that both are necessary.

If two objects are set to be eql?, their hash methods must also return
the same value. More details in The Ruby Programming Language book.

Thus, when you redefine eql?, the hash methods also should be redefined.
--
Posted via http://www.ruby-forum.com/\.

Rein Henrichs:

#hash makes sense for Hash# and etc. #eql? makes more
sense for Array#&. I too find it odd that both are necessary.

Both are necessary because #eql? says whether two objects are surely
the same, while #hash says whether they’re surely different – which,
perhaps counterintuitively, is not the same problem.

The difference is that in many, many cases it’s much faster to check
whether two objects are surely different (via a fast #hash function)
than whether they’re surely the same (#eql? can be quite slow).

The main difference betwen #eql? and #hash is that #hash can return the
same value for objects that are not #eql? (but if two objects are #eql?
then #hash must return the same value).

An untested, and definitely not optimal
(but hopefully simple) example follows. :slight_smile:

Imagine that you want to implement a new immutable string class, one
which caches the string length (for performance reasons). Imagine also
that the vast majority of such strings you use are of different lenghts,
and that you want to use them as Hash keys.

class ImmutableString

  def initialize string
    @string = string.dup.freeze
    @length = string.length
  end

end

Given the above assumptions, it might make sense for #hash to
return the @length, while #eql? makes the ‘proper’ comparison:

class ImmutableString

  def hash
    @length
  end

  alias eql? ==

end

This way in the vast majority of cases, when your ImmutableStrings will
be considered for Hash keys, the check whether a given key exists will
be very quick; only when two objects #hash to the same value (i.e.,
when they’re not surely different) the #eql? is called to tell whether
they’re surely the same.

— Shot

···

--
60.times{|a|puts((0..240).map{|b|x=y=i=0;until(x*x+y*y>4||i==99);
x,y,i=x*x-y*y+b/120.0-1.5,2*x*y+a/30.0-1,i+1;end;i==99?'#':'.'}*'');}
                                                        [David Brady]