Hash method how-to?

Are there any good tutorials out there for writing proper #hash
methods?

in most cases it's the easiest to delegate the hash method to something
else. Example:

class X
  def initialize(name)
    @name = name
  end

  def hash
    @name.hash
  end
end

h = { X.new("foo") => "bar" }
h[X.new("foo")] = "baz" # will overwrite the above

···

On Thu, 2010-12-02 at 03:40 +0900, Intransition wrote:

Are there any good tutorials out there for writing proper #hash
methods?

And so will `h['foo'] = 'baz'`.

···

On Dec 1, 1:10 pm, niklas | brueckenschlaeger <nik...@brueckenschlaeger.de> wrote:

Hash function - Wikipedia

in most cases it's the easiest to delegate the hash method to something
else. Example:

class X
def initialize(name)
@name = name
end

def hash
@name.hash
end
end

h = { X.new("foo") => "bar" }
h[X.new("foo")] = "baz" # will overwrite the above

--
-yossef

> Hash function - Wikipedia
>
> in most cases it's the easiest to delegate the hash method to something
> else. Example:
>
> class X
> def initialize(name)
> @name = name
> end
>
> def hash
> @name.hash
> end
> end
>
> h = { X.new("foo") => "bar" }
> h[X.new("foo")] = "baz" # will overwrite the above

And so will `h['foo'] = 'baz'`.

sure, that is a nasty side effect, so probably not the nicest example :slight_smile:

···

On Thu, 2010-12-02 at 04:27 +0900, Yossef Mendelssohn wrote:

On Dec 1, 1:10 pm, niklas | brueckenschlaeger > <nik...@brueckenschlaeger.de> wrote:

--
-yossef

No neither will:

class X
  attr_reader :name
  def initialize(name)
    @name = name
  end

  def hash
    @name.hash
  end

  def inspect
    "X:#{@name.inspect}"
  end
end

h = { X.new("foo") => :bar}
h[X.new("foo")] = :baz
h["foo"] = :bat

h # => {"foo"=>:bat, X:"foo"=>:baz, X:"foo"=>:bar}

The reason is that the value of #hash is not the only thing needed to
distinguish hash keys. All objects with the same hash value are mapped
to the same hash 'bucket' hash collisions are resolved by using the
eql? method,

X.new("a").eql?(X.new("a")) # => false

class X
  def eql?(other)
    other.class == X && # One of the rare occurrences when checking
the class is a good thing
    name.eql?(other.name)
  end
end

h = { X.new("foo") => :bar}
h[X.new("foo")] = :baz
h["foo"] = :bat

h # => {"foo"=>:bat, X:"foo"=>:baz}

X.new("a").eql?(X.new("a")) # => true

Note that the Hash class depends on a relationship between eql? and hash, that

a.eql?(b) => a.hash == b.hash

where => here is logical implication, that is

a => b is the same as

b ||| !a

···

On Wed, Dec 1, 2010 at 2:27 PM, Yossef Mendelssohn <ymendel@pobox.com> wrote:

On Dec 1, 1:10 pm, niklas | brueckenschlaeger > <nik...@brueckenschlaeger.de> wrote:

Hash function - Wikipedia

in most cases it's the easiest to delegate the hash method to something
else. Example:

class X
def initialize(name)
@name = name
end

def hash
@name.hash
end
end

h = { X.new("foo") => "bar" }
h[X.new("foo")] = "baz" # will overwrite the above

And so will `h['foo'] = 'baz'`.

--
Rick DeNatale

Blog: http://talklikeaduck.denhaven2.com/
Github: http://github.com/rubyredrick
Twitter: @RickDeNatale
WWR: http://www.workingwithrails.com/person/9021-rick-denatale
LinkedIn: http://www.linkedin.com/in/rickdenatale

Good explanation. My response was a knee-jerk reaction without
checking if it actually worked, and it definitely makes sense for the
language to care about more than simply the results of #hash.

Although I still think what I was trying to point out is something to
consider, that simply delegating #hash to an internal element isn't
all that good because on some level it makes them the same.

···

On Dec 1, 6:11 pm, Rick DeNatale <rick.denat...@gmail.com> wrote:

The reason is that the value of #hash is not the only thing needed to
distinguish hash keys. All objects with the same hash value are mapped
to the same hash 'bucket' hash collisions are resolved by using the
eql? method,

--
-yossef

Ultimately all hash values are derived from member hash values. The key point - and I believe this is what you want to stress - is _how_ values are derived. The methodology applied should strive to yield different hash values for things that are considered non equivalent. For example, hash value of an Array or list implementation should somehow consider position of elements in order to not return the same hash for two collections which only differ in ordering. We can see that Array#hash obeys this:

irb(main):001:0> [1,2].hash
=> 11
irb(main):002:0> [2,1].hash
=> 1

On the other hand, since Set conceptually is agnostic of position different ordering of elements should not create different hash codes:

irb(main):001:0> require 'set'
=> true
irb(main):002:0> s1 = [1,2].to_set
=> #<Set: {1, 2}>
irb(main):003:0> s2 = [2,1].to_set
=> #<Set: {2, 1}>
irb(main):004:0> s1.hash
=> 14
irb(main):005:0> s2.hash
=> 14
irb(main):006:0> s1 == s2
=> true

Kind regards

  robert

···

On 12/02/2010 10:46 PM, Yossef Mendelssohn wrote:

Although I still think what I was trying to point out is something to
consider, that simply delegating #hash to an internal element isn't
all that good because on some level it makes them the same.

Note that Set, like hash depends on that implication relationship
between #hash and #eql?

class X
attr_reader :name
def initialize(name)
   @name = name
end

def hash
   @name.hash
end

def inspect
   "X:#{@name.inspect}"
end
end

require 'set'

[X.new("foo"), X.new("foo")].to_set # => #<Set: {X:"foo", X:"foo"}>

class X
def eql?(other)
   other.class == X &&
   name.eql?(other.name )
end
end

[X.new("foo"), X.new("foo")].to_set # => #<Set: {X:"foo"}>

···

On Thu, Dec 2, 2010 at 5:35 PM, Robert Klemme <shortcutter@googlemail.com> wrote:

On 12/02/2010 10:46 PM, Yossef Mendelssohn wrote:

Although I still think what I was trying to point out is something to
consider, that simply delegating #hash to an internal element isn't
all that good because on some level it makes them the same.

Ultimately all hash values are derived from member hash values. The key
point - and I believe this is what you want to stress - is _how_ values are
derived. The methodology applied should strive to yield different hash
values for things that are considered non equivalent. For example, hash
value of an Array or list implementation should somehow consider position of
elements in order to not return the same hash for two collections which only
differ in ordering. We can see that Array#hash obeys this:

irb(main):001:0> [1,2].hash
=> 11
irb(main):002:0> [2,1].hash
=> 1

On the other hand, since Set conceptually is agnostic of position different
ordering of elements should not create different hash codes:

irb(main):001:0> require 'set'
=> true
irb(main):002:0> s1 = [1,2].to_set
=> #<Set: {1, 2}>
irb(main):003:0> s2 = [2,1].to_set
=> #<Set: {2, 1}>
irb(main):004:0> s1.hash
=> 14
irb(main):005:0> s2.hash
=> 14
irb(main):006:0> s1 == s2
=> true

--
Rick DeNatale

Blog: http://talklikeaduck.denhaven2.com/
Github: rubyredrick (Rick DeNatale) · GitHub
Twitter: @RickDeNatale
WWR: http://www.workingwithrails.com/person/9021-rick-denatale
LinkedIn: http://www.linkedin.com/in/rickdenatale

This is a different story: I was talking about how Set calculates its hash code from members. You are talking about how Set determines member equivalence and especially what role the implementation of #hash of the _members_ plays in this.

Kind regards

  robert

···

On 12/03/2010 06:28 PM, Rick DeNatale wrote:

On Thu, Dec 2, 2010 at 5:35 PM, Robert Klemme > <shortcutter@googlemail.com> wrote:

On 12/02/2010 10:46 PM, Yossef Mendelssohn wrote:

Although I still think what I was trying to point out is something to
consider, that simply delegating #hash to an internal element isn't
all that good because on some level it makes them the same.

Ultimately all hash values are derived from member hash values. The key
point - and I believe this is what you want to stress - is _how_ values are
derived. The methodology applied should strive to yield different hash
values for things that are considered non equivalent. For example, hash
value of an Array or list implementation should somehow consider position of
elements in order to not return the same hash for two collections which only
differ in ordering. We can see that Array#hash obeys this:

irb(main):001:0> [1,2].hash
=> 11
irb(main):002:0> [2,1].hash
=> 1

On the other hand, since Set conceptually is agnostic of position different
ordering of elements should not create different hash codes:

irb(main):001:0> require 'set'
=> true
irb(main):002:0> s1 = [1,2].to_set
=> #<Set: {1, 2}>
irb(main):003:0> s2 = [2,1].to_set
=> #<Set: {2, 1}>
irb(main):004:0> s1.hash
=> 14
irb(main):005:0> s2.hash
=> 14
irb(main):006:0> s1 == s2
=> true

Note that Set, like hash depends on that implication relationship
between #hash and #eql?