What’s the standard way of implementing #hash for value objects in Ruby?

But still, I don't see the need. Note also that a proper Hash key
usually should be immutable because changing them causes all sorts of
trouble if not done carefully.

Hence the use of “value object” in my question.

I can't see that "value object" implies "immutable".

http://c2.com/cgi/wiki?ValueObject
Value object - Wikipedia

Continue reading:

http://c2.com/cgi/wiki?ValueObjectsShouldBeImmutable

Second, let me rephrase my
question and add some additional context and examples:

What algorithm should one employ in the calculation of the hash value
of an arbitrary value object?

There is no single standard (or best) way. The fact that different
languages (Java, Ruby...) have different means to calculate combined
hash values which all seem to work pretty well indicates this IMHO.

Really? Everyone seems to use the XOR method (with good cause).

As I’ve already pointed out, internally, Ruby does something
completely different.

I would claim that the algorithm should take the class of the object
into account as well, both for consistency with #== (which should
check equality of the classes of the objects being compared) and for
added entropy.

You pay a price for additional calculation though.

Since the fields are immutable, the result of the calculation can be
cached, so that’s not a valid reason to exclude it.

Internally, Ruby (primarily) uses three C functions for the
calculation of combined hash values, namely rb_hash_start,
rb_hash_uint, and rb_hash_end. As an example, the hash value of a
Struct is calculated (in Ruby with these three functions wrapped in an
imaginary module C) as

class Struct
def hash
C.rb_hash_end(reduce(C.rb_hash_start(self.class.hash)){ |h, v|
C.rb_hash_uint(h, v.hash) })
end
end

Might it be useful to have Ruby expose a way to perform this
calculation from the Ruby realm so that other classes may employ this
algorithm?

Not sure whether we would really gain that much. Those calls are
efficient in C but if you provide that mechanism in Ruby land you will
have multiple calls, e.g.

def hash
h = Fixnum::HASH_START
h = h.combine_hash(@a)
h = h.combine_hash(@b)
h = h.combine_hash(@c)
end

I don’t understand what you’re getting at with this example. It
doesn’t seem to add anything to the discussion. My example code,
which shows how Ruby does it internally for Struct, makes multiple
calls. Since these methods would be simple wrappers of the C
functions, the hash calculation would (almost) be as fast as it would
be for Struct.

···

On Sat, Dec 31, 2011 at 13:58, Robert Klemme <shortcutter@googlemail.com> wrote:

On Fri, Dec 30, 2011 at 3:43 PM, Nikolai Weibull <now@bitwi.se> wrote:

>> It would make implementing #hash, which you should/must do
>> if you implement #==, trivial, as there’s then only one way to do so.

> Delegating to the most relevant attribute still seems more trivial.

Trivial – perhaps. Wrong – most definitely. That you don’t seem to
understand this is what tells me that you don’t understand how #hash
should be implemented.

Feel free to explain why it is wrong.

> I suppose one reason I take this view could be that the only viable
> scenarios I can think of for making some arbitrary object into a hash key
> are for sets and Array#uniq. For me, these scenarios are exceedingly
rare,
> and have always been trivially replaced with alternative keys.

How about having them as keys in an actual Hash (which is how Sets and
Array#uniq are currently implemented)?

Which is why I mentioned it.

···

On Fri, Dec 30, 2011 at 4:09 AM, Nikolai Weibull <now@bitwi.se> wrote:

On Fri, Dec 30, 2011 at 10:44, Josh Cheek <josh.cheek@gmail.com> wrote:
> On Fri, Dec 30, 2011 at 3:09 AM, Nikolai Weibull <now@bitwi.se> wrote:

But still, I don't see the need. Note also that a proper Hash key
usually should be immutable because changing them causes all sorts of
trouble if not done carefully.

Hence the use of “value object” in my question.

I can't see that "value object" implies "immutable".

http://c2.com/cgi/wiki?ValueObject
Value object - Wikipedia

Continue reading:

http://c2.com/cgi/wiki?ValueObjectsShouldBeImmutable

Ah, OK. Still it's not a "must".

Second, let me rephrase my
question and add some additional context and examples:

What algorithm should one employ in the calculation of the hash value
of an arbitrary value object?

There is no single standard (or best) way. The fact that different
languages (Java, Ruby...) have different means to calculate combined
hash values which all seem to work pretty well indicates this IMHO.

Really? Everyone seems to use the XOR method (with good cause).

No, not the simple "XOR all hash codes" method. Consider
java.util.AbstractList<E>

    public int hashCode() {
  int hashCode = 1;
  Iterator<E> i = iterator();
  while (i.hasNext()) {
      E obj = i.next();
      hashCode = 31*hashCode + (obj==null ? 0 : obj.hashCode());
  }
  return hashCode;
    }

As I’ve already pointed out, internally, Ruby does something
completely different.

It's bit manipulations as well as far as I can see - but more complex
than the simple XOR all or the Java version.

I would claim that the algorithm should take the class of the object
into account as well, both for consistency with #== (which should
check equality of the classes of the objects being compared) and for
added entropy.

You pay a price for additional calculation though.

Since the fields are immutable, the result of the calculation can be
cached, so that’s not a valid reason to exclude it.

You can only cache it if the object is frozen. And the question still
remains to be answered whether there is a significant gain by having
the class's hash included.

Internally, Ruby (primarily) uses three C functions for the
calculation of combined hash values, namely rb_hash_start,
rb_hash_uint, and rb_hash_end. As an example, the hash value of a
Struct is calculated (in Ruby with these three functions wrapped in an
imaginary module C) as

class Struct
def hash
C.rb_hash_end(reduce(C.rb_hash_start(self.class.hash)){ |h, v|
C.rb_hash_uint(h, v.hash) })
end
end

Might it be useful to have Ruby expose a way to perform this
calculation from the Ruby realm so that other classes may employ this
algorithm?

Not sure whether we would really gain that much. Those calls are
efficient in C but if you provide that mechanism in Ruby land you will
have multiple calls, e.g.

def hash
h = Fixnum::HASH_START
h = h.combine_hash(@a)
h = h.combine_hash(@b)
h = h.combine_hash(@c)
end

I don’t understand what you’re getting at with this example. It
doesn’t seem to add anything to the discussion.

It demonstrates that whatever would be exposed to Ruby land would need
multiple method calls to avoid object creation overhead. If you are
willing to pay that price you can just use [@a,@b,@c].hash.

My example code,
which shows how Ruby does it internally for Struct, makes multiple
calls. Since these methods would be simple wrappers of the C
functions, the hash calculation would (almost) be as fast as it would
be for Struct.

You still have some overhead. And then it might be more efficient to
use Array's implementation. It's certainly simpler.

I still cannot see why there should be a "standard way of implementing
#hash for value objects". We have Struct's way and Array's way and we
can use other approaches like XOR all. Why would a standard make
things better?

Cheers

robert

···

On Sat, Dec 31, 2011 at 3:16 PM, Nikolai Weibull <now@bitwi.se> wrote:

On Sat, Dec 31, 2011 at 13:58, Robert Klemme <shortcutter@googlemail.com> wrote:

On Fri, Dec 30, 2011 at 3:43 PM, Nikolai Weibull <now@bitwi.se> wrote:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/