Using hashes as keys in hashes

Steven_Arnold1 · 23 November 2005 13:52

I've seen several posts related in some way to the subject of using hashes as keys in hashes, but I haven't seen a clear solution to the issue. My problem is I want to generate a set of unique hashes. Since hash keys are unique, I was hoping to just put the hashes in as keys, something like:

myUniqueHashes[aHash] ||= 0
myUniqueHashes[aHash] += 1

This would not only give me a list of unique hashes, but it would also tell me how many times each one was seen.

Unfortunately, this does not work because in a hash, each different hash that is inserted as a key is considered to be different, even if the contents are the same. When used as keys in a hash, two hashes are considered equal if aHash.id = bHash.id, meaning if they are the very same hash located at the same place in memory.

At the micro level, what I need is a special kind of hash that will consider two hash keys to be equal if aHash == bHash. At a higher level, I need an efficient way to collect a unique set of hashes. An array would work, but for a large set it'd be much slower....and storing the number of accesses would be relatively clunky compared to a hash's interface.

What is the Ruby way to solve this problem?

Thanks,
steve

David_A_Black3 · 23 November 2005 14:06

Hi --

I've seen several posts related in some way to the subject of using
hashes as keys in hashes, but I haven't seen a clear solution to the
issue. My problem is I want to generate a set of unique hashes.
Since hash keys are unique, I was hoping to just put the hashes in
as keys, something like:

myUniqueHashes[aHash] ||= 0
myUniqueHashes[aHash] += 1

This would not only give me a list of unique hashes, but it would
also tell me how many times each one was seen.

Unfortunately, this does not work because in a hash, each different
hash that is inserted as a key is considered to be different, even
if the contents are the same. When used as keys in a hash, two
hashes are considered equal if aHash.id = bHash.id, meaning if they
are the very same hash located at the same place in memory.

At the micro level, what I need is a special kind of hash that will
consider two hash keys to be equal if aHash == bHash. At a higher
level, I need an efficient way to collect a unique set of hashes.
An array would work, but for a large set it'd be much slower....and
storing the number of accesses would be relatively clunky compared
to a hash's interface.

What is the Ruby way to solve this problem?

You could define an appropriate default behavior for the hash of
hashes:

   unique_hashes = Hash.new do |hash,key|
     existing = hash.keys.find {|k| k == key }
     if existing
       hash[existing] += 1
     else
       hash[key] = 1
     end
   end

a = {"a","b"}
b = {"a","b"}

   unique_hashes[a]
   unique_hashes[b]
   unique_hashes[{"some","other"}]

p unique_hashes # {{"some"=>"other"}=>1, {"a"=>"b"}=>2}

David

···

On Wed, 23 Nov 2005, Steven Arnold wrote:

--
David A. Black
dblack@wobblini.net

Steven_Arnold1 · 23 November 2005 15:03

This is a cool idea (and taught me about the block approach to creating hashes, thanks!). The one (seeming) flaw is the hash.keys.find, which would do an O(n) search on the hash.keys array, giving O(n) instead of a hash's normal O(ln(n)) behavior. Am I wrong about that?

steve

···

On Nov 23, 2005, at 9:06 AM, David A. Black wrote:

You could define an appropriate default behavior for the hash of
hashes:

  unique_hashes = Hash.new do |hash,key|
    existing = hash.keys.find {|k| k == key }
    if existing
      hash[existing] += 1
    else
      hash[key] = 1
    end
  end

Mauricio_Fernandez · 23 November 2005 15:25

[O(1)]

Yes.

unique_hashes = Hash.new do |h,k|
  k2 = Struct.new(:h) do
         def eql?(o); h == o.h end
   def hash; h.to_a.hash end
       end.new(k) # hoise the Struct def or use singleton methods for better
                    # performance
  h.has_key?(k2) ? h[k2] += 1 : h[k2] = 1
end

unique_hashes[{1 => 2}] # => 1
[{1 => 2}.object_id, {1 => 2}.object_id] # => [-605360772, -605360782]
unique_hashes[{2 => 3}] # => 1
unique_hashes[{1 => 2}] # => 2
unique_hashes[{1 => 2}] # => 3
RUBY_VERSION # => "1.8.3"

You can also redefine Hash#{eql?,hash} globally or using singleton
methods, of course.

···

On Thu, Nov 24, 2005 at 12:03:48AM +0900, Steven D. Arnold wrote:

On Nov 23, 2005, at 9:06 AM, David A. Black wrote:
>You could define an appropriate default behavior for the hash of
>hashes:
>
> unique_hashes = Hash.new do |hash,key|
> existing = hash.keys.find {|k| k == key }
> if existing
> hash[existing] += 1
> else
> hash[key] = 1
> end
> end

This is a cool idea (and taught me about the block approach to
creating hashes, thanks!). The one (seeming) flaw is the
hash.keys.find, which would do an O(n) search on the hash.keys array,
giving O(n) instead of a hash's normal O(ln(n)) behavior.

--
Mauricio Fernandez

Topic		Replies	Views
[?] problem using set with hash objects ruby-talk	2	112	18 September 2007
Odd hash behaviour ruby-talk	7	86	4 October 2005
Creating hashes with duplicate keys ruby-talk	0	133	28 July 2006
[?] problem using set with hash objects ruby-talk	1	120	17 September 2007
Hash with two identical keys? ruby-talk	14	157	26 December 2006

Using hashes as keys in hashes

Related topics