[?] problem using set with hash objects

Your problem is in hash comparison. Set is internally using hash as
implementation (values are keys in the hash). So in order to obtain
uniqueness of the values, you need to define proper Hash#eq (IIRC).
The default one is comparing object ids - i.e. two totally equivalent
hashes are considered different.

To make things even worse, hash calls dup when creating a new key to
avoid someone else changing the object. So even if you insert the same
object more times, it will be added each time (resp. its copy).

For more information search the archives for something like "hash key
dup". There were recent (as in last two months) threads that discussed
this.

···

On 9/17/07, Stephen Bannasch <stephen.bannasch@deanbrook.org> wrote:

I'm using set in the ruby standard library to produce collections of
unique objects from enumerable objects with duplicates but it's
doesn't appear to work with hash objects.

$ ruby --version
ruby 1.8.5 (2006-12-25 patchlevel 12) [i686-darwin8.9.1]
$ irb
irb(main):001:0> require 'set'
=> true
irb(main):002:0> a = [1,1,2,3]
=> [1, 1, 2, 3]
irb(main):003:0> b = [{:a1 => "123"}, {:a1 => "123"}, {:b1 => "123"}]
=> [{:a1=>"123"}, {:a1=>"123"}, {:b1=>"123"}]
irb(main):004:0> seta = a.to_set
=> #<Set: {1, 2, 3}>
irb(main):005:0> setb = b.to_set
=> #<Set: {{:a1=>"123"}, {:a1=>"123"}, {:b1=>"123"}}>
irb(main):006:0> b[0] == b[1]
=> true

Am I doing something wrong?

I'm using set in the ruby standard library to produce collections of
unique objects from enumerable objects with duplicates but it's
doesn't appear to work with hash objects.

$ ruby --version
ruby 1.8.5 (2006-12-25 patchlevel 12) [i686-darwin8.9.1]
$ irb
irb(main):001:0> require 'set'
=> true
irb(main):002:0> a = [1,1,2,3]
=> [1, 1, 2, 3]
irb(main):003:0> b = [{:a1 => "123"}, {:a1 => "123"}, {:b1 => "123"}]
=> [{:a1=>"123"}, {:a1=>"123"}, {:b1=>"123"}]
irb(main):004:0> seta = a.to_set
=> #<Set: {1, 2, 3}>
irb(main):005:0> setb = b.to_set
=> #<Set: {{:a1=>"123"}, {:a1=>"123"}, {:b1=>"123"}}>
irb(main):006:0> b[0] == b[1]
=> true

Am I doing something wrong?

Your problem is in hash comparison. Set is internally using hash as
implementation (values are keys in the hash). So in order to obtain
uniqueness of the values, you need to define proper Hash#eq (IIRC).
The default one is comparing object ids - i.e. two totally equivalent
hashes are considered different.

To make things even worse, hash calls dup when creating a new key to
avoid someone else changing the object. So even if you insert the same
object more times, it will be added each time (resp. its copy).

This is only true for Strings - and even then only if the key is not frozen.

For more information search the archives for something like "hash key
dup". There were recent (as in last two months) threads that discussed
this.

Yeah, and "Hash as Hash key" is probably another helpful search phrase. This topic comes up from time to time.

Kind regards

  robert

···

On 17.09.2007 22:20, Jano Svitok wrote:

On 9/17/07, Stephen Bannasch <stephen.bannasch@deanbrook.org> wrote:

Ribert, Stefano, and Jano, thanks for the pointers.

I'm now using "ara.t.howard" <ara.t.howard@gmail.com>'s arrayfields gem to get hash like access to my data structures stored in arrays. This combines well with set. Here's an example:

$ sudo gem install arrayfields

$ cat a.rb
require 'set'
require 'arrayfields'
abc = Array.struct :a, :b, :c
a = abc.new [1,2,3] # => [1, 2, 3]
b = abc.new [1,2,3] # => [1, 2, 3]
c = abc.new [4,5,6] # => [4, 5, 6]
p a[:a] # => 1
p c[:a] # => 4
p a1 = [a,b,c] # => [[1, 2, 3], [1, 2, 3], [4, 5, 6]]
p b1 = a1.to_set.to_a # => [[1, 2, 3], [4, 5, 6]]
p b1[0][:a] # => 1
p b1[1][:c] # => 6

$ ruby a.rb
1
4
[[1, 2, 3], [1, 2, 3], [4, 5, 6]]
[[1, 2, 3], [4, 5, 6]]
1
6