Hash with array as value type

Hello all,

I would like to store arrays in a hash, indexed by a string key. I
would like to have the hash create an empty array as the default value,
when it sees a new key, something like the code involving "hash1" below.
However, this code is giving some strange results -- it claims that the
hash is empty, even though there is an array stored in it, and I can
then retrieve that array.

I am new to ruby, so maybe I am just doing something stupid (... I am
not sure about that "Hash.new( [] )" for example... )

Can anyone explain these results?

Thanks.

CODE:

#!/usr/bin/ruby

hash1 = Hash.new( [] )
hash1["hello"].push(1.0)
hash1["hello"].push(2.0)
$stderr.write "hash1.size() = #{hash1.size()}\n"
$stderr.write "hash1.empty() = #{hash1.empty?()}\n"
$stderr.write "hash1[\"hello\"].size() = #{hash1["hello"].size()}\n"
$stderr.write "hash1[\"hello\"] = #{hash1["hello"]}\n"

hash2 = {"hello" => [1.0,2.0]}
$stderr.write "\nhash2.size() = #{hash2.size()}\n"
$stderr.write "hash2.empty() = #{hash2.empty?()}\n"
$stderr.write "hash2[\"hello\"].size() = #{hash2["hello"].size()}\n"
$stderr.write "hash2[\"hello\"] = #{hash2["hello"]}\n"

OUTPUT:

hash1.size() = 0
hash1.empty() = true
hash1["hello"].size() = 2
hash1["hello"] = 1.02.0

hash2.size() = 1
hash2.empty() = false
hash2["hello"].size() = 2
hash2["hello"] = 1.02.0

VERSION:
ruby 1.8.3 (2005-09-21) [i686-linux]

···

--
Posted via http://www.ruby-forum.com/.

Michael McGreevy wrote:

Hello all,

I would like to store arrays in a hash, indexed by a string key. I
would like to have the hash create an empty array as the default
value, when it sees a new key, something like the code involving
"hash1" below. However, this code is giving some strange results --
it claims that the hash is empty, even though there is an array
stored in it, and I can then retrieve that array.

I am new to ruby, so maybe I am just doing something stupid (... I am
not sure about that "Hash.new( )" for example... )

Can anyone explain these results?

Yes. You ran into the typical Hash pitfal: the default value is the one
returned if something is not found for the given key. But it never
changes the hash and there is just this single instance. Consider this:

h=Hash.new()

=> {}

h[0]<<1

=> [1]

h[0]

=> [1]

h["foo"]

=> [1]

h["bar"] << 2

=> [1, 2]

h[:x]

=> [1, 2]

h.default

=> [1, 2]

It works for numeric values because then one usually assigns:

h=Hash.new(0)

=> {}

h[:foo] += 1

=> 1

h[:bar] += 10

=> 10

h

=> {:bar=>10, :foo=>1}

Note the "+=" contains an assignment and it's equivalent to

h[:foo] = h[:foo] + 1

You on the other hand want the block form because that can do arbitrary
things when a key is not found:

h=Hash.new() {|ha,key| puts "missing #{key}"; ha[key] = }

=> {}

h[:foo] << "foo"

missing foo
=> ["foo"]

h[:bar] << "foo"

missing bar
=> ["foo"]

h[:foo] << "foo end"

=> ["foo", "foo end"]

h[:foo] << "foo more"

=> ["foo", "foo end", "foo more"]

h

=> {:bar=>["foo"], :foo=>["foo", "foo end", "foo more"]}

Kind regards

    robert

Michael McGreevy wrote:

Hello all,

I would like to store arrays in a hash, indexed by a string key. I
would like to have the hash create an empty array as the default value,
when it sees a new key, something like the code involving "hash1" below.
However, this code is giving some strange results -- it claims that the
hash is empty, even though there is an array stored in it, and I can
then retrieve that array.

I am new to ruby, so maybe I am just doing something stupid (... I am
not sure about that "Hash.new( )" for example... )

Can anyone explain these results?

Thanks.

CODE:

#!/usr/bin/ruby

hash1 = Hash.new( )

This makes a new Hash with the default value being the Array reference
you've passed it. This is only executed *once*, so each new key/value
pair points to the *same* Array instance. You want

hash1 = Hash.new { |h,k| Array.new }

That'll make a new Array instance each time it needs a new default value
for a new key.

···

--
Posted via http://www.ruby-forum.com/\.

It is a bit confusing.

Hash.new() tucks away the newly created array as the default value:

d =
puts d.object_id
hash1 = Hash.new(d)
puts hash1.default.object_id

You can see from this that Hash has saved a reference to the default value.
So when you call:

   hash1["hello"].push(1.0)

on the empty hash, a reference to the default value is returned because
the key is not found. The float, 1.0, is pushed onto that default value.
You still haven't entered anything into the Hash itself but you have pushed
a value into the default array:

puts hash1.default

This same default object will be returned each time a lookup fails in the Hash.

The solution is to use the block form of the Hash constructor.

   hash1 = Hash.new { |h,k| h[k] = }

In this form, every lookup miss causes the block to be called with the hash
and the key as the two arguments. The block allocates a new array and then
stores it back into the hash using the key so that the next lookup finds the
newly allocated array and doesn't call the block.

Here is your example rewritten with that and adjusted to follow the usual ruby coding styles:

hash1 = Hash.new { |h,k| h[k] = }
hash1["hello"].push(1.0)
hash1["hello"].push(2.0)
warn("hash1.size = #{hash1.size}")
warn("hash1.empty? = #{hash1.empty?}")
warn("hash1[\"hello\"].size = #{hash1["hello"].size}")
warn("hash1[\"hello\"] = #{hash1["hello"]}")
warn("")

hash2 = {"hello" => [1.0,2.0]}
warn("hash2.size = #{hash2.size}")
warn("hash2.empty = #{hash2.empty?}")
warn("hash2[\"hello\"].size = #{hash2["hello"].size}")
warn("hash2[\"hello\"] = #{hash2["hello"]}")

Gary Wright

···

On Jan 26, 2006, at 6:22 AM, Michael McGreevy wrote:

I am new to ruby, so maybe I am just doing something stupid (... I am
not sure about that "Hash.new( )" for example... )
Can anyone explain these results?

Robert Klemme wrote:

You on the other hand want the block form because that can do arbitrary
things when a key is not found:

h=Hash.new() {|ha,key| puts "missing #{key}"; ha[key] = }

Thanks a lot :-). I had actually already tried the block form, but in a
way similar to what Mike Fletcher posted. I now see that simply
returning something from the block does not modify the hash.. you have
to explicitly modify the hash within the block.

Thanks to both of you for responding.

Michael.

···

--
Posted via http://www.ruby-forum.com/\.

Mike Fletcher wrote:
[...]

hash1 = Hash.new { |h,k| Array.new }

That'll make a new Array instance each time it needs a new default value
for a new key.

GAH, that should be

hash1 = Hash.new { | h, k | h[k] = Array.new }

MEMO TO SELF: Don't post in the mornings before caffeine has chance to
take effect.

···

--
Posted via http://www.ruby-forum.com/\.

I find it instructive to think of the default as a #key_missing method,
analogous to #method_missing.

martin

···

Robert Klemme <bob.news@gmx.net> wrote:

Michael McGreevy wrote:
> Hello all,
>
> I would like to store arrays in a hash, indexed by a string key. I
> would like to have the hash create an empty array as the default
> value, when it sees a new key, something like the code involving
> "hash1" below. However, this code is giving some strange results --
> it claims that the hash is empty, even though there is an array
> stored in it, and I can then retrieve that array.
>
> I am new to ruby, so maybe I am just doing something stupid (... I am
> not sure about that "Hash.new( )" for example... )
>
> Can anyone explain these results?

Yes. You ran into the typical Hash pitfal: the default value is the one
returned if something is not found for the given key. But it never
changes the hash and there is just this single instance. Consider this: