Surprise with Hash#new

Today I spent 15 mins baffled by very weird behavior with a Hash, and I
just figured out where I went wrong. (I’m using ruby 1.8.1.)

So, I wanted to create a dictionary of stacks, which in Ruby comes out
as being a Hash of Arrays. I wanted to be able to write code like this

@seen[key].push(word)
val = @seen[key].pop

which, in English, would read something like “push word into the stack
@seen[key]”, and “pop a value from @seen[key] and assign it to val”. I
also wanted the following behavior when the hash is indexed with a key
it doesn’t have:

@seen[‘foo’] # expected: => []
@seen[‘foo’].push(‘bar’) # expected: @seen <- {“foo”=>[“bar”]}

So, I try the following code:

WARNING: don’t do this at home

@seen = Hash.new([]) # return [] on failed lookup
@seen[‘foo’].push(‘bar’)

(Faulty) reasoning: push the string ‘bar’ on the stack @seen[‘foo’];
if @seen[‘foo’] doesn’t exist, it gets initialized to [] beforehand.

Of course, it doesn’t work like that at all:

@seen[‘baz’]
=> [‘bar’]

@seen
=> {}

The problem, which I only saw considerably later, is that calling
Hash#new with an empty list is equivalent to the following:

bob = []
@seen = Hash.new(bob)
bob.push(‘bar’) # equivalent to @seen[‘foo’].push(‘bar’)
bob # equivalent to @seen[‘baz’]

So, the thing that’s being modified by all those pushes is the object
returned as the default value for failed lookups on @seen; and it’s
aliased by the variable bob. This certainly breaks my expectations of
how Hash#new should behave: I would expect that hashes hide the instance
variable that stores the default value, so that the only way to modify
it would be Hash#default=.

Does anybody ever make use of this aliasing? Would it be better to
have Hash copy the argument to new(), and return copies of it as the
default value? Or is using a mutable object as the default just a bad
idea?

PS I ended up not needing that data structure, but after seeing my
mistake I managed to get the behavior I wanted this way:

class StackDict < Hash
def
if self.has_key? key
super(key)
else
self[key] = Array.new
end
end
end

@seen = StackDict.new
@seen[‘foo’].push(‘bar’)
@seen # => {“foo”=>[“bar”]}

···


Luis Casillas
Department of Linguistics
Stanford University
http://www.stanford.edu/~casillas/

“This fence is here for your protection. Do not approach or try to
cross, or you will be shot.”
– Text of sign on US-installed barbed wire fence around village of
Abu Hishma, Iraq. (NY Times, Dec. 7, 2003)

here is some info about default values for Array/Hash:
http://www.glue.umd.edu/~billtj/ruby.html#default

···

On Tue, 30 Mar 2004 08:28:40 +0000, Luis D Casillas wrote:

Today I spent 15 mins baffled by very weird behavior with a Hash, and I
just figured out where I went wrong. (I’m using ruby 1.8.1.)

So, I wanted to create a dictionary of stacks, which in Ruby comes out
as being a Hash of Arrays. I wanted to be able to write code like this


Simon Strandgaard

Alternatively, Hash.new can take a block to generate default values:

@seen = Hash.new {|h,k| h[k] = }

regards,
andrew

···

On Tue, 30 Mar 2004 07:28:40 +0000 (UTC), Luis D Casillas <casillas@no_spam.Stanford.EDU> wrote:

PS I ended up not needing that data structure, but after seeing my
mistake I managed to get the behavior I wanted this way:

In article dV9ac.47224$li5.33643@pd7tw3no, Andrew Johnson wrote:

···

On Tue, 30 Mar 2004 07:28:40 +0000 (UTC), Luis D Casillas > <casillas@no_spam.Stanford.EDU> wrote:

PS I ended up not needing that data structure, but after seeing my
mistake I managed to get the behavior I wanted this way:

Alternatively, Hash.new can take a block to generate default values:

@seen = Hash.new {|h,k| h[k] = }

Ah. This is good.


Luis Casillas
Department of Linguistics
Stanford University
http://www.stanford.edu/~casillas/

“This fence is here for your protection. Do not approach or try to
cross, or you will be shot.”
– Text of sign on US-installed barbed wire fence around village of
Abu Hishma, Iraq. (NY Times, Dec. 7, 2003)