Black magical hash element vivification

Ruby (1.9.3p0 to be precise, installed with RVM) is not behaving as I
expected.

    >> foo = Hash.new( Hash.new )
    => {}
    >> foo[3][2] = true
    => true
    >> foo
    => {}
    >> foo[3]
    => {2=>true}
    >> foo[2]
    => {2=>true}
    >> foo
    => {}
    >>

What I would expect would be something more like this:

    >> foo = Hash.new( Hash.new )
    => {}
    >> foo[3][2] = true
    => true
    >> foo
    => {3=>{2=>true}}
    >> foo[3]
    => {2=>true}
    >> foo[2]
    => {}
    >> foo
    => {3=>{2=>true}}
    >>

Where and why do my assumptions fail me?

   >> foo = Hash.new( Hash.new )
   => {}
   >> foo[3][2] = true
   => true
   >> foo
   => {}

This is odd.

   >> foo[3]
   => {2=>true}
   >> foo[2]
   => {2=>true}

This isn't. Hash.new(Hash.new) != Hash.new { |h, k| h[k] = Hash.new }. The
left hand side uses the same object for each default access, hence f =
Hash.new(); f[1] << 1; f[2].include?(1) == true. The right hand-side
behaves as you expect for this part.

   >> foo

   => {}

Again, definitely odd! :slight_smile:

foo.rehash didn't fix anything. Looks like a bug to me?

···

On Mon, Dec 5, 2011 at 16:28, Chad Perrin <code@apotheon.net> wrote:

Ruby (1.9.3p0 to be precise, installed with RVM) is not behaving as I
expected.

>> foo = Hash.new( Hash.new )
=> {}
>> foo[3][2] = true
=> true
>> foo
=> {}
>> foo[3]
=> {2=>true}
>> foo[2]
=> {2=>true}
>> foo
=> {}
>>

Using that Hash constructor, what you pass is the default value for a
missing key. What this means is that the hash will return that object
to any call in which the key is not found. *But* it won't assign that
default object to the key. You have to do that yourself. The other
effect you are seeing is that the default object is returned for all
missing keys, hence the {2 => true} for foo[2].

What I would expect would be something more like this:

>> foo = Hash.new( Hash.new )
=> {}
>> foo[3][2] = true
=> true
>> foo
=> {3=>{2=>true}}
>> foo[3]
=> {2=>true}
>> foo[2]
=> {}
>> foo
=> {3=>{2=>true}}
>>

Where and why do my assumptions fail me?

If you want that behaviour you have to use the default proc, in which
you define how you handle a missing key:

foo = Hash.new {|h,k| h[k] = {}}
foo[3][2] = true
foo => {3 => {2 => true}}
foo[2] => nil

Hope this helps,

Jesus.

···

On Mon, Dec 5, 2011 at 5:28 PM, Chad Perrin <code@apotheon.net> wrote:

D'oh! Strike that, it's behaving as expected, it's just hidden beneath an
extra layer of oddness.

foo = Hash.new(Hash.new)

=> {}

foo[3][2] = true

=> true

There's no assignment for the top-level hash. So the default hash object
specified in Hash.new(Hash.new) has been modified, but you can't see it
directly because you don't do, e.g., foo[3]=. You just need to poke a hole
through, either before or after the fact. Here's after:

foo[3] = foo["anything"]

=> {2=>true}

foo

=> {3=>{2=>true}}

From the docs:

Returns a new, empty hash. If this hash is subsequently accessed by a
key that doesn’t correspond to a hash entry, the value returned
depends on the style of new used to create the hash. In the first
form, the access returns nil. If obj is specified, this single object
will be used for all default values. If a block is specified, it will
be called with the hash object and the key, and should return the
default value. It is the block’s responsibility to store the value in
the hash if required.

The only thing that the second form does is return the same object for
all missing keys. It doesn't store it in the hash.

Jesus.

···

On Mon, Dec 5, 2011 at 5:34 PM, Adam Prescott <adam@aprescott.com> wrote:

On Mon, Dec 5, 2011 at 16:28, Chad Perrin <code@apotheon.net> wrote:

>> foo = Hash.new( Hash.new )
=> {}
>> foo[3][2] = true
=> true
>> foo
=> {}

This is odd.

> >> foo = Hash.new( Hash.new )
> => {}
> >> foo[3][2] = true
> => true
> >> foo
> => {}
>

This is odd.

> >> foo[3]
> => {2=>true}
> >> foo[2]
> => {2=>true}
>

This isn't. Hash.new(Hash.new) != Hash.new { |h, k| h[k] = Hash.new }. The
left hand side uses the same object for each default access, hence f =
Hash.new(); f[1] << 1; f[2].include?(1) == true. The right hand-side
behaves as you expect for this part.

I figured out that Hash.new(Hash.new) used the same object for all of
them, but found that quite surprising. Now that I see it side-by-side
with the block argument form, though, I realize it should not have
surprised me as it did.

Thanks for helping me fix my code.

foo.rehash didn't fix anything. Looks like a bug to me?

That's what I was wondering -- whether I'd found a bug in there
somewhere.

···

On Tue, Dec 06, 2011 at 01:34:39AM +0900, Adam Prescott wrote:

On Mon, Dec 5, 2011 at 16:28, Chad Perrin <code@apotheon.net> wrote:

I find it rather surprising that assignment does not create something new
where the default used to appear, and kind of useless in this context. I
know it's a special case of assignment that causes this to occur, now
that I think about it after reading replies to my original email, but it
seems like a strange way to implement things from a language user
perspective.

Thanks to everybody who responded.

···

On Tue, Dec 06, 2011 at 01:40:17AM +0900, Jesús Gabriel y Galán wrote:

Using that Hash constructor, what you pass is the default value for a
missing key. What this means is that the hash will return that object
to any call in which the key is not found. *But* it won't assign that
default object to the key. You have to do that yourself. The other
effect you are seeing is that the default object is returned for all
missing keys, hence the {2 => true} for foo[2].

Quite. A fairly unacceptable oversight on my part, really. But it got Chad,
too, so I feel okay about it. :slight_smile:

···

2011/12/5 Jesús Gabriel y Galán <jgabrielygalan@gmail.com>

The only thing that the second form does is return the same object for
all missing keys. It doesn't store it in the hash.

That, for me, is *very* surprising behavior. I would think that it
should treat the default as though it exists, even when not referencing
it directly.

···

On Tue, Dec 06, 2011 at 01:39:54AM +0900, Adam Prescott wrote:

D'oh! Strike that, it's behaving as expected, it's just hidden beneath an
extra layer of oddness.

>> foo = Hash.new(Hash.new)
=> {}
>> foo[3][2] = true
=> true

There's no assignment for the top-level hash. So the default hash object
specified in Hash.new(Hash.new) has been modified, but you can't see it
directly because you don't do, e.g., foo[3]=.

I think the surprise comes from the fact that you are using a mutable
object, or at least an object that only makes sense when modified.
There are some cases where it makes sense to have a default value that
doesn't imply assigning it to the missing key. For example, say you
have an histogram hash, and want to calculate the average of a certain
set of values:

histogram = Hash.new(0)
#fill the hash
possible_keys = [:a, :b, :c, :d] # some of them might not be in the histogram:

average = (possible_keys.inject(0) {|total, element| total +
histogram[element]} ) * 1.0 / possible_keys.length

Here it totally makes sense to have 0 as the default value, and not
have all possible_keys assigned in the hash when you are calculating
an average for a set of keys that might or might not be in the hash.

I think that allowing this version of the constructor along with the
one where you have full control of what to do on a missing key, gives
you all the flexibility for a lot of use cases.

By the way, for the histogram case above, when you are counting
things, you also take advantage of 0 being the default value like
this:

histogram[key] += 1

Jesus.

···

On Mon, Dec 5, 2011 at 9:05 PM, Chad Perrin <code@apotheon.net> wrote:

On Tue, Dec 06, 2011 at 01:40:17AM +0900, Jesús Gabriel y Galán wrote:

Using that Hash constructor, what you pass is the default value for a
missing key. What this means is that the hash will return that object
to any call in which the key is not found. *But* it won't assign that
default object to the key. You have to do that yourself. The other
effect you are seeing is that the default object is returned for all
missing keys, hence the {2 => true} for foo[2].

I find it rather surprising that assignment does not create something new
where the default used to appear, and kind of useless in this context. I
know it's a special case of assignment that causes this to occur, now
that I think about it after reading replies to my original email, but it
seems like a strange way to implement things from a language user
perspective.

Thanks to everybody who responded.

You can get that behaviour with Hash.new { 0 }, too.

···

2011/12/5 Jesús Gabriel y Galán <jgabrielygalan@gmail.com>

On Mon, Dec 5, 2011 at 9:05 PM, Chad Perrin <code@apotheon.net> wrote:
> On Tue, Dec 06, 2011 at 01:40:17AM +0900, Jesús Gabriel y Galán wrote:
>>
>> Using that Hash constructor, what you pass is the default value for a
>> missing key. What this means is that the hash will return that object
>> to any call in which the key is not found. *But* it won't assign that
>> default object to the key. You have to do that yourself. The other
>> effect you are seeing is that the default object is returned for all
>> missing keys, hence the {2 => true} for foo[2].
>
> I find it rather surprising that assignment does not create something new
> where the default used to appear, and kind of useless in this context. I
> know it's a special case of assignment that causes this to occur, now
> that I think about it after reading replies to my original email, but it
> seems like a strange way to implement things from a language user
> perspective.
>
> Thanks to everybody who responded.

I think the surprise comes from the fact that you are using a mutable
object, or at least an object that only makes sense when modified.
There are some cases where it makes sense to have a default value that
doesn't imply assigning it to the missing key. For example, say you
have an histogram hash, and want to calculate the average of a certain
set of values:

histogram = Hash.new(0)
#fill the hash
possible_keys = [:a, :b, :c, :d] # some of them might not be in the
histogram:

average = (possible_keys.inject(0) {|total, element| total +
histogram[element]} ) * 1.0 / possible_keys.length

Here it totally makes sense to have 0 as the default value, and not
have all possible_keys assigned in the hash when you are calculating
an average for a set of keys that might or might not be in the hash.

I think that allowing this version of the constructor along with the
one where you have full control of what to do on a missing key, gives
you all the flexibility for a lot of use cases.

By the way, for the histogram case above, when you are counting
things, you also take advantage of 0 being the default value like
this:

histogram[key] += 1

Jesus.

-----Messaggio originale-----

···

Da: Chad Perrin [mailto:code@apotheon.net]
Inviato: lunedì 5 dicembre 2011 21:01
A: ruby-talk ML
Oggetto: Re: black magical hash element vivification

On Tue, Dec 06, 2011 at 01:39:54AM +0900, Adam Prescott wrote:

D'oh! Strike that, it's behaving as expected, it's just hidden beneath
an extra layer of oddness.

>> foo = Hash.new(Hash.new)
=> {}
>> foo[3][2] = true
=> true

There's no assignment for the top-level hash. So the default hash
object specified in Hash.new(Hash.new) has been modified, but you
can't see it directly because you don't do, e.g., foo[3]=.

That, for me, is *very* surprising behavior. I would think that it should
treat the default as though it exists, even when not referencing it
directly.

--
Caselle da 1GB, trasmetti allegati fino a 3GB e in piu' IMAP, POP3 e SMTP autenticato? GRATIS solo con Email.it http://www.email.it/f

Sponsor:
Riccione Hotel 3 stelle in centro: Pacchetto Capodanno mezza pensione, animazione bimbi, zona relax, parcheggio. Scopri l'offerta solo per oggi...
Clicca qui: http://adv.email.it/cgi-bin/foclick.cgi?mid982&d)-12