Show me the ruby way

This works, but there must be a more natural way to do
this. Can someone point the way?

Thanks,

···

nord

#!/bin/ruby

class HashAutoVivify < Hash

create sub-hashes automatically

def
self[n]=HashAutoZero.new if super(n)==nil
super(n)
end

“values” in sub-hash automatically have initial

value 0
class HashAutoZero < Hash
def
self[n]=0 if super(n)==nil
super(n)
end
end

def printSortedBy(sortSpecifier=‘keys’)
puts “---- sorted by #{sortSpecifier}”
self.keys.sort.each do | m |
puts m
if sortSpecifier == ‘keys’
self[m].keys.sort_by {|a| a }.each do | k |
puts " #{k.ljust(12)} #{self[m][k]}"
end
elsif sortSpecifier == ‘vals’
self[m].keys.sort_by {|a| -self[m][a] }.each
do | k |
puts " #{k.ljust(12)} #{self[m][k]}"
end
end
end
end

def printByKeys
printSortedBy(‘keys’)
end

def printByVals
printSortedBy(‘vals’)
end
end

metaData = HashAutoVivify.new
metaData[‘fruit’][‘apple’] = 13
metaData[‘fruit’][‘mango’] = 7
metaData[‘fruit’][‘banana’] = 11
metaData[‘fruit’][‘cherry’] = 17
metaData[‘veg’][‘eggplant’] = 7
metaData[‘veg’][‘artichoke’] += 19
metaData[‘veg’][‘green bean’] -= 5
metaData[‘veg’][‘squash’] += 2
metaData.printByKeys
metaData.printByVals


Do you Yahoo!?
Yahoo! SiteBuilder - Free, easy-to-use web site design software
http://sitebuilder.yahoo.com

metaData = HashAutoVivify.new

try this (only with ruby-1.8)

   metaData = Hash.new {|h, k| h[k] = Hash.new(0) }

when ruby try to access a key which is not defined, it call the block with
(hash, key), you store in hash[key] a new hash with `0' as default value

Guy Decoux

This works, but there must be a more natural way to do
this. Can someone point the way?

There’s a new feature in 1.8.0 for it:

create sub-hashes automatically

def
self[n]=HashAutoZero.new if super(n)==nil
super(n)
end

h = Hash.new { |me,k| me[k]={} }

“values” in sub-hash automatically have initial

value 0
class HashAutoZero < Hash
def
self[n]=0 if super(n)==nil
super(n)
end
end

h2 = Hash.new(0)

although this means that unset values read out as 0, it does not autovivify
the value in the hash. If you need that, then:

h2 = Hash.new { |me,k| me[k]=0 }

Combining these two:

irb(main):020:0> h = Hash.new { |h1,k1| h1[k1] = Hash.new { |h2,k2| h2[k2]=0 } }
=> {}
irb(main):021:0> h[‘fred’]
=> {}
irb(main):022:0> h
=> {“fred”=>{}}
irb(main):023:0> h[‘foo’][‘bar’]
=> 0
irb(main):024:0> h
=> {“fred”=>{}, “foo”=>{“bar”=>0}}

Regards,

Brian.

···

On Wed, Sep 03, 2003 at 12:13:19AM +0900, nord ehacedod wrote:

ts decoux@moulon.inra.fr wrote in message news:200309021525.h82FPkM17085@moulon.inra.fr

try this (only with ruby-1.8)

metaData = Hash.new {|h, k| h[k] = Hash.new(0) }

when ruby try to access a key which is not defined, it call the block with
(hash, key), you store in hash[key] a new hash with `0’ as default value

On a related note, with ruby 1.8.0 (2003-08-04) [i386-mswin32],
I was wondering about this behavior:

md = Hash.new { |h,k| h[h] = 0; 3 }
p md[1] # => 3
p md[1] # => 0

I understand that the first “p md[1]” is returning the result of the
block; is that intentional?

Cheers,
alan

alan,

I was wondering about this behavior:

md = Hash.new { |h,k| h[h] = 0; 3 }
p md[1] # => 3
p md[1] # => 0

I understand that the first “p md[1]” is returning
the result of the block; is that intentional?

Yes.  A clearer example:

md = Hash.new { 3 }
md[1] = 1
p md[1] => 1
p md[2] => 3
p md => {1=>1}

Remember, the main point of the block is to generate a default value for

nonexistent keys. The autovivication trick is merely an extension of this
functionality.

I hope this helps.

- Warren Brown

~$ ri Hash.new
This is a test ‘ri’. Please report errors and omissions
on http://www.rubygarden.org/ruby?RIOnePointEight

-------------------------------------------------------------- Hash::new
Hash.new → aHash
Hash.new( anObject ) → aHash
Hash.new {| aHash, key | block } → aHash

···

On 2 Sep 2003 14:21:46 -0700 aero6dof@yahoo.com (Alan Chen) wrote:

ts decoux@moulon.inra.fr wrote in message news:200309021525.h82FPkM17085@moulon.inra.fr

try this (only with ruby-1.8)

metaData = Hash.new {|h, k| h[k] = Hash.new(0) }

when ruby try to access a key which is not defined, it call the block with
(hash, key), you store in hash[key] a new hash with `0’ as default value

On a related note, with ruby 1.8.0 (2003-08-04) [i386-mswin32],
I was wondering about this behavior:

md = Hash.new { |h,k| h[h] = 0; 3 }
p md[1] # => 3
p md[1] # => 0

I understand that the first “p md[1]” is returning the result of the
block; is that intentional?


 Returns a new, empty hash. If this hash is subsequently accessed by
 a key that doesn't correspond to a hash entry, the value returned
 depends on the style of new used to create the hash. In the first
 form, the access returns nil. If anObject is specified, this single
 object will be used for all default values. If a block is
 specified, it will be called with the hash object and the key, and
 should return the default value. It is the block's responsibility
 to store the value in the hash if required.
    h = Hash.new("Go Fish")
    h["a"] = 100
    h["b"] = 200
    h["a"]           #=> 100
    h["c"]           #=> "Go Fish"
    # The following alters the single default object
    h["c"].upcase!   #=> "GO FISH"
    h["d"]           #=> "GO FISH"
    h.keys           #=> ["a", "b"]
    # While this creates a new default object each time
    h = Hash.new { |hash, key| hash[key] = "Go Fish: #{key}" }
    h["c"]           #=> "Go Fish: c"
    h["c"].upcase!   #=> "GO FISH: C"
    h["d"]           #=> "Go Fish: d"
    h.keys           #=> ["c", "d"]

~$

So yes, I think it is intentional.

Jason Creighton

Hmm, I meant h[k] not h[h] as I originally posted below…

md = Hash.new { |h,k| h[k] = 0; 3 }

access undefined key 1

p md[1] # => 3 # returns the block result
p md[1] # => 0 # now we discover the actual default value

It’s clear to me exactly what is happening, but I guess my thought was
that this behavior is a bug. If the code that runs the autovivication
block can recognize that the value for the undefined key was set, then
perhaps the block return value should be ignored. I guess it would
require an additional st_lookup after the initial lookup fails.

“Warren Brown” wkb@airmail.net wrote in message news:005101c3719d$c091ca50$670b88cf@warrenpc

···

alan,

I was wondering about this behavior:

md = Hash.new { |h,k| h[h] = 0; 3 }
p md[1] # => 3
p md[1] # => 0

I understand that the first “p md[1]” is returning
the result of the block; is that intentional?

Yes.  A clearer example:

md = Hash.new { 3 }
md[1] = 1
p md[1] => 1
p md[2] => 3
p md => {1=>1}

Remember, the main point of the block is to generate a default value for

nonexistent keys. The autovivication trick is merely an extension of this
functionality.

I hope this helps.

- Warren Brown

It’s clear to me exactly what is happening, but I guess my thought was
that this behavior is a bug. If the code that runs the autovivication
block can recognize that the value for the undefined key was set, then
perhaps the block return value should be ignored.

Eek. Consider that in the common case, where the undefined key is being set
as the last statement in the block, the correct value will be automatically
returned. So only in edge cases would the set value and the returned value
differ, and as a programmer I would be most surprised if the set value was
the one that was used while the returned value was the one that was dropped!

Doesn’t really seem to be that useful a test to me. If you want to write
code which says “as a side-effect set the stored value to 0, but the value
to return to the user is 3” then why not :slight_smile:

Regards,

Brian.

···

On Wed, Sep 03, 2003 at 10:54:09AM +0900, Alan Chen wrote:

Hmm, I meant h[k] not h[h] as I originally posted below…

md = Hash.new { |h,k| h[k] = 0; 3 }

access undefined key 1

p md[1] # => 3 # returns the block result
p md[1] # => 0 # now we discover the actual default value

It’s clear to me exactly what is happening, but I guess my thought was
that this behavior is a bug. If the code that runs the autovivication
block can recognize that the value for the undefined key was set, then
perhaps the block return value should be ignored. I guess it would
require an additional st_lookup after the initial lookup fails.

It’s not a bug. The block you pass to Hash.new is not expected to
do anything but return the value that the hash should return for the
given key when no value is yet stored for that key. That’s all.
There’s no requirement or expectation that the block modify the hash,
even though it has the ability to do so.

If the block does modify the hash at all, it’s not unreasonable to assume
that the modification will consist of storing the same value that the block
is returning at the key that was requested. If it does something else,
then it’s the programmer, not Ruby, who’s violating the principle of
least surprise. :slight_smile:

In other words, you have it backwards. The return value of the block is
the important thing from the standpoint of the Hash specification; anything
else is just a side effect. Which the autovivification solution presented
earlier in this thread was exploiting.

-Mark

···

On Wed, Sep 03, 2003 at 10:54:09AM +0900, Alan Chen wrote:

Hmm, I meant h[k] not h[h] as I originally posted below…

md = Hash.new { |h,k| h[k] = 0; 3 }

access undefined key 1

p md[1] # => 3 # returns the block result
p md[1] # => 0 # now we discover the actual default value

It’s clear to me exactly what is happening, but I guess my thought was
that this behavior is a bug. If the code that runs the autovivication
block can recognize that the value for the undefined key was set, then
perhaps the block return value should be ignored. I guess it would
require an additional st_lookup after the initial lookup fails.

“Mark J. Reed” markjreed@mail.com wrote in message news:20030903145103.GC17861@mulan.thereeds.org

In other words, you have it backwards. The return value of the block is
the important thing from the standpoint of the Hash specification; anything
else is just a side effect. Which the autovivification solution presented
earlier in this thread was exploiting.

-Mark

md = Hash.new { |h,k| h[k] = 0; 3 }

access undefined key 1

p md[1] # => 3 # returns the block result
p md[1] # => 0 # now we discover the actual default value

3 3, or 0 0, but not 3 0. If the return value of the block
is to be the intended semantic, shouldn’t the second line return 3 too?

FWIW, I think an additional st_lookup after running the default
block for a undefined key access would preserve the intended
semantics of both usages of the autovivification block (setting the
value by block return, and explicity setting the value) while
protecting the programmer from this edge case.

Cheers,

  • alan
···

From my standpoint this code should either return

No. The block is ony used for return values WHEN THE KEY DOESN’T
EXIST IN THE HASH. That is the whole point of the block - you only
supply one when you want the hash to return something other than nil
for missing keys, and specifically when you want dynamic on-the-fly
control of exactly what is returned based on the key value and/or
the current contents of the hash.

The first time md[1] is queried, the key doesn’t exist in the hash,
so the block is executed and its return value (3) returned. But
because the block also actually stores a value in md[1], on
the next attempt to access md[1] the key DOES exist, and therefore
the block is not executed at all. The hash just returns the stored
value, which is 0.

Having the block modify the hash is a side effect. Having the block
modify the hash AND return a different value than the one it
stores is just plain weird, and if you do it you deserve whatever you
get. :slight_smile:

-Mark

···

On Wed, Sep 03, 2003 at 11:45:50AM -0700, Alan Chen wrote:

“Mark J. Reed” markjreed@mail.com wrote in message news:20030903145103.GC17861@mulan.thereeds.org

In other words, you have it backwards. The return value of the block is
the important thing from the standpoint of the Hash specification; anything
else is just a side effect. Which the autovivification solution presented
earlier in this thread was exploiting.

-Mark

md = Hash.new { |h,k| h[k] = 0; 3 }

access undefined key 1

p md[1] # => 3 # returns the block result
p md[1] # => 0 # now we discover the actual default value

From my standpoint this code should either return
3 3, or 0 0, but not 3 0. If the return value of the block
is to be the intended semantic, shouldn’t the second line return 3 too?

“Mark J. Reed” markjreed@mail.com wrote in message news:20030903185333.GA19685@mulan.thereeds.org

… hash logic snipped

Thanks for taking to time to explain the hash logic, but as I mentioned
at the start, I already understand the current implementation. Primarily,
I’m interested in the edge case.

Having the block modify the hash is a side effect. Having the block
modify the hash AND return a different value than the one it
stores is just plain weird, and if you do it you deserve whatever you
get. :slight_smile:

Currently thats correct, but this patch makes the hash default block
logic behave more consistently. I think it preserves all of the
intentional logic of the current hash default block while fixing up
the odd edge case that I’ve been so poorly explaining my interest in. :slight_smile:

As far as I’ve accounted for those cases are as follows:

case 0

h = Hash.new
h[undefinedkey] == nil

case 1

ho = Hash.new(object)
ho[undefinedkey] == object

case 2

hb1 = Hash.new { |h,k|
h[k] = defaultvalue
}
hb1[undefinedkey] == defaultvalue

case 3

hb2 = Hash.new { |h,k|

… any logic but setting h[k]

defaultvalue
}
hb2[undefinedkey] == defaultvalue

case 4

hb3 = Hash.new { |h,k|
h[k] = defaultvalue

… side calc

}

current behavior

hb3[undefinedkey] == side calc result on first access
hb3[nowdefinedkey] == defaultvalue (on accesses after the default block has run)

behavior with patch

hb3[undefinedkey] == defaultvalue

My question regarding intent was to find out if anybody depended on,
expected, or liked the current behavior in case 4. I obviously dislike having
behavior where a hash value can appear to be one value in the initial access
with an undefined key and another value on subsequent accesses after the default
key has defined the value. Now the price you pay to get case 4 to work in the patch
is the additional st_lookup call in cases 2,3, and 4.

— hash.c.orig 2003-09-03 23:03:24.000000000 -0700
+++ hash.c 2003-09-03 23:08:10.000000000 -0700
@@ -329,7 +329,14 @@

 rb_scan_args(argc, argv, "01", &key);
 if (FL_TEST(hash, HASH_PROC_DEFAULT)) {
  • return rb_funcall(RHASH(hash)->ifnone, id_call, 2, hash, key);
  •    VALUE val;
    
  •    VALUE procres = rb_funcall(RHASH(hash)->ifnone, id_call, 2, hash, key);
    
  •    if (st_lookup(RHASH(hash)->tbl, key, &val)) {        
    
  •      return val;
    
  •    }
    
  •    else {
    
  •      return procres;
    
  •    }
    
    }
    return RHASH(hash)->ifnone;
    }

Hi –

case 4

hb3 = Hash.new { |h,k|
h[k] = defaultvalue

… side calc

}

current behavior

hb3[undefinedkey] == side calc result on first access
hb3[nowdefinedkey] == defaultvalue (on accesses after the default block has run)

behavior with patch

hb3[undefinedkey] == defaultvalue

My question regarding intent was to find out if anybody depended on,
expected, or liked the current behavior in case 4.

I think we all constantly depend on the fact that the return value of
a block is the last expression in the block, and that’s all that’s
going on here. You’re adding an extra layer of complexity to it; it’s
basically just a yield. As in every such case in every Ruby program
(and all analogous cases in other languages), you have to write the
appropriate code in the block to get the return value you want.

David

···

On Thu, 4 Sep 2003, Alan Chen wrote:


David Alan Black
home: dblack@superlink.net
work: blackdav@shu.edu
Web: http://pirate.shu.edu/~blackdav

It already behaves perfectly consistently. The return value of the
block is what is returned when the key is not present. That’s it,
that’s all, nothing more, nothing less. I completely fail to see the
problem. It’s not an “edge case”. You’re looking for extra magic
that is neither called for nor even desirable.

If you want the block to both set and return the hash value, just do it as
the last statement in the block.

There are a myriad of other cases where you can do weird things to
the receiver inside a block and thereby get what appears to be inconsistent
behavior from the outside. That’s a feature, not a flaw; Ruby doesn’t
try to protect programmers from themselves invasively.

-Mark

···

On Wed, Sep 03, 2003 at 11:38:40PM -0700, Alan Chen wrote:

Having the block modify the hash is a side effect. Having the block
modify the hash AND return a different value than the one it
stores is just plain weird, and if you do it you deserve whatever you
get. :slight_smile:

Currently thats correct, but this patch makes the hash default block
logic behave more consistently.

case 4

hb3 = Hash.new { |h,k|
h[k] = defaultvalue

… side calc

}

current behavior

hb3[undefinedkey] == side calc result on first access
hb3[nowdefinedkey] == defaultvalue (on accesses after the default block has run)

behavior with patch

hb3[undefinedkey] == defaultvalue

My question regarding intent was to find out if anybody depended on,
expected, or liked the current behavior in case 4.

Good use I can think of:

hb3 = Hash.new { |h,k|
Thread.new {
some_task_with_immense_processing_that_assigns_to_h(h, k)
}
“incomplete”
}

I can see that being used in the app I’m writing – a clone of
Rhythmbox, which is a clone of iTunes – I’d use it in the music-library
classes, where scanning the disk for music can take tens of minutes, but
the GUI is in it’s own thread…

In writing this, I’m trying to outperform Rhythmbox 0.5 (which is coded
in C) in Ruby, just because an agile language lets you tune the
algorithms more easily.

Ari

Since it’s clear that I’m out of sync with everybody else on this
issue, I’ll stop the torture and drop it :). However, Ari, you will
want to be careful with the construct you orignally posted. If your
GUI allows the user to cause an access the same undefined key before
the thread completes, you’ll kick off redundant threads searching
for the same resource.

Aredridel aredridel@nbtsc.org wrote in message news:1062691741.4672.4.camel@mizar.nbtsc.org

hb3 = Hash.new { |h,k|
h[k] = “incomplete” #ADDED: define the value for the key
Thread.new {
some_task_with_immense_processing_that_assigns_to_h(h, k)
}
“incomplete”
}

Ari,

Aredridel wrote:

Good use I can think of:

hb3 = Hash.new { |h,k|
Thread.new {
some_task_with_immense_processing_that_assigns_to_h(h, k)
}
“incomplete”
}

I can see that being used in the app I’m writing – a clone of
Rhythmbox, which is a clone of iTunes – I’d use it in the music-library
classes, where scanning the disk for music can take tens of minutes, but
the GUI is in it’s own thread…

I find this to be very clever. I’d want it to be
commented, though, if I were reading the code. :slight_smile:

This snippet should perhaps be captured somewhere.

Hal