Populating a hash from an array using inject

I was looking at this problem on Stack Overflow (this one:

The question is how to "convert" an array of objects into a hash.

Consider this code:

    require 'pp'
    p RUBY_VERSION

    Product = Struct.new(:name, :category)
    products = [
        ['Apple','Golden Delicious'],
        ['Apple','Granny Smith'],
        ['Orange','Navel']
    ].collect {|cat, name| Product.new(name, cat)}

    foo = products.inject({}) {|h,p| h[p.category] ||= []; h[p.category] << p; h}
    pp foo

    bar = products.inject(Hash.new([])) {|h,p| h[p.category] << p; h}
    pp bar

    baz = products.inject(Hash.new([])) {|h,p| h[p.category] += p; h}
    pp baz

It outputs:

    "1.8.7"
    {"Orange"=>[#<struct Product name="Navel", category="Orange">],
     "Apple"=>
      [#<struct Product name="Golden Delicious", category="Apple">,
       #<struct Product name="Granny Smith", category="Apple">]}
    {}
    {"Orange"=>[#<struct Product name="Navel", category="Orange">],
     "Apple"=>
      [#<struct Product name="Golden Delicious", category="Apple">,
       #<struct Product name="Granny Smith", category="Apple">]}

My question: why is bar empty?

···

--
Glenn Jackman
    Write a wise saying and your name will live forever. -- Anonymous

I don't have an answer, but trying out the code in JRuby, I can see that the default value for bar becomes simply the products array after the inject() method is used. One more thing that requires an explanation. You can see this by appending this line to your code:

pp bar["random string here"]

···

Date: Fri, 11 Sep 2009 08:30:07 +0900
From: glennj@ncf.ca
Subject: populating a hash from an array using inject
To: ruby-talk@ruby-lang.org

I was looking at this problem on Stack Overflow (this one:
Reorganizing Ruby array into hash - Stack Overflow

The question is how to "convert" an array of objects into a hash.

Consider this code:

    require 'pp'
    p RUBY_VERSION

    Product = Struct.new(:name, :category)
    products = [
        ['Apple','Golden Delicious'],
        ['Apple','Granny Smith'],
        ['Orange','Navel']
    ].collect {|cat, name| Product.new(name, cat)}

    foo = products.inject({}) {|h,p| h[p.category] ||= ; h[p.category] << p; h}
    pp foo

    bar = products.inject(Hash.new()) {|h,p| h[p.category] << p; h}
    pp bar

    baz = products.inject(Hash.new()) {|h,p| h[p.category] += p; h}
    pp baz

It outputs:

    "1.8.7"
    {"Orange"=>[#<struct Product name="Navel", category="Orange">],
     "Apple"=>
      [#<struct Product name="Golden Delicious", category="Apple">,
       #<struct Product name="Granny Smith", category="Apple">]}
    {}
    {"Orange"=>[#<struct Product name="Navel", category="Orange">],
     "Apple"=>
      [#<struct Product name="Golden Delicious", category="Apple">,
       #<struct Product name="Granny Smith", category="Apple">]}

My question: why is bar empty?

--
Glenn Jackman
    Write a wise saying and your name will live forever. -- Anonymous

_________________________________________________________________
Windows Live: Make it easier for your friends to see what you’re up to on Facebook.
http://windowslive.com/Campaign/SocialNetworking?ocid=PID23285::T:WLMTAGL:ON:WL:en-US:SI_SB_facebook:082009

Glenn Jackman wrote:

I was looking at this problem on Stack Overflow (this one:
Reorganizing Ruby array into hash - Stack Overflow

The question is how to "convert" an array of objects into a hash.

Consider this code:

    require 'pp'
    p RUBY_VERSION

    Product = Struct.new(:name, :category)
    products = [
        ['Apple','Golden Delicious'],
        ['Apple','Granny Smith'],
        ['Orange','Navel']
    ].collect {|cat, name| Product.new(name, cat)}

    foo = products.inject({}) {|h,p| h[p.category] ||= ; h[p.category]
<< p; h}
    pp foo

    bar = products.inject(Hash.new()) {|h,p| h[p.category] << p; h}
    pp bar

    baz = products.inject(Hash.new()) {|h,p| h[p.category] += p; h}
    pp baz

It outputs:

    "1.8.7"
    {"Orange"=>[#<struct Product name="Navel", category="Orange">],
     "Apple"=>

Even though it is not very well written, if you read the documentation
on creating hashes with default values:

$ ri Hash.new

-------------------------------------------------------------- Hash::new
     Hash.new => hash
     Hash.new(obj) => aHash
     Hash.new {|hash, key| block } => aHash

···

------------------------------------------------------------------------
     Returns a new, empty hash. If this hash is subsequently accessed by
     a key that doesn't correspond to a hash entry, the value returned
     depends on the style of +new+ used to create the hash. In the first
     form, the access returns +nil+. If _obj_ is specified, this single
     object will be used for all _default values_. If a block is
     specified, it will be called with the hash object and the key, and
     should return the default value. It is the block's responsibility
     to store the value in the hash if required.

        h = Hash.new("Go Fish")
        h["a"] = 100
        h["b"] = 200
        h["a"] #=> 100
        h["c"] #=> "Go Fish"
        # The following alters the single default object
        h["c"].upcase! #=> "GO FISH"
        h["d"] #=> "GO FISH"
        h.keys #=> ["a", "b"]
-----------------------------------------------------------

The key line is:

***8It is the block's responsibility to store the value in the hash if
required.***

Because your block does not assign the array to a hash key, the array is
discarded.

--
Posted via http://www.ruby-forum.com/\.

Sorry, I pasted the ri output on top of the first part of my post, which
said something to the effect of:

My question: why is bar empty?

...because h[non_existent_key] sends an array that is unassociated with
any hash to the block. In other words, creating a hash with a default
value does not cause a key/value pair to be created in the hash when a
non-existent key is accessed.

···

--
Posted via http://www.ruby-forum.com/\.

7stud -- wrote:

Sorry, I pasted the ri output on top of the first part of my post, which
said something to the effect of:

My question: why is bar empty?

...because h[non_existent_key] sends an array that is unassociated with
any hash to the block. In other words, creating a hash with a default
value does not cause a key/value pair to be created in the hash when a
non-existent key is accessed.

It was probably best that first sentence was erased. Here is what I was
trying to say:

h = Hash.new()

result = h["A"]
p result

--output:--

p h

--output:--
{}

When you write:

h[p.category] << p

The example above demonstrates that is equivalent* to:

<< p

...and that simply appends an object to an empty array, and does nothing
to the hash.

And you are in even worse shape than you realize. If you take the empty
array that is returned as the default value and append an object to the
array, and then manually assign the array to a key in the hash:

h = Hash.new()

h["A"] = h["A"] << 10 #==>[10]

h["B"] = h["B"] << 20 #==>[20]

Look what happens:

p h

{"A"=>[10, 20], "B"=>[10, 20]}

Yep, ruby hands you a reference to the same array over and over again.

···

--
Posted via http://www.ruby-forum.com/\.

The thread has moved on a bit, but.

He actually isn't USING the block form of Hash.new, the only blocks
are arguments to inject.

And Hash.new {|h,k| ... } is what I tend to reach for in a situation like this.

   foo = products.inject(Hash.new {|h,k| h[k] = }) {|h,p|
h[p.category] << p; h}

···

On Thu, Sep 10, 2009 at 8:42 PM, 7stud -- <bbxx789_05ss@yahoo.com> wrote:

Glenn Jackman wrote:

I was looking at this problem on Stack Overflow (this one:
Reorganizing Ruby array into hash - Stack Overflow

The question is how to "convert" an array of objects into a hash.

Consider this code:

require &#39;pp&#39;
p RUBY\_VERSION

Product = Struct\.new\(:name, :category\)
products = \[
    \[&#39;Apple&#39;,&#39;Golden Delicious&#39;\],
    \[&#39;Apple&#39;,&#39;Granny Smith&#39;\],
    \[&#39;Orange&#39;,&#39;Navel&#39;\]
\]\.collect \{|cat, name| Product\.new\(name, cat\)\}

foo = products\.inject\(\{\}\) \{|h,p| h\[p\.category\] ||= \[\]; h\[p\.category\]

<< p; h}
pp foo

bar = products\.inject\(Hash\.new\(\[\]\)\) \{|h,p| h\[p\.category\] &lt;&lt; p; h\}
pp bar

baz = products\.inject\(Hash\.new\(\[\]\)\) \{|h,p| h\[p\.category\] \+= p; h\}
pp baz

It outputs:

&quot;1\.8\.7&quot;
\{&quot;Orange&quot;=&gt;\[\#&lt;struct Product name=&quot;Navel&quot;, category=&quot;Orange&quot;&gt;\],
 &quot;Apple&quot;=&gt;

Even though it is not very well written, if you read the documentation
on creating hashes with default values:

$ ri Hash.new

-------------------------------------------------------------- Hash::new
Hash.new => hash
Hash.new(obj) => aHash
Hash.new {|hash, key| block } => aHash
------------------------------------------------------------------------
Returns a new, empty hash. If this hash is subsequently accessed by
a key that doesn't correspond to a hash entry, the value returned
depends on the style of +new+ used to create the hash. In the first
form, the access returns +nil+. If _obj_ is specified, this single
object will be used for all _default values_. If a block is
specified, it will be called with the hash object and the key, and
should return the default value. It is the block's responsibility
to store the value in the hash if required.

   h = Hash\.new\(&quot;Go Fish&quot;\)
   h\[&quot;a&quot;\] = 100
   h\[&quot;b&quot;\] = 200
   h\[&quot;a&quot;\]           \#=&gt; 100
   h\[&quot;c&quot;\]           \#=&gt; &quot;Go Fish&quot;
   \# The following alters the single default object
   h\[&quot;c&quot;\]\.upcase\!   \#=&gt; &quot;GO FISH&quot;
   h\[&quot;d&quot;\]           \#=&gt; &quot;GO FISH&quot;
   h\.keys           \#=&gt; \[&quot;a&quot;, &quot;b&quot;\]

-----------------------------------------------------------

The key line is:

***8It is the block's responsibility to store the value in the hash if
required.***

Because your block does not assign the array to a hash key, the array is
discarded.

--
Rick DeNatale

Blog: http://talklikeaduck.denhaven2.com/
Twitter: http://twitter.com/RickDeNatale
WWR: http://www.workingwithrails.com/person/9021-rick-denatale
LinkedIn: http://www.linkedin.com/in/rickdenatale

7stud -- wrote:

h = Hash.new()

h["A"] = h["A"] << 10 #==>[10]

h["B"] = h["B"] << 20 #==>[20]

Look what happens:

p h

{"A"=>[10, 20], "B"=>[10, 20]}

Yep, ruby hands you a reference to the same array over and over
again...when you try to access non-existent keys.

···

--
Posted via http://www.ruby-forum.com/\.

While I thank you for takking the time, I can't say I'm enlightened by
your explanation. What's the real difference between the blocks

bar = products.inject(Hash.new()) {|h,p| h[p.category] << p; h}
baz = products.inject(Hash.new()) {|h,p| h[p.category] += [p]; h}

Is it because += explicitly assigns a new object to the hash key?

···

At 2009-09-10 09:16PM, "7stud --" wrote:

7stud -- wrote:
> h = Hash.new()
>
> h["A"] = h["A"] << 10 #==>[10]
>
> h["B"] = h["B"] << 20 #==>[20]
>
> Look what happens:
>
>
> p h
>
> {"A"=>[10, 20], "B"=>[10, 20]}
>

Yep, ruby hands you a reference to the same array over and over
again...when you try to access non-existent keys.

--
Glenn Jackman
    Write a wise saying and your name will live forever. -- Anonymous

Hi --

7stud -- wrote:

h = Hash.new()

h["A"] = h["A"] << 10 #==>[10]

h["B"] = h["B"] << 20 #==>[20]

Look what happens:

p h

{"A"=>[10, 20], "B"=>[10, 20]}

Yep, ruby hands you a reference to the same array over and over
again...when you try to access non-existent keys.

While I thank you for takking the time, I can't say I'm enlightened by
your explanation. What's the real difference between the blocks

bar = products.inject(Hash.new()) {|h,p| h[p.category] << p; h}
baz = products.inject(Hash.new()) {|h,p| h[p.category] += [p]; h}

Is it because += explicitly assigns a new object to the hash key?

Yes. Here's another way to look at it:

   array =
   hash = Hash.new(array)
   hash[:x] << 1 # equivalent to: array << 1
   hash[:y] << 2 # equivalent to: array << 2
   hash[:z] += [:a,:b,:c] # equivalent to: h[:z] = array + [:a,:b,:c]

David

···

On Fri, 11 Sep 2009, Glenn Jackman wrote:

At 2009-09-10 09:16PM, "7stud --" wrote:

--
David A. Black / Ruby Power and Light, LLC / http://www.rubypal.com
Ruby/Rails training, mentoring, consulting, code-review
Latest book: The Well-Grounded Rubyist (http://www.manning.com/black2\)

September Ruby training in NJ has been POSTPONED. Details to follow.

Glenn Jackman wrote:

> p h
>
> {"A"=>[10, 20], "B"=>[10, 20]}
>

Yep, ruby hands you a reference to the same array over and over
again...when you try to access non-existent keys.

While I thank you for takking the time, I can't say I'm enlightened by
your explanation. What's the real difference between the blocks

bar = products.inject(Hash.new()) {|h,p| h[p.category] << p; h}
baz = products.inject(Hash.new()) {|h,p| h[p.category] += [p]; h}

Is it because += explicitly assigns a new object to the hash key?

Yes. When the key doesn't exist the first line is equivalent to:

bar = products.inject(Hash.new()) {|h,p| << p; h}

which does nothing to the hash--all it does is append p to an empty
array, and then the empty array is discarded.

The second line is equivalent to:

baz = products.inject(Hash.new()) {|h,p| h[p.category] = h[p.category]
+ [p]; h}

If you access a non-existent key, say h["A"], then that line is
equivalent to

baz = products.inject(Hash.new()) {|h,p| h["A"] = + [p]; h}

or

baz = products.inject(Hash.new()) {|h,p| h["A"] = [p]; h}

which explicitly sets a key in the hash to the value [p], thereby
altering the hash.

···

At 2009-09-10 09:16PM, "7stud --" wrote:

--
Posted via http://www.ruby-forum.com/\.

As we've seen, it's not discarded: it's kept for h's reference:

    products = [[1,2],[3,4],[1,5]]
    foo = products.inject(Hash.new()) {|h,(a,b)| h[a] << b; h} # => {}
    foo[:unknown] # => [2, 4, 5]

···

At 2009-09-11 02:07PM, "7stud --" wrote:

Glenn Jackman wrote:
> While I thank you for takking the time, I can't say I'm enlightened by
> your explanation. What's the real difference between the blocks
>
> bar = products.inject(Hash.new()) {|h,p| h[p.category] << p; h}
> baz = products.inject(Hash.new()) {|h,p| h[p.category] += [p]; h}
>
> Is it because += explicitly assigns a new object to the hash key?

Yes. When the key doesn't exist the first line is equivalent to:

bar = products.inject(Hash.new()) {|h,p| << p; h}

which does nothing to the hash--all it does is append p to an empty
array, and then the empty array is discarded.

--
Glenn Jackman
    Write a wise saying and your name will live forever. -- Anonymous

Glenn Jackman wrote:

bar = products.inject(Hash.new()) {|h,p| << p; h}

which does nothing to the hash--all it does is append p to an empty
array, and then the empty array is discarded.

As we've seen, it's not discarded: it's kept for h's reference:

    products = [[1,2],[3,4],[1,5]]
    foo = products.inject(Hash.new()) {|h,(a,b)| h[a] << b; h} # => {}
    foo[:unknown] # => [2, 4, 5]

...and this is a lie too:

If you access a non-existent key, say h["A"], then that line is
equivalent to

baz = products.inject(Hash.new()) {|h,p| h["A"] = + [p]; h}

You asked for lies. I gave them to you.

···

At 2009-09-11 02:07PM, "7stud --" wrote:

--
Posted via http://www.ruby-forum.com/\.