Iterating over an Array of Hashes

All,

When iterating over an Array of Hashes repeatedly, the 'each' method
appear to pass a reference to - not a copy of - each element in the
array.

Is there a way to take a *copy* of the data, rather than a reference to
the data, so that any changes to each hash aren't?

Explaining it another way - I have code similar to the following:

  a = [ '2011-01-01', '2011-01-02', '2011-01-03' ]
  b = [ { :x => "foo" }, { :x => "bar" }, { :x => "baz" } ]

  a.each do |a_item|

    puts a_item

    b.each do |b_item|
      b_item[:date] = a_item
      # DO SOMETHING WITH B
    end

  end

What I want to do within the b.each loop is work on a *copy* of each
element of b, such that if I mess around with it, the changes are lost
when when the loop exits.

What is the magic I am missing?

Peter

I think this might do what you want:

    Marshal.load(Marshal.dump b).each |b_item|
      # do stuff
    end

It's ugly and kludgey, but it should work.

···

On Wed, May 04, 2011 at 07:58:04AM +0900, Peter Hicks wrote:

What I want to do within the b.each loop is work on a *copy* of each
element of b, such that if I mess around with it, the changes are lost
when when the loop exits.

--
Chad Perrin [ original content licensed OWL: http://owl.apotheon.org ]

hi Peter -

  well, this is certainly no magic - but you could just make a temp
array, and check each entry to make sure you like it before committing
it to "b."
  there's probably a better way to do this, but here's what i came up
with:

a = [ '2011-01-01', '2011-01-02', '2011-01-03' ]
b = [ { :x => "foo" }, { :x => "bar" }, { :x => "baz" } ]
temp = []

a.each{|a_item| temp << {:date => a_item}}

temp.each{|entry|
  unless entry.inspect.include?("02") #or something else more relevant
    idx = temp.index(entry)
    b[idx] = entry
  end
}

temp = []

p b

#=> [{:date=>"2011-01-01"}, {:x=>"bar"}, {:date=>"2011-01-03"}]

  - j

···

--
Posted via http://www.ruby-forum.com/.

Peter Hicks wrote in post #996483:

What is the magic I am missing?

clone():

b = [ { :x => "foo" }, { :x => "bar" }, { :x => "baz" } ]
count = 0

b.each do |x|
  copy = x.clone
  copy[:x] = count
  count += 1

  puts copy[:x]
end

b.each do |x|
p x
end

--output:--
0
1
2
{:x=>"foo"}
{:x=>"bar"}
{:x=>"baz"}

···

--
Posted via http://www.ruby-forum.com/\.

`clone` doesn't cut it, since it's creating shallow copies. An
illustration of the problem:

==== begin snippet ====
arr = [{:fruit => {:kind => 'apple'}}, {:fruit => {:kind => 'banana'}}]
# => [{:fruit=>{:kind=>"apple"}}, {:fruit=>{:kind=>"banana"}}]

arr.map(&:clone).each { |h| h[:fruit][:kind] = 'coconut' }
# => [{:fruit=>{:kind=>"coconut"}}, {:fruit=>{:kind=>"coconut"}}]
==== end snippet ====

Notice that `arr` now has coconuts in it, instead of the original
apples and bananas.

~ jf

···

On Tue, May 3, 2011 at 20:34, 7stud -- <bbxx789_05ss@yahoo.com> wrote:

Peter Hicks wrote in post #996483:

What is the magic I am missing?

clone():

--
John Feminella
Principal Consultant, BitsBuilder
LI: http://www.linkedin.com/in/johnxf
SO: User John Feminella - Stack Overflow

On Tue, May 3, 2011 at 20:34, 7stud -- <bbxx789_05ss@yahoo.com> wrote:

Peter Hicks wrote in post #996483:

What is the magic I am missing?

clone():

b = [ { :x => "foo" }, { :x => "bar" }, { :x => "baz" } ]
count = 0

b.each do |x|
copy = x.clone
copy[:x] = count
count += 1

puts copy[:x]
end

b.each do |x|
p x
end

--output:--
0
1
2
{:x=>"foo"}
{:x=>"bar"}
{:x=>"baz"}

--
Posted via http://www.ruby-forum.com/\.

John Feminella wrote in post #996498:

···

On Tue, May 3, 2011 at 20:34, 7stud -- <bbxx789_05ss@yahoo.com> wrote:

Peter Hicks wrote in post #996483:

What is the magic I am missing?

clone():

`clone` doesn't cut it, since it's creating shallow copies. An
illustration of the problem:

Really? Look at my output. It cuts it just fine.

--
Posted via http://www.ruby-forum.com/\.

Your example doesn't contain nested hashes, while mine does. That's
what I was demonstrating -- it's a shallow copy, not a deep one.

~ jf

···

--
John Feminella
Principal Consultant, BitsBuilder
LI: http://www.linkedin.com/in/johnxf
SO: User John Feminella - Stack Overflow

On Tue, May 3, 2011 at 22:00, 7stud -- <bbxx789_05ss@yahoo.com> wrote:

John Feminella wrote in post #996498:

On Tue, May 3, 2011 at 20:34, 7stud -- <bbxx789_05ss@yahoo.com> wrote:

Peter Hicks wrote in post #996483:

What is the magic I am missing?

clone():

`clone` doesn't cut it, since it's creating shallow copies. An
illustration of the problem:

Really? Look at my output. It cuts it just fine.

--
Posted via http://www.ruby-forum.com/\.

Any problem with the marshal dump and load aproach of Chad Perrin?

Look...

arr = [{:fruit => {:kind => 'apple'}}, {:fruit => {:kind => 'banana'}}]

Marshal.load(Marshal.dump arr).each do |h|
  h[:fruit][:kind] = 'coconut'
end

If it's ugly, you can try to beautify it...

class Object
  def deep_copy
    Marshal.load(Marshal.dump self)
  end
end

arr.deep_copy.each do |h|
  h[:fruit][:kind] = 'coconut'
end

Abinoam Jr.

···

On Tue, May 3, 2011 at 10:08 PM, John Feminella <johnf@bitsbuilder.com> wrote:

Your example doesn't contain nested hashes, while mine does. That's
what I was demonstrating -- it's a shallow copy, not a deep one.

~ jf
--
John Feminella
Principal Consultant, BitsBuilder
LI: http://www.linkedin.com/in/johnxf
SO: User John Feminella - Stack Overflow

On Tue, May 3, 2011 at 22:00, 7stud -- <bbxx789_05ss@yahoo.com> wrote:

John Feminella wrote in post #996498:

On Tue, May 3, 2011 at 20:34, 7stud -- <bbxx789_05ss@yahoo.com> wrote:

Peter Hicks wrote in post #996483:

What is the magic I am missing?

clone():

`clone` doesn't cut it, since it's creating shallow copies. An
illustration of the problem:

Really? Look at my output. It cuts it just fine.

--
Posted via http://www.ruby-forum.com/\.

Your example doesn't contain nested hashes, while mine does.

Neither did Peter's example and request.

That's what I was demonstrating -- it's a shallow copy, not a deep
one.

Simply using #dup or #clone meets the original request as presented,
and does so for arbitrary Ruby objects in the hashes.

If the Peter needs a deep copy, which Peter didn't ask for (just a
copy of the data in each hash in the array -- not nested hashes -- so
that the originals in the area wouldn't be modified on iteration) its
true that simply using #dup or #clone won't work.

OTOH, if Peter needs to be able to handle arbitrary Ruby objects in
the hashes, using Marshal.load(Marshal.dump(whatever)) won't work,
since there are objects that can't be dumped.

What if the values in the hash are lambdas?

The load/dump mechanism meets some other need, but doesn't meet the
original request for hashes with arbitrary data (and is an
extraordinarily convoluted mechanism for meeting the original request
even where it works.)

Its possible to craft a generic deep-copying mechanism, if you need
it, but Marshal.load(Marshal.dump(whatever)) isn't it.

···

On Tue, May 3, 2011 at 7:08 PM, John Feminella <johnf@bitsbuilder.com> wrote:

>
> Your example doesn't contain nested hashes, while mine does.

Neither did Peter's example and request.

His example was an array with nested hashes -- that is, hashes nested in
an array. It was not hashes nested in hashes, but I do not think that
was what John meant anyway.

>
> That's what I was demonstrating -- it's a shallow copy, not a deep
> one.

Simply using #dup or #clone meets the original request as presented,
and does so for arbitrary Ruby objects in the hashes.

If the Peter needs a deep copy, which Peter didn't ask for (just a copy
of the data in each hash in the array -- not nested hashes -- so that
the originals in the area wouldn't be modified on iteration) its true
that simply using #dup or #clone won't work.

I went back and read the original request. While he did not use the
words "deep copy", the implication of his request seemed pretty clearly
to ask for exactly that, in my estimation. He referred to an array of
hashes; he referred to things being "references" rather than "copies"
(the more formal terminology for those with a pedantic bent would be
"reference copies rather than value copies"); and he referred to the
desire to be able to operate on his data structure in a loop, making
changes, and have those changes lost when the loop exits rather than
saved in the data structure that existed before copying.

The implications of these requirements add up to a request for a way to
get a deep copy.

OTOH, if Peter needs to be able to handle arbitrary Ruby objects in the
hashes, using Marshal.load(Marshal.dump(whatever)) won't work, since
there are objects that can't be dumped.

It meets the need of the example data for the requirements indicated
above. I am not aware of any (relatively trivial) solution to this much
broader problem you're suggesting.

What if the values in the hash are lambdas?

Then yeah, the Marshal dump/load approach doesn't work. It *does* work
for the presented example, though, whereas (given the implications of the
requests in the original questions) your dup-or-clone approach does *not*
work.

The load/dump mechanism meets some other need, but doesn't meet the
original request for hashes with arbitrary data (and is an
extraordinarily convoluted mechanism for meeting the original request
even where it works.)

1. It meets the needs of the presented example data.

2. Is there some less-convoluted approach that works for the more complex
needs you presented?

3. How exactly does this make a dup-or-clone approach work such as what
you presented work?

Its possible to craft a generic deep-copying mechanism, if you need it,
but Marshal.load(Marshal.dump(whatever)) isn't it.

It is not clear to me "generic deep-copying" is exactly what's needed.

···

On Wed, May 04, 2011 at 12:28:18PM +0900, Christopher Dicely wrote:

On Tue, May 3, 2011 at 7:08 PM, John Feminella <johnf@bitsbuilder.com> > wrote:

--
Chad Perrin [ original content licensed OWL: http://owl.apotheon.org ]

Hi,
I'm running a pc with Windows Vista...
My email flags your emails with 'phishing' warning...
Thought you might want to know...
Good day

···

----- Original Message ----- From: "Chad Perrin" <code@apotheon.net>
To: "ruby-talk ML" <ruby-talk@ruby-lang.org>
Sent: Wednesday, May 04, 2011 11:18 AM
Subject: Re: Iterating over an Array of Hashes

What if the values in the hash are lambdas?

Then yeah, the Marshal dump/load approach doesn't work. It *does* work
for the presented example, though, whereas (given the implications of the
requests in the original questions) your dup-or-clone approach does *not*
work.

The #clone based method presented upthread (which wasn't mine, it was
7stud's) works perfectly for the data and requirements presented. It
doesn't work if, in addition to wanting to lose changes to the passed
hashes, one also wants to lose any changes resulting from manipulation
of the data in the hashes, but that wasn't requested.

That is, since you work on a local copy of each hash, mutating methods
called on the hash won't have any lasting effect, but mutating methods
called on keys or values extracted from the (local copy of the) hash
will affect the objects stored in the original hash. Avoiding this
kind of effect was not requested.

The load/dump mechanism meets some other need, but doesn't meet the
original request for hashes with arbitrary data (and is an
extraordinarily convoluted mechanism for meeting the original request
even where it works.)

1. It meets the needs of the presented example data.

Sure, but its way overkill for it. You only need deep copies if you
have requirements that weren't stated (avoiding effects from mutating
methods called not on the hashes being passed but on the keys or
values of the hashes.)

2. Is there some less-convoluted approach that works for the more complex
needs you presented?

I'm not sure what relevance this has. A deep copying solution that
handles arbitrary rather than merely serializable objects at the
fringes is going to be more complex than
Marshal.load(Marshal.dump(...)), but since either approach is more
convoluted than what is needed here, I don't see why that matters here
(and, since Marshal.load(Marshal.dump(...)) doesn't work for the cases
where you'd need the more sophisticated deep copying solution, I don't
see why the fact that the latter would be more complex would matter
there, either -- a simpler approach that doesn't work is still no
solution at all.)

3. How exactly does this make a dup-or-clone approach work such as what
you presented work?

"this" doesn't make the dup-or-clone approach work, it is orthogonal
to the fact that the dup-or-clone approach, as presented (in #clone
form) by 7stud upthread _does_ work for the scenario presented.

The #clone based method presented upthread (which wasn't mine, it was
7stud's) works perfectly for the data and requirements presented. It
doesn't work if, in addition to wanting to lose changes to the passed
hashes, one also wants to lose any changes resulting from manipulation
of the data in the hashes, but that wasn't requested.

Are you seriously claiming that the person's statements seemed to you to
indicate a desire for the data to change in the original data structure?

Seriously?

That is, since you work on a local copy of each hash, mutating methods
called on the hash won't have any lasting effect, but mutating methods
called on keys or values extracted from the (local copy of the) hash
will affect the objects stored in the original hash. Avoiding this kind
of effect was not requested.

It really seemed implied from where I was sitting.

Sure, but its way overkill for it. You only need deep copies if you
have requirements that weren't stated (avoiding effects from mutating
methods called not on the hashes being passed but on the keys or values
of the hashes.)

I'm not sure why you're so pedantically splitting hairs when the actual
intent seemed pretty obvious: no changes.

>
> 2. Is there some less-convoluted approach that works for the more
> complex needs you presented?

I'm not sure what relevance this has.

I think you're playing dumb.

···

On Thu, May 05, 2011 at 04:17:53AM +0900, Christopher Dicely wrote:

--
Chad Perrin [ original content licensed OWL: http://owl.apotheon.org ]

Thank you.

I'm trying to get this resolved with a service provider. Apparently the
provider's mail server IP address has ended up on a blacklist or two for
some reason completely unrelated to me.

Just in case your phishing warning is unrelated to that, though -- are
you sure it's not related to the fact that my emails are digitally
signed? I've noticed that MS Windows users sometimes have problems with
digital signature attachments being marked as malware or otherwise
misidentified as some kind of threat.

···

On Thu, May 05, 2011 at 01:35:47AM +0900, Patrick Lynch wrote:

I'm running a pc with Windows Vista...
My email flags your emails with 'phishing' warning...
Thought you might want to know...

--
Chad Perrin [ original content licensed OWL: http://owl.apotheon.org ]

No, I'm claiming that the original requester's statement didn't
indicate a need to protect against changes resulting from calling
mutating methods on the keys or values of the hashes, only protection
from changes resulting from mutations on the hash.

That could be because changes to things held in the hash were supposed
to be propagated, or it (perhaps more likely) could be because the
code in which it was to be used wasn't going to be calling mutating
methods on the keys or values from the hash in any case.

In the former case, a deep copy would be wrong, in the latter case it
would merely be unnecessary.

···

On Wed, May 4, 2011 at 1:50 PM, Chad Perrin <code@apotheon.net> wrote:

On Thu, May 05, 2011 at 04:17:53AM +0900, Christopher Dicely wrote:

The #clone based method presented upthread (which wasn't mine, it was
7stud's) works perfectly for the data and requirements presented. It
doesn't work if, in addition to wanting to lose changes to the passed
hashes, one also wants to lose any changes resulting from manipulation
of the data in the hashes, but that wasn't requested.

Are you seriously claiming that the person's statements seemed to you to
indicate a desire for the data to change in the original data structure?

2 solutions have been presented. I think the original poster's question was
sufficiently vague to make both of the solutions valid.

So, original poster: what were your intentions? Did you need the only the
hashes in the array preserved, or did you ALSO need the contents of the
hashes in the array preserved? If the former, then clone/dup is fine. If
the latter, then you need some kind of deep copy. The "simplest" deep copy
I can think of is the Marshal#dump/load one.

Saludos,
Doug

···

On Wed, May 4, 2011 at 1:50 PM, Chad Perrin <code@apotheon.net> wrote:

On Thu, May 05, 2011 at 04:17:53AM +0900, Christopher Dicely wrote:
>
> The #clone based method presented upthread (which wasn't mine, it was
> 7stud's) works perfectly for the data and requirements presented. It
> doesn't work if, in addition to wanting to lose changes to the passed
> hashes, one also wants to lose any changes resulting from manipulation
> of the data in the hashes, but that wasn't requested.

Are you seriously claiming that the person's statements seemed to you to
indicate a desire for the data to change in the original data structure?

Seriously?

>
> That is, since you work on a local copy of each hash, mutating methods
> called on the hash won't have any lasting effect, but mutating methods
> called on keys or values extracted from the (local copy of the) hash
> will affect the objects stored in the original hash. Avoiding this kind
> of effect was not requested.

It really seemed implied from where I was sitting.

>
> Sure, but its way overkill for it. You only need deep copies if you
> have requirements that weren't stated (avoiding effects from mutating
> methods called not on the hashes being passed but on the keys or values
> of the hashes.)

I'm not sure why you're so pedantically splitting hairs when the actual
intent seemed pretty obvious: no changes.

> >
> > 2. Is there some less-convoluted approach that works for the more
> > complex needs you presented?
>
> I'm not sure what relevance this has.

I think you're playing dumb.

--
Chad Perrin [ original content licensed OWL: http://owl.apotheon.org ]

>>
>> The #clone based method presented upthread (which wasn't mine, it was
>> 7stud's) works perfectly for the data and requirements presented. It
>> doesn't work if, in addition to wanting to lose changes to the passed
>> hashes, one also wants to lose any changes resulting from manipulation
>> of the data in the hashes, but that wasn't requested.
>
> Are you seriously claiming that the person's statements seemed to you to
> indicate a desire for the data to change in the original data structure?

No, I'm claiming that the original requester's statement didn't
indicate a need to protect against changes resulting from calling
mutating methods on the keys or values of the hashes, only protection
from changes resulting from mutations on the hash.

He didn't specify "only protection from changes resulting from mutations
on the hash." He said he didn't want his actions to change his data
structure, in a very general way. Drawing distinctions between the data
itself and the "containers" in which the data resides seems like a case
of inventing complexity in the original request that were not indicated.

That could be because changes to things held in the hash were supposed
to be propagated, or it (perhaps more likely) could be because the code
in which it was to be used wasn't going to be calling mutating methods
on the keys or values from the hash in any case.

In the former case, a deep copy would be wrong, in the latter case it
would merely be unnecessary.

. . . and in the general "I don't want to permanently change anything"
case, which seems most likely, it would be necessary.

···

On Thu, May 05, 2011 at 06:21:00AM +0900, Christopher Dicely wrote:

On Wed, May 4, 2011 at 1:50 PM, Chad Perrin <code@apotheon.net> wrote:
> On Thu, May 05, 2011 at 04:17:53AM +0900, Christopher Dicely wrote:

--
Chad Perrin [ original content licensed OWL: http://owl.apotheon.org ]

Does anyone want to make a bet with me? I bet $10 he says he doesn't
want the actual data in the hash to change, if he comes back and says
anything one way or the other. My first clue is where he starts out by
talking about copying the *data* -- using the word "data" literally --
without simply creating a reference to the original data.

···

On Thu, May 05, 2011 at 06:38:57AM +0900, Douglas Seifert wrote:

2 solutions have been presented. I think the original poster's question was
sufficiently vague to make both of the solutions valid.

So, original poster: what were your intentions? Did you need the only the
hashes in the array preserved, or did you ALSO need the contents of the
hashes in the array preserved? If the former, then clone/dup is fine. If
the latter, then you need some kind of deep copy. The "simplest" deep copy
I can think of is the Marshal#dump/load one.

--
Chad Perrin [ original content licensed OWL: http://owl.apotheon.org ]

All,

Apologies, I lost track of the thread.

So, original poster: what were your intentions? Did you need the only the
hashes in the array preserved, or did you ALSO need the contents of the
hashes in the array preserved? If the former, then clone/dup is fine. If
the latter, then you need some kind of deep copy. The "simplest" deep copy
I can think of is the Marshal#dump/load one.

I just needed the hashes in the array preserved. I've done this
with .clone now, and it works like a treat. I now have other problems
over ActiveRecord's slowness and various other parts of the "correct but
slow" code :slight_smile:

Thanks for your help - this was one area in which I truly had a surprise
when my unit tests failed!

Peter

···

On Thu, 2011-05-05 at 06:38 +0900, Douglas Seifert wrote:

--
Peter Hicks <peter.hicks@poggs.co.uk>

Yes, which is why I used the word "indicated" rather than "specified";
those words have very different meanings.

I certainly think that, as at one commenter has stated, the original
request was sufficiently ambiguous to support different readings. I
think we've addressed the relative utility of the various options that
have been present sufficiently that the OP (or others with similar
concerns) can make up their minds about what approach is best for
their use cases, or ask cogent follow-up questions if they need more
information.

Clearly, we disagree on what the best interpretation of the original
posters requirements is, but I think we are clearly past the point
where further discussion of that disagreement has any value for anyone
reading.

···

On Wed, May 4, 2011 at 3:17 PM, Chad Perrin <code@apotheon.net> wrote:

On Thu, May 05, 2011 at 06:21:00AM +0900, Christopher Dicely wrote:

On Wed, May 4, 2011 at 1:50 PM, Chad Perrin <code@apotheon.net> wrote:
> On Thu, May 05, 2011 at 04:17:53AM +0900, Christopher Dicely wrote:
>>
>> The #clone based method presented upthread (which wasn't mine, it was
>> 7stud's) works perfectly for the data and requirements presented. It
>> doesn't work if, in addition to wanting to lose changes to the passed
>> hashes, one also wants to lose any changes resulting from manipulation
>> of the data in the hashes, but that wasn't requested.
>
> Are you seriously claiming that the person's statements seemed to you to
> indicate a desire for the data to change in the original data structure?

No, I'm claiming that the original requester's statement didn't
indicate a need to protect against changes resulting from calling
mutating methods on the keys or values of the hashes, only protection
from changes resulting from mutations on the hash.

He didn't specify "only protection from changes resulting from mutations
on the hash."