Help traversing and modifying hash key and value inplace

Having a little trouble here. seems that I'm getting some errors trying
to dup certain values (like FIXNUM), or if I try clone it says that I
can't modify a frozen string. Is there a way to achieve this
functionality? Here's the code:

class Hash

   # Returns a new hash created by traversing the hash and its
   # subhashes, executing the given block on each key/value pair.

···

#
   # h = { "A"=>"A", "B"=>"B" }
   # h = h.traverse { |k,v| k.downcase! }
   # h #=> { "a"=>"A", "b"=>"B" }
   #
   def traverse( &yld )
     h = {}
     self.each_pair do |k,v|
       q = k.dup
       f = v.dup
       if f.kind_of?(Hash)
         h[q] = f.traverse( &yld )
       else
         yield(q, f)
         h[q] = f
       end
     end
     return h
   end

end

Thanks,
T.

"Trans" <transfire@gmail.com> schrieb im Newsbeitrag news:1112899357.671275.91150@o13g2000cwo.googlegroups.com...

Having a little trouble here. seems that I'm getting some errors trying
to dup certain values (like FIXNUM), or if I try clone it says that I
can't modify a frozen string. Is there a way to achieve this
functionality? Here's the code:

class Hash

  # Returns a new hash created by traversing the hash and its
  # subhashes, executing the given block on each key/value pair.
  #
  # h = { "A"=>"A", "B"=>"B" }
  # h = h.traverse { |k,v| k.downcase! }
  # h #=> { "a"=>"A", "b"=>"B" }
  #
  def traverse( &yld )
    h = {}
    self.each_pair do |k,v|
      q = k.dup
      f = v.dup
      if f.kind_of?(Hash)
        h[q] = f.traverse( &yld )
      else
        yield(q, f)
        h[q] = f
      end
    end
    return h
  end

end

Unfortunately not all objects can be dupe'd or cloned. I wished that #dup would return self in these cases but Matz decided to do it otherwise (both alternatives have their merits). You need special treatment for that to work. Alternatively use Marshal.

Another problem of your implementation is that it doesn't take recursive structures into account. I also find the copying of values inside the method sub optimal: there might be cases where you don't want or need copies (for example if you add 1 to all keys and values, which creates new instances anyway). So I'd leave the copying to the discretion of the block.

If I wanted to do the conversion you did, I'd probably do this:

h.inject({}){|h,(k,v)| h[k.downcase] = v; h}

(Yes, I know these are not exactly equivalent.)

Just some 0.02EUR...

Kind regards

    robert

Thanks Robert.

I agree about returning self. Since dup can't be used when the object
is immutable (that's the condition isn't it?) then it would seem
reasonable to return self.

I'm not sure what recursive structures you mean. And I don't see how it
could work if the duping is left to the block. How could a new hash be
built up then?

Hmm... perhaps it would work better if a hash parameter were fed into
the block itself and then that could be used?

  h = h.traverse { |n,k,v| n[k.downcase] = v }

Would that a better approach?

BTW, somone sent me another version similar to my original, but using
inject and somehow getting around the dup problem like so:

  def traverse(&b)
    inject({}) do |h,kv|
      k,v = kv.first.dup, kv.last.dup
      b[k,v]
      h[k] = (Hash === v ? v.traverse(&b) : v)
      h
    end
  end

T.

"Trans" <transfire@gmail.com> schrieb im Newsbeitrag news:1113017766.220202.153150@f14g2000cwb.googlegroups.com...

Thanks Robert.

I agree about returning self. Since dup can't be used when the object
is immutable (that's the condition isn't it?) then it would seem
reasonable to return self.

Yeah, but in that case other applications can break because they think they have a new instance while they have not. As I said, both approaches have their merits - a classical dilemma. :slight_smile:

I'm not sure what recursive structures you mean. And I don't see how it
could work if the duping is left to the block. How could a new hash be
built up then?

No, you need to create a new Hash instance. But you don't necessarily need to dup keys and values. That's the dup I suggested to leave to the block.

Hmm... perhaps it would work better if a hash parameter were fed into
the block itself and then that could be used?

h = h.traverse { |n,k,v| n[k.downcase] = v }

Would that a better approach?

I wouldn't do that.

BTW, somone sent me another version similar to my original, but using
inject and somehow getting around the dup problem like so:

def traverse(&b)
   inject({}) do |h,kv|
     k,v = kv.first.dup, kv.last.dup
     b[k,v]
     h[k] = (Hash === v ? v.traverse(&b) : v)
     h
   end
end

That's not a solution for the dup problem, as "k,v = kv.first.dup, kv.last.dup" will throw anyway.

If you really always need copies then the easiest might acutally be to use Marshal and work on the copy. Marshal has solved all the problems of recursion and duping already - so why do the work twice?

Kind regards

    robert

Robert Klemme wrote:

"Trans" <transfire@gmail.com> schrieb im Newsbeitrag
news:1113017766.220202.153150@f14g2000cwb.googlegroups.com...
> Thanks Robert.
>
> I agree about returning self. Since dup can't be used when the

object

> is immutable (that's the condition isn't it?) then it would seem
> reasonable to return self.

Yeah, but in that case other applications can break because they

think they

have a new instance while they have not. As I said, both approaches

have

their merits - a classical dilemma. :slight_smile:

Could Marshal be used to remedy this?

> I'm not sure what recursive structures you mean. And I don't see

how it

> could work if the duping is left to the block. How could a new hash

be

> built up then?

No, you need to create a new Hash instance. But you don't

necessarily need

to dup keys and values. That's the dup I suggested to leave to the

block.

Okay, I'll give this some more thought.

> Hmm... perhaps it would work better if a hash parameter were fed

into

> the block itself and then that could be used?
>
> h = h.traverse { |n,k,v| n[k.downcase] = v }
>
> Would that a better approach?

I wouldn't do that.

No good ey? I was thinking of it would work something like inject. But
maybe this over complexifies the problem.

That's not a solution for the dup problem, as "k,v = kv.first.dup,
kv.last.dup" will throw anyway.

Oops. Your right, still some problems there.

If you really always need copies then the easiest might acutally be

to use

Marshal and work on the copy. Marshal has solved all the problems of

recursion and duping already - so why do the work twice?

Okay, I'll give that a go too. Thanks, robert.

T.

So here's what I've settled on. Good? Or is there still a better way to
do?

# Returns a new hash created by traversing the hash and its subhashes,
  # executing the given block on the key and value. The block should
  # return a 2-element array of the form +[key, value]+.

···

#
  # h = { "A"=>"A", "B"=>"B" }
  # h.traverse { |k,v| [k.downcase, v] }
  # h #=> { "a"=>"A", "b"=>"B" }
  #
  def traverse(&b)
    inject({}) do |h,(k,v)|
      nk, nv = b[k,v]
      h[nk] = (Hash === nv ? nv.traverse(&b) : nv)
      h
    end
  end

"Trans" <transfire@gmail.com> schrieb im Newsbeitrag news:1113350541.397689.163850@l41g2000cwc.googlegroups.com...

So here's what I've settled on. Good? Or is there still a better way to
do?

# Returns a new hash created by traversing the hash and its subhashes,
# executing the given block on the key and value. The block should
# return a 2-element array of the form +[key, value]+.
#
# h = { "A"=>"A", "B"=>"B" }
# h.traverse { |k,v| [k.downcase, v] }
# h #=> { "a"=>"A", "b"=>"B" }
#
def traverse(&b)
   inject({}) do |h,(k,v)|
     nk, nv = b[k,v]
     h[nk] = (Hash === nv ? nv.traverse(&b) : nv)
     h
   end
end

There's a subtle thing: you decide whether you traverse on the other instance *after* the block got it. Dunno whether that is what you want but I think I remember the other version decided this based on the original value. You probably want to leave that out altogether and have the block decide this. In this case traverse would become even simpler.

Ah, and traverse will crash for recursive structures.

h={}

=> {}

h[1]=h

=> {1=>{...}}

h.traverse {|*a| a}

SystemStackError: stack level too deep
        from (irb):5:in `traverse'
        from (irb):3:in `inject'
        from (irb):3:in `each'
        from (irb):3:in `inject'
        from (irb):3:in `traverse'
        from (irb):5:in `traverse'
        from (irb):3:in `inject'
        from (irb):3:in `each'
        from (irb):3:in `inject'
        from (irb):3:in `traverse'
        from (irb):5:in `traverse'
        from (irb):3:in `inject'
        from (irb):3:in `each'
        from (irb):3:in `inject'
        from (irb):3:in `traverse'
        from (irb):5:in `traverse'
.... 11844 levels...
        from (irb):5:in `traverse'
        from (irb):3:in `inject'
        from (irb):3:in `each'
        from (irb):3:in `inject'
        from (irb):3:in `traverse'
        from (irb):5:in `traverse'
        from (irb):3:in `inject'
        from (irb):3:in `each'
        from (irb):3:in `inject'
        from (irb):3:in `traverse'
        from (irb):5:in `traverse'
        from (irb):3:in `inject'
        from (irb):3:in `each'
        from (irb):3:in `inject'
        from (irb):3:in `traverse'
        from (irb):12>>

:slight_smile:

Kind regards

    robert

Robert Klemme wrote:

There's a subtle thing: you decide whether you traverse on the other
instance *after* the block got it. Dunno whether that is what you

want but

I think I remember the other version decided this based on the

original

value.

Good catch. Thank you, so:

  def traverse(&b)
    inject({}) do |h,(k,v)|
      nk, nv = b[k,v]
      h[nk] = (Hash === v ? v.traverse(&b) : nv)
      h
    end
  end

You probably want to leave that out altogether and have the block
decide this. In this case traverse would become even simpler.

Hmmm... not sure what "this" refers to. Assuming you mean, whether to
traverse or not. Something like:

  def traverse(&b)
    inject({}) do |h,(k,v)|
      nk, nv = b[k,v]
      h[nk] = nv
      h
    end
  end

Is that right? In that case I wouldn't call it "traverse" though --more
like a hash verion of collect. Actually I think I already have that in
the libs as Enumerable#build_hash.

Ah, and traverse will crash for recursive structures.

Should I go through the complexity of preventing that? Well, at very
least I will note it in the docs.

Thanks robert,
T.

Sigh, sorry if my last message is garbled --blame Google groups.

BTW, speaking of #build_hash: Anyone have a "smoother" name? I was
thinking possibly #remap.

Which leads me to wonder, actually, why is #map the same as #collect?
Always struck me as strange that those two words would be considered
synonymous. I'm guessing its derivative from another language?

T.