Help me improve Hash#rekey

The note `TODO: Improve Hash#rekey code!!!` has been in my docs for too
long. I could use other's insights and thought others might enjoy the
challenge. So here's the code:

  require 'facets/na'

  class Hash

    # Rekey a hash:

···

#
    # rekey()
    # rekey(from_key => to_key, ...)
    # rekey{|from_key| to_key}
    # rekey{|from_key, value| to_key}
    #
    # If a key map is given, then the first key is changed to the second
key.
    #
    # foo = { :a=>1, :b=>2 }
    # foo.rekey(:a=>'a') #=> { 'a'=>1, :b=>2 }
    # foo.rekey(:b=>:x) #=> { :a =>1, :x=>2 }
    # foo.rekey('foo'=>'bar') #=> { :a =>1, :b=>2 }
    #
    # If a block is given, converts all keys in the Hash accroding to the
    # given block procedure. If the block returns +NA+ for a given key,
    # then that key will be left intact.
    #
    # foo = { :name=>'Gavin', :wife=>:Lisa }
    # foo.rekey{ |k| k.to_s } #=> { "name"=>"Gavin", "wife"=>:Lisa }
    # foo #=> { :name =>"Gavin", :wife=>:Lisa }
    #
    # If no key map or block is given, then all keys are converted
    # to Symbols.
    #
    # Note that if both a +key_map+ and a block are given, the +key_map+ is
    # applied first then the block.
    #
    # CREDIT: Trans, Gavin Kistner

    def rekey(key_map=nil, &block)
      if !(key_map or block)
        block = lambda{|k| k.to_sym}
      end

      key_map ||= {}

      hash = dup.replace({}) # to keep default_proc

      (keys - key_map.keys).each do |key|
        hash[key] = self[key]
      end

      key_map.each do |from, to|
        hash[to] = self[from] if key?(from)
      end

      if block
        hash2 = dup.replace({})
        case block.arity
        when 2 # TODO: is this condition needed?
          hash.each do |k, v|
            nk = block.call(k,v)
            nk = (NA == nk ? k : nk)
            hash2[nk] = v
          end
        else
          hash.each do |k, v|
            nk = block.call(k)
            nk = (NA == nk ? k : nk)
            hash2[nk] = v
          end
        end
      else
        hash2 = hash
      end

      hash2
    end

    # Synonym for Hash#rekey, but modifies the receiver in place (and
returns it).
    #
    # foo = { :name=>'Gavin', :wife=>:Lisa }
    # foo.rekey!{ |k| k.to_s } #=> { "name"=>"Gavin", "wife"=>:Lisa }
    # foo #=> { "name"=>"Gavin", "wife"=>:Lisa }
    #
    # CREDIT: Trans, Gavin Kistner

    def rekey!(key_map=nil, &block)
      replace(rekey(key_map, &block))
    end

  end

Not the use of `facets/na`. That is defined as:

    class << NA = ArgumentError.new
      def inspect ; 'N/A' ; end
      def method_missing(*); self; end
    end

But it is really nothing more than a dummy object used to mean Not
Applicable. So in the case of #rekey, if the block returns NA then the key
goes unchanged. Thinking about it again now, it's probably unnecessary, but
I had wanted a way to say "leave it alone" while also making sure that
`nil` could still be used as a key (even if that's rare). Feel free to
remove the NA business, but if you do please explain why you think its not
needed.

Best solution will get their name put in front of CREDITs for the next
release of Facets.

The note `TODO: Improve Hash#rekey code!!!` has been in my docs for too
long. I could use other's insights and thought others might enjoy the
challenge. So here's the code:

I think the problem is in the API. That's what makes it too complex.
There are too many cases of specific handling and options (see below):

  require 'facets/na'

  class Hash

    # Rekey a hash:
    #
    # rekey()
    # rekey(from_key => to_key, ...)
    # rekey{|from_key| to_key}
    # rekey{|from_key, value| to_key}
    #
    # If a key map is given, then the first key is changed to the second
key.
    #
    # foo = { :a=>1, :b=>2 }
    # foo.rekey(:a=>'a') #=> { 'a'=>1, :b=>2 }
    # foo.rekey(:b=>:x) #=> { :a =>1, :x=>2 }
    # foo.rekey('foo'=>'bar') #=> { :a =>1, :b=>2 }
    #
    # If a block is given, converts all keys in the Hash accroding to the
    # given block procedure. If the block returns +NA+ for a given key,
    # then that key will be left intact.

1. Unnecessary option: if the key is supposed to stay intact the block
should just return the original key.

    #
    # foo = { :name=>'Gavin', :wife=>:Lisa }
    # foo.rekey{ |k| k.to_s } #=> { "name"=>"Gavin", "wife"=>:Lisa }
    # foo #=> { :name =>"Gavin", :wife=>:Lisa }
    #
    # If no key map or block is given, then all keys are converted
    # to Symbols.

2. Why that default? In my mind this is too much implicit logic.
Also this can be easily achieved with

hash.rekey(&:to_sym)

    #
    # Note that if both a +key_map+ and a block are given, the +key_map+ is
    # applied first then the block.

3. I would change that to EITHER block OR map argument, but not both.

Not the use of `facets/na`. That is defined as:

    class << NA = ArgumentError.new
      def inspect ; 'N/A' ; end
      def method_missing(*); self; end
    end

But it is really nothing more than a dummy object used to mean Not
Applicable. So in the case of #rekey, if the block returns NA then the key
goes unchanged. Thinking about it again now, it's probably unnecessary, but
I had wanted a way to say "leave it alone" while also making sure that `nil`
could still be used as a key (even if that's rare). Feel free to remove the
NA business, but if you do please explain why you think its not needed.

If one needs a special key one can use a Symbol for that as
efficiently as nil. nil is the value return if something is absent
and I believe it does not make for a good key in a Hash.

Best solution will get their name put in front of CREDITs for the next
release of Facets.

:slight_smile:

class Hash
  def rekey(mapping = nil, &convert)
    c = convert || mapping
    dup.tap do |h| # preserve type and defaults
      h.clear
      each_pair {|k, v| h[c[k] || k] = v}
    end
  end
end

Kind regards

robert

···

On Sun, Dec 9, 2012 at 7:19 PM, Intransition <transfire@gmail.com> wrote:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

> The note `TODO: Improve Hash#rekey code!!!` has been in my docs for too
> long. I could use other's insights and thought others might enjoy the
> challenge. So here's the code:

I think the problem is in the API. That's what makes it too complex.
There are too many cases of specific handling and options (see below):

1. Unnecessary option: if the key is supposed to stay intact the block
should just return the original key.

I agree. Just wanted someone else to confirm. Out it goes!

> #
> # foo = { :name=>'Gavin', :wife=>:Lisa }
> # foo.rekey{ |k| k.to_s } #=> { "name"=>"Gavin", "wife"=>:Lisa }
> # foo #=> { :name =>"Gavin", :wife=>:Lisa }
> #
> # If no key map or block is given, then all keys are converted
> # to Symbols.

2. Why that default? In my mind this is too much implicit logic.
Also this can be easily achieved with

hash.rekey(&:to_sym)

I can understand that. I used the default b/c the vast majority of the time
I was using it to convert to symbols. Rather then have yet another method
like ActiveSupport's `symbolize_keys` it made more sense to me to just give
#rekey this as the default.

I can't really take that back now. It's been too long part of the API.

> #
> # Note that if both a +key_map+ and a block are given, the +key_map+
is
> # applied first then the block.

3. I would change that to EITHER block OR map argument, but not both.

Yea, I thought about that when I added support for the mapping. That was
actually something that came later that the original block form. At first I
had thought about making an entirely different method, but then thought
that was a waste and that it made more sense as part of #rekey. So when I
first added it I did an "either or", just as you suggest. But then I
thought "why?" it certainly can handle both even if people will almost
never use both.

What do you think? Does it really matters enough to change it now? Like I
said, I doubt anyone has used both, so this is something that could be
change if it really is worth it.

> Not the use of `facets/na`. That is defined as:
>
> class << NA = ArgumentError.new
> def inspect ; 'N/A' ; end
> def method_missing(*); self; end
> end
>
> But it is really nothing more than a dummy object used to mean Not
> Applicable. So in the case of #rekey, if the block returns NA then the
key
> goes unchanged. Thinking about it again now, it's probably unnecessary,
but
> I had wanted a way to say "leave it alone" while also making sure that
`nil`
> could still be used as a key (even if that's rare). Feel free to remove
the
> NA business, but if you do please explain why you think its not needed.

If one needs a special key one can use a Symbol for that as
efficiently as nil. nil is the value return if something is absent
and I believe it does not make for a good key in a Hash.

Wouldn't use symbols b/c then you have a special exception. NA was made
just for such cases. But you probably right that `nil` doesn't make a good
hash key no matter what. Nonetheless, it doesn't really matter b/c as you
said above, they can just return the original key.

> Best solution will get their name put in front of CREDITs for the next
> release of Facets.

:slight_smile:

class Hash
  def rekey(mapping = nil, &convert)
    c = convert || mapping
    dup.tap do |h| # preserve type and defaults
      h.clear
      each_pair {|k, v| h[c[k] || k] = v}
    end
  end
end

Sweet. Much smaller than mine, that's for damn sure!!! Put the default
:to_sym back in and we could have a deal :slight_smile:

I'm need to test and benchmark it first though.

Oh, and nice use of polymorphism using # for both proc and hash retrieval!

···

On Monday, December 10, 2012 7:22:17 AM UTC-5, Robert Klemme wrote:

On Sun, Dec 9, 2012 at 7:19 PM, Intransition <tran...@gmail.com<javascript:>> > wrote:

> The note `TODO: Improve Hash#rekey code!!!` has been in my docs for too
> long. I could use other's insights and thought others might enjoy the
> challenge. So here's the code:

I think the problem is in the API. That's what makes it too complex.
There are too many cases of specific handling and options (see below):

1. Unnecessary option: if the key is supposed to stay intact the block
should just return the original key.

I agree. Just wanted someone else to confirm. Out it goes!

:slight_smile:

> #
> # foo = { :name=>'Gavin', :wife=>:Lisa }
> # foo.rekey{ |k| k.to_s } #=> { "name"=>"Gavin", "wife"=>:Lisa }
> # foo #=> { :name =>"Gavin", :wife=>:Lisa }
> #
> # If no key map or block is given, then all keys are converted
> # to Symbols.

2. Why that default? In my mind this is too much implicit logic.
Also this can be easily achieved with

hash.rekey(&:to_sym)

I can understand that. I used the default b/c the vast majority of the time
I was using it to convert to symbols. Rather then have yet another method
like ActiveSupport's `symbolize_keys` it made more sense to me to just give
#rekey this as the default.

I can't really take that back now. It's been too long part of the API.

Well, then you mustn't change *anything* which affects observable behavior.

> #
> # Note that if both a +key_map+ and a block are given, the +key_map+
> is
> # applied first then the block.

3. I would change that to EITHER block OR map argument, but not both.

Yea, I thought about that when I added support for the mapping. That was
actually something that came later that the original block form. At first I
had thought about making an entirely different method, but then thought that
was a waste and that it made more sense as part of #rekey. So when I first
added it I did an "either or", just as you suggest. But then I thought
"why?" it certainly can handle both even if people will almost never use
both.

What do you think? Does it really matters enough to change it now? Like I
said, I doubt anyone has used both, so this is something that could be
change if it really is worth it.

I would get rid of that behavior. After all, if someone wants do do
Hash lookups as part of conversion they can still do that in the block
without much effort.

I also noticed a slight asymmetry between block and Hash: a block will
return something for every key passed in. But since a Hash is fixed
at the time of method call (modulo default_proc of course) there is
not really a way to handle absent keys ad hoc.

> Not the use of `facets/na`. That is defined as:
>
> class << NA = ArgumentError.new
> def inspect ; 'N/A' ; end
> def method_missing(*); self; end
> end
>
> But it is really nothing more than a dummy object used to mean Not
> Applicable. So in the case of #rekey, if the block returns NA then the
> key
> goes unchanged. Thinking about it again now, it's probably unnecessary,
> but
> I had wanted a way to say "leave it alone" while also making sure that
> `nil`
> could still be used as a key (even if that's rare). Feel free to remove
> the
> NA business, but if you do please explain why you think its not needed.

If one needs a special key one can use a Symbol for that as
efficiently as nil. nil is the value return if something is absent
and I believe it does not make for a good key in a Hash.

Wouldn't use symbols b/c then you have a special exception. NA was made just
for such cases. But you probably right that `nil` doesn't make a good hash
key no matter what. Nonetheless, it doesn't really matter b/c as you said
above, they can just return the original key.

> Best solution will get their name put in front of CREDITs for the next
> release of Facets.

:slight_smile:

class Hash
  def rekey(mapping = nil, &convert)
    c = convert || mapping
    dup.tap do |h| # preserve type and defaults
      h.clear
      each_pair {|k, v| h[c[k] || k] = v}
    end
  end
end

Sweet. Much smaller than mine, that's for damn sure!!!

Yeah, but that's easy when one removes the complexity in behavior.

Put the default
:to_sym back in and we could have a deal :slight_smile:

Well, that's easy, isn't it?

def rekey(mapping = nil, &convert)
  c = convert || mapping || :to_sym.to_proc

I'm need to test and benchmark it first though.

Have fun!

Oh, and nice use of polymorphism using # for both proc and hash retrieval!

Thank you! :slight_smile:

Kind regards

robert

···

On Mon, Dec 10, 2012 at 8:44 PM, Intransition <transfire@gmail.com> wrote:

On Monday, December 10, 2012 7:22:17 AM UTC-5, Robert Klemme wrote:

On Sun, Dec 9, 2012 at 7:19 PM, Intransition <tran...@gmail.com> wrote:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Well, everything went well except for one thing, which was a bit irritating
actually -- I couldn't find a way to concisely handle a block with either
one argument (just the key) or two arguments (the key and the value). Seems
like that should be doable in one block call using a splat. But I had to
have a condition on arity and two separate each_pair calls, one for each
case. Maybe it's just not possible.

In any case, thanks for your help! The code is looking much better.

···

On Tuesday, December 11, 2012 6:55:32 AM UTC-5, Robert Klemme wrote:

> Oh, and nice use of polymorphism using # for both proc and hash
retrieval!

Thank you! :slight_smile: