[SUMMARY] Hash to OpenStruct (#81)

The solutions for this quiz are bite size and filled-to-bursting with clever
tricks, so let's check out several. First, here's the easiest way I saw to not
solve the problem, by Ilmari Heikkinen:

  class Hash
   def method_missing(mn,*a)
     mn = mn.to_s
     if mn =~ /=$/
       super if a.size > 1
       self[mn[0...-1]] = a[0]
     else
       super unless has_key?(mn) and a.empty?
       self[mn]
     end
   end
  end

Ilmari doesn't bother fiddling with imports here, deciding instead to just
implement an OpenStructish interface directly in Hash. Any method call Hash
doesn't recognize will be sent to this method_missing() callback. Inside the
method, Ilmari checks to see if the call ended with an equals sign and had
exactly one argument. If so, the assignment is made. Otherwise, if a matching
key can be found, it is fetched and returned.

Of course, this kind of wholesale modification of Hash is quite dangerous. If
some code is counting on the normal failure chain of an unknown method called on
Hash, this code could easily break it.

To get away from that, we're going to need to build the converters, as Jacob
Fugal does here:

  require "yaml"
  require "ostruct"
  
  class Object
   def to_openstruct
     self
   end
  end
  
  class Array
   def to_openstruct
     map{ |el| el.to_openstruct }
   end
  end
  
  class Hash
   def to_openstruct
     mapped = {}
     each{ |key,value| mapped[key] = value.to_openstruct }
     OpenStruct.new(mapped)
   end
  end
  
  module YAML
   def self.load_openstruct(source)
     self.load(source).to_openstruct
   end
  end
  
  p YAML.load_openstruct(File.read("sample.yml"))

This was a popular style of solution. People found ways to shorten it quite a
bit, but it's easy to see the goal with this one. Any normal Object is given a
to_openstruct() method that has no effect. Array's version calls
to_openstruct() on each member and Hash passes the call down to each value.
Finally, a method is added to YAML that does the load(), then starts the chain
reaction at the top of the constructed object tree. This call gets passed down
by Arrays and Hashes as we just saw and converts most Hashes to OpenStructs.
(It doesn't convert Hashes in the instance variables of custom objects that have
been serialized.)

Now MenTaLguY threw a big monkey wrench into solutions like this when he posted
this alternate test data:

···

---
  &verily
  lemurs:
    unite: *verily
    beneath:
      - patagonian
      - bread
      - products
  thusly: [1, 2, 3, 4]

As you can see, YAML is allowed to have nested data structures. When I run
Jacob's solution on this I get the odd error:

  Illegal instruction

One of MenTaLguY's own solutions to this was to use some lazy evaluation and
memoization:

  require 'ostruct'
  require 'lazy'
  
  def hashes_to_openstructs( obj, memo={} )
    return obj unless Hash === obj
    memo[obj.object_id] ||= promise {
      OpenStruct.new( Hash[
        *obj.inject( ) { |a, (k, v)|
          a.push k, hashes_to_openstructs( v, memo )
        }
      ] )
    }
  end

This is a recursive solution like Jacob's, though trimmed down and less
complete. The trick here is that instead of walking the whole object tree
immediately, the code just promises to do it when needed. A promise() is just a
magic object that springs to life when it is actually used for the first time
(constructing itself by calling the block).

The other trick of this method is the memoization. Using the memo Hash and the

= operator, the conversion process caches each object by object_id(). Any

future calls for the same object, just get the already constructed version
straight from the cache.

Then TRANS pointed out a interesting fact, YAML already has to understand all
this recursive data structure stuff, so we really want to let it do all the hard
work and just change what it loads. TRANS poked around in the innards of YAML
and did get a working solution, but why the lucky stiff was drawn into the
challenge and suggested this version:

  require 'yaml'
  require 'ostruct'
  
  class << YAML::DefaultResolver
    alias_method :_node_import, :node_import
    def node_import(node)
      o = _node_import(node)
      o.is_a?(Hash) ? OpenStruct.new(o) : o
    end
  end

This just overrides a piece of YAML behavior to check if the object just loaded
was a Hash. When it is, it is replaced with an OpenStruct. Simple and very
effective.

Now this YAML solution will convert Hashes inside the instance variables of
objects. That's probably a bad thing, since those classes likely weren't
designed with that in mind. You always have to weight the tradeoffs and choose
a solution that will best meet your current needs.

A big thanks to all who couldn't help but fiddle with Hans's fun little
challenge. I don't believe I've ever seen such variation in the solutions
before.

Tomorrow's quiz is to help Benjohn Barnes get into shape...

This is, incidentally, the most concise explanation of promises I've ever seen. Mind if I borrow it for the lazy.rb rdoc?

-mental

···

On Thu, 8 Jun 2006 22:20:44 +0900, Ruby Quiz <james@grayproductions.net> wrote:

A promise() is just a magic object that springs to life when it is actually
used for the first time (constructing itself by calling the block).

Not at all.

I'm a huge fan of lazy.rb. :wink:

James Edward Gray II

···

On Jun 8, 2006, at 12:29 PM, MenTaLguY wrote:

On Thu, 8 Jun 2006 22:20:44 +0900, Ruby Quiz > <james@grayproductions.net> wrote:

A promise() is just a magic object that springs to life when it is actually
used for the first time (constructing itself by calling the block).

This is, incidentally, the most concise explanation of promises I've ever seen. Mind if I borrow it for the lazy.rb rdoc?