Duck Typing Hash-Like Objects

I often find that when writing initialize (or alternate constructors)
I want to examine the class of the arguments to decide how to
proceed. An example is Array.new, which behaves differently if it
is given an integer argument or an array argument:

    Array.new 2 # [nil,nil]
    Array.new [1,2] # [1,2]

These sorts of tests can be done via Class#=== or Kernel#is_a? or
Kernel#kind_of? but that can lead to artificial constraints. Using
Kernel#respond_to? seems to avoid many of those constraints.

My question is: What is the least constraining test to determine
if you've got a hash-like object? Is arg.respond_to?(:has_key?)
reasonable? At first I thought a test for :[] would be great but
that catches strings also. I'm thinking that if someone hands my
method a Hash or a HashWithIndifferentAccess or an OrderedHash or
a tree of some sort, I'd like to be able to accept all of them.

All I really want to know is "Does this object provide key/value
pair lookups via the #[] method?", but I don't want to get strings
and integers along for the ride (for example).

Gary Wright

I would check for to_hash(), then call that method on the argument to get its Hash representation.

James Edward Gray II

···

On Mar 1, 2007, at 4:25 PM, Gary Wright wrote:

I often find that when writing initialize (or alternate constructors)
I want to examine the class of the arguments to decide how to
proceed. An example is Array.new, which behaves differently if it
is given an integer argument or an array argument:

   Array.new 2 # [nil,nil]
   Array.new [1,2] # [1,2]

These sorts of tests can be done via Class#=== or Kernel#is_a? or
Kernel#kind_of? but that can lead to artificial constraints. Using
Kernel#respond_to? seems to avoid many of those constraints.

My question is: What is the least constraining test to determine
if you've got a hash-like object? Is arg.respond_to?(:has_key?)
reasonable?

I'd do it like this:

  def foo(duck)
    # if the duck claims to have keys and indexing, we'll just use it as is
    unless duck.respond_to?(:keys) and duck.respond_to?(:)
      # otherwise, we'll ask it to turn itself into a hash for us
      if duck.responds_to?(:to_hash)
        duck = duck.to_hash
      else
        # not close enough to a hash...
        raise ArgumentError, "want something with keys and indexing,
or that supports to_hash"
      end
    end
    ...
  end

This requires the keys method though, which thinking back, I usually
don't provide in my hash-like classes. So I don't know...

Jacob Fugal

···

On 3/1/07, Gary Wright <gwtmp01@mac.com> wrote:

My question is: What is the least constraining test to determine
if you've got a hash-like object? Is arg.respond_to?(:has_key?)
reasonable? At first I thought a test for : would be great but
that catches strings also. I'm thinking that if someone hands my
method a Hash or a HashWithIndifferentAccess or an OrderedHash or
a tree of some sort, I'd like to be able to accept all of them.

That might work but what if the object is an interface to some sort of database? You don't
really want to convert the external data structure into a Hash just to access a single item.

Gary Wright

···

On Mar 1, 2007, at 5:30 PM, James Edward Gray II wrote:

I would check for to_hash(), then call that method on the argument to get its Hash representation.

Hi --

My question is: What is the least constraining test to determine
if you've got a hash-like object? Is arg.respond_to?(:has_key?)
reasonable? At first I thought a test for : would be great but
that catches strings also. I'm thinking that if someone hands my
method a Hash or a HashWithIndifferentAccess or an OrderedHash or
a tree of some sort, I'd like to be able to accept all of them.

I'd do it like this:

def foo(duck)
  # if the duck claims to have keys and indexing, we'll just use it as is
  unless duck.respond_to?(:keys) and duck.respond_to?(:)
    # otherwise, we'll ask it to turn itself into a hash for us
    if duck.responds_to?(:to_hash)
      duck = duck.to_hash
    else
      # not close enough to a hash...
      raise ArgumentError, "want something with keys and indexing,
or that supports to_hash"
    end
  end
  ...
end

Or you could just do:

   duck[whatever]....

and rescue the exception(s), possibly cascading down into a to_hash
operation. You might as well fail without bothering with the
respond_to? calls -- just ask the object to do what it's supposed to,
and handle the error cases.

David

···

On Fri, 2 Mar 2007, Jacob Fugal wrote:

On 3/1/07, Gary Wright <gwtmp01@mac.com> wrote:

--
Q. What is THE Ruby book for Rails developers?
A. RUBY FOR RAILS by David A. Black (http://www.manning.com/black\)
    (See what readers are saying! http://www.rubypal.com/r4rrevs.pdf\)
Q. Where can I get Ruby/Rails on-site training, consulting, coaching?
A. Ruby Power and Light, LLC (http://www.rubypal.com)

OK, what about using Hash#fetch and trapping the IndexError for an invalid key?

James Edward Gray II

···

On Mar 1, 2007, at 5:22 PM, Gary Wright wrote:

On Mar 1, 2007, at 5:30 PM, James Edward Gray II wrote:

I would check for to_hash(), then call that method on the argument to get its Hash representation.

That might work but what if the object is an interface to some sort of database? You don't
really want to convert the external data structure into a Hash just to access a single item.

Yes, I think #fetch might be a better choice, but not exactly in the way you suggest.
I'm thinking specifically about the construction of objects such as:

class A
   def initialize(arg, &b)
   case
   when arg.respond_to?(:nonzero?)
     # do construction based on integer-like behavior
   when arg.respond_to?(:fetch)
     # do construction based on hash-like behavior
   when arg.respond_to?(:to_str)
     # do construction based on string-like behavior
   else
     # punt
   end
end

I was going to use : for hash-like behavior but that doesn't sift out Integer and Strings so
I started using :has_key?, but that seemed wrong so I posted my question.

Your suggestion to use fetch seems promising, but ActiveRecord, for example doesn't define
ActiveRecord::Base.fetch. The correct choice would be find for ActiveRecord. Hash#fetch,
and Array#fetch exist, so that does permit some nice duck-typing between those two collections.
RBtree also defines #fetch, which is convenient.

It looks like #fetch might be the best approach.

Gary Wright

···

On Mar 1, 2007, at 7:03 PM, James Edward Gray II wrote:

On Mar 1, 2007, at 5:22 PM, Gary Wright wrote:

On Mar 1, 2007, at 5:30 PM, James Edward Gray II wrote:

I would check for to_hash(), then call that method on the argument to get its Hash representation.

That might work but what if the object is an interface to some sort of database? You don't
really want to convert the external data structure into a Hash just to access a single item.

OK, what about using Hash#fetch and trapping the IndexError for an invalid key?

class A
  def initialize(arg, &b)
  case
  when arg.respond_to?(:nonzero?)
    # do construction based on integer-like behavior

Floats have nonzero?() too. I really think picking arbitrary methods like this to find a type is a big mistake.

You're still type checking, you're just doing it in a more fragile way. If you want to type check, use the class, I say.

If you want it to be an Integer, ask it if it can:

   Integer(...) rescue # nope...

  when arg.respond_to?(:fetch)
    # do construction based on hash-like behavior

Arrays have fetch too.

  when arg.respond_to?(:to_str)
    # do construction based on string-like behavior

String(...) rescue # nope...

  else
    # punt
  end
end

James Edward Gray II

···

On Mar 1, 2007, at 6:31 PM, Gary Wright wrote:

Yet if I test for (Hash == mystery_obj) that would not
allow someone to pass an RBTree object instead, which I think
is a very reasonable thing to allow and works just fine if
I only use #fetch.

A minimum interface to an indexable collection might be:

   has_key?(key)
   fetch(key)
   store(key, val)

In a quick look it seems like only Hash and RBTree implement
those methods though.

Gary Wright

···

On Mar 1, 2007, at 7:38 PM, James Edward Gray II wrote:

You're still type checking, you're just doing it in a more fragile way. If you want to type check, use the class, I say.

Is there a good reason why you can't just use different constructors for
different types of objects, then just trust that they duck-type OK?

--Ken

···

On Fri, 02 Mar 2007 10:01:06 +0900, Gary Wright wrote:

On Mar 1, 2007, at 7:38 PM, James Edward Gray II wrote:

You're still type checking, you're just doing it in a more fragile way.
If you want to type check, use the class, I say.

Yet if I test for (Hash == mystery_obj) that would not allow someone to
pass an RBTree object instead, which I think is a very reasonable thing
to allow and works just fine if I only use #fetch.

A minimum interface to an indexable collection might be:

   has_key?(key)
   fetch(key)
   store(key, val)

In a quick look it seems like only Hash and RBTree implement those
methods though.

--
Ken Bloom. PhD candidate. Linguistic Cognition Laboratory.
Department of Computer Science. Illinois Institute of Technology.
http://www.iit.edu/~kbloom1/

Hi --

···

On Fri, 2 Mar 2007, Gary Wright wrote:

On Mar 1, 2007, at 7:38 PM, James Edward Gray II wrote:

You're still type checking, you're just doing it in a more fragile way. If you want to type check, use the class, I say.

Yet if I test for (Hash == mystery_obj) that would not
allow someone to pass an RBTree object instead, which I think
is a very reasonable thing to allow and works just fine if
I only use #fetch.

I had the impression James was talking about the Integer and String
methods, though then again those aren't actually the classes. So I'm
not sure what he meant :slight_smile: But I don't think it was just to test
class membership, since that manifestly doesn't help in the kind of
situation you're describing.

David

--
Q. What is THE Ruby book for Rails developers?
A. RUBY FOR RAILS by David A. Black (http://www.manning.com/black\)
    (See what readers are saying! http://www.rubypal.com/r4rrevs.pdf\)
Q. Where can I get Ruby/Rails on-site training, consulting, coaching?
A. Ruby Power and Light, LLC (http://www.rubypal.com)

Sounds like you want C++200x concept checking, but that depends very
heavily on static typing.

Basically, I think you want to know (in a non-mutating way) whether #
supports various types non-integer parameters. I doubt there's any way to
do that in Ruby. You could try indexing it and see if it throws a
TypeError (like an Array will), but when you call # on Hash.new{|h,v| h
[v]=0}, # is mutating.

--Ken

···

On Fri, 02 Mar 2007 10:01:06 +0900, Gary Wright wrote:

On Mar 1, 2007, at 7:38 PM, James Edward Gray II wrote:

You're still type checking, you're just doing it in a more fragile way.
If you want to type check, use the class, I say.

Yet if I test for (Hash == mystery_obj) that would not allow someone to
pass an RBTree object instead, which I think is a very reasonable thing
to allow and works just fine if I only use #fetch.

A minimum interface to an indexable collection might be:

   has_key?(key)
   fetch(key)
   store(key, val)

In a quick look it seems like only Hash and RBTree implement those
methods though.

--
Ken Bloom. PhD candidate. Linguistic Cognition Laboratory.
Department of Computer Science. Illinois Institute of Technology.
http://www.iit.edu/~kbloom1/

Hi --

You're still type checking, you're just doing it in a more fragile way. If you want to type check, use the class, I say.

Yet if I test for (Hash == mystery_obj) that would not
allow someone to pass an RBTree object instead, which I think
is a very reasonable thing to allow and works just fine if
I only use #fetch.

I had the impression James was talking about the Integer and String
methods, though then again those aren't actually the classes. So I'm
not sure what he meant :slight_smile:

I was probably just babbling, not making sense. I do that.

But I don't think it was just to test class membership, since that manifestly doesn't help in the kind of situation you're describing.

Yeah, you're right. I was feeling that this is just an attempt to sidestep type checking by inventing a clever new type checking system. It's really just trying to provide a flexible interface though.

Given that, I'm changing my answer.

This is a documentation problem. As long as the documentation tells me your method needs a put_stuff_in() and a pull_stuff_out() to work, tells me what they will be passed, and *doesn't* type check, you support ALL data structures. I can always wrap Hash, RBTree, Integer, JamesCustomDataVoid, or whatever in a trivial class implementing those calls.

Am I making sense yet, or do I just need to go to sleep now?

James Edward Gray II

···

On Mar 1, 2007, at 8:26 PM, dblack@wobblini.net wrote:

On Fri, 2 Mar 2007, Gary Wright wrote:

On Mar 1, 2007, at 7:38 PM, James Edward Gray II wrote:

Hi --

Hi --

You're still type checking, you're just doing it in a more fragile way. If you want to type check, use the class, I say.

Yet if I test for (Hash == mystery_obj) that would not
allow someone to pass an RBTree object instead, which I think
is a very reasonable thing to allow and works just fine if
I only use #fetch.

I had the impression James was talking about the Integer and String
methods, though then again those aren't actually the classes. So I'm
not sure what he meant :slight_smile: But I don't think it was just to test
class membership, since that manifestly doesn't help in the kind of
situation you're describing.

Well, I should say: it's a way to deal with some of the practicalities
of a situation where you really only want objects of certain classes,
at the expense of duck typing. But (a) it sounds like you want
something more elastic, and (b) testing class membership doesn't tell
you anything definitive, so it doesn't solve the problem if you're
thinking that rogue objects might be coming in to the method (since if
someone can roguely send it, say, a Proc, which responds to , they
can presumably send it a hash that responds to irresponsibly).

I guess I tend to think in terms of error handling: that is, let
objects call , but catch the ones that fail, or the ones that hand
back nonsense (in the context) values.

It's funny sometimes how discussions of duck typing come at the same
thing from two directions: protecting systems from supposed gremlins
that are engineering its demise by extending objects with destructive
but well-camouflaged behaviors, and exploring the coolness of the
openness of Ruby objects. Or something.

David

···

On Fri, 2 Mar 2007, dblack@wobblini.net wrote:

On Fri, 2 Mar 2007, Gary Wright wrote:

On Mar 1, 2007, at 7:38 PM, James Edward Gray II wrote:

--
Q. What is THE Ruby book for Rails developers?
A. RUBY FOR RAILS by David A. Black (http://www.manning.com/black\)
    (See what readers are saying! http://www.rubypal.com/r4rrevs.pdf\)
Q. Where can I get Ruby/Rails on-site training, consulting, coaching?
A. Ruby Power and Light, LLC (http://www.rubypal.com)

Hi --

Yeah, you're right. I was feeling that this is just an attempt to sidestep type checking by inventing a clever new type checking system.

Or an attempt to sidestep class-checking by inventing a type-checking
system :slight_smile: A few years ago there were some interesting attempts to
come up with a systematic way to determine an object's type, in the
sense of its full profile and interface, at any given point in its
life. The idea was to be able to get some kind of rich response from
the object, well beyond what respond_to? and is_a? provide, in order
to determine whether you'd gotten hold of the type of object you
needed. I seem to recall it turned out to be very challenging,
perhaps impossible, to come up with a complete system for this. I'm
not sure if anyone is still working on it. But it's an interesting
area.

It's really just trying to provide a flexible interface though.

Given that, I'm changing my answer.

This is a documentation problem. As long as the documentation tells me your method needs a put_stuff_in() and a pull_stuff_out() to work, tells me what they will be passed, and *doesn't* type check, you support ALL data structures. I can always wrap Hash, RBTree, Integer, JamesCustomDataVoid, or whatever in a trivial class implementing those calls.

Am I making sense yet, or do I just need to go to sleep now?

Definitely the former, and perhaps the latter too -- 'tis up to you
:slight_smile: I'm also very tired, and feeling semi-coherent at best, but
enjoying the thread.

David

···

On Fri, 2 Mar 2007, James Edward Gray II wrote:

--
Q. What is THE Ruby book for Rails developers?
A. RUBY FOR RAILS by David A. Black (http://www.manning.com/black\)
    (See what readers are saying! http://www.rubypal.com/r4rrevs.pdf\)
Q. Where can I get Ruby/Rails on-site training, consulting, coaching?
A. Ruby Power and Light, LLC (http://www.rubypal.com)

Let me make the situation a little more concrete.

I'd like to define a class that accepts the following syntax for
construction:

  A.new
  A.new(1)
  A.new(1,2)
  A.new(3 => 4)
  A.new(1, 3 => 4)
  A.new(1, 2, 3 => 4)

So the arguments to A.new are zero or more objects followed by an
optional hash. I can certainly look for that trailing hash via
(Hash === args.last) but what if I don't want to lock it down to
a Hash?

  tree = RBTree.new
  A.new(1, 2, tree)

I'd like that to work also and I'm sure there are other sorts of
objects that would work just fine (i.e. respond to #fetch/#, has_key?,
and perhaps is Enumerable). If I use a class based test to discover
if the last argument is an instance of Hash, I'm eliminating those
other possibilities. I also don't want to use args.last[key] and
catch an exception because that is only useful *after* I've
discovered if an optional final hash-like object has been passed.

I could have different constructors:

  A.new(1)
  A.new_with_hash(1, 1=>2)

but it really isn't as nice, IMHO.

At first I thought I could use respond_to?(:) on the last argument,
but as I said in the original post integers and strings will create
a false-positive for a hash-like trailing argument using that test.

Perhaps I'm trying to push the duck-typing too far and should just stick
with testing for Hash but it seems like testing for #fetch gives at least
a little more flexibility.

It also seems like it might be nice to encourage a practice of defining
#fetch, #store, and #has_key? for data structures that are 'indexable'.

Gary Wright

···

On Mar 1, 2007, at 9:49 PM, dblack@wobblini.net wrote:

I guess I tend to think in terms of error handling: that is, let
objects call , but catch the ones that fail, or the ones that hand
back nonsense (in the context) values.

Gary Wright wrote:

I guess I tend to think in terms of error handling: that is, let
objects call , but catch the ones that fail, or the ones that hand
back nonsense (in the context) values.

Let me make the situation a little more concrete.

I'd like to define a class that accepts the following syntax for
construction:

    A.new
    A.new(1)
    A.new(1,2)
    A.new(3 => 4)
    A.new(1, 3 => 4)
    A.new(1, 2, 3 => 4)

So the arguments to A.new are zero or more objects followed by an
optional hash. I can certainly look for that trailing hash via
(Hash === args.last) but what if I don't want to lock it down to
a Hash?

There seems to be still some ambiguity in this description. In this case:

   h = {3 => 4}
   A.new(1, 2, h)

how do you know if _h_ is intended as the third object (in the "zero or more objects" part) or as the optional hash?

Sometimes I have wished that the hash generated by this syntax:

meth(k=>v)

were flagged in some way, so that you could distinguish it from

meth({k=>v})

But I'm not sure that would help in this case anyway.

···

On Mar 1, 2007, at 9:49 PM, dblack@wobblini.net wrote:

--
       vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

Is it *really* a problem that strings and integers produce values that
your method would make use of? Say someone wants to encode those input
parameters into a string - as long as works, they can. Why is this
a problem?

···

On 3/2/07, Gary Wright <gwtmp01@mac.com> wrote:

On Mar 1, 2007, at 9:49 PM, dblack@wobblini.net wrote:
> I guess I tend to think in terms of error handling: that is, let
> objects call , but catch the ones that fail, or the ones that hand
> back nonsense (in the context) values.

Let me make the situation a little more concrete.

I'd like to define a class that accepts the following syntax for
construction:

        A.new
        A.new(1)
        A.new(1,2)
        A.new(3 => 4)
        A.new(1, 3 => 4)
        A.new(1, 2, 3 => 4)

So the arguments to A.new are zero or more objects followed by an
optional hash. I can certainly look for that trailing hash via
(Hash === args.last) but what if I don't want to lock it down to
a Hash?

        tree = RBTree.new
        A.new(1, 2, tree)

I'd like that to work also and I'm sure there are other sorts of
objects that would work just fine (i.e. respond to #fetch/#, has_key?,
and perhaps is Enumerable). If I use a class based test to discover
if the last argument is an instance of Hash, I'm eliminating those
other possibilities. I also don't want to use args.last[key] and
catch an exception because that is only useful *after* I've
discovered if an optional final hash-like object has been passed.

I could have different constructors:

        A.new(1)
        A.new_with_hash(1, 1=>2)

but it really isn't as nice, IMHO.

At first I thought I could use respond_to?(:) on the last argument,
but as I said in the original post integers and strings will create

All of this is personal perspective of course, but my view of duck
typing is that it's really a question of the type of a variable rather
than on objects. In other words in my view variables have types which
are generated by their usage. Types in this view are like job
requirements.

So rather than talking about say an array type, to me duck typing is
talking about the type of a variable which is used in a certain way,
which might be somewhat idiosyncratic to the user. Some common such
types do exist, like a queue, a stack, a generalized collection (with
various requirements as to access, ordering etc.) given one of these,
or a more idiosyncratic type, several objects might work as the value
of the variable in question.

As an example, lets say I'm looking for something to use to drive a
nail. The obvious 'type' of thing for this job is a hammer, but a
heavy wrench, or a rock can also serve since the usage really just
requires a mass which can be conveniently accelerated so as to impart
inertia to that nail. If I don't have a hammer to hand, I can press
one of these other objects into service.

In this view of duck typing choosing an object is akin to hiring
someone, you make an initial assessment of whether the potential
employee has the requirements, and if s/he passes that sniff test, you
hire him/her and test that assessment over time.

These kind of types also can require much more than a simple list of
provided interfaces, they also typically rely to one degree or another
on the semantics of those interfaces, often including how the object's
observed behavior is affected by the SEQUENCE of calls. These types
of types are much harder to statically check. And they cause bugs in
either a statically or dynamically typed system. In my experience,
the kind of stupid bugs which are flushed out by a static type system
are a small percentage of the bugs which are caused by these semantic
mismatches.

I know that this viewpoint cause conniption fits in folks who believe
in the religion of static type checking, and I've long ago given up
trying to proselytize those with strong convictions. All I can say is
that I've found such a view of typing combined with a language such as
Ruby which supports it has provided a powerful approach to building
software.

Both static and dynamic typing have benefits and problems, speaking
solely for myself, I just prefer both the benefits and the problems of
dynamically typed systems over those of statically typed ones.

···

On 3/1/07, dblack@wobblini.net <dblack@wobblini.net> wrote:

A few years ago there were some interesting attempts to
come up with a systematic way to determine an object's type, in the
sense of its full profile and interface, at any given point in its
life. The idea was to be able to get some kind of rich response from
the object, well beyond what respond_to? and is_a? provide, in order
to determine whether you'd gotten hold of the type of object you
needed. I seem to recall it turned out to be very challenging,
perhaps impossible, to come up with a complete system for this. I'm
not sure if anyone is still working on it. But it's an interesting
area.

--
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

You don't. There just has to be a clear documentation for
the disambiguation rule. The caller could use:

    A.new(1,2, h, {})

If they wanted to force h to be part of the list of objects
instead of the optional trailing hash.

Gary Wright

···

On Mar 2, 2007, at 12:39 AM, Joel VanderWerf wrote:

There seems to be still some ambiguity in this description. In this case:

  h = {3 => 4}
  A.new(1, 2, h)

how do you know if _h_ is intended as the third object (in the "zero or more objects" part) or as the optional hash?