Duck Typing as Pattern Matching

Its_Me · 10 January 2005 04:51

I am not a type-system expert, but I started thinking about Ruby-based duck
typing some time ago and came up with something that seems promising to me.
I wondered what other Ruby-ists think of the general idea.

Duck typing should specify the minimal behavior required from an object that
gets bound to a variable. Pattern matching (functional languages, Lisp loop
macros, etc.) binds a set of variables (all at once) by stating the minimum
necessary to extract values for the pattern variable from the target
objects, ignoring other irrelevant bits of that object. Both are about
matching criteria and binding of variables.

Pattern matching is more concise than the corresponding code to bind each
variable separately (think regexes, or multiple value assignment), and it
happens to also convey type information (defines a criteria to match
objects).

Ruby already uses some special-case patterns: multi-value assignment,
*splat, &block.

So the basic idea is: duck typing = pattern matching
Types are patterns.
An object is a member of a type if the type pattern matches that object.

Preliminary examples below, please don't get hung up on the (very tentative)
syntax:

    ::[x, y] = obj
        # x = obj[0], y = obj[1]; # illustration only, defer *splat for now
    ::[x, {k1: y}] = obj
        # x = obj[0], y = obj[1][:k1]

    def f ( [x, y] ) : []
        # def f ( xy ); x = xy[0]; y = xy[1]..
        # assert returns.respond_to? :[]
        # Notice how revealing the signature has become

    def f ( m(): x )
        # you may prefer var : type style instead... later ..
        # def f (x)
        # assert x.respond_to?(:m)

    def f ( [ *m() ] : x )
        # def f (x)
        # assert x.all?{|y| y.respond_to?(:m)}, or #each equivalent

    M1 = ::(m1()) # things that have #m1()
    M1M2 = M1 && ::(m2()) # things with m1(), m2()
    def f ( M1M2: x )
        # def f (x)
        # assert: x responds to :m1 and :m2

    def f ( (m1(), m2()): x, (m3(): y, m4(): z) )
        # you may prefer var : type style instead... later ..
        # def f ( x, yz )
        # assert x.respond_to?(:m1) && x.respond_to(:m2)
        # y = yz.m3()
        # z = yz.m4()
        # Notice how revealing the signature has become.

I am interested in any initial reactions.

Thanks.

David_A_Black3 · 10 January 2005 05:32

Hi --

I am not a type-system expert, but I started thinking about Ruby-based duck
typing some time ago and came up with something that seems promising to me.
I wondered what other Ruby-ists think of the general idea.

Duck typing should specify the minimal behavior required from an object that
gets bound to a variable. Pattern matching (functional languages, Lisp loop
macros, etc.) binds a set of variables (all at once) by stating the minimum
necessary to extract values for the pattern variable from the target
objects, ignoring other irrelevant bits of that object. Both are about
matching criteria and binding of variables.

Pattern matching is more concise than the corresponding code to bind each
variable separately (think regexes, or multiple value assignment), and it
happens to also convey type information (defines a criteria to match
objects).

Ruby already uses some special-case patterns: multi-value assignment,
*splat, &block.

So the basic idea is: duck typing = pattern matching
   Types are patterns.
   An object is a member of a type if the type pattern matches that object.

Preliminary examples below, please don't get hung up on the (very tentative)
syntax:

   ::[x, y] = obj
       # x = obj[0], y = obj[1]; # illustration only, defer *splat for now
   ::[x, {k1: y}] = obj
       # x = obj[0], y = obj[1][:k1]

   def f ( [x, y] ) :
       # def f ( xy ); x = xy[0]; y = xy[1]..
       # assert returns.respond_to? :
       # Notice how revealing the signature has become

   def f ( m(): x )
       # you may prefer var : type style instead... later ..
       # def f (x)
       # assert x.respond_to?(:m)

   def f ( [ *m() ] : x )
       # def f (x)
       # assert x.all?{|y| y.respond_to?(:m)}, or #each equivalent

   M1 = ::(m1()) # things that have #m1()
   M1M2 = M1 && ::(m2()) # things with m1(), m2()
   def f ( M1M2: x )
       # def f (x)
       # assert: x responds to :m1 and :m2

   def f ( (m1(), m2()): x, (m3(): y, m4(): z) )
       # you may prefer var : type style instead... later ..
       # def f ( x, yz )
       # assert x.respond_to?(:m1) && x.respond_to(:m2)
       # y = yz.m3()
       # z = yz.m4()
       # Notice how revealing the signature has become.

I am interested in any initial reactions.

My main first reaction was that I find the reference to duck typing
misleading. I think what you're sketching out here is, in some
respects, the opposite of duck typing; that is, rather than simply
asking an object to do something at the moment you want it to do that
thing, you're introducing a wrapper/notation mechanism which, as I
understand it, will prevent you from getting to that point at all
under certain conditions. Thus your mechanism actually causes you to
*avoid* duck typing. That doesn't make it good or bad in itself -- it
would just be clearer, I think, not to label it duck typing.

I know we're supposed to ignore all that punctuation for now...
but one way or another, *something* would have to encapsulate all that
information, so it would be likely to look at least somewhat like what
you've got, which for me would be a fairly major drawback. I love the
clean, not very punctuation-heavy look of Ruby so much that I'm
willing to give it lots of weigh against possible language features
and innovations. I'd rather have some explicit calls to #respond_to?
than a system for abbreviating those calls into punctuation -- even
though it's more words, even though it takes longer to type. Ruby is
already very concise; even doing a couple of explicit assignments (as
in the expansion of your last example) is a lot shorter than it would
be if, say, memory had to be allocated, or variable type declared.

Maybe there's some way to develop the pattern-matching idea but not
necessarily put it all in the method signature. Could incremental
things, like allowing #respond_to? to take an array or multiple
arguments, work in that direction?

David

···

On Mon, 10 Jan 2005, itsme213 wrote:

--
David A. Black
dblack@wobblini.net

Its_Me · 10 January 2005 16:31

rather than simply
asking an object to do something at the moment you want it to do that
thing, you're introducing a wrapper/notation mechanism

Not always true. Below I've dropped many of the ':' as I think they are not
syntactically essential.

    [x,y] = obj
is the same as
    x = obj[0]
    y = obj[1]
So I ask obj to 'do' its [0], [1] right where I want it. Variables are bound
to results.

    (m1(obj1) x, m2(obj2) y) = obj
is the same as
    x = obj.m1(obj1)
    y = obj.m2(obj2)
Again, I ask obj to do it's m1(), m2() right where I want it. Variables are
bound to results.

x = obj
Object x = obj
are identical.

Now
    T x = obj
would be
    x = obj
    assert x.is_of_type(T)
    # Object#is_of_type would handle named types, type patterns, classes,
etc.

So:
    (m1(), m2()) x = obj
declares that m1() and m2() are needed, but does not invoke them since there
are no variables to bind to obj.m1(), obj.m2(). This is like regexs cases
where you want a match but don't care for a variable to be bound. Hence,
besides binding x to full obj, it also includes an assertion
    x = obj
    assert( x.is_of_type(::(m1(), m2())
    # i.e. assert( x.respond_to?(:m1) && x.respond_to?(:m2) )

And such assertions might not be needed if other type info was known (either
simple type propagation, or more sophisticated type inference, to the extent
possible in Ruby):
    def f (T obj)
        x = obj # so x.is_of_type T
        T y = x
        # no type assertion needed for T y = obj

...[can] prevent you from getting to that point at all
under certain conditions.

Yes, this pattern matching can fail, and depending on context might
sometimes raise errors e.g. In places like a case-when or if- statement a
match can fail without raising any error; its just a boolean check, with the
added bonus of variable bindings from a successful match. In a regular
assignment it can fail to match and will raise the same errors as the
equivalent regular ruby code.
- [x,y]=obj : can raise same errors as x=obj[0], y=obj[1]
- (m1(), m2()) x = obj can raise errors from 'assert x.respond_to...'

That doesn't make it good or bad in itself -- it
would just be clearer, I think, not to label it duck typing.

I see. Some usages of it are intentionally closer to explicit type
declarations. Maybe 'duck type declarations'?

Maybe there's some way to develop the pattern-matching idea but not
necessarily put it all in the method signature.

Yes. For example we could put this info right after the def f(x,y) so the
def line is uncluttered. Something like this would be ok.
def f ( x, y )
(m1(), m2()) x
# or T x, if T was named type defined as m1(), m2()
In this case Ruby should associate the type-declared x with the parameter x,
rather than a new local variable. Would you agree?

But this would not allow deconstructing (taking apart and pattern matching)
anonymous args.
    def f [x,y]
would instead become
    def f obj
        obj[0] x # new local var? appears in signature?
        obj[1] y # new local var? appears in signature?
Do you think there should be some way to treat the "obj[0] x" as part of the
signature? I am certain that
    def f [start, end]
communicates much better signature info, with zero overhead, than
    def f start_end

Could incremental
things, like allowing #respond_to? to take an array or multiple
arguments, work in that direction?

I see where you are going. Do you mean:
def f(x)
x.respond_to? :m1, :m2, :m3

I did want to somehow distinguish patterns used as type declarations from a
regular (and overridable!) method call like respond_to. There may be other
alternatives to "punctuation", but how would using 'normal' respond_to would
for this purpose?

Cheers.

···

"David A. Black" <dblack@wobblini.net> wrote

David_A_Black3 · 11 January 2005 01:58

Hi --

rather than simply
asking an object to do something at the moment you want it to do that
thing, you're introducing a wrapper/notation mechanism

Not always true. Below I've dropped many of the ':' as I think they are not
syntactically essential.

   [x,y] = obj
is the same as
   x = obj[0]
   y = obj[1]
So I ask obj to 'do' its [0], [1] right where I want it. Variables are bound
to results.

Hmmm...

   irb(main):004:0> obj = [1,2]
   => [1, 2]
   irb(main):005:0> [x,y] = obj
   SyntaxError: compile error

Or did you mean: x,y = obj ? I don't think I'd characterize it as obj
"doing its [0], [1]". You're not sending messages to obj -- not even
the message #. What happens in this scenario depends on assignment
semantics, not message semantics. Also I think you're playing on the
similarity of the literal array constructor and the method #,
which I think has to be discounted as something of a coincidence
(And, as above, that syntax doesn't parse.)

   (m1(obj1) x, m2(obj2) y) = obj
is the same as
   x = obj.m1(obj1)
   y = obj.m2(obj2)
Again, I ask obj to do it's m1(), m2() right where I want it. Variables are
bound to results.

You can devise some very obscure notation in which things happen and
the results are saved in variables I don't think that, in itself,
makes a case one way or the other. I'm feeling very reactionary here,
because I vastly prefer your 'unrolled' ("same as...") version. I
don't really get that other syntax; it seems confusing to put the
messages being sent to obj on the other side of the assignment from
obj itself. What would you gain by that?

   x = obj
   Object x = obj
are identical.

Now
   T x = obj
would be
   x = obj
   assert x.is_of_type(T)
   # Object#is_of_type would handle named types, type patterns, classes,
etc.

It could be an interesting experiment (and it's been attempted), but I
don't know that such a thick description of an object at a given
moment in its life-cycle would have much practical value. There are
mainly two things at stake: whether to send a message to an object,
and what will happen when you do. To make a reasonable judgement
about the former, the most you need to know is whether the object
responds at all to the message. As for the second (what the object
will do), you literally cannot know without sending the message, so no
amount of thick description will help you there.

I wonder whether any full run-time type-capturing facility in Ruby
might therefore be doomed to being overengineered, even if quite
interesting.

...[can] prevent you from getting to that point at all
under certain conditions.

Yes, this pattern matching can fail, and depending on context might
sometimes raise errors e.g. In places like a case-when or if- statement a
match can fail without raising any error; its just a boolean check, with the
added bonus of variable bindings from a successful match. In a regular
assignment it can fail to match and will raise the same errors as the
equivalent regular ruby code.
- [x,y]=obj : can raise same errors as x=obj[0], y=obj[1]
- (m1(), m2()) x = obj can raise errors from 'assert x.respond_to...'

That doesn't make it good or bad in itself -- it
would just be clearer, I think, not to label it duck typing.

I see. Some usages of it are intentionally closer to explicit type
declarations. Maybe 'duck type declarations'?

I believe that duck typing and type declaration are two different
things -- i.e., that the concept of duck typing is at heart an
alternative to the concept of type declaration. In any case, if you
declare/describe/document what a Ruby object can do at a given moment,
you are declaring/etc. its *type*, not its "duck type". In other
words, that's what type *is* in Ruby. If you call this an object's
"duck type", it suggests that there's some other "type" that an object
can have -- which, frequently if not inexorably, leads back to
the type == class thing.

Maybe there's some way to develop the pattern-matching idea but not
necessarily put it all in the method signature.

Yes. For example we could put this info right after the def f(x,y) so the
def line is uncluttered. Something like this would be ok.
def f ( x, y )
(m1(), m2()) x
# or T x, if T was named type defined as m1(), m2()
In this case Ruby should associate the type-declared x with the parameter x,
rather than a new local variable. Would you agree?

In context, yes, but I'm not sold on this syntax either. m1() looks
like a method call. If you're querying whether a method *can* be
called, but not calling it, I'd rather use something that doesn't look
like a method call. #respond_to?(sym) comes to mind....

Could incremental
things, like allowing #respond_to? to take an array or multiple
arguments, work in that direction?

I see where you are going. Do you mean:
def f(x)
x.respond_to? :m1, :m2, :m3

I did want to somehow distinguish patterns used as type declarations from a
regular (and overridable!) method call like respond_to. There may be other
alternatives to "punctuation", but how would using 'normal' respond_to would
for this purpose?

I just wonder whether letting respond_to? take multiple arguments
might actually solve, in practice, 99% of the perceived problems in
this realm. Or even wrapped, like so:

   module Kernel
     def can?(*methods)
       methods.all? {|m| respond_to?(m) }
     end
   end

   class C
     def blah(x)
       raise ArgumentError unless x.can?(:a,:b,:c)
     end
   end

which looks strangely familiar.... I think that I or someone else
wrote something very like that at some other point.

David

···

On Tue, 11 Jan 2005, itsme213 wrote:

"David A. Black" <dblack@wobblini.net> wrote

--
David A. Black
dblack@wobblini.net

Its_Me · 11 January 2005 04:46

> [x,y] = obj
> is the same as
> x = obj[0]
> y = obj[1]
> So I ask obj to 'do' its [0], [1] right where I want it. Variables are

bound

> to results.

Hmmm...

   irb(main):004:0> obj = [1,2]
   => [1, 2]
   irb(main):005:0> [x,y] = obj
   SyntaxError: compile error

Or did you mean: x,y = obj ?

Nope. I chose syntax is currently illegal (dropping extra :: etc), to
propose it for type + pattern for 2.0.

I don't think I'd characterize it as obj
"doing its [0], [1]". You're not sending messages to obj -- not even
the message #.

The proposal is for
[x,y] = obj
to do exactly x=obj[0]; y=obj[1].

Also I think you're playing on the
similarity of the literal array constructor and the method #,
which I think has to be discounted as something of a coincidence

Actually that is exactly what pattern matching does in functional languages.
It uses term constructors (such as ) on the left hand of an assignment,
effectively 'deconstructing' the rhs (i.e. using accessors to get into). And
it is what regex's do, in a round-about way.

You can devise some very obscure notation in which things happen and
the results are saved in variables

I'm not happy with the notation for method matching, but correspondence with
pattern matching is sound.

It could be an interesting experiment (and it's been attempted), but I
don't know that such a thick description of an object at a given
moment in its life-cycle would have much practical value.

Isn't some type declaration facility a distinct 2.0 possibility?

Modulo the usual duck-arguments of the hazards of slipping into class-based
checking, signature-based type information is useful both for programmers
and for reflective applications (which typically do not want to examine
method-bodies).

> I see. Some usages of it are intentionally closer to explicit type
> declarations. Maybe 'duck type declarations'?

I believe that duck typing and type declaration are two different
things -- i.e., that the concept of duck typing is at heart an
alternative to the concept of type declaration.

Perhaps. I think duck-typing is about focusing on respond_to and steering
clear of class-based tests (unless they are truly part of what a method
requires, arguments of style aside). Duck-typing, like any strong typing,
can be static (associated with variables and expressions) or dynamic
(associated only with objects), or in-between (dynamically select between,
or even generate, multiple threads of type-specialized code).

If you call this an object's
"duck type", it suggests that there's some other "type" that an object
can have -- which, frequently if not inexorably, leads back to
the type == class thing.

I would not suggest requiring any tie in to class-based checks.

> def f ( x, y )
> (m1(), m2()) x
> # or T x, if T was named type defined as m1(), m2()
> In this case Ruby should associate the type-declared x with the

parameter x,

> rather than a new local variable. Would you agree?

In context, yes, but I'm not sold on this syntax either.

Alternatives welcome, but they should allow for selected use to be
unambiguously about type declaration.

Cheers.

···

"David A. Black" <dblack@wobblini.net> wrote

Robert · 11 January 2005 08:21

"David A. Black" <dblack@wobblini.net> schrieb im Newsbeitrag
news:Pine.LNX.4.61.0501101703450.10841@wobblini...

Hi --

>
>
>> rather than simply
>> asking an object to do something at the moment you want it to do that
>> thing, you're introducing a wrapper/notation mechanism
>
> Not always true. Below I've dropped many of the ':' as I think they

are not

> syntactically essential.
>
> [x,y] = obj
> is the same as
> x = obj[0]
> y = obj[1]
> So I ask obj to 'do' its [0], [1] right where I want it. Variables are

bound

> to results.

Hmmm...

   irb(main):004:0> obj = [1,2]
   => [1, 2]
   irb(main):005:0> [x,y] = obj
   SyntaxError: compile error

Or did you mean: x,y = obj ? I don't think I'd characterize it as obj
"doing its [0], [1]". You're not sending messages to obj -- not even
the message #. What happens in this scenario depends on assignment
semantics, not message semantics.

I'd say methods are invoked:

class Foo
  include Enumerable
  def initialize(x) @x=x end
  def each(&b) p "EACH"; @x.times(&b) end
  def to_a() p "TO_A"; super end
end

f = Foo.new 5

=> #<Foo:0x1016f950 @x=5>

a,b,c = *f

"TO_A"
"EACH"
=> [0, 1, 2, 3, 4]

a

=> 0

b

=> 1

c

=> 2

Now I'm making things more complicated: generic Enumerables and Arrays are
treated differently in this assignment context:

a,b,c = [0,1,2,3,4]

=> [0, 1, 2, 3, 4]

a

=> 0

b

=> 1

c

=> 2

a,b,c = *[0,1,2,3,4]

=> [0, 1, 2, 3, 4]

a

=> 0

b

=> 1

c

=> 2

a,b,c = f

=> [#<Foo:0x1016f950 @x=5>]

a

=> #<Foo:0x1016f950 @x=5>

b

=> nil

c

=> nil

a,b,c = *f

"TO_A"
"EACH"
=> [0, 1, 2, 3, 4]

a

=> 0

b

=> 1

c

=> 2

For arrays the star is added implicitely while for generic enumerables
it's not. That's might be a reason to do away with this implicit
behavior - at least for me.

(In Ruby 1.8.1 that is)

Kind regards

robert

···

On Tue, 11 Jan 2005, itsme213 wrote:
> "David A. Black" <dblack@wobblini.net> wrote

David_A_Black3 · 11 January 2005 16:15

Hi --

   [x,y] = obj
is the same as
   x = obj[0]
   y = obj[1]
So I ask obj to 'do' its [0], [1] right where I want it. Variables are

bound

to results.

Hmmm...

   irb(main):004:0> obj = [1,2]
   => [1, 2]
   irb(main):005:0> [x,y] = obj
   SyntaxError: compile error

Or did you mean: x,y = obj ?

Nope. I chose syntax is currently illegal (dropping extra :: etc), to
propose it for type + pattern for 2.0.

I don't think I'd characterize it as obj
"doing its [0], [1]". You're not sending messages to obj -- not even
the message #.

The proposal is for
   [x,y] = obj
to do exactly x=obj[0]; y=obj[1].

What's gained by that, though? It gives you a basis for an analogy
for the other stuff, but I don't see the advantage of the whole
direction, and the analogy doesn't really have persuasive weight. It
sounds like you want:

(join, x) = array

to mean

x = array.join

which I think would be awfully hard to decipher.

Isn't some type declaration facility a distinct 2.0 possibility?

Yes, though I'm hoping Matz's conservatism about change will rule it
out I've seen some interesting and thoughtful discussions of this
topic, but the vast majority of times that people have advocated type
checks/assertions/declarations, in my judgement it's been because
they're unconvinced of the soundness of the conditions that Ruby
imposes on programming. My reaction has always been that (a) Ruby
will continue to impose those conditions; you can't do much more than
pretend they're not there, and (b) those conditions are in fact sound
already.

Mainly I root for people to keep exploring and exploring and exploring
the dynamism of Ruby, and I am convinced that *anything* that people
can latch onto that makes them feel they've defeated that dynamism
will, in fact, be latched onto in exactly that spirit. I think that's
what would happen, and I would count it a loss.

I see. Some usages of it are intentionally closer to explicit type
declarations. Maybe 'duck type declarations'?

I believe that duck typing and type declaration are two different
things -- i.e., that the concept of duck typing is at heart an
alternative to the concept of type declaration.

Perhaps. I think duck-typing is about focusing on respond_to and steering
clear of class-based tests (unless they are truly part of what a method
requires, arguments of style aside). Duck-typing, like any strong typing,
can be static (associated with variables and expressions) or dynamic
(associated only with objects), or in-between (dynamically select between,
or even generate, multiple threads of type-specialized code).

I think it's a somewhat simpler and less sweeping (though rather
profound) concept; see http://www.rubygarden.org/ruby?DuckTyping

def f ( x, y )
(m1(), m2()) x
# or T x, if T was named type defined as m1(), m2()
In this case Ruby should associate the type-declared x with the

parameter x,

rather than a new local variable. Would you agree?

In context, yes, but I'm not sold on this syntax either.

Alternatives welcome, but they should allow for selected use to be
unambiguously about type declaration.

See my #can? method in the last post.

David

···

On Tue, 11 Jan 2005, itsme213 wrote:

"David A. Black" <dblack@wobblini.net> wrote

--
David A. Black
dblack@wobblini.net

David_A_Black3 · 12 January 2005 12:26

Hi --

   [x,y] = obj
is the same as
   x = obj[0]
   y = obj[1]
So I ask obj to 'do' its [0], [1] right where I want it. Variables are

bound

to results.

Hmmm...

   irb(main):004:0> obj = [1,2]
   => [1, 2]
   irb(main):005:0> [x,y] = obj
   SyntaxError: compile error

Or did you mean: x,y = obj ? I don't think I'd characterize it as obj
"doing its [0], [1]". You're not sending messages to obj -- not even
the message #. What happens in this scenario depends on assignment
semantics, not message semantics.

I'd say methods are invoked:

Definitely, but I still wouldn't call it message semantics.

Now I'm making things more complicated: generic Enumerables and Arrays are
treated differently in this assignment context:

a,b,c = [0,1,2,3,4]

...

a,b,c = *[0,1,2,3,4]

...

a,b,c = f

...

a,b,c = *f

For arrays the star is added implicitely while for generic enumerables
it's not. That's might be a reason to do away with this implicit
behavior - at least for me.

I think it's inevitable that arrays are "special" in a lot of these
situations -- as wrappers for multiple arguments and return values, as
the "normalized" format for results from Enumerable operations like
select and map, and so on. Unless that's somehow all completely
redesigned, I think it would be better to keep the * behavior
array-bound in that sense.

David

···

On Tue, 11 Jan 2005, Robert Klemme wrote:

"David A. Black" <dblack@wobblini.net> wrote

On Tue, 11 Jan 2005, itsme213 wrote:

--
David A. Black
dblack@wobblini.net

Its_Me · 12 January 2005 05:01

> The proposal is for
> [x,y] = obj
> to do exactly x=obj[0]; y=obj[1].

What's gained by that, though?

In a method signature
    def m [start,end]
is more informative than
    def m obj
or
    def m start_end

The first tells me I have to pass in something from which start can be
extracted with [0], and end with [1]. And if available in method signatures,
it should be available other places where variables get bound, like
assignment, for-loops, ... (A separate side-comment: for the same reasons I
think the rules of * and ',' should be changed to be the same for assignment
as method calls).

It sounds like you want:

(join, x) = array

(join() x) = array, but that's a minor nit. I agree it is hard to read.
#instance_eval kindof comes close, but
array.instance_eval { x = join() }
would bind a quite inaccessible 'x'. Maybe pattern matching is too hard on
the syntax.

checks/assertions/declarations, in my judgement it's been because
they're unconvinced of the soundness of the conditions that Ruby
imposes on programming.

I know that Ruby's just_in_time duck typing safety check is sound. But I
also believe access to duck-typing information can be used for many several
purposes besides just to check on every call.

If you wrote an app that hooked together other objects, doing some checking
on compatibility before doing so, might it be useful to have access to some
non-class-based signature information? Would you prefer this to be done
separately from the method definitions? And independently invent conventions
for each end?

And I believe duck-typing information should allow duck-type expressions of
the form (borrowing your 'can')
    can?(:a, :b) || (can?(:x) && can?([:k]))
A duck, d, that can either do:
    d.a, d.b, d.c
or
    d.x, d[:k]
Type inference (if feasible) might build such expressions as it traverses
calls, assignments, branches, etc. What do you think? Does this make be more
or less of a quack?

> Perhaps. I think duck-typing is about focusing on respond_to and

steering

> clear of class-based tests (unless they are truly part of what a method
> requires, arguments of style aside). Duck-typing, like any strong

typing,

> can be static (associated with variables and expressions) or dynamic
> (associated only with objects), or in-between (dynamically select

between,

> or even generate, multiple threads of type-specialized code).

I think it's a somewhat simpler and less sweeping (though rather
profound) concept; see http://www.rubygarden.org/ruby?DuckTyping

Hmm. I see the warning of class-based type info, but nothing against
respond_to?-style type info. Did I miss something?

case...when allows
Class === x for case matching. Not very ducky.
But a hypothetical
duck_expression === x
would be quite ducky, imo.

> Alternatives welcome, but they should allow for selected use to be
> unambiguously about type declaration.

See my #can? method in the last post.

   module Kernel
     def can?(*methods)
       methods.all? {|m| respond_to?(m) }
     end
   end

   class C
     def blah(x)
       raise ArgumentError unless x.can?(:a,:b,:c)
     end
   end

Sure, but:
- What tells any runtime reflective access, or a compiler, to treat this as
part of the signature of #blah?
- I'd like to name and compose can?-based types in a way that is, again,
available to runtime reflective access or a compiler.

Cheers.

···

"David A. Black" <dblack@wobblini.net> wrote

Robert · 12 January 2005 13:41

"David A. Black" <dblack@wobblini.net> schrieb im Newsbeitrag
news:Pine.LNX.4.61.0501120420420.1017@wobblini...

Hi --

>>>
>>>
>>> [x,y] = obj
>>> is the same as
>>> x = obj[0]
>>> y = obj[1]
>>> So I ask obj to 'do' its [0], [1] right where I want it. Variables

are

> bound
>>> to results.
>>
>> Hmmm...
>>
>> irb(main):004:0> obj = [1,2]
>> => [1, 2]
>> irb(main):005:0> [x,y] = obj
>> SyntaxError: compile error
>>
>> Or did you mean: x,y = obj ? I don't think I'd characterize it as

obj

>> "doing its [0], [1]". You're not sending messages to obj -- not even
>> the message #. What happens in this scenario depends on assignment
>> semantics, not message semantics.
>
> I'd say methods are invoked:

Definitely, but I still wouldn't call it message semantics.

Ah, ok. I see (at least I think I do).

> Now I'm making things more complicated: generic Enumerables and Arrays

are

> treated differently in this assignment context:
>
>>> a,b,c = [0,1,2,3,4]
..
>>> a,b,c = *[0,1,2,3,4]
..
>>> a,b,c = f
..
>>> a,b,c = *f
>
> For arrays the star is added implicitely while for generic enumerables
> it's not. That's might be a reason to do away with this implicit
> behavior - at least for me.

I think it's inevitable that arrays are "special" in a lot of these
situations -- as wrappers for multiple arguments and return values, as
the "normalized" format for results from Enumerable operations like
select and map, and so on. Unless that's somehow all completely
redesigned, I think it would be better to keep the * behavior
array-bound in that sense.

Although I completely agree that Array is special in a lot of respects,
I'm still not fully convinced that it's good to have the different
behavior at this place. OTOH, it's not too big an issue (at least for me)
and conservatism is always a good option as it's least likely to break
existing code.

Thanks for clarifying!

Kind regards

robert

···

On Tue, 11 Jan 2005, Robert Klemme wrote:
>>> "David A. Black" <dblack@wobblini.net> wrote
>> On Tue, 11 Jan 2005, itsme213 wrote:

David_A_Black3 · 12 January 2005 12:19

Hi --

The proposal is for
   [x,y] = obj
to do exactly x=obj[0]; y=obj[1].

What's gained by that, though?

In a method signature
   def m [start,end]
is more informative than
   def m obj
or
   def m start_end

The first tells me I have to pass in something from which start can be
extracted with [0], and end with [1].

I continue not to believe that this can be determined at run time,
without essentially doing everything that you'd have to do anyway.
For example:

    class C
      def initialize
        @proxy =
      end

      def method_missing(*args)
        handle(*args)
      end

      private
      def handle(*args)
        @proxy.send(*args)
      end
    end

From the duck-typing perspective, a C object is 100% capable of
responding to = and . But from any other perspective -- i.e.,
other than simply asking the object to do it -- it isn't. And this is
just one small example.... This is the kind of thing that leads me to
believe that type assertions of pretty much any kind are fundamentally
out of place in the Ruby landscape.

And if available in method signatures,
it should be available other places where variables get bound, like
assignment, for-loops, ...

Maybe, though I don't think it's self-evident that all
variables-getting-bound semantics have to be the same as each other.
There might be other things to factor in; for example, in method
signatures there the special case of &block.

I know that Ruby's just_in_time duck typing safety check is sound. But I
also believe access to duck-typing information can be used for many several
purposes besides just to check on every call.

Again, I would describe this as type information, not duck-type
information. The thing you're discussing is type; "duck" is
superfluous, and actually somewhat misleading. See my "class C"
example above: it's a perfect setup for duck typing, but there's no
associated "information".

Perhaps. I think duck-typing is about focusing on respond_to and

steering

clear of class-based tests (unless they are truly part of what a method
requires, arguments of style aside). Duck-typing, like any strong

typing,

can be static (associated with variables and expressions) or dynamic
(associated only with objects), or in-between (dynamically select

between,

or even generate, multiple threads of type-specialized code).

I think it's a somewhat simpler and less sweeping (though rather
profound) concept; see http://www.rubygarden.org/ruby?DuckTyping

Hmm. I see the warning of class-based type info, but nothing against
respond_to?-style type info. Did I miss something?

It's not that it's for or against; it's just not based on "focusing on
respond_to?". Again, see my "class C" example; respond_to? is
completely irrelevant.

See my #can? method in the last post.

   module Kernel
     def can?(*methods)
       methods.all? {|m| respond_to?(m) }
     end
   end

   class C
     def blah(x)
       raise ArgumentError unless x.can?(:a,:b,:c)
     end
   end

Sure, but:
- What tells any runtime reflective access, or a compiler, to treat this as
part of the signature of #blah?
- I'd like to name and compose can?-based types in a way that is, again,
available to runtime reflective access or a compiler.

#can? *is* runtime reflection. As for compilers, I don't see a way to
express type that's both Ruby-friendly (i.e., applicable to more than
a subset of perfectly plausible Ruby scenarios) and compiler-friendly.
But I'm not a compiler expert

David

···

On Wed, 12 Jan 2005, itsme213 wrote:

"David A. Black" <dblack@wobblini.net> wrote

--
David A. Black
dblack@wobblini.net

Its_Me · 12 January 2005 17:31

> The first tells me I have to pass in something from which start can be
> extracted with [0], and end with [1].

I continue not to believe that this can be determined at run time,
without essentially doing everything that you'd have to do anyway.

Curiously, that is my point too. If you can some of the things that "you'd
have to do anyway" in a way that makes more transparent a signature (without
having to repeat it outside that signature), then that's A Good Thing.
Patterns like [x,y] are an example of this.

Some other type declarations may not be, unless they do "things you'd have
to do anyway". Hence my attempts to include variable bindings (which you'd
have to do anyway) into the type declarations.

From the duck-typing perspective, a C object is 100% capable of
responding to = and . But from any other perspective -- i.e.,
other than simply asking the object to do it -- it isn't. And this is
just one small example.... This is the kind of thing that leads me to
believe that type assertions of pretty much any kind are fundamentally
out of place in the Ruby landscape.

I don't understand your point. I assume that (basic) duck-typing is about
"does x respond to y", and does not include any further semantics of the
behavior of "y".

Maybe, though I don't think it's self-evident that all
variables-getting-bound semantics have to be the same as each other.

Now that you have stated it, it is. The binding is done by Ruby, not by
user-methods. e.g. just as it is illegal today to have two arguments with
the same name in a method def, it could be illegal to have the same variable
appear twice within a pattern on the "lhs" of a binding.

There might be other things to factor in; for example, in method
signatures there the special case of &block.

Which I believe is a special case of pattern-based binding. As is *splat,
and **keywords.

Again, I would describe this as type information, not duck-type
information.

Unless the "type information" does the things that you'd "have to do
anyway", correct?

It's not that it's for or against; it's just not based on "focusing on
respond_to?". Again, see my "class C" example; respond_to? is
completely irrelevant.

I think we are talking past each other on something here. Your class C seems
to check respond_to? via the can? wrapper.

I'm happy with something like your 'can?' if:
- it becomes the standard Ruby to do it
- it provides a basic set of compositions e.g. t1 && t2 || t3
- it allows for things to be named (constants would probably suffice)

I'd like it to include variable bindings as well, if possible.

#can? *is* runtime reflection. As for compilers, I don't see a way to
express type that's both Ruby-friendly (i.e., applicable to more than
a subset of perfectly plausible Ruby scenarios) and compiler-friendly.
But I'm not a compiler expert

Nor am I. Further, I'm not a Ruby expert either
(But I am an active Ruby programmer and do know a bit about objects).

···

"David A. Black" <dblack@wobblini.net> wrote

David_A_Black3 · 12 January 2005 18:15

Hi --

The first tells me I have to pass in something from which start can be
extracted with [0], and end with [1].

I continue not to believe that this can be determined at run time,
without essentially doing everything that you'd have to do anyway.

Curiously, that is my point too. If you can some of the things that "you'd
have to do anyway" in a way that makes more transparent a signature (without
having to repeat it outside that signature), then that's A Good Thing.
Patterns like [x,y] are an example of this.

There's a verb missing between "can" and "some" But in any case,
my point was that there are some things an object can do that you
really cannot discover except by sending a message to the object.
That's what I mean by the things "you'd have to do anyway"; in the
course of establishing by reflection what the object can do, you'd
have to call the method you're looking for -- so why not just skip
that step and call it when you need it?

That was what my example was about. I think there was some confusion
about which example I was referring to, so let me repeat it:

    class C
      def initialize
        @proxy =
      end

      def method_missing(*args)
        handle(*args)
      end

      private
      def handle(*args)
        @proxy.send(*args)
      end
    end

Now do:

    c = C.new
    puts c.respond_to?(:=) # false
    c[1] = 2
    puts c[1] # 2

In spite of the negative response to respond_to?, the object *does*
know what to do when sent the message = (and also ).

The point is that in the end, there are cases where the *only* way to
know what an object will do (or not) when you send it a message is to
send it the message. At that moment, it really doesn't matter how
much build-up there's been, or how many times you've called
#respond_to?, or what the method signature looks like. There's just
this flashpoint, and its consequences, one way or the other.

Some other type declarations may not be, unless they do "things you'd have
to do anyway". Hence my attempts to include variable bindings (which you'd
have to do anyway) into the type declarations.

I'm just not sold on the idea of a method signature as a place to put
all this information. Actually it's partly because it isn't "all" --
if you have: def f [x,y] or whatever, you're giving a clue about one
method that x and y have, but there might be several (or a hundred)
other methods. It doesn't scale.

But I think the more fundamental problem is qualitative rather than
quantitative: it's not just that it's hard to insert all the
information, but that the "information" available is not, in any case,
really a full profile of the object. That's why I'd rather just take
objects as they are, and directly query their capabilities
(information relating to their type) only when there's a specific need
to do so.

Again, I would describe this as type information, not duck-type
information.

Unless the "type information" does the things that you'd "have to do
anyway", correct?

I personally don't find it useful to call an object's type information
its "duck type information". At best, it's like adding "statically-
typed" to every reference to a C variable (this method takes a
statically-typed int and a statically-typed pointer to char, and
returns a statically-typed int) -- i.e., redundant. I also think it
dilutes the meaning, and therefore the power, of the notion of duck
typing.

But honestly, I'm not the duck police You'll have to use it
however you want.

David

···

On Thu, 13 Jan 2005, itsme213 wrote:

"David A. Black" <dblack@wobblini.net> wrote

--
David A. Black
dblack@wobblini.net

Its_Me · 12 January 2005 19:01

> Curiously, that is my point too. If you can some of the things that

"you'd

Darn! "If you can DO some of the things ..."

There's a verb missing between "can" and "some" But in any case,
my point was that there are some things an object can do that you
really cannot discover except by sending a message to the object.
That's what I mean by the things "you'd have to do anyway"; in the
course of establishing by reflection what the object can do, you'd
have to call the method you're looking for -- so why not just skip
that step and call it when you need it?

Simple. To make the signatures more revealing (to human eyes or code),
provided that:

- you don't have to re-do the things (e.g. establish bindings) in the
signature, or

- there are performance advantages to the info (e.g. type-optimized code
paths), or

- there are other advantages to the info (e.g. certain reflective
facilities)

    class C
      def initialize
        @proxy =
      end

      def method_missing(*args)
        handle(*args)
      end

      private
      def handle(*args)
        @proxy.send(*args)
      end
    end

Now do:

    c = C.new
    puts c.respond_to?(:=) # false
    c[1] = 2
    puts c[1] # 2

In spite of the negative response to respond_to?, the object *does*
know what to do when sent the message = (and also ).

Fine. So you have implemented method_missing and respond_to? in mutually
inconsistent ways. But then this is possible in any number of other places
in Ruby where sets of methods are expected to be mutually consistent, but
that is up to the programmer.

You should have done:
class C
    ... your other stuff ...
    def respond_to?(x)
        x == : ? true : super
    end
end

But I think the more fundamental problem is qualitative rather than
quantitative: it's not just that it's hard to insert all the
information, but that the "information" available is not, in any case,
really a full profile of the object. That's why I'd rather just take
objects as they are, and directly query their capabilities
(information relating to their type) only when there's a specific need
to do so.

Pushing that to (an admittedly absurd) extreme, every method would have a
single *args argument. No, seriously, I do understand and respect your
viewpoint. But some of your examples justifying it are not quite correct
(like your class C).

Cheers.

···

"David A. Black" <dblack@wobblini.net> wrote

David_A_Black3 · 12 January 2005 22:07

Hi --

    class C
      def initialize
        @proxy =
      end

      def method_missing(*args)
        handle(*args)
      end

      private
      def handle(*args)
        @proxy.send(*args)
      end
    end

Now do:

    c = C.new
    puts c.respond_to?(:=) # false
    c[1] = 2
    puts c[1] # 2

In spite of the negative response to respond_to?, the object *does*
know what to do when sent the message = (and also ).

Fine. So you have implemented method_missing and respond_to? in mutually
inconsistent ways.

I haven't touched respond_to? I'm just showing you a constraint under
which it operates.

But then this is possible in any number of other places
in Ruby where sets of methods are expected to be mutually consistent, but
that is up to the programmer.

You should have done:
class C
   ... your other stuff ...
   def respond_to?(x)
       x == : ? true : super
   end
end

There's no "should have" here, certainly not based on convention: I
don't think I've ever seen a case of adding things to respond_to? like
that, even in code which makes all sorts of use of method_missing.
The whole point of method_missing is that it catches missing methods.
You're not expected to enumerate them by name -- in fact, it would be
impossible (not to mention a maintenance nightmare!) -- and if the
object thinks it responds to them, then they're not missing.

What you *can* do is document the code. It is no coincidence, I
think, that the person who coined the term "duck typing" and the
author of RDoc are one and the same The only thing that should
matter is the knowledge that you can do:

c[1] = 2

and maybe a somewhat thicker description of what that means in the
case of this object.

But I think the more fundamental problem is qualitative rather than
quantitative: it's not just that it's hard to insert all the
information, but that the "information" available is not, in any case,
really a full profile of the object. That's why I'd rather just take
objects as they are, and directly query their capabilities
(information relating to their type) only when there's a specific need
to do so.

Pushing that to (an admittedly absurd) extreme, every method would have a
single *args argument.

That's the extreme of a different case actually I'm talking about
objects and their responses, nothing about which would lead me to the
single *args.

No, seriously, I do understand and respect your
viewpoint. But some of your examples justifying it are not quite correct
(like your class C).

It's perfectly correct -- a little artificial, perhaps, but not
incorrect. It's actually only the tip of the iceberg when it comes to
method_missing; all sorts of such things can happen.

I think you're reasoning in a bit of a circle: full run-time type
description must be possible in Ruby, because any technique that
interferes with it is bad programming. I disagree; I think there's a
lot of things that dynamic objects can do that escape that kind of
description, and there's no reason to dismiss those things out of
hand. (Or maybe you just didn't like my code

David

···

On Thu, 13 Jan 2005, itsme213 wrote:

--
David A. Black
dblack@wobblini.net

Its_Me · 12 January 2005 23:36

"David A. Black" <dblack@wobblini.net> wrote in message

> You should have done:
> class C
> ... your other stuff ...
> def respond_to?(x)
> x == : ? true : super
> end
> end

There's no "should have" here, certainly not based on convention: I
don't think I've ever seen a case of adding things to respond_to? like
that, even in code which makes all sorts of use of method_missing.

Let's see, .... Rails generates lots of "methods" from macros by hooking
into method_missing. Here is the tip of Rail's responds_to chain:

      def respond_to?(method)
        self.class.column_methods_hash[method.to_sym] ||
respond_to_without_attributes?(method)
      end

Not a valid example?

The whole point of method_missing is that it catches missing methods.

Sure. And if, after catching it, you do something which the (original)
caller would consider a valid handling of the (original) request, then you
should indicate the fact in your 'respond_to?'. Otoh, if your
method_missing simply interposes some exception stuff, and perhaps then
calls super, then you have not handled the client request. In that case you
did not respond_to the client request.

You're not expected to enumerate them by name -- in fact, it would be
impossible (not to mention a maintenance nightmare!) -- and if the
object thinks it responds to them, then they're not missing.

Imo, this is mixing two very different things.
(a) The Ruby machine rule that if the called method is not found on the
object's [singleton] class, then method_missing is invoked; and
(b) The view that your object's client has of whether or not a message will
be handled (via a respond_to probe), and whether it was handled (via the
outcome of a x.do_it) call.

The distinction is important. I am talking about using (a) while offering a
consistent version of (b). Like Rails does. After not covering that
respond_to ground the first few releases.

> Pushing that to (an admittedly absurd) extreme, every method would have

a

> single *args argument.

That's the extreme of a different case actually I'm talking about
objects and their responses, nothing about which would lead me to the
single *args.

My point was that we want our signatures to reveal usage. It's a matter of
degree.
def (x, y, z) # N args
    ... use x, y, z
def [x, y, z] # 1 arg,
    ...use x, y, z
def *args # 1 splat arg, taken 'as it is', queried internally
    x = args[0]
    y = args[1]
    z = args[2]
     ...use x, y, z

there's no reason to dismiss those things out of
hand.

No worries, I don't feel about to dismiss any Ruby practices out of hand,
least of all those of clear Ruby experts.

(Or maybe you just didn't like my code

Even without ever seeing it I am sure it is far better than mine

ts1 · 13 January 2005 10:06

It's perfectly correct -- a little artificial, perhaps, but not
incorrect. It's actually only the tip of the iceberg when it comes to
method_missing; all sorts of such things can happen.

Why do you think that it's artifical ?

svg% cat b.rb
#!/usr/local/bin/ruby
$LOAD_PATH.unshift("../src")
require 'bdbxml'

txn = BDB::Env.new("tmp", BDB::INIT_TRANSACTION).manager.transaction
p txn.respond_to?(:create_modify)
p txn.create_modify
svg%

svg% b.rb
false
#<BDB::XML::Modify:0x40093080>
svg%

Guy Decoux

David_A_Black3 · 13 January 2005 00:02

Hi --

"David A. Black" <dblack@wobblini.net> wrote in message

You should have done:
class C
   ... your other stuff ...
   def respond_to?(x)
       x == : ? true : super
   end
end

There's no "should have" here, certainly not based on convention: I
don't think I've ever seen a case of adding things to respond_to? like
that, even in code which makes all sorts of use of method_missing.

Let's see, .... Rails generates lots of "methods" from macros by hooking
into method_missing. Here is the tip of Rail's responds_to chain:

     def respond_to?(method)
       self.class.column_methods_hash[method.to_sym] ||
respond_to_without_attributes?(method)
     end

Not a valid example?

Quite interesting example -- I hadn't seen it.

The whole point of method_missing is that it catches missing methods.

Sure. And if, after catching it, you do something which the (original)
caller would consider a valid handling of the (original) request, then you
should indicate the fact in your 'respond_to?'. Otoh, if your
method_missing simply interposes some exception stuff, and perhaps then
calls super, then you have not handled the client request. In that case you
did not respond_to the client request.

I don't agree that that's a necessary formula, or that nothing that
doesn't fit into it should exist. A good example is OpenStruct, which
takes advantage of the open-endedness of method_missing.

You're not expected to enumerate them by name -- in fact, it would be
impossible (not to mention a maintenance nightmare!) -- and if the
object thinks it responds to them, then they're not missing.

Imo, this is mixing two very different things.
(a) The Ruby machine rule that if the called method is not found on the
object's [singleton] class, then method_missing is invoked; and
(b) The view that your object's client has of whether or not a message will
be handled (via a respond_to probe), and whether it was handled (via the
outcome of a x.do_it) call.

The distinction is important. I am talking about using (a) while offering a
consistent version of (b). Like Rails does. After not covering that
respond_to ground the first few releases.

I actually see (a) and (b) as even more distinct than that; or, to put
it another way, I think it's fine to stock respond_to? in cases where
it's possible, but it's not always possible -- precisely because these
two mechanisms are different from each other (not only separate but
different).

(Of course you could just do:

   def respond_to?(m)
     true
   end

to mirror the flexibility of method_missing

(Or maybe you just didn't like my code

Even without ever seeing it I am sure it is far better than mine

You did see it -- I meant specifically the "class C" example -- as in,
maybe my lengthy claim that it was not incorrect really isn't to the
point if you just happened to think it was badly written

I think we should probably wind the thread down. It's been very
interesting indeed, but I think we're starting to circle a bit.

David

···

On Thu, 13 Jan 2005, itsme213 wrote:

--
David A. Black
dblack@wobblini.net

Its_Me · 13 January 2005 14:51

"ts" <decoux@moulon.inra.fr> wrote in message

p txn.respond_to?(:create_modify)
p txn.create_modify

Oh oh, even more Ruby experts join the chorus ?

My claim is this: If you redefine #method_missing so your object handles
x.foo as the caller would expect, then you _should_ define
x.respond_to?(:foo) to return true. I do not claim that Ruby should prohibit
code that does not do this.

I am interested in your thoughts on this claim.

p.s. I know it is possible to redefine #send so that x.send :foo works, but
x.foo does not. I think this a practice to be actively discouraged. But
let's leave that out for now.

Its_Me · 13 January 2005 02:11

"David A. Black" <dblack@wobblini.net> wrote in message

I think we should probably wind the thread down. It's been very
interesting indeed, but I think we're starting to circle a bit.

I agree on both counts. Cheers!

Topic		Replies	Views
Automated ducks solution (was: Re: Duck Typing as Pattern Matching) ruby-talk	0	101	10 January 2005
Duck patterns in Ruby ruby-talk	3	68	22 October 2002
Duck Typing as Pattern Matching ruby-talk	3	100	13 January 2005
How to duck type? - the psychology of static typing in Ruby ruby-talk	82	414	25 May 2004
All there is to know about Duck Typing (was: inplace assignme nt) ruby-talk	0	101	14 December 2003

Duck Typing as Pattern Matching

Related topics