Documenting a class interface when there are no types in the method signature

I've recently started using Ruby, coming from a C++/Java background.

And while there where some idiosyncracies, by and large I was able to be
very productive very quickly.

The application has evolved from a simple script (in the sense that
everything "looks global") to something a bit more complex, with
refactorings towards using objects, a "Test-First-After-The-Fact" approach
:-), and to factoring out library-like elements.

Now, I'm aware that there's always a discussion about strong vs. weak,
static vs. dynamic. The last time I had an opinion about it I had no good
answer to the claim that unit-tests lessen the need for static type
checking.

Now that I have gulped down the basics of ruby, I'm starting writing unit
tests, from the point of view of "what would a user of this class expect".
And I'm running into what seems to be a fundamental question, and my
googling so far hasn't turned up any satisfactory answer:
When I define a method in a class, let's say initialize(categories, data)
for the sake of argument: In Java etc. I can see from the method definition
that categories is an ordered set, and data is a Map (and if that info is
not sufficient, javadoc can be used to explain in more detail what is
expected, but I'll leave that out for the moment, it's not really key to the
question).
Of course the first step would be to find a good name (at the very least
"data" is a bit too generic). But what next? How does someone who writes an
API communicate that it makes no sense to send "1" to the categories.
Apparently the unit tests and/or dbc encompass the specification, but does
rdoc or some tool extract any information from them?

Am I the first person to run in this problem? If not, what are other
people's solutions? Do you just look at the source code of any library you
use?

I'm asking this from both points of view:

Many times I have run into some library where I was simply asking myself
"what exactly do I have to put into these parameters to get the method to do
what I want?" More specifically, I had this problem in parts of the REXML
library.

And from the other perspective, what is the Ruby Way (tm) to document
methods to users of your API/framework (or simply your class)?

Nicolai Czempin wrote:

When I define a method in a class, let's say initialize(categories, data)
for the sake of argument: In Java etc. I can see from the method definition
that categories is an ordered set, and data is a Map (and if that info is
not sufficient, javadoc can be used to explain in more detail what is
expected, but I'll leave that out for the moment, it's not really key to the
question).
Of course the first step would be to find a good name (at the very least
"data" is a bit too generic). But what next? How does someone who writes an
API communicate that it makes no sense to send "1" to the categories.
Apparently the unit tests and/or dbc encompass the specification, but does
rdoc or some tool extract any information from them?

No, RDoc itself can't extract information from Unit Tests. I'm right now commonly embedding some simple unit tests into my documentation which also act as handy sample code. Like in this -- bad and made-up -- example:

# Returns the first of the two halves of an Object.
# This works with all Objects that respond to the #[] and #size methods.

extract.rb (2.68 KB)

ยทยทยท

#
# Example code:
# halve("hello world") # => "hello"
# halve([1, 2, 3, 4]) # => [1, 2]
# halve(Object.new) # raises NoMethodError
def halve(obj)
   obj[0, obj.size / 2]
end

I have written a small tool that will extract the embedded sample code and run it as unit tests. It's still not perfect, but I've attached it to this mail -- maybe it is of some use for you.

Here's a real world example: (from the evil-ruby project)

  # Unfreeze a frozen Object. You will be able to make
  # changes to the object again.
  #
  # obj = "Hello World".freeze
  # obj.frozen? # => true
  # obj.unfreeze
  # obj.frozen? # => false
  # obj.sub!("World", "You!")
  # obj # => "Hello You!"
  def unfreeze
    if $SAFE > 0
      raise(SecurityError, "Insecure operation `unfreeze' at level #{$SAFE}")
    end

    return self if direct_value?

    self.internal.flags &= ~RubyInternal::FL_FREEZE
    return self
  end

So my answer to the question "How does a library user know what protocol objects need to comply with when they are used as arguments for method X?" is "It's specified in the documentation of method X. There's also sample code in there which will yield failing unit test when the protocol changes so the documentation won't be outdated."

I hope I could answer at least some of your questions.

Regards,
Florian Gross

Nicolai Czempin wrote:

I've recently started using Ruby, coming from a C++/Java background.
...
Am I the first person to run in this problem? If not, what are other
people's solutions? Do you just look at the source code of any library you
use?

Sometimes, though I'd prefer not to. Too lazy.

It helps when the arguments have meaningful names.

I'm asking this from both points of view:

Many times I have run into some library where I was simply asking myself
"what exactly do I have to put into these parameters to get the method to do
what I want?" More specifically, I had this problem in parts of the REXML
library.

And from the other perspective, what is the Ruby Way (tm) to document
methods to users of your API/framework (or simply your class)?

I don't know about The Ruby Way, but when I have a method that will accept different sorts of objects for the same parameter, I simple note that in the rdoc.

There are (at least) two types of object substitution. In one, the method is basically expecting, say, a String, but really only cares that it responds to one or two String methods.

In the other case, the method knows that it might receive objects of a limited set. For example, a method for XML processing that knows what to do if given either a String or a REXML Document.

In the first case, the parameter name might be enough to indicate what to pass in. In the second case, some explicit docs are needed to make clear that one can pass either something that acts like a String, or something that acts like a REXML Document. (Well, you could use a parameter named 'xml_as_either_string_or_REXML_doc'.)

I'm thinking, though, that this is not so much a dynamic typing issue, but a problem for API design and documentation in general. For example, simply knowing that some Java method requires a Collection object may not be enough. A Collection of what? The method name, and the names of the parameters, need to express the reason the method exists and why you might be interested in using it.

Even when you know the exact type for a parameter, you still have to know the range of acceptable or meaningful values. Sometimes the best way for a developer to indicate that is to simply write it down.

James