Module Definition Idioms

Looking at HTree, I see a few more things
that seem weird to me, but which are
probably perfectly normal Ruby:

file: parse.rb

···

--------------
module HTree
   def HTree.parse
     ...
   def

   def HTree.parse_xml
     ...
   end

   ...

   def Text.parse_cdata_section(raw_string)
     ...
   end

   def Comment.parse(raw_string)
     ...
   end

end # module
---------------

This is all very strange and new to me.
There are a dozen files, each of which adds
to the "HTree" module, and in each module,
methods are added to several classes.

What makes that a desirable way to structure
things? In the Java world, each class is
in its own file. What are the advantages of
doing things this way, instead?

Probably to define the module HTree as the top level namespace.
"Text" and "Comment" are pretty common names for classes/modules, so
putting them inside an HTree module puts them in their own namespace
and keeps everything organized.

- Rob

···

On 7/13/06, Eric Armstrong <Eric.Armstrong@sun.com> wrote:

Looking at HTree, I see a few more things
that seem weird to me, but which are
probably perfectly normal Ruby:

file: parse.rb
--------------
module HTree
   def HTree.parse
     ...
   def

   def HTree.parse_xml
     ...
   end

   ...

   def Text.parse_cdata_section(raw_string)
     ...
   end

   def Comment.parse(raw_string)
     ...
   end

end # module
---------------

This is all very strange and new to me.
There are a dozen files, each of which adds
to the "HTree" module, and in each module,
methods are added to several classes.

What makes that a desirable way to structure
things? In the Java world, each class is
in its own file. What are the advantages of
doing things this way, instead?

--

Rob Sanheim wrote:

···

On 7/13/06, Eric Armstrong <Eric.Armstrong@sun.com> wrote:

file: parse.rb
--------------
module HTree
   def HTree.parse
     ...
   def

   def Comment.parse(raw_string)
     ...
   end

end # module
---------------

What makes that a desirable way to structure
things?

Probably to define the module HTree as the top level namespace.
"Text" and "Comment" are pretty common names for classes/modules, so
putting them inside an HTree module puts them in their own namespace
and keeps everything organized.

Aha. Module as namespace. That rings a bell.
Thanks.

Ok. I get that using a module lets you establish
a namespace. Now for the other part of the question:

   Why is the HTree module split up among half
   a dozen files: parse.rb, and others??

And come to think of it, is it possible that
the Comment and HTree classes are intuited by
Ruby, since they're not explicitly defined,
in this file at least?

Rob Sanheim wrote:

···

On 7/13/06, Eric Armstrong <Eric.Armstrong@sun.com> wrote:

file: parse.rb
--------------
module HTree
   def HTree.parse
     ...
   end
   ...
   def Comment.parse(raw_string)
     ...
   end

end # module
---------------

What makes that a desirable way to structure
things?

Probably to define the module HTree as the top level namespace.

>

Ok. I get that using a module lets you establish
a namespace. Now for the other part of the question:

  Why is the HTree module split up among half
  a dozen files: parse.rb, and others??

This question sounds to me an awful lot like "why organise your code at all? why not just use one big file?" For large projects (or even small-to-medium projects that you expect other people to use), some sort of functional organisation is necessary.

And come to think of it, is it possible that
the Comment and HTree classes are intuited by
Ruby, since they're not explicitly defined,
in this file at least?

I don't believe so. Class names are constants, and as such have to be initialized somewhere before they're they're used. Try this example:

module Foo
   def Bar.bar
     puts "hello world"
   end
end

-:2: uninitialized constant Foo::Bar (NameError)

HTree::Comment and HTree::Text will be defined somewhere in one of the files 'require'd into parse.rb (this includes the whole chain of a-requires-b-requires-c-etc).

I'll throw in my bit on the original question as well:

Rob Sanheim wrote:

[HTree parse.rb example]
What makes that a desirable way to structure
things?

Probably to define the module HTree as the top level namespace.

In this case it also seems to encapsulate similar behaviour within a single file: all the parsing methods for the various classes are defined in a single location. More generally, if Java is "one class, one file," this approach would be "one behaviour, one file."

In the Java world, each class is
in its own file. What are the advantages of
doing things this way, instead?

Quite similar to the Java model, I'd say, just substitute 'behaviour' for 'class' in all the points. For example, if, in Java, you'd say that an error would be easy to localise because you know where the class is defined, here you might say it's easy to localise because you know where that behaviour is defined.

Which one is more appropriate likely depends both on the code and the coder. I imagine that for certain class structures where there are many similarities in behaviour across the classes(e.g., there's an A#a, B#a, C#a, D#a, etc. that all do similar things), but little interaction within classes (e.g., A#a and A#b don't particularly care about each other), grouping by behaviour makes a lot of sense, at least to me.

matthew smillie.

···

On Jul 14, 2006, at 7:57, Eric Armstrong wrote:

On 7/13/06, Eric Armstrong <Eric.Armstrong@sun.com> wrote:

Matthew Smillie wrote:

Ok. I get that using a module lets you establish
a namespace. Now for the other part of the question:

  Why is the HTree module split up among half
  a dozen files: parse.rb, and others??

This question sounds to me an awful lot like "why organise your code at all? why not just use one big file?" For large projects (or even small-to-medium projects that you expect other people to use), some sort of functional organisation is necessary.

You are kidding, right? I am just not grokking this
particular style of organization. Everything I've
seen about modules explains the syntax. But the
syntax is being used in ways I never expected.

So far, I've seen:
   * classes inside of modules (expected)
   * modules inside of classes (why?)
   * big files with multiple modules in them (why?)
   * modules split up among multiple files,
     adding methods to different classes
     (explained below. Thanks!)

Understanding the syntax of "module" and "class"
doesn't begin to help understand the philosophy
behind those organizational styles, or help me
understand the mental model that produced them.

So I'm here, hoping to expand my comprehension.

Class names are constants, and as such have to be initialized somewhere before they're they're used.

> Try this example:

module Foo
  def Bar.bar
    puts "hello world"
  end
end

-:2: uninitialized constant Foo::Bar (NameError)

Right. I should have tried that experiment.

HTree::Comment and HTree::Text will be defined somewhere in one of the files 'require'd into parse.rb (this includes the whole chain of a-requires-b-requires-c-etc).

Makes sense.

I'll throw in my bit on the original question as well:

Rob Sanheim wrote:

[HTree parse.rb example]
What makes that a desirable way to structure
things?

Probably to define the module HTree as the top level namespace.

In this case it also seems to encapsulate similar behaviour within a single file: all the parsing methods for the various classes are defined in a single location. More generally, if Java is "one class, one file," this approach would be "one behaviour, one file."

AHA! That is a /really/ interesting observation.
That seems useful, somehow. I like it immediately.

In the Java world, each class is
in its own file. What are the advantages of
doing things this way, instead?

Quite similar to the Java model, I'd say, just substitute 'behaviour' for 'class' in all the points. For example, if, in Java, you'd say that an error would be easy to localize because you know where the class is defined, here you might say it's easy to localise because you know where that behaviour is defined.

Which one is more appropriate likely depends both on the code and the coder. I imagine that for certain class structures where there are many similarities in behaviour across the classes(e.g., there's an A#a, B#a, C#a, D#a, etc. that all do similar things), but little interaction within classes (e.g., A#a and A#b don't particularly care about each other), grouping by behaviour makes a lot of sense, at least to me.

Thanks much. Grouping by behavior is a very interesting
concept. When I learn something that changes one bit of
parsing code, there is a good chance that I will need to
modify other bits, as well. That organization could make
it easier to replicate changes.

So that covers

···

On Jul 14, 2006, at 7:57, Eric Armstrong wrote:

On 7/13/06, Eric Armstrong <Eric.Armstrong@sun.com> wrote: