XMLParser, NQXML, REXML,

Armin_Roehrl · 20 October 2002 21:12

Hi XML-freaks,

Who has used XMLParser, NQXML, REXML, ...?

I am right now looking at the faqtotoum
http://www.rubygarden.org/iowa/faqtotum
and try to work on pending questions.

Feedback/Experience would be welcome.
Why did you individually choose parser X and not
parser Y?

Thanks,
-A.

···

Armin Roehrl, http://www.approximity.com
Training, Development and Mentoring
OOP, XP, Java, Ruby, Smalltalk, .Net, Datamining, Parallel computing,
Webservices

I can’t understand why people are frightened of new ideas.
I’m frightened of the old ones.
– John Cage

James6 · 20 October 2002 21:38

Hi XML-freaks,

Who has used XMLParser, NQXML, REXML, …?

I’ve used some of the available XML parsers for Ruby, ultimately settling on
REXML for pretty much all my XML needs.

I am right now looking at the faqtotoum
http://www.rubygarden.org/iowa/faqtotum
and try to work on pending questions.

Feedback/Experience would be welcome.
Why did you individually choose parser X and not
parser Y?

The main considerations were ease of installation under both Linux and windows,
and the API.
If I had to compile native C code, that was a problem. If it had to link with
some other native code library (e.g., expat), that too was a problem. I was
looking for something I could install with minimum user privileges, so a
pure-Ruby solution was the goal, and REXML does all I need it to do.

The REXML API is quite intuitive, and the option to use either a object model or
a stream parser is very handy.

Incidentally, I’m working on a chart of Ruby XML libraries that, hopefully, will
make it easier to see what does what.

In the mean time, you can still dig up quite a bit of information here:

James

···

Thanks,
-A.

Sean_Chittenden2 · 23 October 2002 04:01

Hi XML-freaks,

Who has used XMLParser, NQXML, REXML, …?

I am right now looking at the faqtotoum
http://www.rubygarden.org/iowa/faqtotum
and try to work on pending questions.

Feedback/Experience would be welcome.
Why did you individually choose parser X and not
parser Y?

http://www.rubynet.org/modules/xml/libxml/

I’m writing libxml for Ruby. It’s actually pretty stable at this
point and very fast compared to the other XML parsers out there.
Right now there is full XPath support, it does do validation, can
write XML documents using the DOM interface, and can read/write
gzipped XML documents. A SAX interface will likely come sometime in
mid-November, however the xslt libraries are getting the majority of
my time at the moment. XSLT variables are coming shortly, maybe even
tonight sometime.

Markus and I are working on rubydoc which is now able to automatically
document the API and interface for libxml according to the rubydoc
DTD. As soon as libxml2 has support for schemas, I will move all of
our DTD’s to schemas. In the mean time, the spec for rubydoc can be
found here:

http://cvs.rubynet.org/index.cgi/projects/rubydoc/src/doc/

With libxslt, I’m working on the rubydoc CLI that will merge in hand
written documentation (from an inline source or from an external
rubydoc XML file) with the auto-generated document to form a complete
map of the source, regardless of whether or not it is C module or
written in Ruby.

I like libxml because it’s very fast, you can do xpath queries on any
node in the document, and the XML parser is based on libxml2 which is
spec complete in most cases and released under the MIT license. If
you have any problems with code, please let me know.

-sc

···

–
Sean Chittenden

Austin_Ziegler2 · 23 October 2002 04:46

Markus and I are working on rubydoc which is now able to
automatically document the API and interface for libxml according
to the rubydoc DTD. …
[…]
With libxslt, I’m working on the rubydoc CLI that will merge in
hand
written documentation (from an inline source or from an external
rubydoc XML file) with the auto-generated document to form a
complete
map of the source, regardless of whether or not it is C module or
written in Ruby.

Okay. This hits on perhaps my biggest frustration with Ruby right
now. I love this language. It makes sense in a way that I haven’t
really found any other language to make sense … BUT!

Perl has ONE documentation format, POD. It’s old, it’s ugly, and
it’s a sumbitch to do anything “nice” with, but bedamned it WORKS.

Ruby, on the other hand, has two going on three different formats,
if I’m understanding what Sean is saying correctly. None of which,
by the way, appear to be compatible. Right now, if I want
documentation for a package, I have to keep rd2 and rdoc handy. I’d
probably have to keep rubydoc’s CLI handy, too, but it’s currently
vapour. I settled on rdoc for my personal documentation efforts
because it Just Works.

Similarly, there are no fewer than four packaging/installation
systems available or soon to be available: RubyGems, setup.rb, rpkg,
and (again, currently vapourous) rubynet. I’m sure I’ve missed one
or two. What I’ve settled on is a variant of what Dave includes with
rdoc, that might be one of the two variants included with setup.rb
– but I’m not quite sure.

As to multiples of other functionality providers, I don’t have as
much of an issue, except for the database libraries.[1] But
packaging/installation systems and documentation systems are
fundamentals. rd and rd2 are too limited? Fine. The same cannot be
said of rdoc. rdoc even understands other languages (including
Fortran95?!?!) so that documentation can be embedded in those
languages and produced cleanly. It supports – at least
experimentally and I haven’t been able to get it to work yet –
automatic diagram creation. Dave is quite responsive to suggestions,
and rdoc comments are very readable in and of themselves without
being formatted as HTML or anything else. Want to create PDFs
instead of HTML files? Write the appropriate generator and/or
template.

I’m still baffled as to why I should choose any particular
package/installation getup. Perl has, for the most part,
standardized on ‘perl Make’ or something like that which creates a
makefile which may or may not use autoconf as necessary. Of course,
this part gets into another issue that may be unique only to the
Windows port because there’s such a variety of compilers, but it
would be nice if the setup tools for extensions requiring
compile-time efforts could be made to recognise that the local
compiler may not be the same as the package provider’s compiler –
the same with the build directory (Andy’s build directory begins on
drive t:, did anyone notice? (:

Yes, something better than the current RAA with its link-only form
is needed. Yes, rubynet might be a good choice for defining such a
concept. I also think that Simon Cozens might be right that we might
also be best off starting from CPAN and search.cpan and slowly but
surely rubyfying so that we have a proper CRAN/rubynet/whatEVER!

BTW, with respect to rubynet itself, I don’t like the dot files. Why
not use an XML file format or YAML, if you want something
lighter-weight (I’m partial to YAML’s simplicity, even though I’m
happily using REXML for an application that I’ve written locally).
IMO, there should be ONE file that describes a given project – and
it’s up to the rubynet server(s) to parse that file into the
appropriate meta-data. (Think PAD for Windows shareware or DIZ or
even RSS for that matter.)

-austin
[1] Perl standardized on DBI a while back, and I think that all of
the various database libraries should merge in with Ruby’s DBI
effort so that there’s a single database interface. It’s too
messy to make portable code, otherwise.
– Austin Ziegler, austin@halostatue.ca on 2002.10.23 at 00.45.43

···

On Wed, 23 Oct 2002 13:01:59 +0900, Sean Chittenden wrote:

Sean_Chittenden2 · 23 October 2002 05:07

Markus and I are working on rubydoc which is now able to
automatically document the API and interface for libxml according
to the rubydoc DTD. …
[…]
With libxslt, I’m working on the rubydoc CLI that will merge in
hand written documentation (from an inline source or from an
external rubydoc XML file) with the auto-generated document to
form a > complete map of the source, regardless of whether or not
it is C module or written in Ruby.

Perl has ONE documentation format, POD. It’s old, it’s ugly, and
it’s a sumbitch to do anything “nice” with, but bedamned it WORKS.

Works, sure, but it’s ugly… and on the scale of ugly as in it’s
fugly ugly. pukes and puts bag over POD’s head

Ruby, on the other hand, has two going on three different formats,
if I’m understanding what Sean is saying correctly.

Sadly yes… however there is a saving grace with rubydoc, it’s
standard is XML and the other utilities export XML… which means that
anyone can write a stylesheet that’ll convert rdoc->rubydoc or
rd->rubydoc. These stylesheets will be included in the base rubydoc
installation.

None of which, by the way, appear to be compatible. Right now, if I
want documentation for a package, I have to keep rd2 and rdoc
handy. I’d probably have to keep rubydoc’s CLI handy, too, but it’s
currently vapour. I settled on rdoc for my personal documentation
efforts because it Just Works.

rdoc’s great and I actually plan on using it for embedded
documentation. I have no interest in recreating the work of Dave,
only extending it and making things more generic and rubynet friendly
(wherein users can submit comments/patches, code, etc on either
rubynet.org or rubydoc.org and have the latest doc included on
download or available via a ruby(doc|net) --update [module] command).

Similarly, there are no fewer than four packaging/installation
systems available or soon to be available: RubyGems, setup.rb, rpkg,
and (again, currently vapourous) rubynet.

:-/ Working on stuff by yourself and being in a perpetual state of
over commitment and everywhere-all-at-once kinda bites when you’re
trying to design something “right” and aren’t hacking it together.
Help wanted/appreciated.

It supports – at least experimentally and I haven’t been able to
get it to work yet – automatic diagram creation.

Speaking of, does ruby have a DOT interface? I’ve wanted to use this
for class diagramming on rubynet but haven’t looked into it.

else. Want to create PDFs instead of HTML files? Write the
appropriate generator and/or template.

The joy of XSLT and flow objects.

BTW, with respect to rubynet itself, I don’t like the dot files. Why
not use an XML file format or YAML, if you want something
lighter-weight (I’m partial to YAML’s simplicity, even though I’m
happily using REXML for an application that I’ve written locally).

I’m not a yaml lover, personally. Culturally I think YAML exists as
a counter movement to Java/XML and XMLs tendency to get bundled with
Java. I can’t say as I disagree with the dislike of the Java/M$
developer sentiment, Sun hasn’t done much in the way of innovative
computing in a while and I wish would just curl up and flop. Tandem
makes better hardware anyway.

Here’s the dilly with supporting multiple files and formats: I don’t
know what your preference is. Contrary to my sentiment about YAML,
I’ll likely support a YAML interface for configuring packages just
because that’s a format that some developers prefer. I personally
favor having simple and small files each with a specific format.
Makes it easier to manipulate/create the files with sed(1) and
find(1). I’m a die hard UNIX guy at heart, what can I say. It
showing?

IMO, there should be ONE file that describes a given project – and
it’s up to the rubynet server(s) to parse that file into the
appropriate meta-data. (Think PAD for Windows shareware or DIZ or
even RSS for that matter.)

By the time the data hits the rubynet server, the data will be
serialized into an XML file. The dot files are used only by an author
for describing their package, not for use in the rubynet system. Once
things hit the rubynet system, it’s XML. Period. For those that are
curious, binary data is MIME64 encoded in an element in the rubynet
file.

Think of rubynet as FreeBSD’s ports + tar + CPAN.

http://lists.rubynet.org/lists/listinfo/rubynet-devel

-sc

···

–
Sean Chittenden

Ptkwt1 · 23 October 2002 07:19

In article 20021023044635.EPYB16517.tomts23-srv.bellnexxia.net@hogwarts,

Okay. This hits on perhaps my biggest frustration with Ruby right
now. I love this language. It makes sense in a way that I haven’t
really found any other language to make sense … BUT!

Perl has ONE documentation format, POD. It’s old, it’s ugly, and
it’s a sumbitch to do anything “nice” with, but bedamned it WORKS.

Ruby, on the other hand, has two going on three different formats,
if I’m understanding what Sean is saying correctly. None of which,
by the way, appear to be compatible. Right now, if I want
documentation for a package, I have to keep rd2 and rdoc handy. I’d
probably have to keep rubydoc’s CLI handy, too, but it’s currently
vapour. I settled on rdoc for my personal documentation efforts
because it Just Works.

Similarly, there are no fewer than four packaging/installation
systems available or soon to be available: RubyGems, setup.rb, rpkg,
and (again, currently vapourous) rubynet. I’m sure I’ve missed one
or two. What I’ve settled on is a variant of what Dave includes with
rdoc, that might be one of the two variants included with setup.rb
– but I’m not quite sure.

As to multiples of other functionality providers, I don’t have as
much of an issue, except for the database libraries.[1] But
packaging/installation systems and documentation systems are
fundamentals.

I have to strongly agree with your sentiments here. I generally tend to
dislike centralized planning, but this is an instance where I think we’re
gonna need some. I really think that whatever packaging package we go
with and whatever documentation system we go with should be included in
the next major release (1.8) (and only one of each should be included).
Yes, this effectively kills the competing packages, but in these two cases
I think that needs to happen.

I suspect that the way to go about it is to say, for example, to each of
the packaging entries that we want to settle on a packaging system by some
date in the not-so-distant future and that each packaging camp should
submit their best entry by some date that allows folks to play with the
entries a fair amount and then we stage a vote on RubyGarden. The vote
probably would not be the final say, but it could serve as a guide to Matz
(or someone he delegates the task to) who would make the final decision.

Phil

···

Austin Ziegler austin@halostatue.ca wrote:

Austin_Ziegler2 · 23 October 2002 06:32

Perl has ONE documentation format, POD. It’s old, it’s ugly, and
it’s a sumbitch to do anything “nice” with, but bedamned it
WORKS.
Works, sure, but it’s ugly… and on the scale of ugly as in it’s
fugly ugly. pukes and puts bag over POD’s head

Yeah, well, I said that. Just not as … eloquently. It does work
and there’s only one standard to deal with.

Ruby, on the other hand, has two going on three different
formats, if I’m understanding what Sean is saying correctly.
Sadly yes… however there is a saving grace with rubydoc, it’s
standard is XML and the other utilities export XML… which means
that anyone can write a stylesheet that’ll convert rdoc-> rubydoc
or rd-> rubydoc. These stylesheets will be included in the base
rubydoc installation.

In some ways, though, I still think that this is probably the wrong
approach. IMO, either rdoc or rd needs to die. Ideally, rdoc will
pick up the ability to parse rd comments cleanly (perhaps spitting
out warnings) so that there only needs to be one primary
documentation tool.

None of which, by the way, appear to be compatible. Right now, if
I want documentation for a package, I have to keep rd2 and rdoc
handy. I’d probably have to keep rubydoc’s CLI handy, too, but
it’s currently vapour. I settled on rdoc for my personal
documentation efforts because it Just Works.
rdoc’s great and I actually plan on using it for embedded
documentation. I have no interest in recreating the work of Dave,
only extending it and making things more generic and rubynet
friendly (wherein users can submit comments/patches, code, etc on
either rubynet.org or rubydoc.org and have the latest doc included
on download or available via a ruby(doc|net) --update [module]
command).

Okay … so … what’s the point of rubydoc? IMO, rdoc is quite
sufficient to produce the necessary documentation for the code. Is
rubydoc intended to massage that documentation into a nicer format
that could, theoretically, look good in PDF?

Don’t get me wrong – I’m not overly enamoured of the default rdoc
template, but at the moment I’m just too lazy to change it for my
own purposes and (later) offer it as a replacement template.

I have to be honest and say that I don’t get why rubydoc as a
tool is necessary. If you’re going to make it an aggregator so that
the user can have the equivalent of a single interface like users of
the ActiveState Perl package do … that’s cool. If you’re going to
make it a super-duper version of ri with Rimport, even better! (See?
I forgot another Ruby documentation tool.) But if it’s just going to
be Yet Another Documentation Tool, I’ve got to say that I don’t see
the point.

IMO, rubydoc should be a shell and transformation agent on top of
rdoc, not a whole new documentation system. But that’s just IMO.

Similarly, there are no fewer than four packaging/installation
systems available or soon to be available: RubyGems, setup.rb,
rpkg, and (again, currently vapourous) rubynet.
:-/ Working on stuff by yourself and being in a perpetual state of
over commitment and everywhere-all-at-once kinda bites when you’re
trying to design something “right” and aren’t hacking it together.
Help wanted/appreciated.

That’s not a criticism, Sean. However, I can’t at the moment really
provide any assistance with the code side – job searches tend to
take a lot of time, and I have my own coding projects that I’m
attacking (including a bit of boring I18N work that I’m WAAAAY
behind on, and a Palm OS project that I need to finish so I can
start selling it, but I also need to read more in Applied
Cryptography so I can build a decent software key system that’s not
too nasty to deal with). I have joined rubynet-devel, though, so I
can possibly provide some design commentary.

It supports – at least experimentally and I haven’t been able to
get it to work yet – automatic diagram creation.
Speaking of, does ruby have a DOT interface? I’ve wanted to use
this for class diagramming on rubynet but haven’t looked into it.

Look at rdoc; there’s a dot/ directory in the distribution. I’m not
exactly sure the best way to use it – part of the problem could be
that I use Ruby from Windows, and I don’t have dot itself installed
(: As I said above, rdoc should be the documentation system for
Ruby; rubydoc should be a way of transforming rdoc output into ri
database information (a la Rimport), creating a unified API
reference for Ruby and all the modules installed on the user’s
system, etc.

else. Want to create PDFs instead of HTML files? Write the
appropriate generator and/or template.
The joy of XSLT and flow objects.

Certainly – and this is probably what rubydoc should do. (Of
course, I’ve now tried rdoc’s CHM – Windows HTMLHelp – output and
have had mixed results with it. It seems to ignore #:nodoc:
directives; I may look at that to provide a patch for Dave when he
gets back, like another patch that I’ve made to a patch that he made
in response to a patch that I gave him.)

BTW, with respect to rubynet itself, I don’t like the dot files.
Why not use an XML file format or YAML, if you want something
lighter-weight (I’m partial to YAML’s simplicity, even though I’m
happily using REXML for an application that I’ve written
locally).
I’m not a yaml lover, personally. Culturally I think YAML exists
as a counter movement to Java/XML and XMLs tendency to get bundled
with Java. I can’t say as I disagree with the dislike of the
Java/M$ developer sentiment, Sun hasn’t done much in the way of
innovative computing in a while and I wish would just curl up and
flop. Tandem makes better hardware anyway.

I think that your cultural analysis is correct. I still think that
YAML is a useful lightweight format.

Here’s the dilly with supporting multiple files and formats: I
don’t know what your preference is.

One format, one file. Period. Don’t give me the option of multiple
formats and files to describe a single package. I picked YAML
because it appears to be close to the .rubynet_ format that you
specified. Perhaps as follows (note that I use ‘…’ when there
could be other attributes or I’m eliding):

--- !rubynet.com/rubynet
- project:
    - name:
    - status:
    - comment:
    - copyright:
    - description:
    - download_name:
    - location:
    - download_urls:
        - ...
    - homepage:
    - licenses:
        - licence: url
    - version:
        - major:
        - minor:
        - micro:
- authors:
    - author:
        - name: 
        - email:
        - ...:
- build:
    - append_files:
    - ...:
- categories:
    - primary:
    - secondary:
        - cat1
        - cat2
- dependencies:
    - ruby:
        - version:
    - libs:
        - required:
            - ...
        - optional:
            - ...
    - dependency:
        - name:
        - version:
        - version_rule:
        - ...
- files:
    - doc:
        - package.html
    - site_lib:
        - MyPackage:
            - test.rb

IMO, in a single file, this provides everything that you would have
in the multiple file form – and it’s easier to edit. XML would be
appropriate, too. I’ll re-present this on rubynet-devel with further
commentary in a couple of days, but I think that there are mistakes
being made in the design as expressed by the package information
files.

Contrary to my sentiment about YAML, I’ll likely support a YAML
interface for configuring packages just because that’s a format
that some developers prefer. I personally favor having simple and
small files each with a specific format. Makes it easier to
manipulate/create the files with sed(1) and find(1). I’m a die
hard UNIX guy at heart, what can I say. It showing?

As I said before, one format and one file. Too many options or files
will make the project unmaintainable. If you keep the existing
dot-rubynet files, I can guarantee that I won’t be packaging things
that way – it’s too much work for me, the maintainer. (Ideally,
even, you would have a cross-platform GUI- or TUI-based interface
for building and maintaining the file.)

IMO, there should be ONE file that describes a given project –
and it’s up to the rubynet server(s) to parse that file into the
appropriate meta-data. (Think PAD for Windows shareware or DIZ or
even RSS for that matter.)
By the time the data hits the rubynet server, the data will be
serialized into an XML file. The dot files are used only by an
author for describing their package, not for use in the rubynet
system. Once things hit the rubynet system, it’s XML. Period. For
those that are curious, binary data is MIME64 encoded in an
element in the rubynet file.

Blechhhh. base64 encoding is evil unless it’s unavoidable. It adds
an unnecessary 30% or more to the size of the file. It also seems
that here, you’re planning on mixing metadata and content – which I
consider a major no-no when it comes to data modeling. Frankly, I
think that this is a task for which XML is uniquely UNSUITED. Sure,
XPath helps here, but XML will never match the power of a properly
coded relational database for this sort of problem space.

-austin
– Austin Ziegler, austin@halostatue.ca on 2002.10.23 at 02.32.37

···

On Wed, 23 Oct 2002 14:07:46 +0900, Sean Chittenden wrote:

Simon_Cozens · 23 October 2002 09:59

Sean Chittenden sean@chittenden.org writes:

Works, sure, but it’s ugly… and on the scale of ugly as in it’s
fugly ugly. pukes and puts bag over POD’s head

Sadly yes… however there is a saving grace with rubydoc, it’s
standard is XML and the other utilities export XML…

POD was created to be easy for programmers to write; XML was created to
be easy for computers to read. As a programmer rather than a computer, I
prefer writing documentation in POD.

···

–
It is now pitch dark. If you proceed, you will likely fall into a pit.

James6 · 23 October 2002 06:54

Ruby, on the other hand, has two going on three different
formats, if I’m understanding what Sean is saying correctly.
Sadly yes… however there is a saving grace with rubydoc, it’s
standard is XML and the other utilities export XML… which means
that anyone can write a stylesheet that’ll convert rdoc-> rubydoc
or rd-> rubydoc. These stylesheets will be included in the base
rubydoc installation.

In some ways, though, I still think that this is probably the wrong
approach. IMO, either rdoc or rd needs to die. Ideally, rdoc will
pick up the ability to parse rd comments cleanly (perhaps spitting
out warnings) so that there only needs to be one primary
documentation tool.

I tend to agree. Getting people to write documentation, and making that
documentation readably available, might be a lot easier with a single standard.

I have to be honest and say that I don’t get why rubydoc as a
tool is necessary. If you’re going to make it an aggregator so that
the user can have the equivalent of a single interface like users of
the ActiveState Perl package do … that’s cool. If you’re going to
make it a super-duper version of ri with Rimport, even better! (See?
I forgot another Ruby documentation tool.) But if it’s just going to
be Yet Another Documentation Tool, I’ve got to say that I don’t see
the point.

My goal with Rimport was to avoid inventing anything new, but rather to exploit
what was already familiar and available. It’s a bridge between RDoc and ri.
Theoretically, it could be rewritten as a loadable output formatter for RDoc or
as an XSLT transformation. Proof is left as an exercise for the reader.

James

···

-austin
– Austin Ziegler, austin@halostatue.ca on 2002.10.23 at 02.32.37

Alan_Chen2 · 23 October 2002 07:27

So far, neither ri nor rdoc or rd has all the features in place to
make any of them a clear winner. One viable approach is to write glue
code until we have a system that fills all our documentation needs -
but it might be quite an inconsistent patchwork by then. Just from the
initial description, rubydoc sounds like a decent core to build a
cleaner documentation system upon, but who knows for sure.

Regardless of the technical merits of rubydoc vs rdoc vs whatever, I
think it’s healthy to explore different options to meet the needs of a
documentation system. Just as genetic recombination allows nature to
explore different survivability traits, I think we should encourage
experimentation for ruby projects – especially if somebody is willing
to donate their time and effort to do so.

···

On Wed, Oct 23, 2002 at 03:32:56PM +0900, Austin Ziegler wrote:

I have to be honest and say that I don’t get why rubydoc as a
tool is necessary. If you’re going to make it an aggregator so that
the user can have the equivalent of a single interface like users of
the ActiveState Perl package do … that’s cool. If you’re going to
make it a super-duper version of ri with Rimport, even better! (See?
I forgot another Ruby documentation tool.) But if it’s just going to
be Yet Another Documentation Tool, I’ve got to say that I don’t see
the point.

–
Alan Chen
Digikata LLC
http://digikata.com

Sean_Chittenden2 · 23 October 2002 09:18

Ruby, on the other hand, has two going on three different
formats, if I’m understanding what Sean is saying correctly.
Sadly yes… however there is a saving grace with rubydoc, it’s
standard is XML and the other utilities export XML… which means
that anyone can write a stylesheet that’ll convert rdoc-> rubydoc
or rd-> rubydoc. These stylesheets will be included in the base
rubydoc installation.

In some ways, though, I still think that this is probably the wrong
approach. IMO, either rdoc or rd needs to die. Ideally, rdoc will
pick up the ability to parse rd comments cleanly (perhaps spitting
out warnings) so that there only needs to be one primary
documentation tool.

Agreed, but I don’t have any interest in what happens in the embedded
documentation space. If something new comes along, great, I’ll use
that too. After rubynet and rubydoc have reached critical mass and I
have free time to work on something else, I may spend some time
flushing out an inline documentation format of my own that fits in
nicely with rubydoc that isn’t XML. XML’s great for machines, but
when it comes to writing stuff in emacs or any text editor, writing
XML from scratch is the perfect way to develop carpal tunnel. See the
ruby-doc@ mailing list for details about what I’ve quasi envisioned,
however I need to support some kind of linking of variables, methods,
and classes. After spending so much time in the libxml source, I
actually kinda like gnome’s doc format… granted it’s a rip off of
javadoc and wherever the inventive folks at Sun ripped it off from.

Okay … so … what’s the point of rubydoc? IMO, rdoc is quite
sufficient to produce the necessary documentation for the code. Is
rubydoc intended to massage that documentation into a nicer format
that could, theoretically, look good in PDF?

To encapsulate and provide a transformable, publishable (www, TeX,
PDF, text, nroff), includable (I REALLY want to be able to offer
documentation sets/examples that are dynamically driven from the
rubynet.org and rubydoc.org sites and also have static content. Think
php.net + PostgreSQL’s idocs + docbook + perldoc) and indexable (from
the CLI: rubydoc [term/class]). Nothing anywhere in the programming
world is coming close to offering that. I as a developer want that.
Others likely do too.

Don’t get me wrong – I’m not overly enamoured of the default rdoc
template, but at the moment I’m just too lazy to change it for my
own purposes and (later) offer it as a replacement template.

Good point, I’ll make a not of that request and let you specify a
different txt/nroff stylesheet in your ~/.rubynet/rubydoc.cfg.

I have to be honest and say that I don’t get why rubydoc as a
tool is necessary. If you’re going to make it an aggregator so that
the user can have the equivalent of a single interface like users of
the ActiveState Perl package do … that’s cool.

Basically, yeah.

If you’re going to make it a super-duper version of ri with Rimport,
even better! (See? I forgot another Ruby documentation tool.) But
if it’s just going to be Yet Another Documentation Tool, I’ve got to
say that I don’t see the point.

Naw, that’s why I’m incessant on standardizing around XML. Any of the
Yet-Another-Ruby-Doc formats can be massaged into rubydoc’s XML spec
with a stylesheet. From there, any number of documents can be
made/indexed.

IMO, rubydoc should be a shell and transformation agent on top of
rdoc, not a whole new documentation system. But that’s just IMO.

Bingo, that’s exactly what it’s going to do.

:-/ Working on stuff by yourself and being in a perpetual state of
over commitment and everywhere-all-at-once kinda bites when you’re
trying to design something “right” and aren’t hacking it together.
Help wanted/appreciated.

That’s not a criticism, Sean. However, I can’t at the moment really
provide any assistance with the code side – job searches tend to
take a lot of time, and I have my own coding projects that I’m
attacking[snip].

It’s alright, I know how that goes. I have to talk fault in it moving
slow since I’m doing this work outside of the limelight of -talk.
I’ve actually completely disengaged from the -talk community and only
read through this mailbox when I get a nudge from someone on IRC
hinting that there’s something worth checking out.

I have joined rubynet-devel, though, so I can possibly provide some
design commentary.

I welcome the addition, there’s a good crew of some 30+ lurkers now.
Put enough of them on a list and someone’s bound to chirp up with a
patch every now and then.

It supports – at least experimentally and I haven’t been able to
get it to work yet – automatic diagram creation.
Speaking of, does ruby have a DOT interface? I’ve wanted to use
this for class diagramming on rubynet but haven’t looked into it.

Look at rdoc; there’s a dot/ directory in the distribution. I’m not
exactly sure the best way to use it – part of the problem could be
that I use Ruby from Windows, and I don’t have dot itself installed
(: As I said above, rdoc should be the documentation system for
Ruby; rubydoc should be a way of transforming rdoc output into ri
database information (a la Rimport), creating a unified API
reference for Ruby and all the modules installed on the user’s
system, etc.

That’s the short term plan. Glad to know my ideas aren’t existing in
a total void.

else. Want to create PDFs instead of HTML files? Write the
appropriate generator and/or template.
The joy of XSLT and flow objects.

Certainly – and this is probably what rubydoc should do. (Of
course, I’ve now tried rdoc’s CHM – Windows HTMLHelp – output and
have had mixed results with it. It seems to ignore #:nodoc:
directives; I may look at that to provide a patch for Dave when he
gets back, like another patch that I’ve made to a patch that he made
in response to a patch that I gave him.)

So long as rdoc does the right thing when it comes to generating XML,
then the stylesheet should handle this correctly… though I’m not a
Win32 guy atm, but I’m doing an increasing amount of Win32 GUI clients
in FOX so who knows, I’ll likely transgress into the depths of MS hell
again at some point in the not too distant future.

I’m not a yaml lover, personally. Culturally I think YAML exists
as a counter movement to Java/XML and XMLs tendency to get bundled
with Java. I can’t say as I disagree with the dislike of the
Java/M$ developer sentiment, Sun hasn’t done much in the way of
innovative computing in a while and I wish would just curl up and
flop. Tandem makes better hardware anyway.

I think that your cultural analysis is correct. I still think that
YAML is a useful lightweight format.

No disagreements there, but I just hacked out libxml which seems to do
everything that I need it to at the moment… short of generating DTDs
on the fly, but that’ll come here sometime this week.

Here’s the dilly with supporting multiple files and formats: I
don’t know what your preference is.

One format, one file. Period. Don’t give me the option of multiple
formats and files to describe a single package. I picked YAML
because it appears to be close to the .rubynet_ format that you
specified. Perhaps as follows (note that I use ‘…’ when there
could be other attributes or I’m eliding):

Ooooh!

[snipped the wonderful yaml example]

IMO, in a single file, this provides everything that you would have
in the multiple file form – and it’s easier to edit. XML would be
appropriate, too.

There is no perfect format, IMHO. XML’s good for machines and for
transferring between automated processes. Yaml’s good if you need to
have a human touch down and edit the file and don’t want to barrage
him with a zillion characters that the eye has to parse through.
Multiple plain text files, however, are shell friendly. I honestly
see room for having all three as valid formats for describing a
package. I actually wonder if I couldn’t generate the formats for the
plain text files and the YAML from the XML spec… hrm, that’d be an
interesting exercise in code generation… any XSLT/YAML buffs out
there that’d want to take that on?

I’ll re-present this on rubynet-devel with further commentary in a
couple of days, but I think that there are mistakes being made in
the design as expressed by the package information files.

! Excellent! I look forward to the discussion.

Contrary to my sentiment about YAML, I’ll likely support a YAML
interface for configuring packages just because that’s a format
that some developers prefer. I personally favor having simple and
small files each with a specific format. Makes it easier to
manipulate/create the files with sed(1) and find(1). I’m a die
hard UNIX guy at heart, what can I say. It showing?

As I said before, one format and one file. Too many options or files
will make the project unmaintainable. If you keep the existing
dot-rubynet files, I can guarantee that I won’t be packaging things
that way – it’s too much work for me, the maintainer. (Ideally,
even, you would have a cross-platform GUI- or TUI-based interface
for building and maintaining the file.)

The dot files will get compiled into XML. From there, a GUI can
operate on the XML. The dot files are being generated at the moment
with some crude guesses so for me, it’s actually really nice.

$ rubynet --generate

[edit dot files and delete false positives/add missing entries]

$ rubynet --compile-module

By the time the data hits the rubynet server, the data will be
serialized into an XML file. The dot files are used only by an
author for describing their package, not for use in the rubynet
system. Once things hit the rubynet system, it’s XML. Period. For
those that are curious, binary data is MIME64 encoded in an
element in the rubynet file.

Blechhhh. base64 encoding is evil unless it’s unavoidable. It adds
an unnecessary 30% or more to the size of the file.

No arguments from me here, but do you know of another way to encode
binary data in an XML file? I’d love to use something more efficient,
but don’t know of anything. Fortunately with the compression set at 9
the size drops quite dramatically as it seems there are enough
patterns in the mime encoded file. I’ll add support to compress files
individually before they get encoded in the event that this pattern
doesn’t hold true.

It also seems that here, you’re planning on mixing metadata and
content

Correctomundo, that’s exactly what I’m doing… and I’m being
quasi-clever about it for several reasons. I’m trying to create a
format that fullfills three purposes.

Is indexable. XML+XPath takes care of this.
Contains meta data for the module that way a skeleton module can be
distributed instead of the full blown tarball. Think FreeBSD ports
here. The reason for doing this is for ports conformity, and
because I’d like to make this system attractive to commercial
vendors who need to distribute modules and have restricted downloads.
Can be used as a ubiquitous format that stores all of the packaging
and file content. I’ve got an idea in the back of my head for how
to convert a tarball into a rubynet package, for example. As a
user, I only want to have to download one thing. Think of this a
JAR file on steroids, if you will. While I agree with your next
point that it’s a no-no, by and large, I do think that this is
miles better than what Perl has and certainly better than the
simple zip format that JAR files employ.

– which I consider a major no-no when it comes to data
modeling.

See #3 above.

Frankly, I think that this is a task for which XML is uniquely
UNSUITED. Sure, XPath helps here, but XML will never match the
power of a properly coded relational database for this sort of
problem space.

Agreed… there’s an element of usability though that can’t be matched
if you split things into two files. Hrm… maybe I should just
concatenate a tarball with the rubynet skeleton file with a small
contents that gives the version of rubynet needed and the sizes of the
XML and package… hrm… to be continued on devel@rubynet.org.
-sc

···

–
Sean Chittenden

Jim_Freeze2 · 23 October 2002 17:04

I have discussed this issue with the rdtool creators and they are
on board with the Ruby-Doc effort. All raa app will be converted
to the new rubydoc standard when it is available.

[much snippage]

Man, you guys are long winded… ;`

···

On Wednesday, 23 October 2002 at 15:32:56 +0900, Austin Ziegler wrote:

On Wed, 23 Oct 2002 14:07:46 +0900, Sean Chittenden wrote:

Ruby, on the other hand, has two going on three different
formats, if I’m understanding what Sean is saying correctly.
Sadly yes… however there is a saving grace with rubydoc, it’s
standard is XML and the other utilities export XML… which means
that anyone can write a stylesheet that’ll convert rdoc-> rubydoc
or rd-> rubydoc. These stylesheets will be included in the base
rubydoc installation.

In some ways, though, I still think that this is probably the wrong
approach. IMO, either rdoc or rd needs to die. Ideally, rdoc will
pick up the ability to parse rd comments cleanly (perhaps spitting
out warnings) so that there only needs to be one primary
documentation tool.

–
Jim Freeze

Which is worse: ignorance or apathy? Who knows? Who cares?

Gavin_Sinclair · 23 October 2002 07:59

I have to be honest and say that I don’t get why rubydoc as a
tool is necessary. If you’re going to make it an aggregator so that
the user can have the equivalent of a single interface like users of
the ActiveState Perl package do … that’s cool. If you’re going to
make it a super-duper version of ri with Rimport, even better! (See?
I forgot another Ruby documentation tool.) But if it’s just going to
be Yet Another Documentation Tool, I’ve got to say that I don’t see
the point.

So far, neither ri nor rdoc or rd has all the features in place to
make any of them a clear winner. One viable approach is to write glue
code until we have a system that fills all our documentation needs -
but it might be quite an inconsistent patchwork by then. Just from the
initial description, rubydoc sounds like a decent core to build a
cleaner documentation system upon, but who knows for sure.

rd and rdoc are in competition, and I’d like to see rdoc swallow rd whole: that
is, understand its markup and include rd files in the overall presentation.

But ri is different. It’s a viewer. At the moment, its data (from whence its
output springs) is in a private format. And the only thing it documents is
built-in classes. However, it’s a wonderful little program, and remeber, at
its core it is only a viewer.

The Ruby Documentation Project (on a separate mailing list) is attempting to
unite ri with other documentation programs, among other things. That’s what
RImport is about: allow RDoc documentation to be viewed piecemeal through ri.

Gavin

···

From: “Alan Chen” alan@digikata.com

On Wed, Oct 23, 2002 at 03:32:56PM +0900, Austin Ziegler wrote:

Austin_Ziegler2 · 23 October 2002 14:51

Oh, don’t get me wrong – I haven’t downloaded it yet, but I intend
to. The concept is perfect. The tool I forgot was ri, not Rimport (:

-austin
– Austin Ziegler, austin@halostatue.ca on 2002.10.23 at 10.48.56

···

On Wed, 23 Oct 2002 15:54:08 +0900, JamesBritt wrote:

I have to be honest and say that I don’t get why rubydoc as a
tool is necessary. If you’re going to make it an aggregator so
that the user can have the equivalent of a single interface like
users of the ActiveState Perl package do … that’s cool. If
you’re going to make it a super-duper version of ri with Rimport,
even better! (See? I forgot another Ruby documentation tool.) But
if it’s just going to be Yet Another Documentation Tool, I’ve got
to say that I don’t see the point.
My goal with Rimport was to avoid inventing anything new, but
rather to exploit what was already familiar and available. It’s a
bridge between RDoc and ri. Theoretically, it could be rewritten
as a loadable output formatter for RDoc or as an XSLT
transformation. Proof is left as an exercise for the reader.

Austin_Ziegler2 · 23 October 2002 17:12

Ruby, on the other hand, has two going on three different
formats, if I’m understanding what Sean is saying correctly.
Sadly yes… however there is a saving grace with rubydoc, it’s
standard is XML and the other utilities export XML… which
means that anyone can write a stylesheet that’ll convert rdoc->
rubydoc or rd-> rubydoc. These stylesheets will be included in
the base rubydoc installation.
In some ways, though, I still think that this is probably the
wrong approach. IMO, either rdoc or rd needs to die. Ideally,
rdoc will pick up the ability to parse rd comments cleanly
(perhaps spitting out warnings) so that there only needs to be
one primary documentation tool.
I have discussed this issue with the rdtool creators and they are
on board with the Ruby-Doc effort. All raa app will be converted
to the new rubydoc standard when it is available.

I still think that rdoc is a better choice for inline documentation,
and that rubydoc should be built on top of rdoc (as a consumer or
formatter). rdoc has the advantage of being relatively easy to read
in the code itself, whereas I find the rd that I’ve seen in modules
is … hard.

rdoc, by the way, does SORT OF support rd format, but it’s not
clean. (It doesn’t ignore certain rd directives.)

Man, you guys are long winded… ;`

This rant has been coming on for a while. (:

– Austin Ziegler, austin@halostatue.ca on 2002.10.23 at 13.09.39

···

On Thu, 24 Oct 2002 02:04:34 +0900, Jim Freeze wrote:

On Wednesday, 23 October 2002 at 15:32:56 +0900, Austin Ziegler > wrote:

On Wed, 23 Oct 2002 14:07:46 +0900, Sean Chittenden wrote:

why_the_lucky_stiff1 · 28 October 2002 06:54

There is no perfect format, IMHO. XML’s good for machines and for
transferring between automated processes. Yaml’s good if you need to
have a human touch down and edit the file and don’t want to barrage
him with a zillion characters that the eye has to parse through.

I’m trying to understand why XML is good for machines. XML parsers
will likely always be more sizeable and unwieldy to a machine than a
YAML parser. It seems that XML is good for machines in the same
way that hefting large bales of hay is good for people. I dunno.

Multiple plain text files, however, are shell friendly. I honestly
see room for having all three as valid formats for describing a
package. I actually wonder if I couldn’t generate the formats for the
plain text files and the YAML from the XML spec… hrm, that’d be an
interesting exercise in code generation… any XSLT/YAML buffs out
there that’d want to take that on?

I would love to help on Rubynet, if you decide to integrate YAML.
When I looked at your dot-files description, I couldn’t help but think
of how yaml.rb’s api could cover all of your parsing needs. Software
authors who already know YAML will have less of a barrier to writing
rubynet dot-files. In addition, YAML is extensible (through its flexible
typing system), which could prove quite beneficial as your dot-file
schema undergoes future revisions.

As a standing offer to any Ruby project: if you want to trade out XML
configuration files or homemade file formats for YAML, please let me know.
I’d be glad to spend some time working out how YAML can work in projects
throughout the community.

_why

James6 · 28 October 2002 07:32

As a standing offer to any Ruby project: if you want to trade out XML
configuration files or homemade file formats for YAML, please let me know.
I’d be glad to spend some time working out how YAML can work in projects
throughout the community.

I read the article at http://www.xml.com/lpt/a/2002/07/24/yaml.html , but it was
fairly sparse. Seems like yet another markup language.

I’m not looking to start (or perpetuate) any YAML vs. XML religious wars, but is
there a web site that does a pro/con comparison between the two? There’s some
info at YAML™ Specification Index, but not too much.

Your comments suggest YAML can easily step for XML, though that previous YAML
spec link suggests potentially significant structural/conceptual differences
between the two. Plus, there is a growing (albeit increasingly complex)
constellation of XML-derived specs and tools. Does YAML, for example, have
anything similar to XSLT or a schema language/validation process?

I see that the FAQ at YAML Ain't Markup Language has but one question:

What happens to YAML when people edit with different tab presets?

If this means that YAML treats white space as significant, then I might as well
switch to Python while I’m at it.

James

···

_why

Gavin_Sinclair · 28 October 2002 07:39

There is no perfect format, IMHO. XML’s good for machines and for
transferring between automated processes. Yaml’s good if you need to
have a human touch down and edit the file and don’t want to barrage
him with a zillion characters that the eye has to parse through.

I’m trying to understand why XML is good for machines. XML parsers
will likely always be more sizeable and unwieldy to a machine than a
YAML parser. It seems that XML is good for machines in the same
way that hefting large bales of hay is good for people. I dunno.

[…]

_why

It’s a trade-off: easy enough for man and machine to process. The balance
could be struck in a different way, sure, but someone has drawn a line in the
sand and produced a document representation format that lots of people can use
and benefit from. It’s popular. And it’s buzzword-compliant (which seems to
be one reason why a minority dislike it - not accusing anyone here). And it’s
a bitch to edit manually.

As for being processor intensive, who cares? That what processors are for

Oh, and what’s your favourite validating parser for YAML?

Gavin

···

From: “why the lucky stiff” ruby-talk@whytheluckystiff.net

Simon_Cozens · 28 October 2002 09:49

" JamesBritt" james@jamesbritt.com writes:

I read the article at http://www.xml.com/lpt/a/2002/07/24/yaml.html
, but it was fairly sparse. Seems like yet another markup language.

Surprising, that.

···

–

Progress (n.): The process through which Usenet has evolved from
smart people in front of dumb terminals to dumb people in front of
smart terminals. – obs@burnout.demon.co.uk (obscurity)

why_the_lucky_stiff1 · 28 October 2002 15:58

Your comments suggest YAML can easily step for XML, though that previous
YAML spec link suggests potentially significant structural/conceptual
differences between the two.

YAML cannot step for XML. The two are completely different. When it comes to
interleaved content and markup, YAML cannot tread water. Certainly use XML
in cases requiring such.

Let’s ask the inverse question: Can XML easily step for YAML? XML is squeezed
into many case where I believe it doesn’t suit as well. Configuration files,
messaging, data serialization. YAML is engineered for these cases.

Plus, there is a growing (albeit increasingly
complex) constellation of XML-derived specs and tools. Does YAML, for
example, have anything similar to XSLT or a schema language/validation
process?

Well, it’s all in progress. Sure, YAML is quite young. Here’s what I can
tell you:

CYATL is YAML’s transformation language. Brian Ingerson (Inline.pm) and
Steve Howell (PyYaml) are developing this. I imagine that it will be a year
or so before there is a stable product (depending on demand, of course).
We are working a schema language. We’re looking at several possible
angles:
- A reworking of RELAX-NG:
  - http://wiki.yaml.org/yamlwiki/YamlRelaxExampleOne
- A custom schema:
  - http://wiki.yaml.org/yamlwiki/OkayTypeSchema
  - http://wiki.yaml.org/yamlwiki/YaCleanSchemeProposal

The !okay/schema is currently functional. It comes with YAML.rb.

_why

···

On Monday 28 October 2002 12:32 am, JamesBritt wrote:

Topic		Replies	Views
OT: XML too hard (YAML opportunity?) ruby-talk	26	243	21 March 2003
XmlConfigFile usage ruby-talk	28	171	7 March 2003
Xml + ruby ruby-talk	19	65	8 October 2003
Most popular wiki in Ruby seeks kind maintainer ruby-talk	11	126	25 November 2004
[ANN] QuiXML 0.0.0 ruby-talk	11	107	28 August 2003

XMLParser, NQXML, REXML,

– Jim Freeze

Related topics

–
Jim Freeze