Greetings,
This is to announce the release of REXML 2.7.0, AKA 3.0-beta. This is a
significant release, and includes architectural changes and rewrites of
large portions of the REXML code. This release is mostly backwards
compatible with REXML <= 2.5.x, the only behavioral differences being
in the handling of text nodes – more specifically, how :raw is handled.
I’m rushing this release, because I want to start getting feedback and
testing on it ASAP. The main reason for this is because REXML is now
part of the standard Ruby distribution. That is, from Ruby 1.8 on,
REXML will be included as a standard library. Since we’re expecting
1.8 to be released soon, I’d like the opportunity to slip in as many
optimizations and bug fixes as possible before 1.8 goes out. As a result,
you’ll notice that the documentation is a bit out-of-whack; the change
log, in particular, has not been updated. This email is supposed
to address this issue, until I can get a chance to rewrite the documentation.
Architectural changes
REXML is now broken down into three sections, by purpose. The first are
the parsers, the second are the user APIs, and the third are the utilities.
The utilities use the objects in the user APIs, and include things like
XPath and validation. The user APIs use the parsers, and are things like
the SAX2 API, the Pull API, and the (non-yet-existant) DOM API. By
segregating the various parts, fixing bugs is easier, speed improvements
have broader impact, and extending REXML is vastly easier.
Behavioral changes
If you use the original REXML tree API, your applications will be a little
slower parsing documents; this is temporary. I haven’t applied any speed
optimizations to the new parser yet. However, if you are in the need for
speed, there is a new, lighter API that is faster than the old REXML parser
(even without optimizations) and also consumes only half the memory of the old
tree. XPath on the old tree is faster, and is much faster on the new light
tree. Most of these speed improvements are translatable to the old REXML
API, and will be in the future.
As I mentioned, the old API hasn’t changed. I’ve left the packages, classes,
and methods alone, so applications that use REXML should require minimal
changes (if any) to work with the new API. Your only concern is if you use
the :raw property of text nodes; in this case, you definately want to check
the Text class API for behavioral changes. Text node behavior is much
clearer now.
The new REXML passes all of the old unit tests, as well as the Oasis
XML test suites.
Future direction
All of these changes were made for one purpose: to make REXML easier to
maintain. This translates directly into benefits for the users – all of
the features I mention are not currently present in REXML, but will
be rapidly appearing in future versions.
It is now easy to tack on a DOM API to the front of REXML without incurring
significant overhead. It is also possible to use XPath with a wider variety
of APIs, such as the SAX2 API (with caveats). Bug fixes in the parser code
will now directly affect all of the APIs, so no API will be left behind.
There is greater opportunity for speed and memory use optimizations. Finally,
adding plug-in modules to REXML should be significantly easier.
Thanks for your attention,
Sean