Ruby/REXML vs XSLT

I'm just looking at 5000 lines of the gnarliest XSLT that generates out of XML some C to pack and unpack a serial protocol.

Ooo, it's ugly, ugly, ugly.

Anyone ever tried to do something in both Ruby REXML and the samething in XSLT?

Was it prettier in Ruby?

Was it easier? Fewer lines of Code? (by what ratio)

How about speed? The XSLT is chewing on 14500 lines of XML in (almost) too long a time.

I itch to rewrite it.

John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : john.carter@tait.co.nz
New Zealand

Carter's Clarification of Murphy's Law.

"Things only ever go right so that they may go more spectacularly wrong later."

From this principle, all of life and physics may be deduced.

John Carter wrote:

I'm just looking at 5000 lines of the gnarliest XSLT that generates out of XML some C to pack and unpack a serial protocol.

Ooo, it's ugly, ugly, ugly.

Anyone ever tried to do something in both Ruby REXML and the samething in XSLT?

Um, not exactly. I've tried doing some XSLT stuff in Ruby, ran into various issues (lack of a spec-complete Ruby XLST engine being one of them), and found that using the REXML stream parser made life easier overall.

Was it prettier in Ruby?

By far.

Was it easier? Fewer lines of Code? (by what ratio)

Easier in most ways, and easier overall. The logic is different, so if you're used to doing a transformation in XSLT (am ostensibly functional language) and then try to do the same thing in REXML, you need to shift your perspective.

How about speed? The XSLT is chewing on 14500 lines of XML in (almost) too long a time.

What XSLT engine are you using? I've found that a big bottleneck can be having to load a large document into memory for processing. Using a stream or pull parser alleviates much of that, but that may not be an option (depends on the sort of document and the nature of the transformation).

But if your source XML can be broken into small subsets and transformed in segments, stream or pull transformation may be a good choice.

I itch to rewrite it.

Go for it.

···

John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : john.carter@tait.co.nz
New Zealand

Carter's Clarification of Murphy's Law.

"Things only ever go right so that they may go more spectacularly wrong later."

From this principle, all of life and physics may be deduced.

.

--

http://www.ruby-doc.org - The Ruby Documentation Site
http://www.rubyxml.com - News, Articles, and Listings for Ruby & XML
http://www.rubystuff.com - The Ruby Store for Ruby Stuff
http://www.jamesbritt.com - Playing with Better Toys

John Carter schrieb:

Anyone ever tried to do something in both Ruby REXML and the samething in XSLT?

Look at Moving Away From Xslt

Regards,
Pit

Slightly related to this - I think Rich and Chad did something along
those lines with Ruby-JDWP:

http://rubyforge.org/projects/rubyjdwp/

Rich wrote Ruby code to parse the protocol spec and generate Ruby
classes that could read the data stream:

http://rubyforge.org/cgi-bin/viewcvs.cgi/rubyjdwp/lib/jdwp/spec/?cvsroot=rubyjdwp

Yours,

tom

···

On Tue, 2005-06-21 at 10:36 +0900, John Carter wrote:

I'm just looking at 5000 lines of the gnarliest XSLT that generates
out of XML some C to pack and unpack a serial protocol.

Was it easier? Fewer lines of Code? (by what ratio)

Easier in most ways, and easier overall. The logic is different, so if you're used to doing a transformation in XSLT (am ostensibly functional language) and then try to do the same thing in REXML, you need to shift your perspective.

My impression is XSLT is excellent at doing small pattern match and templating tasks, and just plain lousy at doing substantial logic.

This tasks seems to be substantially string manipulation.

How about speed? The XSLT is chewing on 14500 lines of XML in (almost) too long a time.

What XSLT engine are you using?

xalan. It's a Java implementation.

I've found that a big bottleneck can be having to load a large document into memory for processing. Using a stream or pull parser alleviates much of that, but that may not be an option (depends on the sort of document and the nature of the transformation).

Does I have used REXML before, I can't remember whether it had a pull parser or not.

What I need is the ability to rapidly pull the XML document into native Array and Hash objects which would be _much_ smaller than the corresponding XML.

John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : john.carter@tait.co.nz
New Zealand

Carter's Clarification of Murphy's Law.

"Things only ever go right so that they may go more spectacularly wrong later."

From this principle, all of life and physics may be deduced.

···

On Tue, 21 Jun 2005, James Britt wrote:

Quoting Pit Capitain <pit@capitain.de>:

John Carter schrieb:
> Anyone ever tried to do something in both Ruby REXML and the samething
> in XSLT?

Look at Moving Away From Xslt

In my opinion, the best thing about XSLT is XPath. XQuery also supports the use
of XPath, but doesn't use XML syntax. This makes it much more compact than
XSLT. I think XQuery solves the issues raised in Martin Fowler's article.

A good XQuery implementation is Saxon. This also supports XSLT.

···

--
R. Mark Volkmann
Partner, Object Computing, Inc.

John Carter wrote:

Was it easier? Fewer lines of Code? (by what ratio)

Easier in most ways, and easier overall. The logic is different, so if you're used to doing a transformation in XSLT (am ostensibly functional language) and then try to do the same thing in REXML, you need to shift your perspective.

My impression is XSLT is excellent at doing small pattern match and templating tasks, and just plain lousy at doing substantial logic.

Logic in XSLT requires an appreciation of functional programming. XSLT is really quite good at complex matching and templating, especially if you need to grab and match stuff from all over the document, or when you are not quite sure where something will be.

But for highly regular data sources then it can be overkill.

This tasks seems to be substantially string manipulation.

How about speed? The XSLT is chewing on 14500 lines of XML in (almost) too long a time.

What XSLT engine are you using?

xalan. It's a Java implementation.

Oh, sorry, I thought you had tried this in Ruby + XSLT.

I've found that a big bottleneck can be having to load a large document into memory for processing. Using a stream or pull parser alleviates much of that, but that may not be an option (depends on the sort of document and the nature of the transformation).

Does I have used REXML before, I can't remember whether it had a pull parser or not.

Yes, it does.

What I need is the ability to rapidly pull the XML document into native Array and Hash objects which would be _much_ smaller than the corresponding XML.

I've written a magazine article describing how to do XML transformations with REXML's pull parser, but it is currently in editorial limbo.

If the source data has readily identifiable demarcation points (e.g., a particular element or attribute), one can use the pull parser to keep yanking content off the input stream, stashing it in buffer. When the demarcation point is encountered, the buffer can be processed using the REXML DOM and XPath, saved off someplace, and cleared for the next round.

Better, if you can do this, is to keep pulling content and processing it right away, based on the current element/attribute values, avoiding the intermediate DOM objects. This typically requires your code to track more state i order to know what to do ant any given point in the process.

James

···

On Tue, 21 Jun 2005, James Britt wrote:

--

http://www.ruby-doc.org - The Ruby Documentation Site
http://www.rubyxml.com - News, Articles, and Listings for Ruby & XML
http://www.rubystuff.com - The Ruby Store for Ruby Stuff
http://www.jamesbritt.com - Playing with Better Toys

I have tried the same problem in both, actually.

XSLT is a conceptually better way of looking at the problem, but Ruby
is of course a far superior language. My conclusion - either way is
going to be hard to grok for any problem with significant depth.

XPath is the real workhorse of XSLT... keeping it and throwing the
rest away would be a good start. I havent looked at XQuery in any
depth.

···

On 6/22/05, R. Mark Volkmann <mark@ociweb.com> wrote:

Quoting Pit Capitain <pit@capitain.de>:

> John Carter schrieb:
> > Anyone ever tried to do something in both Ruby REXML and the samething
> > in XSLT?
>
> Look at Moving Away From Xslt

In my opinion, the best thing about XSLT is XPath. XQuery also supports the use
of XPath, but doesn't use XML syntax. This makes it much more compact than
XSLT. I think XQuery solves the issues raised in Martin Fowler's article.

A good XQuery implementation is Saxon. This also supports XSLT.

--
R. Mark Volkmann
Partner, Object Computing, Inc.

--
spooq