Ruby in XML

John_Carter · 8 July 2005 00:22

I have just stuck this on..
http://www.rubygarden.org/ruby?RubyInXML

I like XML.

There is a firm standard, there is a rich toolset to work on it.

I like HTML. It is simply the fastest way to deliver good looking documents to the widest audience.

No surprise. I like XHTML it _is_ HTML in XML. I can validate my XHTML documents and know that they conform exactly to the standard, and hence will render properly on a wide set of browsers.

I love ruby. It is quite the easiest way to program. It has a lovely XML API called REXML.

I sometimes need to do spreadsheet sort of things. Basically a document that describes my reasoning and findings, supported by numbers.

Long time ago, when I still did Perl, I found by actual trials that I was about as fast in Perl as the average guy is using a Spreadsheet. Sometimes faster, sometimes slower. But for the next hundred data sets, my perl scripts where a thousand times faster.

So I don't do spreadsheets these days, I write ruby scripts.

So I have taken to combining Ruby & HTML. Sometimes via cgi. It works for me.

But sometimes I have documents that are more HTML than ruby. So it makes sense to write them in HTML, with a bit of Ruby embedded. That's where erb and eruby live.

But I don't like erb and eruby's tags. I can't validate my XHTML.

So add REXML and I present a very small script I call rubyexml. Ruby Embedded in XML.

#!/usr/bin/ruby -w

require 'rexml/document'
require 'rexml/streamlistener'
require 'pp'

# All eval's are evaluated in the context of an instance of this class.
# Extend this, or add this method to a class of your own.
class Context

   def eval_value( value)
     value.gsub( %r{ \#\{ ( [^\}]+ ) \} }x) do | match|
       instance_eval( $1).to_s
     end
   end
end

# This does the work.
class Listener
include REXML::StreamListener

   def initialize( context)
     @context = context
   end

   def comment( text)
     print @context.instance_eval( text)
   rescue SyntaxError => details
     pp @context
     pp text
     raise "Failed to compile '#{text}' in context : #{details}"
   end

   def tag_start(name,attrs)
     print "<",name
     attrs.each_pair do |key, value|
       print " #{key}=\"#{@context.eval_value( value)}\""
     end
     print ">"
   end

   def tag_end( name)
     print "</", name, ">"
   end

   def text( text)
     print @context.eval_value(text)
   end

   def cdata( ctext)
     text( ctext)
   end
end

# This comes for free from REXML. Stream parse an XML document.
REXML::Document::parse_stream( REXML::SourceFactory::create_from( STDIN), Listener::new( Context.new))

So take a chunk of XHTML...

<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"xhtml11.dtd" >
<html xmlns="HTTP://www.w3.org/TR/xhtml"
    xmlns:xlink="HTTP://www.w3.org/XML/XLink/0.9"
    xml:lang="en" >
    <head>
       <title>

</title>
</head>

<body>
       <h1>
          The answer to life, the universe and everything is 
       </h1>

<p>
The following image is

It validates as correct xml against the XHTML DTD.

Feed it through rubyexml and get...

</title>
</head>

    <body>
       <h1>
          The answer to life, the universe and everything is 42
       </h1>

<p>
The following image is pretty_picture.jpg

Just so blooming simple.

And if you have a big hairy object that knows all the deeper secrets of life, just change rubyexml to...

REXML::Document::parse_stream( REXML::SourceFactory::create_from( STDIN), Listener::new( BigHairyObjectThatKnowsTheDeeperSecretsOfLife.new))

And you can refer to all it's instance variables and methods.

It all so blooming simple!

John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : john.carter@tait.co.nz
New Zealand

"At first I hoped that such a technically unsound project would
  collapse but I soon realized it was doomed to success. Almost
  anything in software can be implemented, sold, and even used given
  enough determination. There is nothing a mere scientist can say that
  will stand against the flood of a hundred million dollars. But there
  is one quality that cannot be purchased in this way---and that is
  reliability. The price of reliability is the pursuit of the utmost
  simplicity. It is a price which the very rich find most hard to
  pay." -- C.A.R. Hoare in The Emperor's Old Clothes,
           Turing Award Lecture (27 October 1980)

David_Mitchell · 8 July 2005 00:45

How does this work for loops? For example this won't work:

<img src="#{@i}.jpg"/>

How would you suggest I achieve this? Perhaps:

I can quickly see that becoming an escaping nightmare.

I really like the idea but I would want it to be this flexible.

David

John Carter wrote:

···

I have just stuck this on..
  http://www.rubygarden.org/ruby?RubyInXML

I like XML.

There is a firm standard, there is a rich toolset to work on it.

I like HTML. It is simply the fastest way to deliver good looking documents to the widest audience.

No surprise. I like XHTML it _is_ HTML in XML. I can validate my XHTML documents and know that they conform exactly to the standard, and hence will render properly on a wide set of browsers.

I love ruby. It is quite the easiest way to program. It has a lovely XML API called REXML.

I sometimes need to do spreadsheet sort of things. Basically a document that describes my reasoning and findings, supported by numbers.

Long time ago, when I still did Perl, I found by actual trials that I was about as fast in Perl as the average guy is using a Spreadsheet. Sometimes faster, sometimes slower. But for the next hundred data sets, my perl scripts where a thousand times faster.

So I don't do spreadsheets these days, I write ruby scripts.

So I have taken to combining Ruby & HTML. Sometimes via cgi. It works for me.

But sometimes I have documents that are more HTML than ruby. So it makes sense to write them in HTML, with a bit of Ruby embedded. That's where erb and eruby live.

But I don't like erb and eruby's tags. I can't validate my XHTML.

So add REXML and I present a very small script I call rubyexml. Ruby Embedded in XML.

#!/usr/bin/ruby -w

require 'rexml/document'
require 'rexml/streamlistener'
require 'pp'

# All eval's are evaluated in the context of an instance of this class.
# Extend this, or add this method to a class of your own.
class Context

  def eval_value( value)
    value.gsub( %r{ \#\{ ( [^\}]+ ) \} }x) do | match|
      instance_eval( $1).to_s
    end
  end
end

# This does the work.
class Listener
  include REXML::StreamListener

  def initialize( context)
    @context = context
  end

  def comment( text)
    print @context.instance_eval( text)
  rescue SyntaxError => details
    pp @context
    pp text
    raise "Failed to compile '#{text}' in context : #{details}"
  end

  def tag_start(name,attrs)
    print "<",name
    attrs.each_pair do |key, value|
      print " #{key}=\"#{@context.eval_value( value)}\""
    end
    print ">"
  end

  def tag_end( name)
    print "</", name, ">"
  end

  def text( text)
    print @context.eval_value(text)
  end

  def cdata( ctext)
    text( ctext)
  end
end

# This comes for free from REXML. Stream parse an XML document.
REXML::Document::parse_stream( REXML::SourceFactory::create_from( STDIN), Listener::new( Context.new))

So take a chunk of XHTML...

<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"xhtml11.dtd" >
<html xmlns="HTTP://www.w3.org/TR/xhtml"
   xmlns:xlink="HTTP://www.w3.org/XML/XLink/0.9"
   xml:lang="en" >
   <head>
      <title>

    </title>
   </head>

   <body>
      <h1>
         The answer to life, the universe and everything is 
      </h1>

      <p>
         The following image is 

         <img src="#{@file_name}" alt = "#{@file_name.sub(/\.jpg/,'')}"/>
      </p>
   </body>
</html>

It validates as correct xml against the XHTML DTD.

Feed it through rubyexml and get...

<html xmlns:xlink="HTTP://www.w3.org/XML/XLink/0.9" xml:lang="en" xmlns="HTTP://www.w3.org/TR/xhtml">
   <head>
      <title>

    </title>
   </head>

   <body>
      <h1>
         The answer to life, the universe and everything is 42
      </h1>

      <p>
         The following image is pretty_picture.jpg

         <img src="pretty_picture.jpg" alt="pretty_picture"></img>
      </p>
   </body>
</html>

Just so blooming simple.

And if you have a big hairy object that knows all the deeper secrets of life, just change rubyexml to...

REXML::Document::parse_stream( REXML::SourceFactory::create_from( STDIN), Listener::new( BigHairyObjectThatKnowsTheDeeperSecretsOfLife.new))

And you can refer to all it's instance variables and methods.

It all so blooming simple!

John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : john.carter@tait.co.nz
New Zealand

"At first I hoped that such a technically unsound project would
collapse but I soon realized it was doomed to success. Almost
anything in software can be implemented, sold, and even used given
enough determination. There is nothing a mere scientist can say that
will stand against the flood of a hundred million dollars. But there
is one quality that cannot be purchased in this way---and that is
reliability. The price of reliability is the pursuit of the utmost
simplicity. It is a price which the very rich find most hard to
pay." -- C.A.R. Hoare in The Emperor's Old Clothes,
          Turing Award Lecture (27 October 1980)

--
David Mitchell
Software Engineer
Telogis

James_Britt4 · 8 July 2005 01:26

John Carter wrote:
...

So add REXML and I present a very small script I call rubyexml. Ruby

...

So take a chunk of XHTML...

<?xml version="1.0"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"xhtml11.dtd" >

...

<body>
      <h1>
         The answer to life, the universe and everything is 
      </h1>

Question: If your document has instructions for processing, why not use processing instructions? Why munge the semantics of the comments syntax?

Maybe take a look at how Nitro does this.

···

--

http://www.ruby-doc.org - The Ruby Documentation Site
http://www.rubyxml.com - News, Articles, and Listings for Ruby & XML
http://www.rubystuff.com - The Ruby Store for Ruby Stuff
http://www.jamesbritt.com - Playing with Better Toys

Daniel_Brockman · 9 July 2005 01:04

John Carter <john.carter@tait.co.nz> writes:

I can validate my XHTML documents and know that they conform exactly to
the standard, and hence will render properly on a wide set of browsers.

If only it were that simple...

···

--
Daniel Brockman <daniel@brockman.se>

John_Carter · 8 July 2005 01:04

How does this work for loops? For example this won't work:

Yup. Thought about it. Didn't come up with any bright thunks..

Not bright, but will work.

I really like the idea but I would want it to be this flexible.

Given flexible or simple, I chose simple.

Possibly this is merely a lack of imagination on my part.

Perhaps flexible and simple is possible. I wanted it to be able to validate as vanilla XHTML.

But that is what Wiki's are for. If I missed something, click on "edit this page".

John Carter wrote:

I have just stuck this on..
http://www.rubygarden.org/ruby?RubyInXML

John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : john.carter@tait.co.nz
New Zealand

Carter's Compass...

I know I'm on the right track when by deleting code I'm adding
functionality.

···

On Fri, 8 Jul 2005, David Mitchell wrote:

John_Carter · 8 July 2005 04:06

Question: If your document has instructions for processing, why not use processing instructions? Why munge the semantics of the comments syntax?

The short answer is I had read the XML standard so long ago I forgot about them....

I knew they existed, but I feared they had some deep meaning I didn't want to clash with.

I will change to using them.

Thank you,

Maybe take a look at how Nitro does this.

Will do.

John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : john.carter@tait.co.nz
New Zealand

Carter's Clarification of Murphy's Law.

"Things only ever go right so that they may go more spectacularly wrong later."

From this principle, all of life and physics may be deduced.

···

On Fri, 8 Jul 2005, James Britt wrote:

David_Mitchell · 8 July 2005 02:32

Hey,

John Carter wrote:

I really like the idea but I would want it to be this flexible.

Given flexible or simple, I chose simple.
..
Perhaps flexible and simple is possible. I wanted it to be able to validate as vanilla XHTML.

Ok, I don't think the example I posted falls outside the bounds of simple. Maybe it makes your script a little more complicated but not the syntax that the end user must use. I suspect if you hold off the actual evaluation of the comments until the page is output then you might find this task simpler. That is, build a single ruby script in memory that contains the logic for outputting the page, then just eval it at the end.

···

--
David Mitchell
Software Engineer
Telogis

George_Moschovitis · 9 July 2005 05:35

Hello,

Nitro allready implements this. Have a look at www.nitrohq.com.

The normal way to do this is:

<ul>
<?r for item in items ?>
<li>#{item.title}</li>
<?r end ?>
</ul>

If you include the morphing shader you can also write it as:

<ul>
<li each="item in items">#{item.title}</li>
</ul>

And since Nitro, always gives you one more option (sic), you can do:

<ul>
  <% for item in items %>
  <li>#{item.title}</li>
  <% end %>
</ul>

Suit yourself

Of course Nitro can do so much more. You can use XSLT on top of you
xhtml page, or the new cool Elements system (similar to JSP tag
libraries).

regards,
George.

John_Carter · 8 July 2005 03:51

I guess where I started was I wanted to loop on the rows of a table.

But then I'm holding all kind of state and what happens if I want to iterate over the columns of the table as well? (Nested loops.)

I'm then operating a fairly hairy state machine in my Listener class or I'm no longer using the one (longish) line StreamParser.

However, if anyone gets the urge to embellish what I have done, as I say, that's what Wiki's are for.

John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : john.carter@tait.co.nz
New Zealand

Carter's Clarification of Murphy's Law.

"Things only ever go right so that they may go more spectacularly wrong later."

From this principle, all of life and physics may be deduced.

···

On Fri, 8 Jul 2005, David Mitchell wrote:

Ok, I don't think the example I posted falls outside the bounds of simple. Maybe it makes your script a little more complicated but not the syntax that the end user must use. I suspect if you hold off the actual evaluation of the comments until the page is output then you might find this task simpler.

Christian_Neukirche1 · 9 July 2005 12:33

"George Moschovitis" <george.moschovitis@gmail.com> writes:

And since Nitro, always gives you one more option (sic), you can do:

<ul>
  <% for item in items %>
  <li>#{item.title}</li>
  <% end %>
</ul>

Except that isn't valid XML anymore; maybe a templating library
along Amrita2 or XTemplate would be more appropriate?

One could imagine a processing instruction at the beginning to setup
the contents, and then let Amrita do all the stuff.

···

George.

--
Christian Neukirchen <chneukirchen@gmail.com> http://chneukirchen.org

David_Mitchell · 8 July 2005 05:13

I hate to break this to you John, but the link you posted is to a blank wiki page. Perhaps you got the wrong link? The page http://www.rubygarden.org/ruby?RubyInXML doesn't exist.

Yes, you end up with a hairy state machine, but it doesn't mean you lose the one-line StreamParser.

Cheers

David

John Carter wrote:

···

On Fri, 8 Jul 2005, David Mitchell wrote:

Ok, I don't think the example I posted falls outside the bounds of simple. Maybe it makes your script a little more complicated but not the syntax that the end user must use. I suspect if you hold off the actual evaluation of the comments until the page is output then you might find this task simpler.

I guess where I started was I wanted to loop on the rows of a table.

<table>
  
    <tr>
      <td>
         #{i} - 
      </td>
      <td>
         <img src="#{i}.jpg" alt="#{i}"/>
      </td>
    </tr>
  
</table>

But then I'm holding all kind of state and what happens if I want to iterate over the columns of the table as well? (Nested loops.)

<table>
  
    <tr>
      <td>
         #{i} - 
      </td>
      
        <td>
           #{i*10 + j}
        </td>
      
    </tr>
  
</table>

I'm then operating a fairly hairy state machine in my Listener class or I'm no longer using the one (longish) line StreamParser.

However, if anyone gets the urge to embellish what I have done, as I say, that's what Wiki's are for.

John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : john.carter@tait.co.nz
New Zealand

Carter's Clarification of Murphy's Law.

"Things only ever go right so that they may go more spectacularly wrong later."

From this principle, all of life and physics may be deduced.

--
David Mitchell
Software Engineer
Telogis

George_Moschovitis · 11 July 2005 09:50

Except that isn't valid XML anymore; maybe a templating library
along Amrita2 or XTemplate would be more appropriate?

Yeap, it isn't valid XTML and this is discouraged. I find this usefull
when defining for example Email templates. Or if you would like to use
some Rails code with out many changes...

John_Carter · 8 July 2005 05:39

I hate to break this to you John, but the link you posted is to a blank wiki page. Perhaps you got the wrong link? The page http://www.rubygarden.org/ruby?RubyInXML doesn't exist.

Nope, seems to be all there when I look. From two separate machines.

Yes, you end up with a hairy state machine, but it doesn't mean you lose the one-line StreamParser.

True. I meant the choice was a hairy state machine xor use the "eat whole doc" REXML parser instead of stream parser.

John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : john.carter@tait.co.nz
New Zealand

Carter's Clarification of Murphy's Law.

"Things only ever go right so that they may go more spectacularly wrong later."

From this principle, all of life and physics may be deduced.

···

On Fri, 8 Jul 2005, David Mitchell wrote:

Dominik_Bathon · 8 July 2005 11:11

You have probably been caught by the RubyGarden tarpit:

http://www.ruby-talk.org/cgi-bin/scat.rb/ruby/ruby-talk/137405

···

On Fri, 08 Jul 2005 07:13:17 +0200, David Mitchell <david.mitchell@telogis.com> wrote:

I hate to break this to you John, but the link you posted is to a blank wiki page. Perhaps you got the wrong link? The page http://www.rubygarden.org/ruby?RubyInXML doesn't exist.

John_Carter · 10 July 2005 21:05

Hmm. Unfortunately I live behind a large corporate firewall. So reverse DNS is never going to work right.

I tried putting in other wiki links to that page and found that I had to wade through the entire existing page and find every existing http: and convert it to HTTP:.

I gave up on that pretty fast.

John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : john.carter@tait.co.nz
New Zealand

Carter's Clarification of Murphy's Law.

"Things only ever go right so that they may go more spectacularly wrong later."

From this principle, all of life and physics may be deduced.

···

On Fri, 8 Jul 2005, Dominik Bathon wrote:

On Fri, 08 Jul 2005 07:13:17 +0200, David Mitchell > <david.mitchell@telogis.com> wrote:

I hate to break this to you John, but the link you posted is to a blank wiki page. Perhaps you got the wrong link? The page http://www.rubygarden.org/ruby?RubyInXML doesn't exist.

You have probably been caught by the RubyGarden tarpit:

http://www.ruby-talk.org/cgi-bin/scat.rb/ruby/ruby-talk/137405

Jim_Weirich1 · 11 July 2005 01:02

Hmm. Unfortunately I live behind a large corporate firewall. So reverse
DNS is never going to work right.

As long as it resolves back to your corporate firewall, it should be OK. I
live behind an extremely unfriendly firewall, and it causes no problems for
the wiki.

And if it is the case that it does cause problem, just define a preferences
setting (which sets a cookie in your browser). That will make you avoid the
tarpit as well.

I tried putting in other wiki links to that page and found that I
had to wade through the entire existing page and find every existing http:
and convert it to HTTP:.

Yea, that is pretty annoying. I disabled that feature tonight. Although
helpful in the short run, it did nothing for long term spam avoidance. You
should be able to post links using http: again.

···

On Sunday 10 July 2005 05:05 pm, John Carter wrote:

--
-- Jim Weirich jim@weirichhouse.org http://onestepback.org
-----------------------------------------------------------------
"Beware of bugs in the above code; I have only proved it correct,
not tried it." -- Donald Knuth (in a memo to Peter van Emde Boas)

Topic		Replies	Views
XML in ruby ruby-talk	11	73	10 July 2006
XML parser ruby-talk	7	111	20 July 2008
Ann: rexml 2.5.8 ruby-talk	0	122	22 April 2003
Wrapping XML document in Class ruby-talk	6	158	13 March 2006
XML in Ruby (or C++?) ruby-talk	3	130	30 July 2003

Ruby in XML

Related topics