I bit the bullet and went to ruby cvs last week from 1.6.8.
The new libraries are great! Today I am playing with rexml and
open-uri. Something is b0rked, though.
I tried parsing my homepage since it’s xhtml, and I don’t have
loads of xml files knocking about.
open-uri loads the doc fine (nice lib, incidentally), but rexml has
some problems with it. If I remove this line:
from , it all works ok. Anything I’m missing?
Here’s the program, both inputs and the output (I cut out all the
bits that didn’t seem relevant):
1rasputin@lb:xml$ cat wtf.xml
oh dear
hmm
1rasputin@lb:xml$ cat poc.rb
#!/data/ruby/bin/ruby -w
require “rexml/document”
require “open-uri”
xml = open(ARGV[0])
doc = REXML::Document.new xml
1rasputin@lb:xml$ ./poc.rb wtf.xml
/data/ruby/lib/ruby/1.9/rexml/parsers/baseparser.rb:291:in pull': Missing end tag for 'head' (got "html") (REXML::ParseException) Line: 9 Position: 280 Last 80 unconsumed characters: from /data/ruby/lib/ruby/1.9/rexml/document.rb:180:in build’
from /data/ruby/lib/ruby/1.9/rexml/document.rb:44:in initialize' from ./poc.rb:6:in new’
from ./poc.rb:6
1rasputin@lb:xml$ ./poc.rb ok.xml
1rasputin@lb:xml$ diff ok.xml wtf.xml
5a6
1rasputin@lb:xml$
···
–
Serenity through viciousness.
Rasputin :: Jack of All Trades - Master of Nuns
I bit the bullet and went to ruby cvs last week from 1.6.8.
The new libraries are great! Today I am playing with rexml and
open-uri. Something is b0rked, though.
I tried parsing my homepage since it’s xhtml, and I don’t have
loads of xml files knocking about.
open-uri loads the doc fine (nice lib, incidentally), but rexml has
some problems with it. If I remove this line:
I think you should enclose “stylesheet” with quotation
marks, i.e. ‘rel=stylesheet’ should be ‘rel=“stylesheet”’.
XML does not allow you to skip quotation marks in attribute
values.
Thanks, I thought I’d run tidy on it, but obviously without the
option to enforce xml compliance.
Sorry for the noise, I’ll get back to playing…
Grrr. I’ve got to sit down one of these days and improve REXML’s
error reporting. It isn’t quite a bug, but it is certainly frustrating
that REXML didn’t tell you that the problem was that quotes were
missing.