>> Hi,
>> I'm trying to build a HTML page indexer in ruby and I'd like to be able
>> to use DOM and or XPath on a document. The application is currently
>> using REXML
> Yes, REXML can be awkward if you're used to using the DOM. IMHO.
Why do you say that? REXML provides an XML DOM in similar ways as other
XML libs. You can even use XPath queries.
Not sure what you mean by similar. Similar in that there is a tree of
elements that can be manipulated, but not similar to anything called
DOM.
In REXML, an Element is an REXML::Element; which is a REXML::Parent
which is a REXML::Child (huh?) which includes REXML::Node.
There is no NodeList, createTextNode(), getElementById(), etc...
To get an element by its ID, I'd have to say something like:
my_document.root.elements("//@id['crap']").each { #do something with
crap }
I would have liked to been able to use the DOM when using REXML,
unfortunately REXML doesn't really support it.
>> Is there a way to make REXML more permissive or is there another library
> There's libxml bindings for Ruby, but I recall that library missing
> getElementsByTagName and getElementsById. Though it does have a method
> to query the DOM via Xpath.
libxml won't help as Victor is not processing XML.
That should be fine.
> Have you tried using REXML's SAX2 parser? I think it would be better
> suited for your problem.
No, his problem is that he used an XML tool to process HTML.
Your right. He should never have been using REXML.
-Skye
···
On Jun 23, 1:48 pm, Robert Klemme <shortcut...@googlemail.com> wrote:
On 23.06.2009 19:24,SkyeShaw!@#$ wrote:
> On Jun 23, 8:55 am, Victor Tanvuia <victor.tanv...@tantanprod.com> > > wrote: