From a URL to XPath 2.0

Evan_Senter · 20 February 2008 21:31

Hi,

I am trying to write a small script that allows me to scrape HTML using
XPath 2.0. As much as I enjoyed using hPricot, it's lack of support for
indexed paths has forced me to look to a different tool (I've heard
REXML has the best XPath support). In order to use REXML however, I need
to first convert the HTML to XML and I'm yet to find a good gem / plugin
to do that.

As I mentioned however, my main interest is having index support for
XPath queries against an HTML page arbitrarily pulled from a generated
URL. Anyone know of a good approach to handle this?

Thank you,

Ruby.new(user)

···

--
Posted via http://www.ruby-forum.com/.

Guillaume_Carbonneau · 21 February 2008 05:03

Evan Senter wrote:

Hi,

I am trying to write a small script that allows me to scrape HTML using
XPath 2.0. As much as I enjoyed using hPricot, it's lack of support for
indexed paths has forced me to look to a different tool (I've heard
REXML has the best XPath support). In order to use REXML however, I need
to first convert the HTML to XML and I'm yet to find a good gem / plugin
to do that.

As I mentioned however, my main interest is having index support for
XPath queries against an HTML page arbitrarily pulled from a generated
URL. Anyone know of a good approach to handle this?

Thank you,

Ruby.new(user)
--
Posted via http://www.ruby-forum.com/\.

Hi, you might want to try HTML tidy

project : http://tidy.sourceforge.net/
try it online (output XML): HTML Tidy Online

···

--
View this message in context: http://www.nabble.com/From-a-URL-to-XPath-2.0-tp15599428p15604926.html
Sent from the ruby-talk mailing list archive at Nabble.com.

Topic		Replies	Views
HTML dom ruby-talk	8	101	25 June 2009
XPath and HTML ruby-talk	8	84	13 October 2003
Using Xpath in Ruby ruby-talk	0	79	19 May 2008
Hpricot and XPaths with indices ruby-talk	2	115	8 October 2006
HTML Parser suggestions wanted ruby-talk	12	127	4 June 2002

From a URL to XPath 2.0

Related topics