Problems with scRUBYt

Cs_Webgrl · 30 December 2008 23:33

Hi.

I am currently scraping a page with scRUBYt and am not getting the
results as expected.

Instead of the correctly formatted xml document I'm getting the
following.

<record>
  <1>book a</1>
  <1>book b</1>
  <1>book c</1>
  <2>chapter aa</2>
  <2>chapter bb</2>
  <2>chapter cc</2>
  <3>verse aaa</3>
  <3>verse bbb</3>
  <3>verse ccc</3>
</record>

My code looks like this:

listing "//a[@id*='volume'>" do
book "//a[@class='1']"
chapter "//span[@class='2']"
verse "//a[@id*='3']"
end

Any ideas?

Sorry for the sample data, but hopefully someone has seen this before
and can help.

···

--
Posted via http://www.ruby-forum.com/.

Aaron_Patterson1 · 31 December 2008 00:37

Hi.

I am currently scraping a page with scRUBYt and am not getting the
results as expected.

Instead of the correctly formatted xml document I'm getting the
following.

<record>
  <1>book a</1>
  <1>book b</1>
  <1>book c</1>
  <2>chapter aa</2>
  <2>chapter bb</2>
  <2>chapter cc</2>
  <3>verse aaa</3>
  <3>verse bbb</3>
  <3>verse ccc</3>
</record>

This is a correctly formatted XML document. You just have numbers for
tag names.

My code looks like this:

listing "//a[@id*='volume'>" do
book "//a[@class='1']"
chapter "//span[@class='2']"
verse "//a[@id*='3']"
end

Any ideas?

Have you tried something like this:

book "//2[@id='whatevs']"

That should get you access to the tags.

Hope that helps!

···

On Wed, Dec 31, 2008 at 08:33:26AM +0900, Cs Webgrl wrote:

--
Aaron Patterson
http://tenderlovemaking.com/

Cs_Webgrl · 31 December 2008 01:02

Aaron Patterson wrote:

Have you tried something like this:

book "//2[@id='whatevs']"

That should get you access to the tags.

This gives me a ton of data, but now I have lost the specific pieces of
information that I'm looking for. Instead it looks like the output of
all of the sourced code on that page. Was I to change something else in
the code to get the specific piece of data that I need?

···

--
Posted via http://www.ruby-forum.com/\.

Topic		Replies	Views
Scrubyt and line breaks ruby-talk	0	74	12 August 2009
Selecting text in scrubyt ruby-talk	0	87	31 October 2008
Problem with xpath in scrubyt ruby-talk	2	121	24 July 2009
Data extraction using Scrubyt ruby-talk	3	82	6 December 2008
Scrubyt scraper help ruby-talk	0	81	1 October 2010

Problems with scRUBYt

Related Topics