Cs_Webgrl
(Cs Webgrl)
30 December 2008 23:33
1
Hi.
I am currently scraping a page with scRUBYt and am not getting the
results as expected.
Instead of the correctly formatted xml document I'm getting the
following.
<record>
<1>book a</1>
<1>book b</1>
<1>book c</1>
<2>chapter aa</2>
<2>chapter bb</2>
<2>chapter cc</2>
<3>verse aaa</3>
<3>verse bbb</3>
<3>verse ccc</3>
</record>
My code looks like this:
listing "//a[@id *='volume'>" do
book "//a[@class='1']"
chapter "//span[@class='2']"
verse "//a[@id *='3']"
end
Any ideas?
Sorry for the sample data, but hopefully someone has seen this before
and can help.
···
--
Posted via http://www.ruby-forum.com/ .
Hi.
I am currently scraping a page with scRUBYt and am not getting the
results as expected.
Instead of the correctly formatted xml document I'm getting the
following.
<record>
<1>book a</1>
<1>book b</1>
<1>book c</1>
<2>chapter aa</2>
<2>chapter bb</2>
<2>chapter cc</2>
<3>verse aaa</3>
<3>verse bbb</3>
<3>verse ccc</3>
</record>
This is a correctly formatted XML document. You just have numbers for
tag names.
My code looks like this:
listing "//a[@id *='volume'>" do
book "//a[@class='1']"
chapter "//span[@class='2']"
verse "//a[@id *='3']"
end
Any ideas?
Have you tried something like this:
book "//2[@id='whatevs']"
That should get you access to the tags.
Hope that helps!
···
On Wed, Dec 31, 2008 at 08:33:26AM +0900, Cs Webgrl wrote:
--
Aaron Patterson
http://tenderlovemaking.com/
Cs_Webgrl
(Cs Webgrl)
31 December 2008 01:02
3
Aaron Patterson wrote:
Have you tried something like this:
book "//2[@id='whatevs']"
That should get you access to the tags.
This gives me a ton of data, but now I have lost the specific pieces of
information that I'm looking for. Instead it looks like the output of
all of the sourced code on that page. Was I to change something else in
the code to get the specific piece of data that I need?
···
--
Posted via http://www.ruby-forum.com/\ .