Hpricot and path of an elememt

Hi all,

I use hpricot to load a page. Then I try to find the path for an
element "font"(<font face="courier" color="black">) in the page. Here is
the tutorial

#=> "//div[@id='header']"

here is my code:
puts doc.at("#font").xpath

When I run the code Ruby complains undefined method for xpath. I wonder
if I have problem understanding the tutorial.




Posted via http://www.ruby-forum.com/.

I use hpricot to load a page. Then I try to find the path for an
element "font"(<font face="courier" color="black">) in the page.

So, you probably want:

(doc / 'font')

#=> "//div[@id='header']"

Right, that's searching for a tag that looks like this: <div id="header">

here is my code:
puts doc.at("#font").xpath

And that's searching for a tag that looks like this: <div id="font">

If you're following that example, you probably want:

puts doc.at('font').xpath

Now, first question: Why do you need the xpath? Usually, the idea is to try to
find that element, and then do something with it. So, for example:

# To return all text:
(doc / 'font').text

# To loop over each font element:
(doc / 'font').each { |tag|
  puts tag.inner_text

Second question: Why is there a font tag on this page? If you had any hand in
creating the page, shame on you -- go learn some CSS.

In fact, go learn some CSS anyway. Hpricot supports both CSS selectors and
XPath, and it's usually much easier to use the selectors. Years later, I
still remember, roughly, how selectors work -- but only a few months later,
I've almost completely forgotten XPath.

There are things XPath can do that selectors can't. But until you encounter
them, XPath is overkill.


On Sunday 10 August 2008 13:36:42 Li Chen wrote:

David Masover wrote:

Now, first question: Why do you need the xpath? Usually, the idea is to
try to
find that element, and then do something with it. So, for example:

# To return all text:
(doc / 'font').text

# To loop over each font element:
(doc / 'font').each { |tag|
  puts tag.inner_text

I need to extract text within this tag. I follow you code and I find
1) (doc/'font').text and (doc/'font').html return the same results
2) when I run (doc / 'font').each { |tag| puts tag.inner_text}
Ruby complains it:
undefined method `inner_text' for #<Hpricot::Elem:0x2e9f9c4>

so I change it to tag.inner_html and it works. I check the document
about hpricot and find the methode #inner_text is there. But I cannot
figure out why Ruby complains about it.

Second question: Why is there a font tag on this page? If you had any
hand in
creating the page, shame on you -- go learn some CSS.

I am a newbie on HTML and website development. If you want to know why
there is a font tag in the page, please check this out:

What I try to do is to extract some info I am interested from this
page. I have no idea why they put this tag and that tag there. I don't
think it is my priority to know somany whys now. I am more concerned
about letting the job done.

Anyway thank very much for the tips.



Posted via http://www.ruby-forum.com/\.