[QUIZ] Posix Pangrams (#97)

> (There's no joy in trying to parse/string HTML.)
Sure there is.

require 'rubygems'; require 'hpricot'; require 'open-uri'
doc = Hpricot(open('http://www.unix.org/version3/apis/cu.html'\))
puts (doc/"p.tent/i").map{|i|i.inner_html}

Hpricot makes HTML scraping fun again (no, really!).

:slight_smile:

I copy/pasted the rows from the table and did it like this:
puts <<ENDS.scan( /<i>([^<]+)/ ).flatten.join( ' ' )
#HTML HERE
ENDS

Actually...I can't get Hpricot to install to test your code (gem server
seems to be down) but doesn't that grab way more information than you
wanted from the table? The table headers and all columns, too?

路路路

From: Jamie Macey [mailto:jamie.macey@gmail.com]

I'm assuming the problems with the gem server are related to the
problems with RubyForge. Hopefully that will be resolved soon, I know
it's being worked on.

Regarding your question: note the "/i" after "p.tent".

Jacob Fugal

路路路

On 10/6/06, Gavin Kistner <gavin.kistner@anark.com> wrote:

From: Jamie Macey [mailto:jamie.macey@gmail.com]
> require 'rubygems'; require 'hpricot'; require 'open-uri'
> doc = Hpricot(open('http://www.unix.org/version3/apis/cu.html&#39;\))
> puts (doc/"p.tent/i").map{|i|i.inner_html}

Actually...I can't get Hpricot to install to test your code (gem server
seems to be down) but doesn't that grab way more information than you
wanted from the table? The table headers and all columns, too?

I am pulling the entire html file down, but by dividing the Hpricot
instance I'm essentially asking it to give me all the <i> tags that
are inside a <p> with the tent class. Given the content of the file I
could probably just do doc/"i" but it would also grab the 'opt' from
the definition list up top.

My code does give the same output as yours, excepting that since I'm
putsing the array rather than joining I get one command per line.

- Jamie

路路路

On 10/6/06, Gavin Kistner <gavin.kistner@anark.com> wrote:

From: Jamie Macey [mailto:jamie.macey@gmail.com]
> require 'rubygems'; require 'hpricot'; require 'open-uri'
> doc = Hpricot(open('http://www.unix.org/version3/apis/cu.html&#39;\))
> puts (doc/"p.tent/i").map{|i|i.inner_html}

Actually...I can't get Hpricot to install to test your code (gem server
seems to be down) but doesn't that grab way more information than you
wanted from the table? The table headers and all columns, too?