Is there a Ruby library that does HTML entity parsing?

Thanks for the pointer, Ilmari! Just checked and it works fine under Windows.

-John

···

________________________________

From: Ilmari Heikkinen [mailto:kig@misfiring.net]
Sent: Thu 4/28/2005 11:40 AM
To: ruby-talk ML
Subject: Re: Is there a Ruby library that does HTML entity parsing?

On 28.4.2005, at 18:05, John Lam wrote:

Steve, I've been using the URI lib quite a bit, but it doesn't parse
entities. One feature that would be nice to add is one that calculates
the URL for the directory that contains a document, given its complete
URL.

For example, consider:

http://www.foo.com/bar/something.htm

This document clearly lives in

http://www.foo.com/bar/

This can also be done using File.split / dirname / basename:

File.dirname "http://foo.com/bar/stuff.html"
#=> "http://foo.com/bar"

File.basename "http://foo.com/bar/stuff.html"
#=> "stuff.html"

File.split "http://foo.com/bar/stuff.html"
#=> ["http://foo.com/bar", "stuff.html"]

File.join( File.dirname("http://foo.com/bar/doc.html"\),
"relative_link.html" )
# => "http://foo.com/bar/relative_link.html"

Though that probably breaks on Windows since it has backslashes for
directory separators.

Cheers,
Ilmari Heikkinen