Hi,
I have a few questions about parsing HTML:
1) The default docs (rdoc) for HTMLParser (the one that comes with the
Win32 binary distribution) in Ruby are very poor. Where can I find
some good documentation of the module, or better yet a tutorial /
examples ?
2) Another question: is HTMLParser built after Perl's HTML::Parser ?
3) Can someone suggest which is the best parser to tokenize and build
a tree of the HTML document ? Hpricot looks like a nice parser and is
well documented, but I'm not sure it's suitable.
Thanks in advance