I don't think I've mentioned my FeedTools library anywhere except my
blog, so I guess maybe I should get around to announcing it more
visibly since I'm sure there are plenty of people who could find a use
I just released version 0.2.3 yesterday, and work on it is coming
along rather quickly. Basically, FeedTools is a Rails-friendly
implementation of a caching xml feed parser, quite similar to Mark
Pilgrim's python feedparser library, except with a few extra features,
and not nearly as many unit tests (still working on this, trying to
get the library to actually pass a port of the same suite of tests
that he uses). It can parse any version of RSS or CDF, and I believe
it parses Atom 1.0 just fine as well, although I haven't had the time
to write enough unit tests to be certain.
The parser automatically handles html sanitization and optionally
offers support for the html tidy library. If libtidy is not present,
it seamlessly reverts to untidy html output. There's also fairly
extensive support for enclosures and the itunes and yahoo media
modules (though there's probably still quite a bit of work to be done
in this area).
The caching system is modular, and can use any caching mechanism that
conforms to the caching interface. By default, it uses a database
feed cache based on ActiveRecord (which, when using Rails, will
connect you automatically to the same database that Rails is using).
Alternatively, it can run without any caching mechanism at all.
Also, the FeedTools library has methods for generating xml for RSS
1.0, 2.0 and Atom 0.3 using the Builder library. This means that feed
generation in Rails' .rxml files is very easy to accomplish (most of
the time the view code is a single line).
There's still a lot more work to be done, but so far, it appears to be
fairly stable and tends to be quite liberal in what it accepts as
input (although I don't think it's quite up to the level of Mark
Pilgrim's parser... yet). If anyone spots any bugs, problems,
concerns, or design issues, please let me know.
Documentation is here: http://sporkmonger.com/projects/feedtools/api/
slashdot_feed = FeedTools::Feed.open('http://www.slashdot.org/index.rss')
=> "News for nerds, stuff that matters"