Hello,
I've put together a new rsssf gem [1] that lets you work with rsssf
archive pages.
What's the RSSSF?
RSSSF stands for Rec.Sport.Soccer Statistics Foundation [2]
and collects football (soccer) league tables, match results
and more from all over the world online in plain text for over fifteen years.
Today the rsss archive is the world's largest football data archive.
(Note: I'm not associated with the RSSSF).
Anyways, trying to get any of the plain text datasets into an SQL
database (with tables such as leagues, teams, matches, etc.)?
The new rsssf scripts and repos get you started. Example:
page = RsssfPage.from_url( 'http://www.rsssf.com/tablese/eng2015.html')
schedule = page.find_schedule( header: 'Premier League')
schedule.save( './1-premierleague.txt' )
schedule = page.find_schedule( header: 'FA Cup', cup: true )
schedule.save( './facup.txt' )
Or working in batch (many seasons) using a repo:
repo = RsssfRepo.new( './eng-england', title: 'England (and Wales)' )
repo.fetch_pages
repo.make_pages_report
and so on. Questions? Comments? As always welcome.
All code public domain. Cheers.
[1] https://github.com/sportdb/rsssf
[2] http://www.rsssf.com