IMC Global News, and Forking XML::RSS
About 10 months ago, Indymedia started generating RSS feeds for its features column – a collection of stories that (hopefully) served as a hilight to the over whelming, and often spotty, open publishing newswire.
6 months ago, we started aggregating those feeds into the IMC Global Newswire, which you can see on the front page, bottom right, on Indymedia.org
Yesterday, I finally made available a RSS feed for the Indymedia Global Newswire, the cream if you will, from the 70+ sites publishing all over the world.
Forking XML::RSS
Why did it take so long? Well, honestly, mostly life got in the way. But for the last few weeks, what has been stopping me is a bug in XML::RSS XML::RSS is a great module, Perl is blessed to have such a good tool for working with RSS feeds, it not only parses all 3 versions of RSS into a simple, and easy to use datastructure (a very idiomatically Perl solution), but is capable of outputting RSS as well (rare is the library that does this) and all 3 versions. (Indymedia makes available RSS feeds as RSS .9, RSS .91, and RSS 1.0)
A Failure to Encode
However, there is a fatal flaw with this facility. XML::RSS automatically decodes HTML entties where it encouters them (thanks, I think to its underlying use of XML::Parser), however, its does not encode entities on output! Meaning, if you take your parsed XML::RSS object, and try to create new RSS from it, and one url contains an &
, then you’ve got invalid XML. So I’ve made a version of XML::RSS that does intelligent encoding. Right now you can’t turn it off, though I’m not sure why you want to turn on the option to make invalid XML. It uses a simple module I called XML::Encode which is really just a wrapper for some code from Matt Sergeant’s script rssmirror.pl.
I don’t think XML::Encode is really CPAN worthy, and I’m not sure the next step on XML::RSS as the maintainers haven’t shown interest. But I’ll make both available here, and feel free to contact me if you’re interested in getting this code.