Weather, RSS, and Thunderstorms

June 22nd, 2003

Tim Bray touched lightly on an idea for a business model I had a while back; that of leveraging RSS’s popularity as a format beyond web syndication. Information like bank services, sales tracking, traffic alerts, and weather.

Its an idea that occurred to me when I first started playing around with delivering events via RSS, and realized that RSS wasn’t an XML file for headlines, but a webservice pipeline I could shove all types of data through.
Sourceforge figured out the same thing, and Technorati has a nice little service (don’t know how “successful” in the business since its been) selling a personalized RSS feed that can watch Google queries, or keep track of who is linking to you.

Tim mentions a few of the same types of feeds I was thinking of, misses a couple, and mentions one I never would have thought of, traffic reports. (being a non-driver and all) Unfortunately/fortunately most of my entrepreneurial flare was burned out of me during the brief years I was running my own little dotcom, and so like a handful of other business models, it sits collecting dust.

Weather

Or at least sort of. I did have a brief go at producing an RSS feed of the weather, and last night, as lightning struck all around, and great thunder claps, pealed and rumbled on and on like bombs detonating, their sound waves rattling my winds, and setting off all the car alarms of the apartment building next door, I revived the project.

(long rambling thoughts on RSS weather service [see: insomnia-induced]. includes running code!)

A Quick Spec

A spec is easy to produce, weather reporting is well understood.
Detailed current info, a forecast, alerts for major changes, storm tracking, maybe some doppler images for when large bodies of precipitation move through your area. Aren’t we lucky that this is essentially the form the informations in? (Quick tangent. A trick to providing this service, and something you would put some time into if you were providing a large number of different information feed, is deciding how and when a new item gets generated. What constitutes new info? What is the threshold triggers a new notification? This is not as critical as it might be in say, and SMS paging service, as a RSS item is a less aggressive form of notification).

Current State of Things

Governments around the world do a pretty good of collecting weather statistics. Then a number of commercial companies are willing to re-package that info for you, and re-sell it at exorbitant prices. (aforementioned dotcom was doing a local pitch, weather is part of that sell) The US National Weather Service also provides several different forms of raw data: the NOAA’s forecasts and the aviation service METAR, for detailed current conditions at airports. Dozens, perhaps hundreds, of tools exist to parse these formats, as do tools for scraping the big weather sites, Wunderground and Weather.com (which are really just super sophisticated versions of the smaller tools) CPAN turns up a dozen or so of varying approach and quality. However a quick survey (and some historical knowledge of the options) turns up none of them that are quite flexible to do what I want.

Notes On My Approach

I’m doing this as a cure for insomnia, and because weather has rather aggressively shoved itself into my consciousness, something that rarely happened as a child growing up in eternally “partly cloudy” Santa Cruz. (truth be told I’ve been thinking about weather off and on for a while). I’m not building a business, and I’ve never really been a one of those Weather Channel radar imagery addicts. And while its nice of the government to provide a text format, frankly its easier to scrape HTML then parse their terse, circa 1970s format.(HTML::TokeParser is your friend!) Also limiting one’s service to US cities besides being boring, is a dead end, and makes you seen provincial. (the real reason I18N is taking off, so your European counterparts can’t laugh at you)

For national weather, given a city and state, scrape the NWS website. This gives you anywhere for 3-6 days of forecast. Formats vary based on locality (there is a certain amount of free form data entry going on at the sources) but thats okay, I don’t need to understand the data, just classify it, and re-display it.

For international weather, given a city and country, scrape Wunderground. (Wunderground’s international info is nice and predictable, while their national coverage tends to be chatty, and harder to interpret)

Interpretation

Weather forecasts tends to get broken into morning, afternoon, night, and overnight periods. Ideally a service would collapse this information into a form compact form as the forecast gets farther into the future. Similarly you sometimes see 4 or 5 days of identical predictions when its clear that what is really going on is a lack of good date, would be nice to run a similar transformation on those. Maybe in version 2. (they also compose their forecasts from a limited vocabulary like: clear, fair, partly cloud, etc. Which is not useful to me currently, but might be in the future) Actually the Wunderground based feeds has a bit of ways to go before it looks quite right.

RSS Design Descisions

In this version there are no special namespaces used, all the info is packed into your basic RSS elements. Warnings only provide a link to further details, the warning isn’t inlined. Forecast items’ rdf:about attribute points to arbitrary destination, and <link> points to the page where the info was scraped from.(note this requires a recent XML::RSS CVS checkout, or XML::RSS 1.03 when its released) Each item gets the date of when the scrape took place, and the recommended ordering is that in the RSS feed. Future versions of the RSS would probably include a weather module, and images.

Code?

The package contains 2 pieces, Weather::RSS, and Weather::RSS::Source.

Weather::RSS::Source is an interface that determines if your asking for US weather, or weather from somewhere else and then hands off the heavy lifting (filling out forms, handling redirects, HTML scraping) to Weather::RSS::Source:WeatherGov and Weather::RSS::Source::Wunderground respectively, whom returned detailed current info, the extended forecast, and any weather alerts, or hazard warnings they happened to find. (this piece should probably be re-packaged and released to CPAN as I like it better then any of the existing Weather:: packages, maybe as Weather::Source? Weather::Forecast? Weather::RainGods?)

Weather::RSS on the other hand is merely responsible for taking the resultant weather source object, and spitting out an RSS feed.

Also include is a little cgi that steals the logic from Aaron’s Acme::Test::Weather and attempts to guess at a visitors location, and provide the proper RSS feed.

Next Steps

Well the code is all very alpha quality. With screen scraping, and particularly with weather, you never know you’ve got your parser just right, you just know it hasn’t broken yet. So there are probably a few bugs still lurking.

Improvements could come from being more intelligent about when to produce new RSS items, and folding/condensing the forecasts as mentioned above. Also including doppler, doing a better job of tracking down extreme weather info, and maybe cute little weather icons like everyone seems to have.

Also the current version (if its still the current version) will probably brake sometime next Winter, as Winter weather reports just look different.

Lastly more and better datasources. If you know of a better source of info, let me know. Thanks.

Providing a Service

Right now its just some Perl modules, and few scripts, you’re welcome to download it. Like I said I kind of burned out my entrepreneurial side hasn’t really re-grown. But IF you really did want to pay me to provide a RSS weather service, something like $8-$12/year probably I might be interested. I wonder what the legality of re-selling other peoples data is? There are a number of desktop clients which do it, but they’re selling clients, not the data. (I guess I’m just selling a webservice client at that) In the process of jack hammering my thoughts out of Perl and into English this morning (Trader Joe’s Moka Java) I think I was hit for the first time what confuses and baffles people about open source. At commercial development prices this is a $1500-$2000 dollar investment before infrastructure, as open source its something to do until the rain stops. (hey, or you could hire me, and I’ll thrown in setting up a customized RSS feed of just about anything as a bonus)

In the meantime, expect at least a minimal project page, ala cvs2rss in the near future. (and try to resist making it into a wildly successful commericial venture, that would be depressing)

ps. this post almost was titled, “It was a Dark and Stormy Night” (after all it was!), this is why people are bad at meta-data.

Tagged: Uncategorized , , ,

Comments are closed.