Atom 1.0: Why Oh Why?

Spent a little time with the Atom 1.0 spec last night on the plane, and I’m coming to the conclusion that it was invented for the sole purpose of making my life difficult. (or perhaps less ego-centrically, making writing feed parsers more challenging)

I don’t really have the expertise of having actually upgraded a parser to add support for this entirely new, utterly backwards incompatible to a degree that makes the so called 7 incompatible versions of RSS look like a fond memory, but a handful of issues that are going to make my life unpleasant jumped out at me. (Maybe I’ll update this list over time)

Arbitrary Renaming

Is “published” really that much better then “issued”? Sure its a small change, but there are literally dozens of seemingly arbitrary word smithing, that makes maintain both a Atom 0.3 and Atom 1.0 parsers in parallel annoyingly cluttered.

updated (nee modified) now under defined

In Atom 0.3 the “modified” element was strongly defined, in that it was required to updated with each edit. This is no longer true. And while we’ve all spent the last 5 years developing expertise trying to guess whether an element has been updated with weird hashing/diffing techniques, it would have been nice to start forgetting about such stuff.

I Miss version

Sure a namespace is “proper” way to do things, but I miss the pragmatism of the version attribute. (what little pragmatism survived to find its way into Atom 0.3 seems to have been beaten out of Atom 1.0)

dc:subject

Dublin Core is a noble piece of work. At its heart it is trying to formally define some basic building blocks which can be universal truths in a digital age. Its like running code version of Semantic Web. So why is the use of dc:subject deprecated in favor of an overly complicated “category” element which, while being similar to the RSS 2.0 category element, has a different for the sake of being different feel to it.

The Atom “category” element

<category term='MSFT' />
<category term='MSFT' scheme='http://www.fool.com/cusips' />

vs. RSS 2.0

<category>MSFT</category>
<category domain="http://www.fool.com/cusips">MSFT</category>

Never thought I’d call an RSS 2.0 design decision[ clean and well thought out, but its all relative.

Content by Reference

One of Atom’s innovations is introducing a standard “link” element, allowing it to leverage and become a first class citizen of this Web thing which seems so popular. (I’ve written about this before, and mourned Kevin Burton’s mod_link)

So why does the Atom “content” element now allowing for “content by reference”, with the “src” attribute? Honestly, this seems like a clear case of confusing duplication. TIMTOWTDI is one the key factors people point to when claiming that Perl is an ugly, under designed, confusing mess.

For example, are there now two, valid but disparate ways of encoding podcasts in Atom?

And should a toolkit parsing Atom 1.0 automagically deference the URL defined in “src” attribute? I mean, if I’m parsing a feed I expect to able to display the contents of the content element. And if so, are we back to those wonderful bygone days (cough “rss2:enclosure” ) of baking dangerous required behaviour into our processing model? (speaking of process model, shouldn’t the Atom 1.0 processing model address this?)