Blog posts tagged "atom"

Yelp, User Contribued Content and Feed Design

December 23rd, 2005

Yelp gets feeds just right. I’m not sure I’ve ever said that before about anybody.

They’ve got:

  • a feed of my reviews allows me to re-purpose the content I create (No RSS, No Content Creation],
  • a feed of my (as yet non-existent) network’s reviews
  • feeds contain the full content of my review (again my content, I created it, give it to me)
  • rich feeds; they use the geo namespace to embed lat/long
  • Both RSS and Atom 1.0 feeds

Very nice.

A minor nit. These are feeds of reviews, not feeds of places, so it makes sense that the rating is included in the title of the entry, but I’d still like to see the rating and location’s name presented in separate structured elements as well (say for example I want to syndicate my reviews locally, and display a graphic of the stars)

Also per category feeds might be useful.

Atom and Wiki Driven Testing

December 20th, 2005

Its been a long standing todo to port Mark’s FeedParser tests to work against Magpie, possibly with an intermediate representation to allow cross-language testing. (has any work been down on capturing unit tests/acceptance tests in XML?) Sam’s approach hilights Ruby-the-language’s awesome flexibility (I’d been playing with something similar for the parser we wrote for Odeo), but doesn’t map to PHP/Magpie very well.

Phil kicked off a new round of testing for Atom 1.0, the results of which are now captured in the Atom wiki. (not to mention a few gentle nudges on Magpie’s lack of 1.0 compliance.)

All of which got me thinking, it would be exceptionally cool if someone made the FeedParser’s tests available on the Atom wiki using Ward’s FIT concept in a documented, reportable fashion.

Any takers?

Tagged: Uncategorized , , , , , ,

Towards MagpieRSS 0.8: repeating elements, attributes, and Atom 1.0

November 5th, 2005

Now that we’ve got the security release out of the way, its time to move on top something a little more interesting. I finally got a chance to add a huge patch from RadGeek (of FeedWordPress fame) that adds:

  • uniform access to attributes
  • support for repeating elements
  • Atom 1.0 support

I’ve struggled for forever to figure out how to provide simple, uniform access to the increasingly rich data that people are syndicating (finally!). Eventually I gave up, and decided that the only solution was to rewrite the parser and make it simple to add per field custom logic. (kind of like the enclosure patch adds custom logic). I never got past the initial sketches.

So I’m thrilled that RadGeek has come up with a syntax (and code!) to extend rss_parser.inc to add support, while staying transparently backwards compatible. I’ve got out a dev build for people to play with, and written up some of the new features.

Access Repeating Elements

Lets assume we have a basic RSS item, with several dc:subjects:

<item>
<title>Some Exciting Title</title>
<dc:subject>exciting</dc:subject>
<dc:subject>example</dc:subject>
<dc:subject>whoohoo</dc:subject>
</item>

echo $item['title'] 
=> "Some Exciting Title"

echo $item['dc']['subject']
=> "exciting"

So far, so normal. Now we get special.

echo $item['dc']['subject#2']
=> "example"

echo $item['dc']['subject#3']
=> "whoohoo"

So how many dc:subjects do we have?

echo $item['dc']['subject#']
=> 3

And this isn’t just for dc:subject, it works with whatever elements you like to repeat. (though we’ll need to decide what to do with the Atom link reltype munging hack, I’ll touch on that in a future post)

Access Attributes

Lets assume, we now have an Atom item with a category like:

<category term='atom' scheme='http://del.cio.us/tags' label='Atom' />

echo $item['category@term']
=> "atom"

echo $item['category@scheme']
=> "http://del.cio.us/tags"

echo $item['category@label']
=> "Atom"

echo $item['category@']
=> "term,scheme,label"   // that might want to change to an array

And if we had a second category for that item?

<category term='calendar' />

echo $item['category#2@term']
=> 'calendar'

or maybe an RSS 2.0 guid element

<guid isPermaLink="true">http://inessential.com/2002/09/01.php#a2</guid>

echo $item['guid']
=> "http://inessential.com/2002/09/01.php#a2"

echo $item['guid@ispermalink']
=> "true"

Where from here

So give the dev build a spin, kick the tires etc. This is the largest new feature in a while, and would be good to give it a workout. (ps. I’ve affair the normalization methods are throwing notices currently.)

Also any show stoppers people might see with this new syntax.

Once we’ve got this working smoothly, there are several other new features looking for inclusion in next release.

And finally I’m debating whether to break backward compatibility with how we currently do Atom link munging, in order to get more consistency with this new syntax, and I’d like to get some feedback on that.

Tagged: Uncategorized , , ,

Atom 1.0: Why Oh Why?

August 16th, 2005

Spent a little time with the Atom 1.0 spec last night on the plane, and I’m coming to the conclusion that it was invented for the sole purpose of making my life difficult. (or perhaps less ego-centrically, making writing feed parsers more challenging)

I don’t really have the expertise of having actually upgraded a parser to add support for this entirely new, utterly backwards incompatible to a degree that makes the so called 7 incompatible versions of RSS look like a fond memory, but a handful of issues that are going to make my life unpleasant jumped out at me. (Maybe I’ll update this list over time)

Arbitrary Renaming

Is “published” really that much better then “issued”? Sure its a small change, but there are literally dozens of seemingly arbitrary word smithing, that makes maintain both a Atom 0.3 and Atom 1.0 parsers in parallel annoyingly cluttered.

updated (nee modified) now under defined

In Atom 0.3 the “modified” element was strongly defined, in that it was required to updated with each edit. This is no longer true. And while we’ve all spent the last 5 years developing expertise trying to guess whether an element has been updated with weird hashing/diffing techniques, it would have been nice to start forgetting about such stuff.

I Miss version

Sure a namespace is “proper” way to do things, but I miss the pragmatism of the version attribute. (what little pragmatism survived to find its way into Atom 0.3 seems to have been beaten out of Atom 1.0)

dc:subject

Dublin Core is a noble piece of work. At its heart it is trying to formally define some basic building blocks which can be universal truths in a digital age. Its like running code version of Semantic Web. So why is the use of dc:subject deprecated in favor of an overly complicated “category” element which, while being similar to the RSS 2.0 category element, has a different for the sake of being different feel to it.

The Atom “category” element

<category term='MSFT' />
<category term='MSFT' scheme='http://www.fool.com/cusips' />

vs. RSS 2.0

<category>MSFT</category>
<category domain="http://www.fool.com/cusips">MSFT</category>

Never thought I’d call an RSS 2.0 design decision[ clean and well thought out, but its all relative.

Content by Reference

One of Atom’s innovations is introducing a standard “link” element, allowing it to leverage and become a first class citizen of this Web thing which seems so popular. (I’ve written about this before, and mourned Kevin Burton’s mod_link)

So why does the Atom “content” element now allowing for “content by reference”, with the “src” attribute? Honestly, this seems like a clear case of confusing duplication. TIMTOWTDI is one the key factors people point to when claiming that Perl is an ugly, under designed, confusing mess.

For example, are there now two, valid but disparate ways of encoding podcasts in Atom?

And should a toolkit parsing Atom 1.0 automagically deference the URL defined in “src” attribute? I mean, if I’m parsing a feed I expect to able to display the contents of the content element. And if so, are we back to those wonderful bygone days (cough “rss2:enclosure” ) of baking dangerous required behaviour into our processing model? (speaking of process model, shouldn’t the Atom 1.0 processing model address this?)

Tagged: Uncategorized , , ,

Private Feeds, and Atom as Open Pipe

January 25th, 2005

Tim Bray has a short entry on Private Syndication this morning, which I by and large agree with. Personal feeds make sense; they make sense from the perspective of business workflow , content model, and a scalability.

In order to make it happen we really need an updated list of what aggregator support which key features. HTTPAuth (at least Basic, if not Digest) and SSL are the fundamental building blocks of private feeds, with the addition that the major aggregator services need to be aware that content could be negoiated at auth time. The only list I know of is from July 2003.

I was puzzled and pleased to see his closing line:

One detail: I think that for this kind of content-critical, all-business feed, Atom is a more attractive choice than any of the RSS flavors.

Which is odd, because all of the time I’ve spent with the Atom community (which was admittedly still called Pie/Echo at the time) was focused on blogs to the exclusion of all else, and all arguments I made about the potential of pushing other forms of data over this new format were ignored/squelched.

For example, an Atom feed, requires every entry to have an author element, which is defined as a Person contruct. Who is the “person” in an Atom feed generated by your “bank account, credit card, or stock portfolio”?

Additionally perhaps the language of the spec needs to be updated with some namespace best practices, and some non-blog examples?

Tagged: Uncategorized , , , , ,

RSS 1.0 and Links

January 20th, 2005

I swear I get closer to walking away from RSS 1.0 every day and not looking back. If Userland RSS wasn’t such a freak show I’d be gone already. One of my greatest frustrations with RSS 1.0 has been the refusal for years to add an equivalent to the Atom link construct, on the grounds that RDF already provides this. You just can’t add it to your feeds today, or even tomorrow.

The thread recently re-emerged mod_link deprecated, and Ken’s postscript today for me really sums up the rss-dev’s working groups total disinterest in non-RDF solutions, and lack of commitment to running code.

P.S. I understand that a follow-on discussion could happen about what constitutes a “generic link”, but the protracted discussion on atom-syntax resolved that there’s no such thing as a “generic link”, it’s just that some people feel more comfortable “extending” a syntax at a bottlneck point where they believe they don’t have to create a namespace or URI and new XML elements. RDF says, “get past it, it’s simpler when you do”.

It isn’t easier. It is harder to understand, harder to explain, provides less guidance to new developers, and makes it that much more difficult to write a parser. Obviously I’m in favor of namespaces, but not allowing a simple link syntax is why all the innovation is currently happening in RSS 2.0 space. (despite any misgivings regarding “podcasting” it embraces my understanding of RSS as generic webservice stream syntax, an insight that really opens up possibilities)

Tagged: Uncategorized ,