Flickr, Twitter, OAuth: A Secret History

July 1st, 2009

I remember it as a dark and stormy night, that seems unlikely, but I’m sure it was late and chilly and damp.

I remember being tired from a long day in the salt mines; that was during a period when I was always tired after work.

I remember there being whiskey, and knowing @maureen, that seems likely.

I’d just won some internal battles regarding delegated auth, and implemented Google AuthSub for the new Blogger Beta, as well as Amazon auth for a side project. So when I wanted to share photos from Flickr to Twitter, I knew it wasn’t going to be over HTTP Basic Auth.

A few weeks earlier @blaine and @factoryjoe had pulled me a into a project called OpenAuth that they’d been talking about for a couple of months — an alternative to yet another auth standard, and a solution for authenticating sites using OpenID.

So one late, damp night along Laguna St. with whiskey, we did a pattern extraction, identifying the minimal possible set of features to offer compatibility against existing best practice API authorization protocols. And wrote down the half pager that became the very first draft of the OAuth spec.

That spec wasn’t the final draft. That came later, after an open community standardization process allowing experts from the security, web, and usability community to weigh in and iterate on the design. But many of those decisions (and some of the mistakes) from that night made it into the final version.

Yesterday, a little over two years later, we finally shipped Flickr2Twitter.

So it was nice yesterday when people commented on the integration:

“Uses OAuth!” “Doesn’t ask for your Twitter password” “Great use of OAuth”.

And I thought to myself, “It better be, this is what OAuth was invented for — literally”.

George Oates: A new leaf.

April 27th, 2009

Today is George’s first day heading up the Open Library project, a project with a simple goal:

“a page on the web for every book ever published”

Woot! Something I’ve wanted for forever.

And George is a great person to head it up. Congratulations all around

Tagged:
  • April 17, 2009

    #2 Every Building with a Shoebox in it’s Basement.

    “Buildings could offer WiFi photo uploading service, in return for keeping the photos taken of them….… what if Cloudgate were built with servers and wireless inside, right from the start, offering to consume the photos taken of it. You take a shot with a wireless enabled camera and it could store a copy for you. It’s building up a library of itself, in all seasons, in all weather. Meanwhile you, have a backup, findable by time and browsing, stored safely in the Cloud!”

    + 0. (Aside , , , , )

Hope vs Optimism

April 14th, 2009

One of the most interesting workshops at the NYC Anarchist Book Fair this year was Matt Hern’s talk “Against Tolerance”. And one of the most useful ideas I took away was an off hand comment on hope vs. optimism, which he attributed to Cornel West, who attributes it to Havel.

You have to draw a distinction between hope and optimism. Vaclav Havel put it well when he said “optimism” is the belief that things are going to turn out as you would like, as opposed to “hope,” which is when you are thoroughly convinced something is moral and right and just and therefore you fight regardless of the consequences. In that sense, I’m full of hope but in no way optimistic. – Cornel West
Hope is definitely not the same thing as optimism. It is not the conviction that something will turn out well, but the certainty that something makes sense, regardless of how it turns out. – Václav Havel

I’m finding it useful.

Tagged: , ,

RevCanonial blog turns 10!

April 12th, 2009

Last Saturday morning’s hack seems to have fulfilled a burning need folks had, and the pace of uptake (and the occasional impassioned rebuttals) have been kind of stunning.

But even more stunning has been the outflow of awesome writing and code from some of the smartest folks around. Which I’ve been woefully failing to keep up with over at the RevCanonical blog. The blog just hit 10 posts, which is about as many as I expected to post ever.

Thank you, you’re all awesome.

Tagged:

URL Shortening Hinting

April 3rd, 2009

Chmurka & Bu?eczka (+ MESSAGE TO THE WORLD)

We’ve been playing with coding up a URL shortener at Flickr this week. A URL shortener that only shortens Flickr URLs. (and right now only shortens URLs to photo pages)

Amazingly enough, this stodgiest of all conversations is a hot topic this week, with joshua’s url shorteners considered harmful post acting as something of a catalyst. Though the slick looking working on DiggBar has to be on folks mind.

Meanwhile I was curious if there a proposed rel=”alternate” syntax floating around for sites that run their own URL shortener?. I thought Dave posted one as part of his call for sites to run their owner shorteners, but I can’t find any evidence that this is correct memory.

Kevin Marks suggested rel=”canonical” and when I said that was the opposite of what I wanted replied rev=”canonical” is by definition the opposite of rel=”canonical”, but in practice people don’t grok rev. I had never heard of “link rev”, but lose the idea of marking up pages with their canonical opposites. But is that correct?

Les Orchard suggested the probably more practical rel=”shorter-alternative”, with Jon Williams refining to “rel=”shorter alternative” because IIRC rel values are like CSS classes“.

All of this implicitly based on the Mark’s work on RSS auto-discovery.

I realize this probably reads like a blow by blow description of watching paint dry.

But I’d like to pretend we’ve got some momentum here, maybe we could solve all the world’s problems my dinner time. (or at least auto-discovering url shortening problems)

update 2009/4/6: And I’ve published my rev=”canonical” URL shortening service

Photo by chmurka, just because.

No, really

March 19th, 2009

Uploaded by straup on 12 Mar 09, 1.49PM EDT.

Tagged: ,

Streams, affordances, Facebook, and rounding errors

March 18th, 2009

I’m not really a Facebook user, but it is impossible to be a serious practitioner of the rough craft of building social software without being at least somewhat a Facebook watcher. So indulge me a bit, as I add my own thoughts to the cacophony of folks writing about the Facebook re-design.

I’ve always thought their status updates design was brilliant. Not because it was usable or attractive, I’ve always thought it was terrible. But because their design didn’t make promises they couldn’t keep.

Think briefly about the platonic ideal of an activity stream, the increasingly common social pattern that makes your traditional CRUD fronted MySQL install cry at anything remotely resembling scale. All the updates from your social network, quickly listed for your viewing pleasure, in reverse date ordering. No two users of your service will share an activity stream view (unless your service tends towards 100% social graph overlap, in which case why bother?), writes are high volume and need to be committed quickly to preserve ordering, and shared caching is right out.

So you go queue-ish, you de-normalize. And now you’re pushing messages around between services, transactional commit are gone, and you’re dealing with the inevitable skew of distributed systems. But even in queue systems, 100% guaranteed, in order delivery is more fantasy then reality (though you can get close).

But Facebook was smarter then that. They specifically designed a page that was lossy. They said, “You don’t want to see everything, here is a subset of things your friends do we think you’ll be interested in.” And so you knew that you weren’t seeing everything, it wasn’t that they were failing their contract with you, but that they had decided not to show you something for editorial reasons. And you knew that if you wanted to see everything you had to dig, because that was the contract. And that digging was scoped to a user, your wall or your friends wall, data scoped by data owner — super cheap look up.

Contrast this to Twitter.

Twitter is infamous for its bad period of down time as growth went asymptotic. But less well remembered is the teeth gnashing and hair pulling of the bad period right before, where update loss, delivery failure, and out of order delivery where the bugaboos of the day. Twitter promised you would see all your friends updates, always, neatly collated. The promise is implicit in the design, the language, the APIs, the very DNA of the service. (in fact Twitter used to make more audacious claims, I still mourn the death of the “With Friends” feature, that allowed you to see anyone’s public updates in the context of their friend network, not just your own).

One of the best, unattributable quotes from Social Foo last year was the data point that Facebook was at one point losing up to 80% of messages across their update bus. As someone whose expectations are shaped by the five nines style promises of Twitter, its a loss at scale which I can’t possibly fathom. And it wasn’t even an issue in the Facebook community. And when they expire updates out of hot storage to less accessible stores, you don’t notice, because they never offered you the option to page back forever. Contrast again to Twitter whose design (if not content) encourages you to page back forever until you smack up against an arbitrary and surprising limit. (whose exact location has changed over the years)

That is designing with affordances. Don’t let your design make promises you can’t keep.

A much smaller and possibly less well known example is the Flickr activity page, where you can monitor activity on your photos, or photos you’ve expressed interest in. For years this page was framed in the language of “which of these limited time periods are you interested in seeing events in?”, that was the question the page tried to answer. Not, “what has ever happened on my stuff?”. Because that was a much harder, and more expensive question to answer. As part of the Toto launch (new homepage) on Flickr last Fall we explicitly changed our contract with our users. Great photography has a 150 year tradition, and we felt that we could at least try to expose 5 years worth of conversations. (and Flickr usage by our members evolves and changes as their lifes evolve and change, something all good social software should design for, rather then living in the ever present now) Our activity streams go all the way back to the beginning now, but it wasn’t a change undertaken without a lot of thinking, architecture, and engineering.

Simon Willison asked this week about best practice for architecting activity streams. And the answer is, “It depends.” Depends on the scope, scale, access patterns, and affordances you’re building — your contract with your users.

Which is a long way of saying think hard about the promises you make to your users, implicitly or explicitly.

And, Facebook, my friend, what the HELL are you thinking? You managed to negotiate the best deal in the business, talk about a racket, and you threw it away for a piece of Twitter’s pain? Are you stupid? Well, best of luck with that.

Flickr wins Classic award at SxSW

March 16th, 2009

by Kent Brewster

Tagged:

Las Manitas

March 14th, 2009

Las Manitas was one of my favorite things about Austin. I miss it. I took a photo of the empty lot that used to be my favorite breakfast spot in the whole world.

The Flickr nearby page is one of the best memorial to it I’ve seen.

Picture 31

Is a Firehose of Snowflakes a Nor’easter?

March 4th, 2009

I tried explaining the title of this blog post to Jasmine this morning. Suffice to say my explanation needed a bit of practice. And more than 140 characters. Or it might just be I’m a bit stir crazy from Winter returning with a vengeance in these here parts. But I wanted to call out a couple of points that might have gotten overshadowed in the good Reverend’s recent post on the Flickr Panda APIs.

NewsWire API

Picture 21

The NY Times at their great Times Open event announced their Newswire API, which is a real time stream of their content. Stories, and blog posts, and what not. More interestingly was their discussion about how they’ve built a backend “pinging service” that makes it easy for them to add new types of data to their stream. I’m a dork enough that a Grey Lady firehose sounds pretty awesome.

But they got some flack for it being a snowflake API. From where I sit snowflake APIs look like opening up your data as fast as possible, along any means necessary, and trying not to pre-judge how people will use it, but I’m thankful for the metaphor, as it allowed me to spend the morning envisioning fire hoses of snowflakes.

Still I spent 2007, and 2008 talking about how XMPP was going to be a key piece of building firehoses standardizing and enabling the real time Web, so its a criticism I’m sensitive to. (and I’ve already been skipping conferences in 2009 in the hopes of actually having some time to build it, though thankfully minor details like time haven’t stopped my colleagues at Fire Eagle from launching theirs)

Pandas

Flickr Panda!

Which is all apropos of saying, we launched our own “snowflake” realtime API yesterday. (though actually its just a slight modification of our standard photo response format). And its Panda-shaped. And it is awesome.

Near Real-Time, Every Minute, up to 120 Events

But because the documentation is quirky, I think people missed the significance. These are Flickr real time data APIs.

We’re building streams of photos in real time. Examining the huge stream of data events that happen on Flickr, the social activity, the searching, the meta-data creation, and fishing from that stream to build 3 real time streams. We’re then exposing those streams via a near real time polling based API.

The API pattern is specifically structured around making it easy to call from client side scripting, and the data streams are structured around discovery rather then guided search, but we’re pushing up to 120 discovered photos down these streams each minutes, every minute. Two streams of real-time interestingness, and 1 of lightly interestingnessed geotagged photos.

And they’re named after famous pandas. Really what more do you want?

Whither XMPP

So what’s up with the blossoming real time data APIs? And where is our promised standardization? They’re coming. There has always been a tricky chicken and egg problem. There is so little data out there that is appropriate to expose in a real time fashion, that there is little demand to consume it, so the tools fail to evolve. But I’m seeing tons of work, great toolkits from like Fire Hydrant from FireEagle and Babylon from notifixio.us, and Google’s decision to make XMPP a standard part of their AppEngine toolkit are just I’ve been most excited about recently.

Flickr Trends

March 3rd, 2009

snow vs flowers

Based on Derek’s NYT Trender and some APIs we haven’t gotten around to releasing, I spent 20 minutes whipping up Flickr Trends yesterday morning.

App Engine is awesome for this kind of stuff.

Favorite’s I’ve found so far:

First iPhone app

February 13th, 2009

2008/2/13 6:31:30 EST

Tagged:

Happiness

February 10th, 2009

Tagged:

Fire at Beijing CCTV tower complex

February 9th, 2009
asc: it seems to me that we need to set up a magic email address for “things on fire”

Fire at 
Beijing CCTV tower complex

From an amazing set by Ai de ke.

Tagged: , , ,

NYTimes Article Search API

February 7th, 2009

NYTimes: Sex & Scandal since 1981

I don’t have much to add that the New York Times hasn’t already said about their Article Search API. Its an amazing corpus to be searchable, both in breadth, and scope, and for sheer richness of the classification. I can’t think of an remotely comparable dataset with such a rich API.

Couple of things I noticed that I wanted to call out.

Get info about an article/Search by URL

Positioned as a search API, it also doubles as a “getInfo”-style API, as article URL is one of the searchable fields.

?query=url:$article_url

Just make sure to remove the various query string bits that the Times appends, as these aren’t indexed. Should make a “find the history of this topic being discussed” Greasemonkey script a snap.

Expert’s attention information

One of my less comprehensible requests to the NYTimes developer team at OSCON last year was to make sure their APIs exposed the “attention information of [their] editors.” Age of amateur, citizen journalism, and radical decentralization are all awesome, but the NYTimes’ editors job is to think about what is important and interesting full time; and that’s information worth mining.

And they did!

The page_facet, and nytd_section_facet both allow you to gauge some degree of relative weight given to a story. (section_page_facet seems like it ought to do the same thing, but I couldn’t get it to work)

?query=flickr nytd_section_facet:[Front Page]

Gives you articles mentioning “flickr” featured on the NYTimes front page. (of which it only finds 3, alas)

API Design

Good stuff:

  • Clean hackable URLs, you can play with it in your browser and see what you’re going to get.
  • The getList + extras (called fields in the NYTimes API) is the house wisdom at Flickr, and I’m glad to see it elsewhere
  • The parsed tokens block is neat, and I can see it being incredibly useful for working with such a large, varied corpus
  • The sure amount of searchable/indexable metadata and the granularity is really unprecedented, great to see them go out with such a rich, “here’s the data do something great” approach.

Visualizations

The graphic at the top of this blog post is a “visualization of the frequency of occurrence of the words ’sex’ and ’scandal’ in the New York Times, since 1981.”, part of a set of visualizations by blprnt_van built with the article search API, and Processing.

Tagged: , , ,

Mixed Messages

January 17th, 2009

Clearly the economy is in a bad way.

What I can’t figure out is why I’m seeing a huge spike in employment inquiries (most totally ridiculous head hunter crap, and some really awesome “work on stuff that matters” style projects), the Web is full of the most innovative stuff I’ve seen in years, and everyoneiknowisdoingawesomeshit.

Some days its hard to stay pessimistic.