Blog posts tagged "programming"

  • April 25, 2010

    Code Kata: oldie, but goodie..

    Back in 2007 the Pragmatic Programmers posted a series of code kata in the style of Are you one of the 10%. Probably worth re-visiting if you’ve never done them.

    + 0. (Aside )
  • February 16, 2010

    Wikimedia has a users RPE of 30mil.

    “Wikimedia Foundation currently employs 14 technical people (not all of whom are developers). At 400 million readers/month (as of February 2010), that’s about 1 developer per 30 million users. Accounting for open source developers probably doesn’t change that ratio by an order of magnitude.” (not exactly apples to apples, but still interesting)

    + 1. (Aside , , )
  • February 9, 2010

    Why I love everything you hate about Java « Magic Scaling Sprinkles.

    NK: “All that boilerplate is really important when you work at massive scale and where efficiency really matters.” More meditations on the coding for scale question and the role of cleverness and abstractions. That said I think the factories actually hide the hard learning that’s gone into those pool size choices.

    + 0. (Aside , )

Try Coding Dear Boy

September 29th, 2009

Several times a week I get an emails like, “Can you explain how Flickr does XYZ? I’m hoping there is a nice packaged solution for this.”

XYZ can be anything from “draw tag maps”, to “click in place editing”, to “scale MySQL”.

And I’m always a little baffled as to what to respond.

Until I realized what was going on.

Laziness Impatience Hubris

This is the dark side of the geek virtue of laziness.

The belief that if one just thinks hard enough, or cleverly enough, that problems will have an “elegant solution”. And by “elegant” we mean a solution that doesn’t involve much code. (elegant, such a tricky word, it can also mean writing tons of code for problems that will likely never manifest) And by “think hard and clever”, a good short cut is probably just be to ask someone.

So I’ve come up with a response that looks something like:

“We generally try do the dumbest thing that will work first. And that’s usually as far as we get. Almost everything we do is pretty straightforward, and as such is well documented around the Web, sometimes by us, generally by others. And when we do get fiendishly clever, as we do now and again, it’s usually a highly tuned (read idiosyncratic) solution for the problems we’re trying to solve.”

Method Acting

But what I want to say, in the spirit of the great Laurence Olivier on giving advice to Dustin Hoffman, “My dear boy, why don’t you try coding?”.

Duct Tape

Which is not to say I mind getting these questions, I love talking about this stuff, it’s why I do it. And it’s often amusing to give the, “This is how we do it at Flickr, no really” talk.

But at the end of the day its 0.1% compsci, 0.9% clever ideas, and 99% duct tape. (btw Joel’s The Duct Tape Programmer is a meditation in similar vein)

A Couple of Caveats on Queuing

July 7th, 2008

Les’ “Delight Everyone” post is latest greatest addition to the 17th letter of the alphabet for savior conversation.

And believe me I’m a huge fan, and am busy carving out a night sometime this week to play with the RabbitMQ/XMPP bridge (/waves hi Alexis).

But …. there are a couple of caveats:

1) Some writes need to be real time.

Les notes this as well, but I just wanted to emphasize because really, they do.

If you can’t see your changes take effect in a system your understanding of cause and effect breaks down. It doesn’t matter that your understanding is wrong, you still need one to function. Ideally a physical analogy too. There are no real world effects that get queued for later application. Violate the principle of (falsely) seeming to respect real world cause and effect and your users will remain forever confused. showing you the wrong state when you use the inline editing tool, and Flickr taking a handful of seconds to index a newly tagged photo are both good examples of subtly broken interfaces that can really throw people.

My data, now real time. Everyone else can wait (how long depends on how social your users are).

2) You’ve got to process that queue eventually.

Ideally you can add processing boxes in parallel forever but if your dequeuing rate falls below your queuing rate you are, in technical terms, screwed.

Think about it, if you’re falling behind 1 event per second, processing 1,000,000 events a second, but adding 1,000,001 for example, at the end of the day your 86,400 events in debt and counting. It’s likes losing money on individual sales, but trying to make it up in volume.

Good news: Traffic is spiky and most sites see daily cycles with quiet times.

Bad news: Many highly tuned systems exhibit slow down properties as their backlogs increase. Like a credit card, processing debt can get exponentially unmanageable.

In practice this means that most of the time your queue consumers should be sitting around bored. (see Allspaw’s Capacity Planning slides for more on that theme.)

If you can’t guarantee those real time writes for thems that cares, and mostly bored queue consumers the rest of the time then your queues might not delight you after all.

See also: Twitter, or Architecture Will Not Save You

Working Notes on Consistent Hashing

March 19th, 2008

Nice to see consistent hashing go from obscure to blindingly obvious in a few short whitepapers.

Dynamo is certainly the sexiest discussion of distributed hash tables (DHTs), while Programmer’s Toolbox Part 3: Consistent Hashing is the most straightforward. libketama is open and easy to use implementation of the 64-bit space mapped to a circle style consistent hash, discussed above and originally “popularized” by Chord. (and proposed over a decade ago)

And best quote:

“…and if anyone tells you that you shouldn’t use MD5 for this because it isn’t secure, just nod and back away slowly. You have identified someone not worth arguing with.”

OAuth in PHP (for Twitter)

October 16th, 2007

Mike released HTTP_Request_OAuth today, so I spent a little while this evening coding up Service_Twitter as helper class for making OAuth authorized requests against the Twitter API.

Both are early enough in the dev cycle to be called proof of concepts.

Mostly I wrote it because I had always envisioned there being wrapper libraries around the low level OAuth implementations that wrapped the calls, and constants, and as Mike graciously went out and wrote a low level library I felt compelled to write a wrapper.

Also twittclient, an interactive client for getting an authed access token, essential to bootstrapping development.

And nota bene, HRO currently only supports the MD5 signing algorithm, which is undefined in the core spec, and subject to change. (Just in case you didn’t believe me about the early state of things.)

update 2008/4/18

This code no longer works because Twitter has taken down their (slightly non-compliant) OAuth endpoint. When they add OAuth support back in, I’ll link to it.


February 13th, 2007

“I can’t tell you how many times I’ve heard people say they wouldn’t use Ruby because it lacks automated refactoring tools. Ruby doesn’t actually need them in the way Java does; it’s like refusing to switch to an electric car because there’s no place to put the gasoline.” – Steve Yegge on the NBL