Blog posts tagged "ha"

Facebook on “Scaling Out”

August 21st, 2008

Jason Sobel has an interesting post, “Scaling Out” on Facebook’s BCP work and the move to being multi-colo.

Interesting to me was noting that:

  • they just got around to this 8 months ago, and they’re fscking Facebook (which means you can wait)
  • they’re still doing all writes to a single datacenter
  • they’re hacking an object-level mark/sweep into the MySQL replication stream suggesting a certain parable of a hammer and nails.

via PaulH

XMPP in TiVo

January 11th, 2008

“Today each TiVo polls TiVo’s severs roughly every 15 minutes to check for new scheduled recordings, TiVoCast downloads, Unbox downloads, etc. That’s highly inefficient – nearly all of those polling calls are for nothing. There is nothing waiting to be done. And it introduces a lag when you want to start a download – up to 15 minutes. And it doesn’t scale well as TiVo’s user base keeps growing.

So what’s changed? The polling system is gone. TiVo is using XMPP now instead. What is XMPP? The Extensible Messaging and Presence Protocol – better known as the instant messaging protocol that powers Jabber, Google Talk, and other IM systems.” – Peter St. Andre noticed as interesting announcement coming out of CES. (via aaron)

Google Talk Architecture, and High Availability (HA)

July 29th, 2007

P7280018_Moleskine_Kreisel

Via the HA blog (an obviously unserved niche in retrospect), a very interesting 30 minute presentation on the Google Talk architecture.

ConnectedUsers * BuddyListSize * OnlineStateChanges

Interestingly people keep independently re-discovering that maintaining presence is the hard part of scaling these systems.

Its something that really came home hard in my talking with Twitter helping with their scaling challenges (so much so that we took a slide out of our “Social Software for Robots” talk to talk about it, and Blaine mentioned it again in his “Scaling Twitter” talk)

So by way of a PSA:

Presence isn’t easy.

Growth in social systems in non-linear. Ignore the network effect at your peril.

Kick the Tires

Also interesting was “Real Life Load Tests”. The GTalk team deployed to Orkut and GMail weeks before actually turning on the UI for the features to be able to monitor the load. These are the practices that make Bill’s recent observation on HA systems possible:

An interesting takeaway is that it’s clearly possible to re-architect data storage on super-busy production systems seemingly no matter where you start from.

For the rest of bullets see the HA blog post.