- don’t write bugs
- once you write a piece of code you own it forever and your top priority is fixing the bugs in it
- assign teams to own every section of the site, allow them to make sure no unapproved changes happen to those areas
- have a bug fixing team
- delete all code over a year old, switch to a new language, rewrite it
Archive for the 'Uncategorized' Category
Paul Mison’s Maximum Viable Product feels like a more clever (if more cheeky) way to get at what I was trying to say with my notes on Flourishes and Minimal Competence (and Competence found), and threads the needle of recent 4up from Aaron and The Flickr We Lost from Dan.
I don’t have much conclusion as much as a sense there’s a thing there that I’m still trying to get my hands around.
The other piece I wonder at about this is the role of sophisticated testing and measurements and being the user.
I hear that engagement is way up in the wake of Flickr’s changes, and I know I’ve been gone a long time and things have probably changed, but I can’t help but be struck how unsophisticated we were at measuring things at Flickr — in part because we were a tiny team, and in part because we relied on our instincts. It makes me wonder what’s guiding the current work.
This started out as a short note explaining the unifying theory behind the Etsy development practices. Then it got out of hand (see also, Mark Twain’s “If I had more time…”). As such I’ve made it “Part 1 of N”, where “Parts 2 .. N” will cover the actual practices and how they relate to the philosophy.
Why Write This?
We believe in being as open as possible about how we develop and run Etsy — our current best theories, learnings, practices, and tools. Given that openness I often get questions about the hows and whys of given subsets of our engineering practice, e.g. “How do you do testing? How do you know you have enough testing?”. Or monitoring or deployment or what not.
At times, it can be tricky to do the questions and answers justice when taking them piecemeal, because underlying them all is a single shared philosophical premise, that isn’t necessarily obvious. And while I tend to be a pragmatist, favoring the rough and ready over the theoretical, without understanding the theory you can’t reason correctly about trade offs. This post is an attempt to surface that underlying philosophy and the practices it informs.
The theory is a theory of change, and the philosophy is about finding paths to move from risk to confidence.
A Theory of Change
Etsy is in the change business. As are, definitionally, all startups, nearly all businesses, and most human projects. We’re attempting to add new capacities to the world, and influence behavior around them. And we’re attempting to do it in an uncertain and complex environment; we neither know the exact recipe for success, nor we do expect that recipe to stay the same over time. In fact, as a startup, we believe that our ability to respond to a changing environment is the key success factor for our engineering organization. It’s natural to read that sentence and think of change in terms of product changes, but more prosaic examples might include the ability to add new server capacity when the site slows down, or to replace a hard drive when one fails.
But change is risky. This is something most of us believe intuitively, and it’s worth examining the sources of risk in change.
Why is change risky?
As humans and practitioners why do we associate change with risk? Doing new things inherently contains the risk of doing the wrong thing.
We may for example have reached a ready state in our project. Through a combination of good luck and planning we find ourselves running a system that we understand sufficiently to keep running indefinitely, while a change would implicitly contain the risk of moving from a state of working to a state of not working. Steady state systems are so rare and so often illusory, it’s almost not worth mentioning except we’re fervently entranced with the possibility. Generally the illusion of steady state simply means the needed changes are non-linear, and often the cost of ignoring them will be high.
More practically, very few of us are employed to maintain systems in a stable environment. Even if we hold the pieces we control constant it’s unlikely that our systems will remain stable forever, at which point action is required. Still change is often associated with surprise in a system that hadn’t previously surprised us, and surprise is definitely risky.
The second reason change is risky has to do with how we think about causality, intention and culpability. While we can agree that the ability to choose not to make a change is an illusion, often fear leads people to approximate avoiding change, by avoiding making choices. If I personally avoid making changes to a system and instead wait for outside pressure to force change, or if I simply play the odds and hope that disastrous failure happens rarely enough that it won’t happen on my watch then I can avoid the personal risk of being labelled the root cause of failure. Forced choices avoid the necessity of stating a hypothesis before acting, thereby reducing significantly the opportunity to be personally wrong.
Software development is a complex system existing as it does at the intersection of people, systems, good intentions, confused and changing goals, and overly literal state machines. Past behavior isn’t always an indication of future behavior, and humans are terrible at reasoning about complex systems. As such we’re unlikely to know or have good visibility into whether we’ve reached a steady state and our hypotheses are likely to be wrong. In this uncertain and complex environment we initiate change only when the cost of not making a change overcomes the fear of making it. (e.g. “The server is down” or “You’ll be fired if this feature isn’t done by April 1st”)
As an industry this means though we’re in the change business, often we aren’t very good at it, and we avoid it out of fear.
Different groups attempt to address this tension by:
- raising the cost of not making change (“you’ll be fired”)
- distributing those costs broadly (this is one of the key functions bureaucracy and process serve)
- gaining confidence by addressing the fear
We see Etsy’s engineering practices as spectrum of tools for increasing our confidence in our ability to make change.
Going back to the opening idea of this post, the attempt to answer a question like, “How much testing do you do?”, the answer becomes, “Enough to gain confidence. But testing is just one of the tools we use to gain confidence, so less then a strong testing shop might.” Similarly if someone asks, “How much monitoring is enough?”, the answer is, “We add monitoring until we feel like it gives us confidence, and we’re comfortable striking an 80/20 balance, particularly upfront, because we’re confident if we don’t have the balance right we have other ways of finding out.” In fact how many and how much confidence boosting techniques you need is situational, and depends on how risky your change is. Which speaks to another fundamental piece of our process, small and iterative changes.
Hopefully that starts to explain why, while I think our testing infrastructure (with its try-servers, “Bobs”, static analysis, integration tests, and quality metrics) is awesome, just telling you how we do testing isn’t necessarily going to be useful. Or perhaps it just speaks to my personal penchant for holistic post-modern explanations.
So given a theory(-ish) of risk, change and confidence, what’s the philosophical premise we derive to underly our development practices:
To be able to consistently deliver the kind of resilient and ongoing change the business requires, we deploy a spectrum of confidence gaining techniques.
Or jokingly what we call, “Making failure cheap and easy.”
Before moving on it’s worth calling out that the goal is NOT to be careful. The goal is to be confident. Careful would imply we’re trying to avoid the risk which is fundamental to the change we’re trying to make. Attempting to avoid risk often leads to paralysis, favoring the short term risk avoidance while compromising long term goals. Instead confidence implies you believe to the best of your ability that you understand and have mitigated the risk involved in your change, and are now going to act.
Now, with a little shared theory and philosophy, what does that spectrum of confidence gaining techniques look like?
Our Paths to Confidence:
- Small and Frequent (and Iterative)
- Ramp Ups
- Default Access to Open
- Monitoring, Metrics and Anomaly Detection
- People / Culture / Brains
Each of which I’ll talk about in subsequent future posts.
1. Complex systems as defined as something that has many diverse, interdependent, adaptive and connected parts points to the uncertainty. Small perturbations can produce large results, and those results could be failures or successes, but in either case: the potential for surprise is high.
I’ve got a short list of things I tell people they need to do to survive being senior management. This list has come up a bunch in the last week talking to different folks. So I’m writing it down so I don’t actually have to remember it. That’s sort of unfortunate because there are some alternate versions that exist in a super-positional state, but I think having it written down outweighs the flexibility.
I’m sure the list isn’t especially unique to being senior management but there are a few things that are unique to being senior management, that makes it particularly relevant:
it’s a job where your ability to cope with your demons is critical to the success of everyone who works for you. See also Ben Horowitz’s, “What’s The Most Difficult CEO Skill? Managing Your Own Psychology”
most people doing the work (at least in tech) are transitioning from maker to manager, and while a few special people are both good at management and enjoy it (and also a few sociopaths), most of us find it a really difficult transition to feel good about consistently and on an ongoing basis.
It’s a simple list. It shouldn’t surprise. This is the minimum. This is my list from having done the job, managed folks doing the job, hired, promoted, and fired folks doing the job, and perhaps most importantly drank with folks doing the job. Your mileage may vary. (But I’d be kind of surprised.)
1. Get some exercise
The ways most of us cope with stress are toxic. They lead to sickness, injury, and reduced cognitive clarity and elasticity. Small amounts of regular exercise help. This is not about getting in shape, this is not about living longer, that’s between you and your work-life balance. But to be an effective manager you need to healthy, functioning, and present, exercise will help with that.
2. Have someone to talk to
There are two variations of have someone to talk to on this list. That’s how important it is. Management brings shit up. It’s a psychological job. Your relationship with your parents is unfortunately relevant, as are just about every other aspect of your personality. Knowing what triggers you, and why, and having someone you trust to talk through it is the only way to do the job well.
It can be a coach, a therapist, a good friend, potentially a very patient and saintly spouse (not recommended). Ongoing, trusted and good at listening are the characteristic you’re looking for.
3. Talk with peers
As distinct from #2, find some folks in your industry, with similar job scope. Get together regularly. Talk shop. But the real shop. The stuff you don’t talk about when the people you work for or the people who work for you are around. This should be off the record. This isn’t a meet up. Start with a small group. Intimacy is the name of the game. Alcohol can help. Ask people, they’ll say yes, everyone needs to talk.
What you’ll find out is everything is fucked up everywhere. And you feel better about your own job. Your problems suck, but boy are you super glad you don’t have their problems. And they’ll feel the same way about you, and your problems.
Perspective is the thin line between a challenging but manageable problem, and chittering balled up in the corner.
4. Have a personal mastery project
Maybe you used to be a coder. Now you’re management overhead. But you really loved coding (and probably because you loved it so much, you spent a lot of time analyzing how folks could do it better, and that’s how you ended up in this mess). You’ve admitted to yourself that you can’t really spend your time writing much code anymore, but you like to keep your hand in the game, carve off a small project here or there for yourself, something that you can look back on after day and say, “Hey, I actually accomplished something today, not just go to meetings.”
You’re almost certainly doing it for the wrong reason. Cut it the fuck out.
There are lots of good reason to stay close to the day to day work (including, but not limited to, you’re an early stage startup, and everybody has to pitch in), and even more failure modes in that directions. But none of those good reasons are about you feeling better, or more in control, or like you “did something real”.
But that doesn’t mean the need you felt to learn, grow, acquire new skills, and generally stretch yourself that were hopefully key traits to getting you this far just go away, or that the sometimes vanishingly abstract accomplishments of your team can be swapped in for that personal satisfaction. For that you need a personal mastery project. Something, quite probably not related to work, where you can prove to yourself that you aren’t actually getting dumber every day (just older), but can still think, reason, and learn.
A side coding project might be it, learning a language, taking a class, practicing classical piano. Something. Something that stretches you, and you can master. By yourself.
Put your own oxygen mask on first, before assisting others
Time is tight, and your schedule is the buckshot mess of manager time. Maybe you’ve mastered time management (block off time, use Google’s auto-reject I’m-busy feature), maybe you haven’t. You’re certainly too busy with work + life to add anything new to your schedule. After all the company depends on you (as does your family). Get over yourself.
You aren’t useful to anyone if you aren’t taking care of yourself. There is an unbound set of things you could be doing better in order to help insure the success of your team that will constantly expand to fill up all the time, but the most important thing you need to be doing is making sure your own oxygen mask is on first.
(hopefully first in a series of posts turning the emails, and chats I have with folks about this work into something public. Keep an eye out for, “Help, I’m a CTO now what do I do?” and “Confidence in the face of risk”)
So, here’s the insight I’m currently tossing around in my head: The problem is that software isn’t built; it’s written. The final product is not like the Bay Bridge. It’s like a novel. – Don’t build. Compose. – Kurt Leafstrand
No, that’s not right.
As appealing as it is. You see novels, modern novels, are a particularly peculiar form of creativity primarily characterized by being the point of view and output of a single individual. Most software, and most bridges, aren’t like that.
Maybe you’re writing novels, but most of us are doing improv theater.
I met Mr. Chaturong this morning. It was 6:30am and he was waiting with his taxi and his 7yr old son to drive us to our dive spot. It was a peaceful morning, up a small mountain, some dew still on the broad green leaves, away from much of the bustle of Thailand (if Koh Tao can be said to have any bustle at all). He might have been about 40 years old, I didn’t ask. I did ask if he was from Koh Tao, I hadn’t yet met someone from Koh Tao in the 4 days we’d been here, but I kept asking, here is what he answered.
“I’m from Koh Tao, people come from lots of places, all over, to find work with the tourists, but it was different when I was young. When I was young there were only 700 people on the island, it was quiet, no trucks, no motor bikes, we walked everywhere, it was nice, my dad was a fisherman, sometimes made coconuts, everyone was a fisherman or made coconuts. It was quiet.”
If you need a taxi in Koh Tao his number is 089 0049117.
I gave a talk last October at the First Round CTO Summit on what we’d learned at Etsy about hiring great engineers and in particular great women engineers for our team, and a little bit about the promising results we’d seen from the Etsy + Hacker School program.
The video went up last week, it’s 18 minutes long, it was given in a venue that was originally off the record and aimed at CTOs, and talks primarily about the work that Marc led. Folks seem to find it interesting, which is deeply gratifying.
NB: also when I say “81 of them were women”, obviously I meant “81 of them were men”. This is the problem with giving talks on back to back days on different topics in Tokyo and San Francisco, you aren’t at your most polished.
The slides are on slideshare.
The Atlantic wrote up their interpretation of the talk.
When I first started sketching out the “Minimal competence” blog post in my head, I imagined it as part of a series. The series had three real functions:
- To layout a theory about our obligations as builders of services and consumers of those services
- To selfishly hilight the hard problems that Flickr and Etsy had tackled and others had shied away from
- To contextualize my frustration that Twitter’s search lacked two features that, while hard, were core to the way I wanted to use it, and the retreat from which felt like cowardice.
On point #3 those two missing Twitter features were:
- The ability to search over all time, not just a fluctuating window of 12 – 72 hours.
- To scope those searches to people you follow
Now I’ve built both of those features for much smaller, slower moving corpuses (after all Flickr is only up to photo id 8 billion and something, and most of Etsy’s items expire after 4 months), so I both knew it was possible, and also could only speculate on how you’d do it on top of a corpus like Twitter.
Good fucking work Twitter! (here’s hoping others follow)
I was talking with Aaron and Blackman a week or so ago about the state of reverse geocoding. This is the business of turning a lat/long into a named place. Besides a neat party trick it turns out that named places have a few benefits over floating point pairs.
while technically the space is infinite and the lat/long space is finite, in practice the names we use to call places converge rapidly to a very small set (in any given region), and for whatever reason (natural or historical) seem to have an affinity for being hierarchical. Both good properties for clustering, compression, and discovery.
humans don’t tend to think in floating point pairs.
At Flickr we spent a while working on turning the point where a photo was set on map (or whose GPS coordinates were shoved into the EXIF) into a place. The work of reverse geocoding is about taking a point, and finding out which polygon its in. This is a well solved problem. With two caveats:
places don’t have neat boundaries, but overlap all over each other. And people disagree about the overlaps
even if places had neat boundaries, and people agreed on them, availability of information about those boundaries is variable at best.
Also it turned out not many people cared, and the ones who did tend to care extraordinarily when we got it wrong. And they cared most passionately about the things that were hardest to get right, points near borders, contested neighborhoods, etc.
Most folks dealing with geo presently bypass this whole problem, and instead go straight to human places, named things, a bar, a restaurant, a rest stop on the side of a highway. And Foursquare has done an amazing job of aggregating and sorting out human attention (and intention) around these places.
But that doesn’t mean the problem is solved, merely deferred. Not all of life’s interesting moments take place in bars (just most), and when aggregating information across multiple steams collapsing onto human labelled places is challenging.
Plus, it’s just dumb that a 100mil+ people carry GPS device in their pockets and we have to buy expensive proprietary data to find out about the shape of where we live.
Anyway, here’s the pitch
Are you shipping a location based mobile app? Would you like to increase engagement? Consider adding a quiz to your app asking people whether they’re in Williamsburg or Northside? Are they in the Bay Area or the East Bay? Is this the Burlingame or the Pennisula? And then publish that aggregated data. People love quizzes, Flickr released a piece of software for turning those answers into shapefiles, the answers change over time, and paying someone for this info is silly. But seriously, if you’re presenting people geographic data, ask them sometime if you’re getting it right.
“People think entrepreneurship is about sitting in a dreamy studio gazing out at nature whilst sipping a latte. In reality, it is a place like your studio, or sitting in your dining room in your pajamas at 10 am looking for something better than a crayon to write shipping weights on your packages with. ” – Caroline of Pease Blossom Studio regarding No Sleep Till Christmas: Etsy Sellers’ Haggard Holidays
“I’d like to find a site that invites you to populate your own archive – a bit like Facebook does for you on its timeline, but in a more customisable way, and in a way that allows you to share with people you choose. Anyone know of such a thing?” – from the comments
We got married two weeks ago today. It was the wedding we always wanted. After 13 years you don’t expect it to change much about your daily lives. And it doesn’t change much, but there are subtle (awesome) changes. For one, I’m wearing a ring.
I’m just now adapted to where I’m no longer aware of the sensation of wearing the ring. I don’t feel it on my finger unless I focus on it. This is an odd reminder of how much of what we take for objective reality is the subjective pastiche of our brains survival oriented signal prioritization routines. Once or twice a day I flash panic that I’ve lost my ring when I realize I can’t feel it on my finger, milliseconds of active querying my finger for touch sensations followed by a quick glance.
I wonder when I’ll internalize the odds sufficiently to realize that even though I can’t feel it, the ring is there. I wonder what other things I’ve come to assume exist without any sensory confirmation. I all of a sudden realize how people lose their rings. There seems to be a metaphor in there for relationships that I’ll leave unexplored today.
Do other people’s wedding rings have magic powers? Am I suggestible to the idea that a ring should be magic because the literature is so full of magic rings, or is the literature so full of magic rings, because rings are inherently suggestive of being imbued with magic?
Mine stores memories. A small handful of scenes, glimpses really, mostly from our wedding day. Glimpses that come back to me in full immersion when I focus on my ring. Is that normal?
Is it normal that as I say these words to myself I’m wondering what the storage density of something the size and shape of a ring would be? You need to make sure to leave space for a Bluetooth antenna because putting a jack of some sort on a ring is going to be awkward and uncomfortable. How far out is the system that allows me to access memories stored in metal (high information density already) visually? (without the benefit of magic)
I could get used to this. But hopeful not too used to it.
It’s conventional to sneer about “people living to take photos” rather then living to live. Going to a party/concert/art/explosion to photograph it rather then experience it.
I’ve never really bought that. (see also: working at Flickr) I tend to be skeptical by default of activities that arise (rather then are marketed) on the edges and are denigrated by folks with cultural weight.
But it struck me tonight that I think it might be part of a larger shift around identity, fluidity, and information flow.
Here is the early 21st century we’re ravenous information omnivores. We also live in uncertain times, where normalcy is rapidly overthrown and little is predictable. You can (almost) imagine a world in which you could predict your life rolling out before you, I’ll live here, I’ll do that, I’ll know these people the rest of my life, we’ll have the same shared set of stories. Given the retreat from that world, the documented individual experience aren’t bragging so much as tentacles we extrude to catch moment of shared sameness.
But what about all those photos we shoot that we don’t even share, what’s happening there?
I’m not entirely sure, but reading 24 Hour Bookstore tonight, and having to flip back a few pages to hilight a passage that I was still thinking about, I realize that this bookmarking behaviour is probably a form of wayfinding and partially a form of commons. I think as a generation we’re actually pretty good at imagining the idea that small contributions of order actually improve the world, and that also small bits of bookmarking will help us find our way back through the river of information we’re deluged with.
Or at least something is making it deeply compelling, and bragging is an insufficient answer.
“Let me tell you about the kitchen, it’s small, everything has to be made fresh, that guacamole, we make it when you order the sandwich, it hasn’t been sitting around. The sauces, the sauces are all less then two days old.” … “I work here, I should know everything about this place. I should know where we buy our produce, and where the wood of that table came from.” … “It came from a single tree in upstate New York, all the wood in this bar came from that tree.” … “What type of tree is it? You see that tall skinny guy at the bar, with his mother? He’s the owner, I’m going to go ask him, I’ll be right back.” – The Randolph
Ads are an ugly business. You barter away functionality, aesthetics, privacy, and performance for a marginal money maker predicated on using manipulation to get people to spend money they don’t have on things they don’t want. If you’ve ever experienced an old favorite website slowly descending into monetization (my canonical example is Alta Vista), you’ve experienced this viscerally, an old favorite slow selling off bits of itself for a few more hits of cash.
Then Google came along, and they went deep, they created a narrative of transcendental advertising. Advertising so good you wanted to see it. Advertising that was net positive. Advertising that would cause you turn off your ad blocker. And if you’re in an advertising supported business you probably even believe the narrative at some level. Ignore the data about who clicks on ads and why, or the insane degradation of most revolutionary communication medium since the printed word into SEO/SEM spam farms. Transcendental advertising, advertising as liberator, advertising for advertising’s sake, advertising as a higher calling. This is what I call “business transcendental”. A philosophy that is tied to your paycheck.
Watching folks responses to the iPhone 5 “Lightning” connector got me thinking about this. Apple has beautiful, breath taking reasons for launching a new connector. It’s innovative, it opens up previously unexplored options that most of us can’t even imagine yet. It’s the product of R&D by some of the best and brightest in the business, like the touch sensing pixel screen or the new thinking aluminum case. But it’s also planned obsolescence. Planned obsolescence is an ugly business. Uglier then advertising. I think, unlike advertising, most of us still recoil in disgust at gratuitous examples of planned obsolescence. Which is why transcendental planned obsolescence is so gut wrenching. Planned obsolescence as innovation, planned obsolescence as the pursuit of perfection, planned obsolescence as identity politics. Google is in the business of biz-transcendental advertising, Apple is in the business of biz-transcendental planned obsolescence. But the underlying business is as optional, and ugly as it ever was, and the transcendence is an illusion.
I backed App.net. The current iteration isn’t really compelling enough for me to use it, but I recently was deeply frustrated by the lack of edit on Twitter flipped over to App.net hoping that maybe it had added this most humanizing of features, but I was disappointed.
The Twitter that exists today is only one of the various Twitters that were posited over the years, it would be a shame if App.net cargo cults this Twitter rather exploring the space of Twitters.
Features from a handful of alternate Twitters:
the ability to edit a tweet. There are several patterns in community software for handling the “I responded and then you changed what you said” pattern. One of then is versioning. The other is a short window of edits. It’s a question of balancing how much you prefer the conversational integrity vs the benefits of a little hypocrisy to a person’s self expression.
archives. is this the record of your life and of culture, or a system for transmitting unrelenting now-ness? Dated archives are key toward setting the expectation that these items you share won’t disappear behind an event horizon of fuzzy human memory, available only in the vast archives of folks trying to sell you things. also, lose the relative dates after a little time has past, say like 2 days. Twitter uses relative dates because it was the preferred method for displaying dates in the Rails community, and it seemed pretty slick in 2006. (for point of reference in 2006 you were still using Myspace)
personal context/useful search. is it a platform for brands/celebrities/robots/blowhards to broadcast or is it a place to connect with a more intimate group? Twitter’s brilliance is in mixing these, but their bread is very much buttered by the brands. This is surfaced both in their disinterest in giving you ways to organize your view (lists) and also in not providing search context (within folks I follow, my tweets, my favorites, etc). of course Twitter did roll out search from people you follow. I love it. So thank you whomever got that out.
privacy. I was an early thorn in Twitter’s side about supporting the privacy settings. But honestly it was just always too much work to respond manually to follow requests or to maintain two separate accounts. Per status privacy and per status geo-privacy would go a long way towards changing the nature of what people share on Twitter away from re-publishing Mashable headlines.
annotations. the ghost feature that lived it’s too brief days in the Sun. Twitter works alright as a “magic word distribution system” (to steal Aaron’s description), but briefly there was the promise of it working exceptionally well. That was annotations. Structured data that could flow along side a tweet indicating that this was me discussing what I was listening to, what I was eating, two robots discussing the weather, part of a larger narrative arc, etc. Metadata that could be displayed or ignored by clients as needed, consumed by listeners as they desired. It makes status casting into something with the potential to surprise you with it’s uses rather then to re-tread. To the extent that Twitter extracts entities like URLs, identities and hashtags already you get a sense of how powerful this could be. It both upends and plays well with the OpenGraph inspired resurgence of microformat based structure data sharing.
federation. just going to leave this one here.
heterogeneous sources of statuses. a tweet is in someways the irreducible atom of web content. if in the era of monied interests (everyone since Postel) we can get interop on anything ever, it should be on tweets.
Those we a few I’d had my heart set on a various teams over the last 6 years. I’m sure there were others. Re-building Twitter without exploring any of the explored alternatives seems like a waste.
Openness has always been my favorite trend. At Etsy we talk about it as part of our “generosity of spirit” value. Just wanted to call out how much I’m loving seeing this trend on the blogs and engineering blogs across the industry.
Our What Hardware Powers Etsy.com?, was in part inspired by 37Signals Behind the Scenes: The Hardware that Powers Basecamp, Campfire, and Highrise. I’d love to see more teams posting in this series.
And our recent post taking a non-technical approach talking about a series of recent outages, felt a little less lonely, and risky, knowing out there with Soundcloud’s Shoot yourself in the foot with iptables and kmod auto-loading, and Simple’s Transparency.
This is how we get better as an industry.
- Though sustainability is a close second. Sometimes they flip flop. But that wouldn’t have been relevant to this blog post.
I backed Postcards from Erik the Red’s House a while back. I like the small projects on Kickstarter, they make me happy, as does Iceland, as does someone writing their first novel.
Today my postcard arrived. It was a sunset in Vik. In particular it was of Reynisdrangar, 3 black basalt stacks just west of Vik. I’ve seen Reynisdrangar. In fact I woke up on a black sand beach, surrounded by sheep, looking out at Reynisdrangar.
We pulled into Vik late in the day. We’d been on the road for 5 or 6 days or so at this point in our circumnavigation of Iceland. We’d planned to drive on further, but as the road climbed up out of Vik we were exhausted. Also Jasmine had bought a funny hat at Vik wool. I’m not sure if that’s relevant to the story. We turned off on a side road at the top of the hill looking back over Vik, and drove until the road turned to sand, and we could see waves in the last of the day light. We pitched the tent by head lights, and went to sleep.
And woke up the next morning, unzipped the tent and found a flock of sheep, and Reynisdrangar.
Here’s the postcard
Camping on that beach is one of my favorite memories. Nice to be reminded of it. This is the network, making the world demonstrably better. That is all.
oldtweets is a search engine for the first year of Twitter.
A bunch of folks asked about the how. The Twitter API provides a method for fetching a tweet by ID. So to build an index of the first year of Twitter you need call the api for each ID in the range of IDs 1-20,000,000. 20 million API calls at the rate of 150 calls per hour. Or roughly 15 years of elapsed API time to index year one.
It also helps to know that Twitter is, and has always been, a MySQL shop, and that in the early days there was a theory about scaling databases by using large auto-increment offsets. (I don’t remember what the logic of that was) That started about 6 months in, was turned off for a while, and periodically drifted. So good news the 20 million ID space is very sparse, which significantly cuts down on the elapsed API time. You just need to send tracers into the space to map it.
From there it’s just a question of patience.
The whole things runs on a very small EC2 instance, and it’s on this week’s todo list to get the index running under Upstart, but it hasn’t happened yet. So if it goes away….
I think our history is what makes us human, and the push to ephemerality and disposability “as a feature” is misguided. And a key piece of our personal histories is becoming “the story we want to remember”, aka what we’ve shared. I just wanted my old tweets, as a side effect I got all of them.
Providing an interface to the whole corpus was motivated by the desire that folks would investigate where the social norms arose, exactly like Rabble’s @-reply investigation.
I thought year one was a meaningful symbol. It maps to the time when we were figuring out how to use Twitter, and maps to the time when I felt like the service was working best for me and mine as an “ambient intimacy” service.
Additionally after SxSW 2007 the rate of tweeting increased significantly, making the brute force approach even slower.