The new book: Web Operations

Posted on

At the Velocity Conference last year, I was talking to Mike Loukides from O’Reilly about the topics being presented and how it was so great to see such successful veterans of the field come out from behind the curtain and share their experiences. Mike said that there was interest in doing a book on the […]


How Complex Systems Fail: A WebOps Perspective

Posted on

I guess I’m late on getting to this, but How Complex Systems Fail by Richard Cook is excellent. Let me start with this: I don’t think I can overstate how right-on this paper is, with respect to the challenges, solutions, observations, and concerns involved with operating a medium to large web infrastructure. I found this […]


Meanwhile: More Meta-Metrics

Posted on

Like all sane web organizations, we gather metrics about our infrastructure and applications. As many metrics as we can, as often as we can. These metrics, given the right context, helps us figure out all sorts of things about our application, infrastructure, processes, and business. Things such as… What: …did we do before (historical trending, […]


Slides for Velocity Talk 2009

Posted on

UPDATE: has the video of the talk as well, below. Jeez I have some major bed-head. That was a blast! I had never done a ‘duet’ talk before. Here are the slides: 10+ Deploys Per Day: Dev and Ops Cooperation at Flickr …and the video of it is here:

Capacity Planning

Some Things We Did Today

Posted on

Moving one of our eight photoserving farms from hardware Layer7 URL hash balancing (expensive, has limits) to L4 DSR balancing with CARP (cheap and simple) and figuring out how to juggle 18,000 requests/second while we do it. Built yet some more automated query analysis reporting (with some yummy MySQLProxy) Added yet another aggregated graph of […]