WebOps

Meanwhile: More Meta-Metrics

October 5, 2009

Like all sane web organizations, we gather metrics about our infrastructure and applications. As many metrics as we can, as often as we can. These metrics, given the right context, helps us figure out all sorts of things about our application, infrastructure, processes, and business. Things such as… What: …did we do before (historical trending, [...]

Read the full article →

Slides for Velocity Talk 2009

June 23, 2009

UPDATE: blip.tv has the video of the talk as well, below. Jeez I have some major bed-head. That was a blast! I had never done a ‘duet’ talk before. Here are the slides: 10+ Deploys Per Day: Dev and Ops Cooperation at Flickr …and the video of it is here:

Read the full article →

Slides from Web2.0 Expo 2009. (and somethin else interestin’)

April 3, 2009

That was a pretty good time. Saw lots of good and wicked smaht people, and I got a lot of great questions after my talk. The slides are up on slideshare, and here are the PDF slides. Operational Efficiency Hacks Web20 Expo2009 View more presentations from John Allspaw. UPDATE: Gil Raphaelli has posted his python [...]

Read the full article →

Some Things We Did Today

March 5, 2009

Moving one of our eight photoserving farms from hardware Layer7 URL hash balancing (expensive, has limits) to L4 DSR balancing with CARP (cheap and simple) and figuring out how to juggle 18,000 requests/second while we do it. Built yet some more automated query analysis reporting (with some yummy MySQLProxy) Added yet another aggregated graph of [...]

Read the full article →

2009 Velocity Conference submissions are open!

November 20, 2008

The CFP for next year’s Velocity Conference is up now, so all you ops and performance ninjas submit your ideas for talks. I’m lucky enough to be on the program committee this year, and I think the conference is a huge opportunity to spread the ops love on all kinds of topics. There’s a list [...]

Read the full article →

Code Swarm for Config Management

October 21, 2008

Gil Raphaelli, one of the guys on our Flickr Ops team, put together a Code Swarm animation for the configuration/deployment management tool we use at Flickr to manage our infrastructure. Myles Grant did this for our bug reporting system as well. Check it out: Our automated config management system is called Gemstone, but conceptually you [...]

Read the full article →

More back-of-envelope-math…

September 18, 2008

Via kottke: some good examples of doing rough math in your head, causing you to guess about assumptions all along the way. IMHO, being able to do this is one of the things that makes a good web ops person. The examples might be “useless”, but the process is invaluable.

Read the full article →

Internet-Scale Efficiency

September 16, 2008

James Hamilton’s excellent LADIS 2008 presentation has lots of great stuff in it about internet scale bits. Cool stats.

Read the full article →

Slides from Velocity

June 25, 2008

Here are the slides from my talk at the Velocity Conference.

Read the full article →

Squid patch for making “time” stats more meaningful.

May 22, 2008

Thanks to Mark, squid’s got a patch I’ve been wanting for a gazillion years: time-to-serve statistics that don’t include the client’s location http://www.squid-cache.org/bugs/show_bug.cgi?id=2345 Normally, squid’s kept statistics that included the “time” to serve an object, whether it be a HIT, MISS, NEAR HIT, etc. The clock starts for this time when the first headers are [...]

Read the full article →