Quantcast

Posts by author:

allspaw

Context and Operational Metrics

May 10, 2009

I really don’t think it can be overestimated how important context can be when it comes to troubleshooting or evaluating the health of an infrastructure. When starting to troubleshoot a complex problem, web ops 101 “best practices” usually start with asking at least these questions:

When did this problem start?
What changes, if any, (software, hardware, usage, [...]

Read the full article →

Mechanical Analogies To Web Stuff, Part 2.

May 6, 2009

This is a ramble continued from before, which means it’s mostly a blog post for me, but maybe others might find it interesting.

The last time I made an analogy between back-end web architectures and mechanical structures, I blathered on about what are basically structural limitations of individual components in a physical device, and how [...]

Read the full article →

Slides from Web2.0 Expo 2009. (and somethin else interestin’)

April 3, 2009

That was a pretty good time. Saw lots of good and wicked smaht people, and I got a lot of great questions after my talk. The slides are up on slideshare, and here are the PDF slides.
Operational Efficiency Hacks Web20 Expo2009
View more presentations from John Allspaw.

UPDATE: Gil Raphaelli has posted his python bindings he [...]

Read the full article →

Why I didn’t include queueing math in my book.

March 25, 2009

It’s been wondered about why I chose not to include any real amount of material in my book about the mathematical topics related to capacity planning, like queueing theory.
There are already many other excellent books that dig into the math behind Little’s Law, M/M/1 queues, and Poisson arrival processes. These concepts do indeed detail the [...]

Read the full article →

Some Things We Did Today

March 5, 2009

Moving one of our eight photoserving farms from hardware Layer7 URL hash balancing (expensive, has limits) to L4 DSR balancing with CARP (cheap and simple) and figuring out how to juggle 18,000 requests/second while we do it.
Built yet some more automated query analysis reporting (with some yummy MySQLProxy)
Added yet another aggregated graph of database queries, [...]

Read the full article →

Speaking at Web2.0 Expo 2009

February 19, 2009

Looks like I’m gonna talk about even more nerdy things at the Web2.0 Expo in April.

You don’t have to wait for a recession to tighten up your operations. Squeezing more oomph out of your servers (or instances!) is always a good thing, and streamlining how you handle site issues is too. We’ll will talk about [...]

Read the full article →

Mechanical Analogies To Web Stuff, Part 1.

January 14, 2009

I don’t blog much, and when I do, they are pretty short and too the point. This post is different: feel free to put into the “ramble” category.
I’m really just posting it here for myself as a thought exercise.
Some years ago, while drawing a network map for the site I was working at the time, [...]

Read the full article →

Web Ops Visualizations Group on Flickr

December 16, 2008

Like lots of operations people, we’re quite addicted to data pr0n here at Flickr. We’ve got graphs for pretty much everything, and add graphs all of the time. We’ve blogged about some of how and why we do it.
One thing we’re in the habit of is screenshotting these graphs when things go wrong, right, or [...]

Read the full article →

2009 Velocity Conference submissions are open!

November 20, 2008

The CFP for next year’s Velocity Conference is up now, so all you ops and performance ninjas submit your ideas for talks.
I’m lucky enough to be on the program committee this year, and I think the conference is a huge opportunity to spread the ops love on all kinds of topics. There’s a list on [...]

Read the full article →

Code Swarm for Config Management

October 21, 2008

Gil Raphaelli, one of the guys on our Flickr Ops team, put together a Code Swarm animation for the configuration/deployment management tool we use at Flickr to manage our infrastructure. Myles Grant did this for our bug reporting system as well. Check it out:

Our automated config management system is called Gemstone, but conceptually you can [...]

Read the full article →