Convincing management that cooperation and collaboration was worth it

05. Jan
/
Culture, Flickr, Random
/
13 Comments

While searching around for something else, I came across this note I sent in late 2009 to the executive leadership of Yahoo’s Engineering organization. This was when I was leaving Flickr to work at Etsy. My intent on sending it was to be open to the rest of Yahoo about what how things worked at...
Continue reading...

MTTR is more important than MTBF (for most types of F)

07. Nov
/
Culture, Etsy, Flickr, Slides, Talks, WebOps
/
26 Comments

UPDATE, 10/17/2017: This post hasn’t aged well, and needs some patching. The title should be “TTR is more important than TBF (for most types of F)” Why? Because taking the statistical mean of TTR or TBF makes absolutely no sense, whatsoever. Incidents and events simply are not comparable in that way, and even if they were, the time...
Continue reading...

Slides from Web2.0 Expo 2009. (and somethin else interestin’)

That was a pretty good time. Saw lots of good and wicked smaht people, and I got a lot of great questions after my talk. The slides are up on slideshare, and here are the PDF slides. Operational Efficiency Hacks Web20 Expo2009 View more presentations from John Allspaw. UPDATE: Gil Raphaelli has posted his python...
Continue reading...

Web Ops Visualizations Group on Flickr

16. Dec
/
Flickr, Tools
/
1 Comment

Like lots of operations people, we’re quite addicted to data pr0n here at Flickr. We’ve got graphs for pretty much everything, and add graphs all of the time. We’ve blogged about some of how and why we do it. One thing we’re in the habit of is screenshotting these graphs when things go wrong, right,...
Continue reading...

Slides from Velocity

25. Jun
/
Flickr, Slides, Talks, WebOps
/
8 Comments

Here are the slides from my talk at the Velocity Conference....
Continue reading...

Squid patch for making “time” stats more meaningful.

22. May
/
Caching, Flickr, WebOps
/
No Comments

Thanks to Mark, squid’s got a patch I’ve been wanting for a gazillion years: time-to-serve statistics that don’t include the client’s location http://www.squid-cache.org/bugs/show_bug.cgi?id=2345 Normally, squid’s kept statistics that included the “time” to serve an object, whether it be a HIT, MISS, NEAR HIT, etc. The clock starts for this time when the first headers are...
Continue reading...

Flickr’s hiring a dba.

30. Jan
/
Flickr, WebOps
/
4 Comments

(Only hardworking supernerds should apply) We’re looking for an experienced and motivated MySQL DBA to help make things go at Flickr. Stuff you’ll do: – Work with engineers on performance tuning, query optimization, index tuning. – Monitor databases for problems and to diagnose where those problems are. – Work with developers and operations to maintain...
Continue reading...

Making a site faster by removing machines

(well, not really) A little while ago, in one of our clusters we replaced a boatload of PowerEdge 1425 webserver-class boxes with a much smaller number of HP DL145 G3 quad-core boxes, getting the same amount of oomph from 1/3 the boxes. Not too bad....
Continue reading...

Varnish and the state of web caching

16. Dec
/
Caching, Flickr
/
2 Comments

So there’s lots of excitement around Varnish, which is a caching proxy that is built to be first and foremost a reverse-proxy, as opposed to squid, which does both forward and reverse. Acceleration (reverse-proxying) is obviously important to us at Flickr, as we use squid extensively....
Continue reading...

Hats and beards

12. Dec
/
Flickr
/
No Comments

http://flickr.com/photos/allspaw/311471361/...
Continue reading...

Kitchen SoapThoughts on systems safety, software operations, and sociotechnical systems.

Kitchen Soap