MTTR is more important than MTBF (for most types of F)

Posted 26 CommentsPosted in Culture, Etsy, Flickr, Slides, Talks, WebOps

UPDATE, 10/17/2017: This post hasn’t aged well, and needs some patching. The title should be “TTR is more important than TBF (for most types of F)” Why? Because taking the statistical mean of TTR or TBF makes absolutely no sense, whatsoever. Incidents and events simply are not comparable in that way, and even if they were, the time […]

Web Ops Visualizations Group on Flickr

Posted 1 CommentPosted in Flickr, Tools

Like lots of operations people, we’re quite addicted to data pr0n here at Flickr. We’ve got graphs for pretty much everything, and add graphs all of the time. We’ve blogged about some of how and why we do it. One thing we’re in the habit of is screenshotting these graphs when things go wrong, right, […]

Squid patch for making “time” stats more meaningful.

Posted Leave a commentPosted in Caching, Flickr, WebOps

Thanks to Mark, squid’s got a patch I’ve been wanting for a gazillion years: time-to-serve statistics that don’t include the client’s location http://www.squid-cache.org/bugs/show_bug.cgi?id=2345 Normally, squid’s kept statistics that included the “time” to serve an object, whether it be a HIT, MISS, NEAR HIT, etc. The clock starts for this time when the first headers are […]

Flickr’s hiring a dba.

Posted 4 CommentsPosted in Flickr, WebOps

(Only hardworking supernerds should apply) We’re looking for an experienced and motivated MySQL DBA to help make things go at Flickr. Stuff you’ll do: • Work with engineers on performance tuning, query optimization, index tuning. • Monitor databases for problems and to diagnose where those problems are. • Work with developers and operations to maintain […]