An Open Letter To Monitoring/Metrics/Alerting Companies

Posted 15 CommentsPosted in Cognitive Systems Engineering, Tools, WebOps

I’d like to open up a dialogue with companies who are selling X-As-A-Service products that are focused on assisting operations and development teams in tracking the health and performance of their software systems. Note: It’s likely my suggestions below are understood and embraced by many companies already. I know a number of them who are […]

A Mature Role for Automation: Part I

Posted 30 CommentsPosted in Cognitive Systems Engineering, Complex Systems, Human Factors, Tools, WebOps

(Part 1 of 2 posts) I’ve been percolating on this post for a long time. Thanks very much to Mark Burgess for reviewing early drafts of it. One of the ideas that permeates our field of web operations is that we can’t have enough automation. You’ll see experience with “building automation” on almost every job […]

Meanwhile: More Meta-Metrics

Posted 6 CommentsPosted in Tools, WebOps

Like all sane web organizations, we gather metrics about our infrastructure and applications. As many metrics as we can, as often as we can. These metrics, given the right context, helps us figure out all sorts of things about our application, infrastructure, processes, and business. Things such as… What: …did we do before (historical trending, […]

Web Ops Visualizations Group on Flickr

Posted 1 CommentPosted in Flickr, Tools

Like lots of operations people, we’re quite addicted to data pr0n here at Flickr. We’ve got graphs for pretty much everything, and add graphs all of the time. We’ve blogged about some of how and why we do it. One thing we’re in the habit of is screenshotting these graphs when things go wrong, right, […]

Code Swarm for Config Management

Posted 3 CommentsPosted in Tools, WebOps

Gil Raphaelli, one of the guys on our Flickr Ops team, put together a Code Swarm animation for the configuration/deployment management tool we use at Flickr to manage our infrastructure. Myles Grant did this for our bug reporting system as well. Check it out: Our automated config management system is called Gemstone, but conceptually you […]