Quantcast

Posts by author:

allspaw

How Complex Systems Fail: A WebOps Perspective

November 12, 2009

I guess I’m late on getting to this, but How Complex Systems Fail by Richard Cook is excellent.
Let me start with this: I don’t think I can overstate how right-on this paper is, with respect to the challenges, solutions, observations, and concerns involved with operating a medium to large web infrastructure. I found this via [...]

Read the full article →

When you deploy: your internal monologue

October 7, 2009

The minimum cycle of questions you should be asking yourself. As brought up by @debuggist and @benjaminblack.

Read the full article →

Meanwhile: More Meta-Metrics

October 5, 2009

Like all sane web organizations, we gather metrics about our infrastructure and applications. As many metrics as we can, as often as we can. These metrics, given the right context, helps us figure out all sorts of things about our application, infrastructure, processes, and business. Things such as…
What:
…did we do before (historical trending, etc)
…is going [...]

Read the full article →

WebOps: Good prep for becoming a new parent?

September 29, 2009

I think I’ve said before somewhere that working in the field of web operations prepared me somewhat for being a parent. I thought the other day that I should write down some of this reasoning, because it’s pretty often that I’m reminded of similarities:
High availability
Having redundant infrastructure is WebOps 101. For my kids’ most prized [...]

Read the full article →

Automated Control paper by the RAD Lab folks

August 1, 2009

Wow, how did I miss this until now? In June, some smart people gathered in Barcelona for the First Workshop on Automated Control for Datacenters and Clouds (ACDC09) and jeez it looked like it was a good time, from a glance at the program.
One of the cooler papers is “Automatic exploration of datacenter performance regimes” in [...]

Read the full article →

Extreme Automated Infrastructure

July 18, 2009

I’ve said it before that I’ve always been a huge fan of SystemImager, for super simple imaging. It has some shortcomings for config management, but those are solved with things like Chef or Puppet.
With all of the great things being talked about surrounding ‘Automated Infrastructure’, I’ll point to something insanely cool: 1,190 nodes installed from [...]

Read the full article →

SLAs, clouds, and whatnot

July 16, 2009

Excellent. Good work, Ben:
ah, the mighty service level agreement! the tooth and claw by which the wily customer brings the vendor to heel. get the SLA right and you, the customer, can sit back and relax, safe in the knowledge that should there be an outage, you are covered. your business is protected from harm [...]

Read the full article →

Uncaching bits in filesystem cache

July 9, 2009

Domas makes something more useful than I bet most would think: http://mituzas.lt/2009/06/26/uncache/

Read the full article →

Slides for Velocity Talk 2009

June 23, 2009

UPDATE: blip.tv has the video of the talk as well, below. Jeez I have some major bed-head.
That was a blast! I had never done a ‘duet’ talk before. Here are the slides:
10+ Deploys Per Day: Dev and Ops Cooperation at Flickr
…and the video of it is here:

Read the full article →

Annoying To Me.

May 22, 2009

I can’t tell you how ripped I get when people say things like this:
“cloud computing means getting rid of ops”
If by “ops” you mean “people in data centers racking servers, installing OSes, running cables, replacing broken hardware, etc.” then sure, cloud computing aims to relieve you of those burdens. If you really think ‘ops’ is [...]

Read the full article →