Quantcast

From the category archives:

Uncategorized

Automated Control paper by the RAD Lab folks

August 1, 2009

Wow, how did I miss this until now? In June, some smart people gathered in Barcelona for the First Workshop on Automated Control for Datacenters and Clouds (ACDC09) and jeez it looked like it was a good time, from a glance at the program.
One of the cooler papers is “Automatic exploration of datacenter performance regimes” in [...]

Read the full article →

Extreme Automated Infrastructure

July 18, 2009

I’ve said it before that I’ve always been a huge fan of SystemImager, for super simple imaging. It has some shortcomings for config management, but those are solved with things like Chef or Puppet.
With all of the great things being talked about surrounding ‘Automated Infrastructure’, I’ll point to something insanely cool: 1,190 nodes installed from [...]

Read the full article →

SLAs, clouds, and whatnot

July 16, 2009

Excellent. Good work, Ben:
ah, the mighty service level agreement! the tooth and claw by which the wily customer brings the vendor to heel. get the SLA right and you, the customer, can sit back and relax, safe in the knowledge that should there be an outage, you are covered. your business is protected from harm [...]

Read the full article →

Annoying To Me.

May 22, 2009

I can’t tell you how ripped I get when people say things like this:
“cloud computing means getting rid of ops”
If by “ops” you mean “people in data centers racking servers, installing OSes, running cables, replacing broken hardware, etc.” then sure, cloud computing aims to relieve you of those burdens. If you really think ‘ops’ is [...]

Read the full article →

Context and Operational Metrics

May 10, 2009

I really don’t think it can be overestimated how important context can be when it comes to troubleshooting or evaluating the health of an infrastructure. When starting to troubleshoot a complex problem, web ops 101 “best practices” usually start with asking at least these questions:

When did this problem start?
What changes, if any, (software, hardware, usage, [...]

Read the full article →

Mechanical Analogies To Web Stuff, Part 2.

May 6, 2009

This is a ramble continued from before, which means it’s mostly a blog post for me, but maybe others might find it interesting.

The last time I made an analogy between back-end web architectures and mechanical structures, I blathered on about what are basically structural limitations of individual components in a physical device, and how [...]

Read the full article →

Why I didn’t include queueing math in my book.

March 25, 2009

It’s been wondered about why I chose not to include any real amount of material in my book about the mathematical topics related to capacity planning, like queueing theory.
There are already many other excellent books that dig into the math behind Little’s Law, M/M/1 queues, and Poisson arrival processes. These concepts do indeed detail the [...]

Read the full article →

Mechanical Analogies To Web Stuff, Part 1.

January 14, 2009

I don’t blog much, and when I do, they are pretty short and too the point. This post is different: feel free to put into the “ramble” category.
I’m really just posting it here for myself as a thought exercise.
Some years ago, while drawing a network map for the site I was working at the time, [...]

Read the full article →

Everything isn’t about the Knuth quote

August 24, 2008

It’s hard to describe how tiring it is to hear someone quote Donald Knuth (or Tony Hoare) in the wrong context. I’m not the only one annoyed by this. In “Structured Programming with go to Statements”, Knuth says:
We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of [...]

Read the full article →

Untitled Metric #1202345227

July 17, 2008

Untitled Metric #1202345227, originally uploaded by straup.

Our philosophy in Flickr Operations Engineering.

Read the full article →