<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Kitchen Soap &#187; Random</title>
	<atom:link href="http://www.kitchensoap.com/category/random/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.kitchensoap.com</link>
	<description>Thoughts on capacity planning and web operations.</description>
	<lastBuildDate>Tue, 17 Jan 2012 17:57:33 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Convincing management that cooperation and collaboration was worth it</title>
		<link>http://www.kitchensoap.com/2012/01/05/convincing-management-that-cooperation-and-collaboration-was-worth-it/</link>
		<comments>http://www.kitchensoap.com/2012/01/05/convincing-management-that-cooperation-and-collaboration-was-worth-it/#comments</comments>
		<pubDate>Thu, 05 Jan 2012 15:35:10 +0000</pubDate>
		<dc:creator>allspaw</dc:creator>
				<category><![CDATA[Culture]]></category>
		<category><![CDATA[Flickr]]></category>
		<category><![CDATA[Random]]></category>

		<guid isPermaLink="false">http://www.kitchensoap.com/?p=8760</guid>
		<description><![CDATA[While searching around for something else, I came across this note I sent in late 2009 to the executive leadership of Yahoo&#8217;s Engineering organization. This was when I was leaving Flickr to work at Etsy. My intent on sending it was to be open to the rest of Yahoo about what how things worked at [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>While searching around for something else, I came across this note I sent in late 2009 to the executive leadership of Yahoo&#8217;s Engineering organization. This was when I was leaving Flickr to work at Etsy. My intent on sending it was to be open to the rest of Yahoo about what how things worked at Flickr, and why. I did this in the hope that other Yahoo properties could learn from that team&#8217;s process and culture, which we worked really hard at building and keeping.</p>
<p>The idea that Development and Operations could:</p>
<ul>
<li>Share responsibility/accountability for availability and performance</li>
<li>Have an equal seat at the table when it came to application and infrastructure design, architecture, and emergency response</li>
<li>Build and maintain a deferential culture to each other when it came to domain expertise</li>
<li>Cultivate equanimity when it came to emergency response and post-mortem meetings</li>
</ul>
<div>
<p>&#8230;wasn&#8217;t evenly distributed across other Yahoo properties, from my limited perspective.</p>
<p>But I knew (still know) lots of incredible engineers at Yahoo that weren&#8217;t being supported as they could be by their upper management. So sending this letter was driven by wanting to help their situation. Don&#8217;t get me wrong, not everything was rainbows and flowers at Flickr, but we certainly had a lot more of them than other Yahoo groups.</p>
<p>When I re-read this, I&#8217;m reminded that when I came to Etsy, I wasn&#8217;t entirely sure that any of these approaches would work in the Etsy Engineering environment. The engineering staff at Etsy was a lot larger than Flickr&#8217;s and continuous deployment was in its infancy when I got there. I can now happily report that 2 years later, these concepts not only solidified at Etsy, they evolved to accommodate a <em><strong>lot</strong></em> more than what challenged us at Flickr. I couldn&#8217;t be happier about how it&#8217;s turned out.</p>
<p>I&#8217;ll note that there&#8217;s nothing groundbreaking in this note I sent, and nothing that I hadn&#8217;t said publicly in a presentation or two around the same time.</p>
<p>This is the note I sent to the three layers of management above me in my org at Yahoo:</p>
<blockquote>
<h3>Subject: Why Flickr went from 73rd most popular Y! property in 2005 to the 6th, 5 years later.</h3>
<p>Below are my thoughts about some of the reasons why Flickr has had success, from an Operations Engineering manager&#8217;s point of view.</p>
<p>When I say <em>everyone </em>below, I mean all of the groups and sub-groups within the Flickr property: <strong>Product</strong>, <strong>Customer Care</strong>, <strong>Development</strong>, <strong>Service Engineering</strong>, <strong>Abuse and Advocacy</strong>, <strong>Design</strong>, and <strong>Community Management</strong>.</p>
<h3>Here are at least some of the reasons we had success:</h3>
<ul>
<ul>
<li>Product included and respected everyone&#8217;s thoughts, in almost every feature and choice.</li>
<li><em>Everyone</em> owned availability of the site, not just Ops.</li>
<li>Community management and customer service were involved <strong>early</strong> and <strong>often</strong>. In <em>everything</em>. If they weren&#8217;t, it was an oversight taken seriously, and would be fixed.</li>
<li>Development and Operations had <strong>zero</strong> divide when it came to availability and performance. No, really. They worked in concert, involving each other in their own affairs when it mattered, and trusting each other every step of the way. This culture was taught, not born.</li>
<li>I have <em>never</em> viewed Flickr Operations as <strong><em>firefighters</em></strong>, and have never considered Flickr Dev Engineering to be <strong><em>arsonists</em></strong>. (I have heard this analogy elsewhere in Yahoo.) The two teams are 100% equal partners, with absolute transparency. If anything, we had a problem with too much deference given between the two teams.</li>
<li>The site was able to evolve, change, and grow as fast as needed to be as long as it was made safe to do so. To be specific: code and config deploys. When it wasn&#8217;t safe, we slowed, and everyone was fine with that happening, knowing that the goal was to return to <em>fast-as-we-need-to-be</em>. See above about everyone owning availability.</li>
<li>Developers were able to see their work almost instantly in production. Institutionalized fear of degradation and outage ensured that changes were as safe as they needed to be. Developers and Ops engineers knew intuitively that the safety net you have is the one that you have built for yourself. When changes are small and frequent, the causes of degradation or outage due to code deploys are exceptionally transparent to all involved. (Re-read above about everyone owning availability.)</li>
<li>We never deployed &#8220;early and often&#8221; because it was:
<ul>
<li>a trend,</li>
<li>we wanted to brag,</li>
<li>or because we think we&#8217;re better than anyone. (We did it because it was right for Flickr to do so.)</li>
</ul>
</li>
<li>Everyone was made aware of any launches that had risks associated with it, and we worked on lists of things that could possibly go wrong, and what we would do in the event they did go wrong. Sometimes we missed things, and we had to think quickly, but those times were rare with new feature launches.</li>
<li>Flickr Ops had <em>always</em> had the &#8220;go or no-go&#8221; decision, as did other groups who could vote with respect to their preparedness. A significant part of my job was working towards saying &#8220;go&#8221;, not &#8220;no-go&#8221;. In fact, almost all of it.</li>
</ul>
</ul>
<h4>Examples: the most boring (anti-climatic, from an operational perspective) launches ever</h4>
<ul>
<ul>
<li><strong>Flickr Video</strong>: I actually held the launch back by some hours until we could rectify a networking issue that I thought posed a risk to post-launch traffic. Other than that, it was a switch in the application that was turned from off to on. The feature&#8217;s code had been on prod servers for months in beta. See &#8216;dark launch&#8217;</li>
<li><strong>Homepage redesign</strong>: Unprecedented amount of activity data being pulled onto the logged-in homepage, order of magnitude increase in the number of calls to backend databases. Why was it boring? Because it was dark launched 10 days earlier. The actual launch was a flip of the &#8216;on&#8217; switch</li>
<li><strong>People In Photos (aka, &#8216;people tagging&#8217;)</strong>: Because the feature required data that we didn&#8217;t actually have yet, we couldn&#8217;t exactly dark launch it. It was a feature that had to be turned on, or off. Because of this, Flickr&#8217;s Architect wrote out a list of all of the parts of the feature that could cause load-related issues, what the likelihood of each was, how to turn those parts of the feature off, what custome care affect it might have, and what contingencies would probably require some community management involvement.</li>
</ul>
</ul>
<h4>Dark Launches</h4>
<p>When we already have the data on the backend needed to display for a new feature, we would &#8216;dark launch&#8217;, meaning that the code would make all of the back-end calls (i.e. the calls that bring load-related risk to the deploy) and simply throw the data away, not showing it to the user. We could then increase or decrease the percentage of traffic who made those calls in safety, since we never risked the user experience by showing them a new feature and then having to take it away because of load issues.</p>
<p>This increases <em>everyone&#8217;s</em> confidence almost to the point of apathy, as far as fear of load-related issues are concerned. I have no idea how many code deploys there were made to production on any given day in the past 5 years (although I could find it on a graph easily), because for the most part I don&#8217;t care, because those changes made in production have such a low chance of causing issues. When they have caused issues, everyone on the Flickr staff can find on a webpage <strong><em>when</em></strong> the change was made, <strong><em>who</em></strong> made the change, and exactly (line-by-line) <strong><em>what</em></strong> the change was.</p>
<p>In the case where we had confidence in the resource consumption of a feature, but not 100% confidence in functionality, the feature was turned on for staff only. I&#8217;d say that about 95% of the features we launched in those 5 years were turned on for staff long before they were turned on for the entire Flickr population. When we still didn&#8217;t feel 100% confident, we ramped up the percentage of Flickr members who could see and use the new feature slowly.</p>
<h4>Config Flags</h4>
<p>We have many pieces of Flickr that are encapsulated as &#8216;feature&#8217; flags, which look as simple as: $cfg[disable_feature_video] = 0; this allows the site to be much more resilient to specific failures. If we have any degradation within a certain feature, we can simply turn that feature off in many cases, instead of taking the entire site down. These &#8216;flags&#8217; have, in the past, been prioritized with conversations with Product, so there is an easy choice to make if something goes wrong and site uptime becomes opposed to feature uptime.</p>
<p>This is an extremely important point: Dark Launches and Config Flags, were concepts and tools created by Flickr Development, not Flickr Operations, even though the end-result of each points toward a typical Operations goal: stability and availability. This is a key distinction. These are initiatives made by Engineering leadership because devs feel protective of the availability of the site, respectful of Operations responsibilities, and just plain good engineering.</p>
<p>If the Flickr Operations had built these tools and approaches to keeping the site stable, I do not believe we would have the same amount of success.</p>
<p>There is more on this topic here: <a href="http://code.flickr.com/blog/2009/12/02/flipping-out/" target="_blank">http://code.flickr.com/blog/2009/12/02/flipping-out/ </a></p>
<h4>Summary</h4>
<p>Flickr Operations is in an enviable position in that they don&#8217;t have to convince anyone in the Flickr property that:</p>
<ul>
<ul>
<ol>
<li>Operations has &#8216;go or no-go&#8217; decision-making power, along with every other subgroup.</li>
<li>Spending time, effort, and money to ensure stable feature launches <em>before they launch </em>is the rule, not the exception<em>.</em></li>
<li>Continuous Deployment is better for the availability of the site</li>
<li>Flickr Operations should be involved as early as possible in the development phase of any project</li>
</ol>
</ul>
</ul>
<p>These things are taken for granted. Any other way would simply feel weird.</p></blockquote>
<p>I have no idea if posting this letter helps anyone other than myself, but there you go.</p>
</div>
]]></content:encoded>
			<wfw:commentRss>http://www.kitchensoap.com/2012/01/05/convincing-management-that-cooperation-and-collaboration-was-worth-it/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		</item>
		<item>
		<title>Systems Engineering: A great definition.</title>
		<link>http://www.kitchensoap.com/2011/07/18/systems-engineering-great-definition/</link>
		<comments>http://www.kitchensoap.com/2011/07/18/systems-engineering-great-definition/#comments</comments>
		<pubDate>Mon, 18 Jul 2011 11:46:57 +0000</pubDate>
		<dc:creator>allspaw</dc:creator>
				<category><![CDATA[Culture]]></category>
		<category><![CDATA[Random]]></category>
		<category><![CDATA[WebOps]]></category>

		<guid isPermaLink="false">http://www.kitchensoap.com/?p=6175</guid>
		<description><![CDATA[Ben Rockwood said something last December about the re-emergence of the Systems Engineer and I agree with him, 100%. To add to that, I&#8217;d like to quote the excellent NASA Systems Engineering handbook&#8217;s introduction. The emphasis is mine: Systems engineering is a methodical, disciplined approach for the design, realization, technical management, operations, and retirement of [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>Ben Rockwood said <a href="http://cuddletech.com/blog/?p=150" target="_blank">something last December</a> about the re-emergence of the Systems Engineer and I agree with him, 100%.</p>
<div id="attachment_6366" class="wp-caption alignright" style="width: 231px">
	<a href="http://education.ksc.nasa.gov/esmdspacegrant/Documents/NASA%20SP-2007-6105%20Rev%201%20Final%2031Dec2007.pdf"><img class="size-medium wp-image-6366" title="NASA Systems Engineering Handbook" src="http://www.kitchensoap.com/wp-content/uploads/2011/07/Screen-shot-2011-07-18-at-7.36.22-AM-231x300.png" alt="NASA Systems Engineering Handbook" width="231" height="300" /></a>
	<p class="wp-caption-text">NASA Systems Engineering Handbook, 2007</p>
</div>
<p>To add to that, I&#8217;d like to quote the excellent NASA Systems Engineering handbook&#8217;s introduction. The emphasis is mine:</p>
<blockquote><p>Systems engineering is a methodical, disciplined approach for the design, realization, technical management, operations, and retirement of a system. A “system” is a construct or collection of different elements that together produce results not obtainable by the elements alone. The elements, or parts, can include <strong>people, hardware, software, facilities, policies, and documents; that is, all things required to produce system-level results.</strong> The results include system-level qualities, properties, characteristics, functions, behavior, and performance. The value added by the system as a whole, beyond that contributed independently by the parts, is primarily created by the relationship among the parts; that is, how they are interconnected. It is a way of looking at the “big picture” when making technical decisions. It is a way of achieving stakeholder functional, physical, and operational performance requirements in the intended use environment over the planned life of the systems. <strong><em>In other words, systems engineering is a logical way of thinking.</em></strong></p>
<p>Systems engineering is the art and science of developing an operable system capable of meeting requirements within often opposed constraints. <strong><em>Systems engineering is a holistic, integrative discipline, wherein the contributions of structural engineers, electrical engineers, mechanism designers, power engineers, human factors engineers, and many more disciplines are evaluated and balanced, one against another, to produce a coherent whole that is not dominated by the perspective of a single discipline.</em></strong></p>
<p>Systems engineering seeks a safe and balanced design in the face of opposing interests and multiple, sometimes conflicting constraints. The systems engineer must develop the skill and instinct for identifying and focusing efforts on assessments to optimize the overall design and not favor one system/subsystem at the expense of another. The art is in knowing when and where to probe. Personnel with these skills are usually tagged as “systems engineers.” They may have other titles—lead systems engineer, technical manager, chief engineer— but for this document, we will use the term <strong><em>systems engineer</em></strong>.</p>
<p>The exact role and responsibility of the systems engineer may change from project to project depending on the size and complexity of the project and from phase to phase of the life cycle. For large projects, there may be one or more systems engineers. For small projects, sometimes the project manager may perform these practices. But, whoever assumes those responsibilities, the systems engineering functions must be performed. The actual assignment of the roles and responsibilities of the named systems engineer may also therefore vary. The lead systems engineer ensures that the system technically fulfills the defined needs and requirements and that a proper systems engineering approach is being followed. The systems engineer oversees the project’s systems engineering activities as performed by the technical team and directs, communicates, monitors, and coordinates tasks. The systems engineer reviews and evaluates the technical aspects of the project to ensure that the systems/subsystems engineering processes are functioning properly and evolves the system from concept to product. <strong><em>The entire technical team is involved in the systems engineering process.</em></strong></p></blockquote>
<p>I would imagine that successful organization understands this concept of systems engineering, but I don&#8217;t think I&#8217;ve ever seen it put so well.</p>
<p>NASA&#8217;s engineers have both common and conflicting goals, just like we do in web operations. They weigh trade-offs in efficiency and thoroughness, and wade into the constraints of better, cheaper, faster, and hopefully: more <a title="Resilience Engineering Part I" href="http://www.kitchensoap.com/2011/04/07/resilience-engineering-part-i/" target="_blank">resilient</a>.</p>
<p>This re-emergence of the systems engineering (or &#8220;full-stack&#8221; engineering) notion is excellent and exciting to me, and I&#8217;m hoping that everyone in our field, when they hear &#8220;DevOps&#8221; (and/or how Theo says <a href="http://www.youtube.com/watch?v=y0mHo7SMCQk" target="_blank">*Ops</a>) what they mean is taking a <em><strong>systems engineering</strong></em> view.</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kitchensoap.com/2011/07/18/systems-engineering-great-definition/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>The epicenter of the web, and NYC</title>
		<link>http://www.kitchensoap.com/2009/12/03/360/</link>
		<comments>http://www.kitchensoap.com/2009/12/03/360/#comments</comments>
		<pubDate>Thu, 03 Dec 2009 23:47:42 +0000</pubDate>
		<dc:creator>allspaw</dc:creator>
				<category><![CDATA[Random]]></category>

		<guid isPermaLink="false">http://www.kitchensoap.com/?p=360</guid>
		<description><![CDATA[One of my apprehensions in moving to New York from San Francisco was a common concern: why would I move from the &#8216;epicenter&#8217; of the web to a place where it&#8217;s not? There&#8217;s been lots written about startup hub cities, and innovative web metro areas, but the fact of the matter is that New York [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>One of my apprehensions in moving to New York from San Francisco was a common concern: why would I move from the &#8216;epicenter&#8217; of the web to a place where it&#8217;s not? There&#8217;s been lots <a href="http://www.avc.com/a_vc/2006/05/replicating_sil.html" target="_blank">written</a> about startup hub cities, and innovative web metro areas, but the fact of the matter is that New York hasn&#8217;t historically been a hotbed of web growth and innovation. Not compared to the Bay Area or Seattle, anyway.</p>
<p>I do, of course, think this is changing as of recently. The punch line is that I obviously did <a href="http://www.kitchensoap.com/2009/11/18/from-one-door-to-another/" target="_blank">take the job</a>, despite my misgivings about not being surrounded by people who are constantly thinking about my industry. One of the reasons I got over not being in the &#8216;epicenter&#8217; is that <a href="http://www.avc.com" target="_blank">Fred Wilson</a> and <a href="http://continuations.com/" target="_blank">Albert Wenger</a><strong> </strong> did an insanely good job at convincing me it was a good idea. <img src='http://www.kitchensoap.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  Another reason is that I think Etsy is basically a Bay Area company that just happens to be in Brooklyn. I mean that as a compliment.</p>
<p>So while I always had some inkling of what &#8216;epicenter of the web&#8217; means, I was never really sure how that could be measured. Indeed.com has indirectly measured it by the <a href="http://www.indeed.com/jobtrends/information-technology-industry" target="_blank"># of job listings</a>.  O&#8217;Reilly did something similar for the <a href="http://radar.oreilly.com/2006/06/startup-centers.html" target="_blank"># of startup jobs in 2006.</a></p>
<p>Number of jobs is interesting, but I thought it might be fun to measure it by locations of headquarters as seen through the lens of monthly unique users. So, I took the <a href="http://www.quantcast.com/top-sites-1" target="_blank">Quantcast &#8220;Top 100&#8243;</a> sites, found the latitude and longitude of the headquarters of each site via <a href="http://www.crunchbase.com/help/api" target="_blank">Crunchbase&#8217;s API</a>, as well as other bits around the web, and <a href="http://www.aaronland.info/weblog/" target="_blank">Aaron</a> helped out with the excellent <a href="http://modestmaps.com/" target="_blank">Modest Maps</a> to make this:</p>
<div class="wp-caption alignnone" style="width: 500px">
	<a href="http://www.flickr.com/photos/straup/4155793319/in/set-72157622926803950/"><img title="North America" src="http://farm3.static.flickr.com/2568/4155793319_e5e2c6bb7b.jpg" alt="Quantcast Top 100 plotted on U.S. Map, radius = monthly uniques" width="500" height="313" /></a>
	<p class="wp-caption-text">Quantcast Top 100 plotted on U.S. Map, radius = monthly uniques</p>
</div>
<p>Like I said, this doesn&#8217;t change my thoughts about the new job, or what I think &#8216;epicenter of the web&#8217; means. But, still interesting, dontcha think?</p>
<p><strong>UPDATE</strong>: Here&#8217;s a link to the raw data: <a href="http://spreadsheets.google.com/pub?key=tLwD1C5mghn9U3XJj_yqyjw&amp;output=html" target="_blank">http://spreadsheets.google.com/pub?key=tLwD1C5mghn9U3XJj_yqyjw&amp;output=html</a></p>
<p>If there&#8217;s anything wrong, lemme know. <img src='http://www.kitchensoap.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.kitchensoap.com/2009/12/03/360/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
		</item>
		<item>
		<title>How Complex Systems Fail: A WebOps Perspective</title>
		<link>http://www.kitchensoap.com/2009/11/12/how-complex-systems-fail-a-webops-perspective/</link>
		<comments>http://www.kitchensoap.com/2009/11/12/how-complex-systems-fail-a-webops-perspective/#comments</comments>
		<pubDate>Thu, 12 Nov 2009 22:39:05 +0000</pubDate>
		<dc:creator>allspaw</dc:creator>
				<category><![CDATA[Random]]></category>
		<category><![CDATA[WebOps]]></category>

		<guid isPermaLink="false">http://www.kitchensoap.com/?p=326</guid>
		<description><![CDATA[I guess I&#8217;m late on getting to this, but How Complex Systems Fail by Richard Cook is excellent. Let me start with this: I don&#8217;t think I can overstate how right-on this paper is, with respect to the challenges, solutions, observations, and concerns involved with operating a medium to large web infrastructure. I found this [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>I guess I&#8217;m late on getting to this, but<a href="http://www.ctlab.org/documents/How%20Complex%20Systems%20Fail.pdf"> How Complex Systems Fail</a> by <a href="http://www.ctlab.org/Cook.cfm" target="_blank">Richard Cook</a> is excellent.</p>
<p>Let me start with this: I don&#8217;t think I can overstate how right-on this paper is, with respect to the challenges, solutions, observations, and concerns involved with operating a medium to large web infrastructure. I found this via @<a href="http://twitter.com/benjaminblack" target="_blank">benjaminblack</a>, and I agree with him 100%: this should be considered <em><strong>required reading</strong></em> for anyone in our industry. I&#8217;m not sure if Cook ever thought that his paper would apply to web infrastructure, but I think it can and does. Please take 30 minutes right now and read it. <img src='http://www.kitchensoap.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>There are a number of salient points in the paper that I&#8217;d like to comment on. Again, this is through the lens of failures of complex systems as it pertains to web operations:</p>
<blockquote><p><strong>7) Post-accident attribution accident to a ‘root cause’ is fundamentally wrong.</strong></p></blockquote>
<p>I&#8217;m going to guess that this portion may be viewed as controversial in the prevailing webops wisdom, where post-mortems are for sure necessary, but whose content may or may not be effective in preventing similar types of failure. I <em>do</em> value the process of a post-mortem, because I think the human element of understanding complex failures is important and doing whatever you can to put in place safety is good, modulo what is said in section #16 of the paper. I believe that even a rudimentary process of &#8220;<a href="http://www.startuplessonslearned.com/2009/07/how-to-conduct-five-whys-root-cause.html" target="_blank">5 Whys</a>&#8221; has value. But at the same time, I also think that there is something in the spirit of this paragraph, which is that there is a danger in standing behind a single underlying cause when there are systemic failures involved. Doing this can lead to the false belief that you&#8217;ve got this mode covered, you&#8217;ve found the silver bullet that made the whole mountain crumble, and jeez what a relief because <em><strong>that</strong></em> will never bite us again.</p>
<blockquote><p><strong>14) Change introduces new forms of failure.</strong></p></blockquote>
<p>I totally agree with this point. However, I often see this as a rallying point for operations teams to say &#8220;No!&#8221; to change, when instead they should be working alongside development (and product owners) with a goal of <em>reducing</em> the risk of failure associated with each change. I do not believe that &#8216;release early, release often&#8217; in and of itself can reduce that risk. I believe that the real (and only) way to do this is both technical <em>and</em> cultural. But I&#8217;ve <a href="http://velocityconference.blip.tv/file/2284377/" target="_blank">spoken about this before</a>.</p>
<blockquote><p><strong>16) Safety is a characteristic of systems and not of their components</strong></p></blockquote>
<p>Emphasis on <em>&#8220;Safety cannot be purchased or manufactured; it is not a feature that is separate from the other components of the system.&#8221; </em>Real safety comes from smart people doing smart things to the entire shebang, not the individual guts.</p>
<p>and I think the point I love the most, with all of my heart:</p>
<blockquote><p><strong>18) Failure free operations require experience with failure.</strong></p></blockquote>
<p>Fear is a strong emotion. I believe it can be used as a strong motivator for ensuring safety in the face of constant change, instead of a reason to push back on the very idea of change. Embrace fear of outages and degradation. Use it to guide your architecture, your code, your infrastructure. So <em>lean into it.</em></p>
<p>There are a lot of great points in the paper, and I could go on, but you get the idea.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kitchensoap.com/2009/11/12/how-complex-systems-fail-a-webops-perspective/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Uncaching bits in filesystem cache</title>
		<link>http://www.kitchensoap.com/2009/07/09/uncaching-bits-in-filesystem-cache/</link>
		<comments>http://www.kitchensoap.com/2009/07/09/uncaching-bits-in-filesystem-cache/#comments</comments>
		<pubDate>Thu, 09 Jul 2009 18:17:26 +0000</pubDate>
		<dc:creator>allspaw</dc:creator>
				<category><![CDATA[Random]]></category>
		<category><![CDATA[Tools]]></category>

		<guid isPermaLink="false">http://www.kitchensoap.com/?p=263</guid>
		<description><![CDATA[Domas makes something more useful than I bet most would think: http://mituzas.lt/2009/06/26/uncache/]]></description>
			<content:encoded><![CDATA[<p></p><p>Domas makes something more useful than I bet most would think: <a href="http://mituzas.lt/2009/06/26/uncache/" target="_blank">http://mituzas.lt/2009/06/26/uncache/</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.kitchensoap.com/2009/07/09/uncaching-bits-in-filesystem-cache/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>More back-of-envelope-math&#8230;</title>
		<link>http://www.kitchensoap.com/2008/09/18/more-back-of-envelope-math/</link>
		<comments>http://www.kitchensoap.com/2008/09/18/more-back-of-envelope-math/#comments</comments>
		<pubDate>Thu, 18 Sep 2008 16:21:45 +0000</pubDate>
		<dc:creator>allspaw</dc:creator>
				<category><![CDATA[Capacity Planning]]></category>
		<category><![CDATA[Random]]></category>
		<category><![CDATA[WebOps]]></category>

		<guid isPermaLink="false">http://www.kitchensoap.com/?p=55</guid>
		<description><![CDATA[Via kottke: some good examples of doing rough math in your head, causing you to guess about assumptions all along the way. IMHO, being able to do this is one of the things that makes a good web ops person. The examples might be &#8220;useless&#8221;, but the process is invaluable.]]></description>
			<content:encoded><![CDATA[<p></p><p><a href="http://www.kottke.org/08/09/guesstimations" target="_blank">Via kottke</a>: some good examples of doing rough math in your head, causing you to guess about assumptions all along the way.</p>
<p>IMHO, being able to do this is one of the things that makes a good web ops person. The <a href="http://3quarksdaily.blogs.com/3quarksdaily/2008/09/useless-calcula.html" target="_blank">examples</a> might be &#8220;useless&#8221;, but the process is invaluable.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kitchensoap.com/2008/09/18/more-back-of-envelope-math/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Four chapters of the new book on RoughCuts&#8230;</title>
		<link>http://www.kitchensoap.com/2008/06/13/four-chapters-of-the-new-book-on-roughcuts/</link>
		<comments>http://www.kitchensoap.com/2008/06/13/four-chapters-of-the-new-book-on-roughcuts/#comments</comments>
		<pubDate>Fri, 13 Jun 2008 22:04:31 +0000</pubDate>
		<dc:creator>allspaw</dc:creator>
				<category><![CDATA[Books]]></category>
		<category><![CDATA[Capacity Planning]]></category>
		<category><![CDATA[Random]]></category>
		<category><![CDATA[Web Ops]]></category>

		<guid isPermaLink="false">http://www.kitchensoap.com/?p=45</guid>
		<description><![CDATA[So now there&#8217;s chapters 1-4 on Safari RoughCuts. Which means if you don&#8217;t mind shelling out the dough, you can take a look at what I&#8217;ve been getting up early for every day for the past few months. The working title is &#8220;The Art of Capacity Planning&#8221; and it&#8217;s meant to be a no-nonsense description [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>So now there&#8217;s chapters 1-4 on <a title="The Art of Capacity Planning" href="http://safari.oreilly.com/9780596518578/" target="_blank">Safari RoughCuts</a>. Which means if you don&#8217;t mind shelling out the dough, you can take a look at what I&#8217;ve been getting up early for every day for the past few months. The working title is &#8220;The Art of Capacity Planning&#8221; and it&#8217;s meant to be a no-nonsense description of the capacity planning process and considerations for web operations.</p>
<p>I still have two chapters to go before it&#8217;s all finished, but if you&#8217;re nice enough to take a look at what I&#8217;ve got thus far, I&#8217;d appreciate any feedback. I&#8217;m sure there could be typos and some graphs misaligned, but such is life with &#8220;drafts&#8221;. <img src='http://www.kitchensoap.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.kitchensoap.com/2008/06/13/four-chapters-of-the-new-book-on-roughcuts/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Tool update: WTF is inside filesystem cache ?</title>
		<link>http://www.kitchensoap.com/2008/03/27/tool-update-wtf-is-inside-filesystem-cache/</link>
		<comments>http://www.kitchensoap.com/2008/03/27/tool-update-wtf-is-inside-filesystem-cache/#comments</comments>
		<pubDate>Thu, 27 Mar 2008 13:04:14 +0000</pubDate>
		<dc:creator>allspaw</dc:creator>
				<category><![CDATA[Caching]]></category>
		<category><![CDATA[Random]]></category>
		<category><![CDATA[Tools]]></category>

		<guid isPermaLink="false">http://www.kitchensoap.com/2008/03/27/tool-update-wtf-is-inside-filesystem-cache/</guid>
		<description><![CDATA[Awhile back, I said I&#8217;d love to have a tool that would allow me to peek inside filesystem cache and tell me what files (or pages of files) are inside. Well Peter Zaitsev points to the fincore tool, which comes pretty damn close: you give it a file, and it will tell you which pages [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>Awhile <a href="http://www.kitchensoap.com/2007/01/26/two-tools-that-i-would-love-more-than-anything/" target="_blank">back</a>, I said I&#8217;d love to have a tool that would allow me to peek inside filesystem cache and tell me what files (or pages of files) are inside. Well Peter Zaitsev <a href="http://www.mysqlperformanceblog.com/2008/03/18/the-tool-ive-been-waiting-for-years/" target="_blank">points</a> to the <a href="http://net.doit.wisc.edu/~plonka/fincore/" target="_blank">fincore</a> tool, which comes pretty damn close: you give it a file, and it will tell you which pages of a particular file are in core memory.</p>
<p>Rock. Thanks, David Plonka.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kitchensoap.com/2008/03/27/tool-update-wtf-is-inside-filesystem-cache/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Datacenter Operating Systems</title>
		<link>http://www.kitchensoap.com/2008/02/20/datacenter-operating-systems/</link>
		<comments>http://www.kitchensoap.com/2008/02/20/datacenter-operating-systems/#comments</comments>
		<pubDate>Wed, 20 Feb 2008 16:16:07 +0000</pubDate>
		<dc:creator>allspaw</dc:creator>
				<category><![CDATA[Capacity Planning]]></category>
		<category><![CDATA[Random]]></category>

		<guid isPermaLink="false">http://www.kitchensoap.com/2008/02/20/datacenter-operating-systems/</guid>
		<description><![CDATA[I&#8217;m probably late in getting to this, but seeing the article in the WSJ about the RAD project made me stop to take a look. It appears to be a collection of different projects, all relating to infrastructure deployment/management and various research topics surrounding it. Looks cool so far.]]></description>
			<content:encoded><![CDATA[<p></p><p>I&#8217;m probably late in getting to this, but seeing the <a href="http://online.wsj.com/article/SB120346246517678289.html" target="_blank">article</a> in the WSJ about the <a href="http://radlab.cs.berkeley.edu/wiki/RAD_Lab" title="RAD project" target="_blank">RAD project</a> made me stop to take a look. It appears to be a collection of different projects, all relating to infrastructure deployment/management and various research topics surrounding it. Looks cool so far.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kitchensoap.com/2008/02/20/datacenter-operating-systems/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>5 Little Known Things</title>
		<link>http://www.kitchensoap.com/2007/01/18/5-little-known-things/</link>
		<comments>http://www.kitchensoap.com/2007/01/18/5-little-known-things/#comments</comments>
		<pubDate>Thu, 18 Jan 2007 21:53:12 +0000</pubDate>
		<dc:creator>allspaw</dc:creator>
				<category><![CDATA[Random]]></category>

		<guid isPermaLink="false">http://www.kitchensoap.com/2007/01/18/5-little-known-things/</guid>
		<description><![CDATA[Ok, Geva Perry tagged me so here goes: 1. After graduating from Umass/Amherst with a Mechanical Engineering degree, I worked at the Volpe National Transportation Systems Center, doing side-impact crashworthiness research for the NHTSA. We ran transient dynamic finite-element analysis simulations using LS-DYNA and rigid-body simulations with MADYMO3D. This is where I learned UNIX, because [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>Ok, <a href="http://gevaperry.typepad.com/main/2006/12/blogtag.html">Geva Perry</a> tagged me so here goes:</p>
<p>1. After graduating from Umass/Amherst with a Mechanical Engineering degree, I worked at the <a href="http://www.volpe.dot.gov/">Volpe National Transportation Systems Center</a>, doing side-impact crashworthiness research for the <a href="http://www.nhtsa.dot.gov/">NHTSA</a>. We ran transient dynamic finite-element analysis simulations using <a href="http://lsdyna.com/">LS-DYNA</a> and rigid-body simulations with <a href="http://madymo.com/cms/index.php?pageid=131">MADYMO3D.</a> This is where I learned UNIX, because we ran the simulations on big SGI Origin compute servers, T3 Cray machines, Intel Paragons, and even AIX RS/6000s.</p>
<p>2. I&#8217;ve been playing guitar since I was about 13 years old, and after playing live many bluegrass, metal, jamband covers, and pop songs, I still can&#8217;t seem to do anything with jazz improvisation except faking it.</p>
<p>3.  In high school I was a major gearhead, and tricked out all of my 3 1980s VW GTIs for extra horsepower.  My introduction to automotive repairs was replacing the front left transaxle on my 1977 VW Scirocco in my driveway. This enthusiasm for cars also led me to be one &#8216;bad driving&#8217; point away from the Massachusetts Registry of Motor Vehicles seizing my license.  (I drive slower now)</p>
<p>4.  In 1999, I moved to San Francisco and after a short stint at <a href="http://gene.com">Genentech</a>, I went to work at <a href="http://salon.com">Salon.com</a> for <a href="http://chaddickerson.com/blog/" target="_blank">Chad Dickerson</a>, who continues to be my all-time favorite manager. I learned a lot from Chad, and am lucky enough to work with him at Yahoo!.</p>
<p>5. I was born in <a href="http://en.wikipedia.org/wiki/Ozarks" target="_blank">The Ozarks</a>, then moved to <a href="http://en.wikipedia.org/wiki/Everett%2C_Massachusetts" target="_blank">north suburban Boston</a>. Depending on how much I&#8217;ve had to drink, I either have a Boston accent, or a slight southern-ish accent.  <img src='http://www.kitchensoap.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>And now, as Norby points out&#8230;I tag <a href="http://saraewood.com/" target="_blank">Sara Wood</a>, <a href="http://www.aaronland.info/weblog/" target="_blank">Aaron Cope</a>, <a href="http://mysqldba.blogspot.com/" target="_blank">Dathan</a>, <a href="http://laughingmeme.org/" target="_blank">Kellan</a>, and <a href="http://george08.blogspot.com/" target="_blank">George</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.kitchensoap.com/2007/01/18/5-little-known-things/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
	</channel>
</rss>

