Blame Murphy's Law, Excessive Hubris For Blackouts In SF

from the or-blame-the-PR-people,-your-choice dept

Lots of San Francisco-based web sites (Craigslist, Six Apart, Yelp, and Technorati, among others) have been experiencing some problems today, after power outages in the city took down 365 Main, a major hosting facility there. While the power company says it doesn't yet know what's caused the outages, we have a pretty good idea: a 365 Main press release that went out this morning, bragging about the two years of continuous uptime one of its customers has had since moving to the data center. And when these guys invoke Murphy's Law, they don't do it by half, either. The release also brags about the center's "unique billing system in which 365 Main only charges customers for the exact amount of power that is used" -- so presumably today will be free. But what really sealed their fate was this paragraph:
"To ensure uptime for key tenants such as RedEnvelope, 365 Main provides modern power and cooling infrastructure. The company's San Francisco facility includes two complete back-up systems for electrical power to protect against a power loss. In the unlikely event of a cut to a primary power feed, the state-of-the-art electrical system instantly switches to live back-up generators, avoiding costly downtime for tenants and keeping the data center continuously running."
Good to see those backup systems are working!


Reader Comments (rss)

(Flattened / Threaded)

  1.  
    icon
    Michael Witt (profile), Jul 24th, 2007 @ 4:15pm

    Drunk and disorderly... ?

    It hit digg a little while ago:

    http://valleywag.com/tech/breakdowns/a-drunk-employee-kills-all-of-the-websites-you-care-abo ut-282021.php

    In that article it says that a "shitfaced drunk" employee did the damage to 40+ racks of equipment.

    I dunno, but I think I'll take that article with a grain of salt for the time being.

     

    reply to this | link to this | view in thread ]

  2.  
    identicon
    dorpus, Jul 24th, 2007 @ 4:26pm

    Rats

    I once worked for Motorola's Iridium project, which had spent millions of dollars on redundant power backup systems at their controlling facility. However, it was still not immune to the stray rat who bit into a power cable -- that would have taken the facility down for days. The facility was near the Potomac River, so there were lots of rats.

    Russia's Baikonur Kosmodrome also had a serious problem with rats chewing cables at the time, and they dealt with the problem by keeping cats.

     

    reply to this | link to this | view in thread ]

  3.  
    icon
    sehlat (profile), Jul 24th, 2007 @ 4:28pm

    Murphy's Law - a correction

    Murphy's Law is commonly stated: "Whatever can go wrong will go wrong." That is incorrect. It should be: "Whatever can go wrong may go wrong." Of course, if you persist in walking through mine fields...

    Quoted from The Signature of God by John Dalmas

     

    reply to this | link to this | view in thread ]

  4.  
    identicon
    GoblinJuice, Jul 24th, 2007 @ 4:44pm

    Could be related. =)

    "> 30K Without Power In SF
    Electricity goes out downtown after explosion beneath Misson St. manhole cover.

     

    reply to this | link to this | view in thread ]

  5.  
    identicon
    BMR777, Jul 24th, 2007 @ 4:58pm

    Bragging like that is almost as stupid as...

    Bragging like that is almost as stupid as saying that a ship is "unsinkable". I can think of one such case where such a statement blew up in the bragger's face and royally pwned them.

    Moral: God doesn't like braggers.

    BMR777

     

    reply to this | link to this | view in thread ]

  6.  
    identicon
    Anonymous Coward, Jul 24th, 2007 @ 5:01pm

    "Excessive Hubris"... seems a bit redundant I'd say.

     

    reply to this | link to this | view in thread ]

  7.  
    identicon
    Anonymous Coward, Jul 24th, 2007 @ 6:04pm

    hrmm thats why I can't get on craigslist. Still seeems to be running slow and timing out now.

     

    reply to this | link to this | view in thread ]

  8.  
    identicon
    Anonymous Coward, Jul 24th, 2007 @ 6:15pm

    Craigslist is running slow and timing out because now the server is Craig's mac that was in his closet.

     

    reply to this | link to this | view in thread ]

  9.  
    identicon
    John, Jul 24th, 2007 @ 6:27pm

    Re:

    Thank good it's a Mac, if it was IIS it wold just be timing out.

     

    reply to this | link to this | view in thread ]

  10.  
    identicon
    Anonymous Coward, Jul 24th, 2007 @ 7:07pm

    Duh...

    So the servers were running on backup power. Big whoop. All of the switching and other important nodes were down all around the servers. No data IO at all. Doesn't really help to have backups unless the entire network has backups.

     

    reply to this | link to this | view in thread ]

  11.  
    identicon
    Anonymous Coward, Jul 24th, 2007 @ 9:37pm

    That wasn't Murphy's Law...

    It was The Un-Speakable Law !!!

     

    reply to this | link to this | view in thread ]

  12.  
    identicon
    Overcast, Jul 24th, 2007 @ 10:11pm

    Imagine all the damage done to the environment due to the do-gooders in San Fran Sicko using their A/C.

    Excessive BS is more like it

     

    reply to this | link to this | view in thread ]

  13.  
    icon
    MadJo (profile), Jul 24th, 2007 @ 11:54pm

    well duh!

    ...the state-of-the-art electrical system instantly switches to live back-up generators...

    (emphasis mine)

    The power goes out and they trust on electrical systems to switch to back up generators?
    No wonder it went wrong. :-)

     

    reply to this | link to this | view in thread ]

  14.  
    identicon
    Mike F.M, Jul 25th, 2007 @ 12:41am

    Re: well duh!

    I'm pretty sure that something powered by the backup would have been monitoring, not something from live

     

    reply to this | link to this | view in thread ]

  15.  
    identicon
    Enrico Suarve, Jul 25th, 2007 @ 4:05am

    Re: well duh!

    You power them off UPS and use this to manage the switchover (humans just aren't fast or reliable enough)

    Something clearly went wrong however. I had a client once who had a similar failure, the batteries switched over to the generator which started just fine and all was OK, but a few minutes later the generator stopped

    Cause: whichever plank had installed the generator had wired up its electric diesel pump to the mains only, instead of to the generators own electrical output.... it drank itself dry ;0)

     

    reply to this | link to this | view in thread ]

  16.  
    identicon
    R3d Jack, Jul 25th, 2007 @ 4:29am

    Murphy was an optimist

    Whatever can go wrong, won't. Whatever can't go wrong, will.

     

    reply to this | link to this | view in thread ]

  17.  
    identicon
    Ferin, Jul 25th, 2007 @ 4:58am

    Askign for it.

    Dude, this was like the ultimate murphy call. Couldn't their marketing guys have just lubed the company up, presented it's backside, and shouted out "**** me murphy! **** me long and hard!"

     

    reply to this | link to this | view in thread ]

  18.  
    identicon
    Jonny Tewbad, Jul 25th, 2007 @ 6:22am

    No mirroring?

    If these major web based companies are not mirroring and utilizing simple disaster recovery by having dual facilities in major metro areas, they should be considering it now. They should be ashamed and now they realize the cost of this ignorance is a mere pittance in comparison to the embarrassment.

     

    reply to this | link to this | view in thread ]

  19.  
    identicon
    Anonymous Coward, Jul 25th, 2007 @ 8:13am

    well, the backup can work all it wants to... the rest of the surrounding infrastructure has to work too, and if the rest of the city block is dead, they're still dead and aren't going anywhere.

     

    reply to this | link to this | view in thread ]

  20.  
    identicon
    Jason, Jul 25th, 2007 @ 8:14am

    backups?

    It happens to the best of us. Why didn't 365 Main plan for this kind of problem? Google, Microsoft, Yahoo, all have enough redundancy in their systems to prevent this kind of thing happening. What if their drives got toasted? What would have happened to their data? They'd be spending alot of money for data recovery, that's what. J

     

    reply to this | link to this | view in thread ]

  21.  
    identicon
    Charles Griswold, Jul 25th, 2007 @ 10:09am

    Re: Excessive Hubris

    "Excessive Hubris"... seems a bit redundant I'd say.
    A moderate amount of hubris can be seen as a virtue.

     

    reply to this | link to this | view in thread ]

  22.  
    identicon
    Anonymous Coward, Jul 26th, 2007 @ 10:06am

    uhh.. they probably run huge diesel gensets that are capable of putting out Megawatts of power. There is no need for offsite mirrors. The diesels can be at peak load in under 1 minute. UPS can hold you over for the time in between. Also, most facilities have automagical transfer switches. Proper maintenance of the gensets (that means running them under loadbanks, and having qualified people come in to do the oil and other fluids) will almost guarantee that your facility will be online in minutes rather than hours.

     

    reply to this | link to this | view in thread ]

  23.  
    identicon
    Anonymous Coward Also, Jul 27th, 2007 @ 12:04pm

    you think you're so smart...

    My DC has redundant feeds from the street, redundant pipes, racks and racks of backup batteries. Didn't do a bit of good when a huge spike came in off the street and *vaporized* the emergency switching gear. I'm not kidding.

    No batteries, no redundant grid, no amount of testing will guarantee a no-impact failover. I've seen multiple outages at multiple sites over 20 years. The answer is IT DEPENDS. With hugs power feeds it gets complicated fast.

    It is incredibly ironic that the PR folks put out a release like that the same day... but really, most hosting sites say the same kinds of things. You can take cheap shots if you want, doesn't mean you know anything. I often enjoy the hubris of the media!

     

    reply to this | link to this | view in thread ]

  24.  
    identicon
    san francisco chiropractor, Dec 23rd, 2010 @ 6:13am

    reply

    I totally agree without a proper maintenance, it's nothing.

     

    reply to this | link to this | view in thread ]


Add Your Comment

Have a Techdirt Account? Sign in now. Want one? Register here
Get Techdirt’s Daily Email
Save me a cookie
  • Note: A CRLF will be replaced by a break tag (<br>), all other allowable HTML will remain intact
  • Allowed HTML Tags: <b> <i> <a> <em> <br> <strong> <blockquote> <hr> <tt>
Follow Techdirt
A word from our sponsors...
Essential Reading
Techdirt Reading List
Techdirt Insider Chat
A word from our sponsors...
Recent Stories
A word from our sponsors...

Close

Email This