Skype's Explanation For Downtime Not Ringing True

from the something's-missing-here dept

It took quite a while, but Skype was finally able to get its service back up and running after extended downtime last week. Monday morning, Skype posted an explanation of what happened, but it has many of us scratching our heads -- and many professionals questioning Skype and trying to match up the official explanation to reality. Skype officially blamed all the rebooting Windows computers due to Windows latest security patch. It must be nice to blame Microsoft, but it's hard to understand why that should be the problem. First of all, Microsoft security patch updates happen pretty regularly (normally once a month), often requiring the same reboot process. Why would this time suddenly be different than every time in the past? Skype doesn't explain that. Second, and much more importantly, the service crashed at 3am PT on Thursday morning. That's about 24 hours after most computers would be rebooting. Microsoft comes out with its patches on Tuesday, and most computers then do the reboot in the early hours of Wednesday morning, not Thursday morning.


Reader Comments (rss)

(Flattened / Threaded)

  •  
    identicon
    Mike Wyman, Aug 20th, 2007 @ 5:54pm

    Microsoft Not to Blame

    Bingo! You are dead-on. Skype obviously made a software change which was aggravated by the MS patch release.

     

    reply to this | link to this | view in chronology ]

    •  
      identicon
      Anonymous Coward, Aug 20th, 2007 @ 7:22pm

      Re: Microsoft Not to Blame

      Lets not also mention the fact that all good network admins TEST the security patches before rebooting the real machines to make sure nothin glike that would happen.

       

      reply to this | link to this | view in chronology ]

  •  
    identicon
    Chris Woods, Aug 20th, 2007 @ 6:09pm

    Skype doesn't blame Microsoft

    They made it very clear in their explanation that it was a bug in Skype's "self-healing algorithm" that was tickled by the Patch Tuesday reboots, so they don't actually blame Microsoft. They're taking full responsibility and admitting it was a Skype bug. Please tone down the hype. It makes you look silly.

     

    reply to this | link to this | view in chronology ]

    •  
      icon
      Mike (profile), Aug 20th, 2007 @ 6:14pm

      Re: Skype doesn't blame Microsoft

      They made it very clear in their explanation that it was a bug in Skype's "self-healing algorithm" that was tickled by the Patch Tuesday reboots, so they don't actually blame Microsoft. They're taking full responsibility and admitting it was a Skype bug. Please tone down the hype.

      Er... while it's true that they officially take full responsibility, they do basically say that it was the MS patch that kicked off the problem (and nearly all of the press reports are focusing on this). That's convenient for Skype, because they knew the press would focus in on that, no matter what they said about their own algorithm.

      The point, however, is not who is to blame, but that this explanation doesn't seem to make much sense.

       

      reply to this | link to this | view in chronology ]

  •  
    identicon
    awwtbone, Aug 20th, 2007 @ 6:23pm

    1+1=5

    I dont download/install patches... FLAW. On the other hand, I had 90+% uptime with my Skype..

     

    reply to this | link to this | view in chronology ]

  •  
    identicon
    John, Aug 20th, 2007 @ 6:51pm

    Kind of hard to believe that a company like Skype would allow an auto reboot to occur on a Production server. Logic would demand a Staging server test before rolling to production...lame excuse...unless an intern was doing the updates...in that case, who's running skype?

     

    reply to this | link to this | view in chronology ]

    •  
      identicon
      Quasse, Aug 20th, 2007 @ 7:25pm

      Re: Auto Reboot

      If you had read their press release... It clearly states that the problem was not a server rebooting. They said it was the flood of login request that came after all the users rebooted.

      They didn't blame this in any way on Microsoft either. They clearly state that the problem was theirs and that the Microsoft patch was perfectly routine.

       

      reply to this | link to this | view in chronology ]

  •  
    identicon
    GoblinJuice, Aug 20th, 2007 @ 7:14pm

    I looked into Skype a looong time ago. It mostly seemed to be used by teens and looosers. (The userbase may have changed.) Assuming that's still the case, who gives a fuck?

    So Bobby can't talk Suzy out of her panties for a little while. Life will go on.

    Bobby will find pr0n. Suzy will get slammed by the jock / emo kid / wigger / fillInTheBlank down the block.

     

    reply to this | link to this | view in chronology ]

    •  
      identicon
      Tin Ear, Aug 20th, 2007 @ 8:34pm

      Re: Porn and looosers?

      Believe it or not, Skype is used by many contemporary businesses in this day and age. You may have looked into the program a looong time ago (What's with the 'o's, anyway?) but I've been using the program almost since it's start. It has totally replaced my home phone service, and I have found it to be completely reliable. Well, outside of a very few instances.

      I don't know where your head is at, but from your post I get the impression that you may think the internet is all about getting 'Suzy' out of her panties for you (dream on) and porn. Then again, that may be all you've seen of it. I can't say.

      The userbase has definitely changed since you looked into it. My impression is that if you were to contact me on my Skype, I would be forced to block you. For that, I apologize in advance.

       

      reply to this | link to this | view in chronology ]

      •  
        identicon
        GoblinJuice, Aug 21st, 2007 @ 4:50am

        Re: Re: Porn and looosers?

        >Believe it or not, Skype is used by many contemporary businesses in this day and age.

        It's matured. Good to know.

        >You may have looked into the program a looong time ago

        Yeah, like I said. =) It was a long time ago.

        >(What's with the 'o's, anyway?)

        Emphasis, baby! =D

        >It has totally replaced my home phone service, and I have found it to be completely reliable. Well, outside of a very few instances.

        Oh, like the major f'in crash? lol.

        >I don't know where your head is at,

        Dude. Come on. LOL.

        >but from your post I get the impression that you may think the internet is all about getting 'Suzy' out of her panties for you (dream on) and porn.

        That's what 99% of life is about.

        >The userbase has definitely changed since you looked into it.

        Good to know.

        >My impression is that if you were to contact me on my Skype, I would be forced to block you. For that, I apologize in advance.

        Don't feel too bad. I'm used to it. =D

        *mumbles something about restraining orders*

         

        reply to this | link to this | view in chronology ]

    •  
      identicon
      PaulT, Aug 21st, 2007 @ 1:56am

      Re:

      GoblinJuice... wow. I'll bite the troll bait however..
      If you're using Skype to look for new 'friends' probably then the teen losers is all you'll get and you deserve it - it's not really a social networking service for finding new friends, it's a communication medium.

      I moved to Spain a couple of years ago and I use Skype to keep in touch with my family and friends back in the UK. I do this because it's over 20 times cheaper to do so if I'm calling a landline, and free if I'm calling a friend on Skype.

      The last company I worked for here also used Skype for people in 3 different offices to easily talk and/or message each other.

      And so on...

       

      reply to this | link to this | view in chronology ]

      •  
        identicon
        GoblinJuice, Aug 21st, 2007 @ 4:52am

        Re: Re:

        >GoblinJuice... wow. I'll bite the troll bait however..

        Not trollbait. Just an asshole that didn't understand what the bfd about Skype having a hiccup was about. Big difference. =D

         

        reply to this | link to this | view in chronology ]

    •  
      identicon
      Solarcanine, Aug 21st, 2007 @ 6:44am

      Re:

      Actually, we use Skype quite a bit for business reasons - it's far more convenient to actually talk to the people in our Taiwan office over the 'net rather than over the phone. The userbase has, indeed, changed quite a bit.

       

      reply to this | link to this | view in chronology ]

  •  
    identicon
    Sanguine Dream, Aug 20th, 2007 @ 7:46pm

    Well...

    It doesn't seem that they are pointing the finger at MS but just saying that it was a matter of horrible timing with MS updates.

     

    reply to this | link to this | view in chronology ]

  •  
    identicon
    linuxamp, Aug 20th, 2007 @ 9:16pm

    Not quite right

    From John:
    Kind of hard to believe that a company like Skype would allow an auto reboot to occur on a Production server.

    John, you should read a bit more carefully. The reboot that they mention is not the reboot of their servers but the reboot of PCs around the world with Skype set to startup with Windows. I'm certain they would not let their production servers reboot without a failover in place.

    In the Skype architecture, each user's computer attempts to act as a node in the Skype network which assists in scalability and apparently fault tolerance to some extent.

    The less nodes, the more work their root servers must perform. In this case since so many nodes were offline and came online at once (due to the MS update) a flood of login attempts hit their servers. They basically DoS attacked themselves.

    Still a poor excuse though. As many have pointed out, why hasn't this happened before when MS released updates which require a reboot?

     

    reply to this | link to this | view in chronology ]

    •  
      identicon
      Anonymous Coward, Aug 21st, 2007 @ 12:17am

      Re: Not quite right

      another issue to consider is that Windows updates occur within 20% of the time specified. This is to directly avoid a flood of machines rebooting at the same time

       

      reply to this | link to this | view in chronology ]

  •  
    identicon
    Richard Burman, Aug 20th, 2007 @ 9:27pm

    Man... I like Suzie..

    "I don't know where your head is at, but from your post I get the impression that you may think the internet is all about getting 'Suzy' out of her panties for you (dream on) and porn. Then again, that may be all you've seen of it. I can't say."

    I REALLY like Suzy....

     

    reply to this | link to this | view in chronology ]

  •  
    identicon
    Logan Durand, Aug 20th, 2007 @ 9:57pm

    A Few Missing Variables...

    You need to consider that there are several other factors that can greatly affect how many restarts occur within a specified timeframe. If the time needed to download, extract, and install an update (as determined by an individual's connection speed and processing power) were to vary by, say, 50 percent between the respective users, a large update or service pack could exhibit restarts spread apart over several minutes or, in extreme cases, even hours, while a small update of only a few hundred kilobytes would have users restarting within only a few minutes of each other. The restart time remains porportionately the same between all users, but only becomes noticably different with large updates. In addition, Windows updates are not all flagged "critical" and therefore may not have been installed automatically, which would lower the chances of this sort of DDoS attack slightly.

    Also, to answer why this happened at a time other than the schedueled update time, I can think of two explanations. First, it is possible that the problem in question did not cause errors until after the initial flood of login requests. Anyone who has used a computer long enough knows that a corrupted disk sector or bad bit of RAM can go unnoticed for days, because only when the system tries to access the bad secor will a problem arise. Another, less likely scenario is while a problem occured, it didn't cause the server to crash, but administrators dicided to take them offline preemtively so as to prevent a more serious error from occuring.

     

    reply to this | link to this | view in chronology ]

  •  
    identicon
    Joel Coehoorn, Aug 21st, 2007 @ 6:24am

    Apparently I found this post a bit late, judging by all the other comments. But I want to point out that the timing is actually pretty consistent with a Windows Update problem. Updates are usually release early in the morning on the 2nd Tuesday of the month, also known as patch Tuesday. But because of the incredible volume of patches to download not everyone gets them at the same time. Microsoft has set it up so that businesses have the best shot at getting them on Tuesday, and I know that on my machine I don't usually get it until Wednesday or Thursday.

    With that, I don't think all those machines re-starting and reconnecting at once would do it alone. But the whole point of Windows Updates is to fix things, and that means changing something. They could have fixed a flaw somewhere in the network code that results in Skype's authentication system having to work harder or simply not handling requests correctly.

    However, even if this is the case the blame still rests squarely with Skype. It was their bad code all along that actually caused the outage. The Windows Update only triggered it.

     

    reply to this | link to this | view in chronology ]


Add Your Comment

Have a Techdirt Account? Sign in now. Want one? Register here
Get Techdirt’s Daily Email
Save me a cookie
  • Note: A CRLF will be replaced by a break tag (<br>), all other allowable HTML will remain intact
  • Allowed HTML Tags: <b> <i> <a> <em> <br> <strong> <blockquote> <hr> <tt>
Follow Techdirt
A word from our sponsors...
Essential Reading
Techdirt Reading List
Techdirt Insider Chat
A word from our sponsors...
Recent Stories
A word from our sponsors...

Close

Email This