Why Microsoft fell offline

Wired News has a story online about Microsoft's DNS troubles last week that left most of its web sites and services inaccessible for large chunks of time. According to Wired, the problem was a router misconfiguration, but they don't really offer any clues about exactly what happened with the routers. Apparently the problem was fairly nasty, since it took a number of hours to sort it out. (I'll betcha it was an HSRP/layer 2 kinda snafu. Those are kinda tricky.) Then there's this part of the story:
Technical experts blame Microsoft's design decisions for exacerbating its woes. All the affected Microsoft sites rely on just four Windows servers, located in the company's Canyon Park data center, to forward users to the right destination via the Domain Name System (DNS).

Because all four DNS servers -- which translate names like microsoft.com into its numeric address -- share the same routers, all are vulnerable to hardware glitches or a technician's error.

I won't launch into a lecture about redundancy, DNS, and how a mega-corporation like Microsoft ought to know better. But I'd really like to. I just want to spare you all; you know the drill.

But jeez.

Tip: You can use the A/Z keys to walk threads.
View options

This discussion is now closed.