/// Amazon Details Last Week’s Cloud Failure, and Apologizes [NewEnterprise]

April 29, 2011  |  All Things Digital

Amazon has released a detailed account of its terrible, horrible, no good very bad week , during which portions of its Amazon Web Services crashed in the US, and brought the operations of numerous other companies down with it. It’s a rather lengthy read , so I thought I’d pull out some highlights. It at all started started 12:47 AM PDT on April 21 in Amazon’s Elastic Block Storage operation, which is essentially the storage used by Amazon’s EC2 cloud compute service, so EBS and EC2 go hand-in-hand. During normal scaling activities, a network change was underway. It wasn performed incorrectly. Not by a machine, but by a human. As Amazon puts it: The configuration change was to upgrade the capacity of the primary network. During the change, one of the standard steps is to shift traffic off of one of the redundant routers in the primary EBS network to allow the upgrade to happen. The traffic shift was executed incorrectly and rather than routing the traffic to the other router on the primary network, the traffic was routed onto the lower capacity redundant EBS network

Go here to see the original:
Amazon Details Last Week’s Cloud Failure, and Apologizes [NewEnterprise]

Leave a Reply

You must be logged in to post a comment.