More Than DNS: Learnings from the 14 hour AWS outage
thundergolfer.com·18h·
Flag this post

Picture of the Modal team working on the outage. I’m on the right, biting a nail nervously. We’re in an Italian hotel because this happened on day 1 of our offsite.

On Monday the AWS us-east-1 region had its worst outage in over 10 years. The whole thing lasted over 16 hours and affected 140 AWS services, including, critically, EC2. SLAs were blown, an eight-figure revenue reduction will follow. Before Monday, I’d spent around 7 years in industry and never personally had production nuked by a public cloud outage. I generally regarded AWS’s reliability as excellent, industry-leading.

What the hell happened?

A number of smart engineers have come to this major bust-up and covered it with the blanket of a simple explanation…

Similar Posts

Loading similar posts...