Update: Following the subsequent official announcements of Amazon and Microsoft I updated the post with more information on the outages and relevant links
Here it is again. A major outage in Amazon’s AWS data center in North Virginia takes down the cloud service in Amazon’s biggest region, and with it, taking down a multitude of cloud-based services such as Netflix, Tinder, AirBnB and Wink. This is not the first time it happens, and not even the worst. At least this time it didn’t last for days. This time it was their DynamoDB that went down and took down a host of other services, as Amazon describes in a lengthy blog post.
And Amazon is not alone in that. Microsoft today also suffered a major outage in its Skype service, which rendered the popular VoIP service unusable. In their update Skype reported the root cause was a bad configuration change:
We released a larger-than-usual configuration change, which some versions of Skype were unable to process correctly therefore disconnecting users from the network. When these users tried to reconnect, heavy traffic was created and some of you were unable to use Skype’s free services …
This time it was Microsoft’s Skype service, but we already saw how Microsoft’s Azure cloud can also suffer major outage, all on account of a configuration update.
One interesting effect was exposed due to this recent outage that is worth noting: up till now the impact was limited to online cloud services such as our movie or dating service. But now, with the penetration of the Internet of Things (IoT) to our homes, the effects of such cloud outage reach far beyond, and into our own homes and daily utilities, as nicely narrated by David Gewirtz’ piece on ZDnet, who tried voice-commanding its Amazon Echo (nicknamed “Alexa”) to turn on the lights and perform other home tasks and was left unanswered during the outage. The loss of faith in Alexas (they have 2 of them) which David described goes beyond technology realm and into psychological effects which extend beyond my field of expertise.
One conclusion could be that cloud computing is bad and should not be used. That would of course be the wrong conclusion, certainly when compared to outages in data centers. As I highlighted in the past, following simple guidelines can significantly reduce the impact of your cloud service to such infrastructure outages. If you are running a mission-critical system you may find that relying on a single cloud provider is not enough and may wish to use multi-cloud strategy to spread the risk and use disaster recovery policies between them. This will become increasingly important as the Internet of Things becomes ubiquitous in our homes and businesses, as heavily promoted by Amazon, Google, Samsung and the likes which combine IoT with their own cloud services.
One thing is for sure: if you connect your door locks to a cloud-based service – make sure you keep a copy of the good-old hard-copy key.
Follow Dotan on Twitter!