Welcome to our ongoing series covering downtime blunders around the world (wide web). We first covered a series of downtime incidents in 2017 here. And now we're making it a regular occurrence here on the Downtime Prevention Blog at Blue Matador. Until we help rid the world of the evils of service interruptions, we'll report on some of the more notable or interesting incidents here.
Amazon Alexa - March 3, 2018
When Amazon ran its clever Superbowl ad in February, joking that Alexa had gotten "sick" and needed a replacement voice, the company didn't know they were foreshadowing a real event where Alexa lost her voice. Because Amazon runs on AWS (why wouldn't they?), when AWS hiccups, so does, um, Alexa. For Echo users around the world, Alexa remained silent when queried. Whether asking for the temperature or wanting to know the 12th digit of Pi, Alexa gave everyone the cold shoulder for several hours. And when systems came back online later in the day, Alexa seemed a little slower than usual, taking up to a minute to answer a simple question like "what's one divided by zero?" They really could've used some aws memory monitoring.
Redbox - Intermittently in March 2018
Redbox Instant (which you probably never heard of) was the Redbox's streaming service, replaced by Redbox on Demand in December 2017. So when Redbox.com went down, even for a few minutes, the DVD-rental company stands to lose a lot more than just a few minutes of uptime. Frustrated, impatient customers wanting to watch a movie from the comfort of their homes go elsewhere, like one of the other two dozen streaming services out there.
Redbox.com availability graph from CurrentlyDown.com
Snapchat - March 17, 2018
When St. Patrick's day comes around, people really like to go all out, like dyeing an entire urban river green. And social media comes abuzz with people celebrating. Enter Snapchat, the "fastest way to share a moment," except when it isn't. For a company that is susceptible to losing $1.3 billion in market revenue when an unhappy influencer tweets about their controversial UI redesign, having only a minute of downtime is liable to send the masses packing to alternative messaging platforms.
So when the social network experienced around 8 hours of intermittent downtime on St. Patty's day, with problems from everything like sending snaps (36% of reports) app not connecting to server (33%), and app doesn't start or crashes (30%), you could say the company wasn't very lucky.
Downtime doesn't have to happen. Ready for your organization to receive predictive recommendations that prevent service outages from starting in the first place? Launching end of Q1 2018, Blue Matador's recommendation engine for preventing downtime actually predicts future service issues and tells DevOps engineers what to do.