Blog Post

AWS Outage Ahead of Black Friday

Enterprises that had their applications or services running on AWS US East 1 were impacted by the outage, including some of the AWS products.

Catchpoint detected an AWS outage earlier today, 25th November 2020. Enterprises that had their applications or services running on AWS US East 1 were impacted. Even some of the AWS products were impacted by the outage.

We noticed Amazon services such as Athena slowing down at 5AM PST followed by intermittent HTTP 500 errors that started around 5:15AM PST.

However, by 5:30AM PST, the fallout from the outage was evident. We noticed drastic performance impact and a major outage started at around 7 AM PST.

Amazon updated their status page highlighting all the Amazon services that were impacted by this outage

Graphical user interface, applicationDescription automatically generated

Services relying on Amazon services could only wait for Amazon to resolve the issue.

The outage was not limited to Amazon services alone but also impacted enterprises using AWS US East 1 as their data center. However, they were able to mitigate the issue with little or less damage to their end-user experience.

For example, the enterprise below whose origin is AWS was also impacted by the outage. But the impact on their production environment was mitigated as the end users were hitting a CDN service that rerouted traffic to rectify the issue. Changing the origin mapping to an alternate origin to fetch the content helped maintain a better end-user experience.

This ensured the service was available for the end users, however, the end users did notice a drop in page load times.

Several SAAS solutions that were impacted by this outage notified their customers with an update they got from AWS. For instance, the Catchpoint Customer Success team uses GONG, a revenue intelligence platform, was also impacted by this incident and they sent us this communication:

The AWS outage also impacted several ecommerce websites, for example, GameStop’s page load time dropped because of third-party content that was being served from AWS. The scatterplot graph below illustrates the spike in load time during the outage.

The outage probably serves as a quick reminder for organizations to evaluate and verify infrastructure setup, including monitoring and failover strategies, before the upcoming Black Friday.

  1. Do you have monitoring set up to detect such outages?
  2. What is your back-up plan in a similar scenario?
  3. Are you relying on third-parties that may become a single point of failure?
Synthetic Monitoring
CDN
Cloud Migration
SLA Management
Workforce Experience
eCommerce
SaaS Application Monitoring
Enterprise
This is some text inside of a div block.

You might also like

Blog post

The cost of inaction: A CIO’s primer on why investing in Internet Performance Monitoring can’t wait

Blog post

Mastering IPM: Key Takeaways from our Best Practices Series

Blog post

Mastering IPM: Protecting Revenue through SLA Monitoring