Blog Post

Preparing for the unexpected: Lessons from the AJIO and Jio Outage

Explore the lessons from the AJIO and Jio outage, highlighting the need to prepare for the unexpected with Internet Performance Monitoring.

Just in the past few months, we've seen high-profile outages happen for all sorts of reasons—server glitches, network issues, and even that now-infamous configuration update that nearly broke the Internet. These incidents remind us how fragile our digital world can be. But it's not just software-related issues causing these disruptions. In the last week alone, two unfortunate incidents involving fires have taken down major websites in the Asia-Pacific region.

On September 10, 2024, a fire at a data center in Singapore caused a significant outage for Alibaba Cloud, affecting services for major companies like TikTok, ByteDance, and Lazada. This time, it's Reliance-owned AJIO and Jio websites that have been hit, leaving millions of users unable to shop, pay bills, or access services. Let’s get into what happened.  

What happened?

On September 17, 2024, Reliance Jio encountered a major network outage affecting customers across multiple regions in India and across the globe. The outage was initially noticed when users began encountering connection timeouts attempting to access both the AJIO and Jio websites. The outage was resolved around 05:42 EDT.  

Image preview
A screenshot of a computerDescription automatically generated
A screenshot of a computerDescription automatically generated

Traceroutes conducted from various locations indicated multiple hop failures, suggesting possible network issues along the route.  

A screenshot of a computerDescription automatically generated

Additionally, it wasn’t just these two sites that were affected. The Reliance Digital website also went down, displaying an error message processed through Akamai Edgesuite:

“An error occurred while processing your request.”

Broader impact

This incident did not affect only Jio's network. Our data reveals this outage also had a wider impact on other ISPs.  

A screen shot of a graphDescription automatically generated

This scatter plot above shows the widespread impact of the outage across multiple ISPs, indicating that the issue extended beyond Reliance Jio, affecting others like Airtel, Vodafone, and BSNL. The spikes in Ping Round Trip Time (RTT), indicate network delays. This suggests that the outage caused a ripple effect, leading to connectivity issues and delays across various networks.

A screenshot of a computerDescription automatically generated

This Sankey diagram shows the impact across multiple ISPs while reaching an endpoint. It highlights how the outage disrupted network flow between regions.  

Root cause

Reuters reported that a fire at Reliance’s data center caused the nationwide outage. A Reliance Jio spokesperson confirmed the outage and claimed the issue had been fully resolved.  

Impact and key learnings

With Jio leading India's telecom space with nearly 489 million subscribers, the impact of this outage was massive. Millions of users could not shop on AJIO, pay bills, or even access essential services. Frustration quickly spread across social media—imagine the outrage on X, the wave of memes, and the flood of negative comments from users demanding answers and solutions.  

This situation serves as a stark reminder for businesses that large-scale disruptions can happen anytime, even due to unexpected events like a fire. Preparing for such unforeseen incidents requires more than just reactive solutions. To mitigate the impact of similar disruptions in the future, companies need to focus on two key areas:

  • Gain full visibility across networks: The outage's impact on other ISPs demonstrates the importance of having visibility into the entire Internet Stack, which includes all network components, not just internal systems. Understanding the performance and health of external dependencies—such as CDN, DNS, and ISP networks—is crucial. This broader visibility helps companies quickly identify where issues are occurring and whether they originate within their own network or from external partners.
  • Proactive monitoring is key: This incident highlights the need for proactive monitoring across the Internet Stack. Early detection of issues like packet loss, latency spikes, and network congestion can allow companies to address potential problems before they escalate into full-blown outages.

Gain full visibility into your Internet Stack with Catchpoint

We’ve built our Internet Performance Monitoring (IPM) platform from the ground up to deliver deep and wide visibility into the Internet Stack, enabling you to find and fix disruptions before your business is affected. Our cloud-native platform ensures Internet Resilience across your organization with the following industry-leading features and capabilities:

  • Unparalleled worldwide and regional visibility through our Global Observability Network with over 2,700 nodes from more than 360 providers in 101 countries – with more being added all the time.
  • Proactive incident management so you can identify and resolve issues, proactively, across public and private networks and application layers, to enable IT teams to identify root cause and triage, fast.
  • AI-powered tools, including:
    • Internet Sonarso you can answer the question, “Is it me or something else?” quickly.
    • Internet Stack Mapfor instant awareness of critical service or application issues.

Learn more about preventing outages from our guide, or test drive Catchpoint for yourself in our guided product tour.  

This is some text inside of a div block.

You might also like

Blog post

Preparing for the unexpected: Lessons from the AJIO and Jio Outage

Blog post

Learnings from ServiceNow’s Proactive Response to a Network Breakdown

Blog post

Don’t get caught in the dark: Lessons from a Lumen & AWS micro-outage