Blog Post

The SRE Report 2025's Call to Action

Published
January 13, 2025
#
 mins read
By 

in this blog post

The SRE Report is now seven years old. I’ve had the honor and privilege of authoring it for the last five years. This 2025 version included working with some amazing individuals like Kurt Andersen and Denton Chikura. My heartfelt thanks go to them for shouldering the weight of what is both a labor of love and an often daunting, procrastination-inducing marathon of analysis.

I’d also like to thank all the contributors in our View from the field sections of the report: Martin Barry, Laura de Vesine, Dave O’Connor, Heinrich Hartmann, Robert Barron, and Sergey Katsev. Their perspective grounds the data in real-life contexts, bridging the gap between survey data and the day-to-day realities of Site Reliability Engineering.  

Check out this short intro video from myself, Kurt and Sergey,where we share what the report is all about and what you can expect.

Of course, the true heroes of the SRE Report are the SREs and reliability practitioners who take the time to respond to our survey. The report lives or dies by the truth of their responses. This year’s edition is no exception, reflecting input from professionals around the world, spanning a diverse range of company sizes, roles, and managerial responsibilities. Thank you for helping make the SRE Report the longest-running, most authentic, and accurate pulse of the reliability engineering community.

An honorable mention must go to the DORA Report. The SRE Report doesn't exist in a vacuum. It draws on trusted industry research, including insights from the 2024 DORA Report, to present a holistic view of key challenges, such as AI’s impact on toil (more on that later).

My most surprising discoveries from The SRE Report 2025

When we release the report, we are always asked “What was most surprising or controversial to you?” In no particular order, here are my most striking takeaways from this year’s edition.

Insight I: It’s official slow is the new down

Each year, we highlight an emerging trend in the field. In past reports, we’ve explored topics like Platform Operations, the concept of Total Experience, and the multi-party dilema. This year, we kicked off the report with something a little softer—investigating whether popular hallway expressions hold true in practice.

They say that if you repeat something loudly and often enough, it won’t be long before it’s widely accepted as true.  

When it comes to the phrase, ‘Slow is the new down,’ however, the majority of organizations are united on its validity. Most organizations (53%) agree with this expression, even though only 21% say they have heard it before.  

The implications for the wider industry are obvious: poor performance is now seen as equally harmful as outright downtime. Reliability is no longer about uptime alone; it’s about delivering consistent, fast experiences.

Insight II: Toil Levels Rise for First Time Ever (So Much for AI)

For seven years, the survey underpinning the SRE Report has relied on consistent methodology, allowing us to track trends over time. Among these, the "time spent" questions act as benchmarks for organizations. So when this year’s data revealed that toil levels had risen for the first time in five years, it was a wake-up call.

What makes this finding even more striking is that time spent on engineering and on-call activities remains steady. The obvious question is: why is toil rising?

Could it be AI?

Many had hoped AI would reduce toil, but reality, as always, is more complicated. The 2024 DORA Report suggests AI accelerates value realization, potentially leading us to fill newfound capacity with additional operational tasks.

Regardless, it’s a red flag. Toil above 50%—Google’s recommended ceiling—restricts organizations’ ability to focus on proactive development. Yet the data shows operational workloads creeping higher. If toil is rising without corresponding gains in system resilience or optimization, are we investing in the right areas? Or are we stuck firefighting, unable to break free to drive long-term value?

Insight VII: Acknowledge the Gap to Fix the Gap

If there’s one takeaway from this year’s SRE Report that challenges us all, it’s the stark gap in how reliability practices are perceived and implemented across organizational ranks. It is, to me, the most provocative takeaway from The SRE Report 2025. More than merely an insight, the need to acknowledge the gap to fix the gap is a call to action for the entire reliability engineering community.

One standout example of the misalignment is the question of testing reliability incident preparedness through chaos engineering exercises like simulated disruptions, failovers, and tabletop scenarios. While only 37% of respondents overall agreed that their teams regularly engage in these practices (the lowest agree score of this section), the breakdown by rank paints an even more striking picture.

The divide between individual contributors and higher management is sharp. Individual contributors and team leads report significantly lower levels of preparedness testing compared to senior management. This divergence suggests that while leadership may believe these practices are happening—or are a priority—the reality on the ground is very different.

And that’s just one example.  

A challenge to the community

The consequences of such a disconnect are far-reaching. The report states, “Without a clear, shared understanding of the current situation, it becomes difficult to set common goals and agree on the steps needed to achieve them. This misalignment can result in wasted resources, duplicated efforts, and missed opportunities.”

But this isn’t just a problem to lament—it’s an opportunity. The SRE Report 2025 is more than a collection of data points—it’s a reflection of where we are as a community and a roadmap for where we need to go. The gaps highlighted in the report provide a starting point for organizations to open up a conversation between leadership and practitioners, fostering a shared understanding of what’s really happening and what needs to change.  

It’s a challenge to all of us—individual contributors, managers, and leaders alike—to acknowledge these gaps, have the necessary conversations, and take action. Whether it’s addressing rising toil, rethinking how we define reliability, or addressing the disconnect between ranks, bridging the gap starts here.

Download The SRE Report 2025 or read it online (no registration required).

The SRE Report is now seven years old. I’ve had the honor and privilege of authoring it for the last five years. This 2025 version included working with some amazing individuals like Kurt Andersen and Denton Chikura. My heartfelt thanks go to them for shouldering the weight of what is both a labor of love and an often daunting, procrastination-inducing marathon of analysis.

I’d also like to thank all the contributors in our View from the field sections of the report: Martin Barry, Laura de Vesine, Dave O’Connor, Heinrich Hartmann, Robert Barron, and Sergey Katsev. Their perspective grounds the data in real-life contexts, bridging the gap between survey data and the day-to-day realities of Site Reliability Engineering.  

Check out this short intro video from myself, Kurt and Sergey,where we share what the report is all about and what you can expect.

Of course, the true heroes of the SRE Report are the SREs and reliability practitioners who take the time to respond to our survey. The report lives or dies by the truth of their responses. This year’s edition is no exception, reflecting input from professionals around the world, spanning a diverse range of company sizes, roles, and managerial responsibilities. Thank you for helping make the SRE Report the longest-running, most authentic, and accurate pulse of the reliability engineering community.

An honorable mention must go to the DORA Report. The SRE Report doesn't exist in a vacuum. It draws on trusted industry research, including insights from the 2024 DORA Report, to present a holistic view of key challenges, such as AI’s impact on toil (more on that later).

My most surprising discoveries from The SRE Report 2025

When we release the report, we are always asked “What was most surprising or controversial to you?” In no particular order, here are my most striking takeaways from this year’s edition.

Insight I: It’s official slow is the new down

Each year, we highlight an emerging trend in the field. In past reports, we’ve explored topics like Platform Operations, the concept of Total Experience, and the multi-party dilema. This year, we kicked off the report with something a little softer—investigating whether popular hallway expressions hold true in practice.

They say that if you repeat something loudly and often enough, it won’t be long before it’s widely accepted as true.  

When it comes to the phrase, ‘Slow is the new down,’ however, the majority of organizations are united on its validity. Most organizations (53%) agree with this expression, even though only 21% say they have heard it before.  

The implications for the wider industry are obvious: poor performance is now seen as equally harmful as outright downtime. Reliability is no longer about uptime alone; it’s about delivering consistent, fast experiences.

Insight II: Toil Levels Rise for First Time Ever (So Much for AI)

For seven years, the survey underpinning the SRE Report has relied on consistent methodology, allowing us to track trends over time. Among these, the "time spent" questions act as benchmarks for organizations. So when this year’s data revealed that toil levels had risen for the first time in five years, it was a wake-up call.

What makes this finding even more striking is that time spent on engineering and on-call activities remains steady. The obvious question is: why is toil rising?

Could it be AI?

Many had hoped AI would reduce toil, but reality, as always, is more complicated. The 2024 DORA Report suggests AI accelerates value realization, potentially leading us to fill newfound capacity with additional operational tasks.

Regardless, it’s a red flag. Toil above 50%—Google’s recommended ceiling—restricts organizations’ ability to focus on proactive development. Yet the data shows operational workloads creeping higher. If toil is rising without corresponding gains in system resilience or optimization, are we investing in the right areas? Or are we stuck firefighting, unable to break free to drive long-term value?

Insight VII: Acknowledge the Gap to Fix the Gap

If there’s one takeaway from this year’s SRE Report that challenges us all, it’s the stark gap in how reliability practices are perceived and implemented across organizational ranks. It is, to me, the most provocative takeaway from The SRE Report 2025. More than merely an insight, the need to acknowledge the gap to fix the gap is a call to action for the entire reliability engineering community.

One standout example of the misalignment is the question of testing reliability incident preparedness through chaos engineering exercises like simulated disruptions, failovers, and tabletop scenarios. While only 37% of respondents overall agreed that their teams regularly engage in these practices (the lowest agree score of this section), the breakdown by rank paints an even more striking picture.

The divide between individual contributors and higher management is sharp. Individual contributors and team leads report significantly lower levels of preparedness testing compared to senior management. This divergence suggests that while leadership may believe these practices are happening—or are a priority—the reality on the ground is very different.

And that’s just one example.  

A challenge to the community

The consequences of such a disconnect are far-reaching. The report states, “Without a clear, shared understanding of the current situation, it becomes difficult to set common goals and agree on the steps needed to achieve them. This misalignment can result in wasted resources, duplicated efforts, and missed opportunities.”

But this isn’t just a problem to lament—it’s an opportunity. The SRE Report 2025 is more than a collection of data points—it’s a reflection of where we are as a community and a roadmap for where we need to go. The gaps highlighted in the report provide a starting point for organizations to open up a conversation between leadership and practitioners, fostering a shared understanding of what’s really happening and what needs to change.  

It’s a challenge to all of us—individual contributors, managers, and leaders alike—to acknowledge these gaps, have the necessary conversations, and take action. Whether it’s addressing rising toil, rethinking how we define reliability, or addressing the disconnect between ranks, bridging the gap starts here.

Download The SRE Report 2025 or read it online (no registration required).

This is some text inside of a div block.

You might also like

Blog post

The SRE Report 2025's Call to Action

Blog post

Monitoring in the Age of the Internet: DEM, IPM, and APM—What You Need to Know

Blog post

2024: A banner year for Internet Resilience