customer story

Outbrain

Outbrain Uses Catchpoint Active Observability to Validate and Ensure World-class Uptime and Performance

Outbrain powers the discovery feeds on high-profile media channels around the world. Advertisers use Outbrain’s world-class Amplify platform to effectively reach targeted audiences on these third-party websites. By delivering on average more than 10 billion recommendations daily, the platform helps advertisers connect with consumers on the open web and helps site visitors discover relevant content. Meanwhile, Outbrain’s publishers, which include CNN, MSN, Sky NEWS, and others, can use Outbrain’s Engage solution to increase engagement. As part of its goal to deliver a seamless experience of its discovery feeds on many of the world's best-known media channels, Outbrain partners with Catchpoint to:

  • Obtain a global view of how its widget is performing beyond its own data centers.
  • Gain insight into the application layer.
  • Confirm its services are running as expected.
  • Ensure its APIs are connected.
  • Pinpoint specific performance problems and quickly get to the root cause of issues.
  • Understand historical trends in relation to availability and reliability.

Employees:
863 employees
Revenue:
$767 million
Headquarters:
New York, NY
Industry:
SaaS Application Monitoring

Our business depends on the trust between us and our publisher and advertiser partners. We want to be proactive and fast about resolving issues, and monitoring with Catchpoint is one of the key ways we can do that.

Guy Kobrinsky
,
VP Cloud Platform

Problem

Outbrain’s platform calls upon a number of technological capabilities to display links to paid and organic content on pages within third-party websites. The company designed its plug-in to load asynchronously on publisher sites, meaning it loads independently of the page content. In this way, Outbrain won’t slow page performance. That said, a number of factors can influence how quickly the widget loads.

Moreover, Outbrain’s products are growing increasingly sophisticated. Rather than simply display static content, the widget now enables a feed-like experience that can feature a carousel and video. Regardless of the geography, time of day, number of site visitors, and type of experience Outbrain is delivering, it needs to do so quickly and in a way that seamlessly blends into the page experience, without fail.

As Guy Kobrinsky, VP Cloud Platform for Outbrain, explains, “Our monitoring with Pingdom focuses on what happens in our data centers but we needed a global view of how our widget performs beyond our data centers.”

Solution

Outbrain previously used a third party for performance monitoring but it wasn’t sufficient. As Nir Kriss, head of production engineering at Outbrain, says, “The vendor wasn’t evolving to support new browsers and provide more sophisticated functionality.” After running Catchpoint in parallel with other service monitoring frameworks for a while, Outbrain completely migrated to Catchpoint. Outbrain leverages Catchpoint in a number of ways, including the following:

Validate platform performance. Catchpoint complements Outbrain’s use of Pingdom. The company uses Pingdom to run basic tests at the network level, and Catchpoint to monitor at the application layer. “With Catchpoint, we can confirm our services are up and running and APIs are connected. It also gives us a history of our uptime,” Kobrinsky says.

While existing partners know that Outbrain is resilient, the company can share its uptime history when striking new partnerships. “We can confidently say our service will never be totally down but specific components might experience a performance degradation. We can use Catchpoint to verify that even if something is not working as expected, our platform still performs as needed,” Kobrinsky says.

Empower developers. Outbrain’s engineers use Catchpoint to pinpoint problems, making it easier to figure out the root cause of issues. To help understand whether a problem is associated with a domain name system (DNS) server, content delivery network (CDN )provider, agent proxy, an internal router in the Kubernetes cluster, or an application related to R&D, Outbrain created four Catchpoint tests. These allow it to monitor performance from its data centers to site visitors. A graph for each test is displayed on a wiki page that Outbrain created for its on-call engineers. This makes it easy for them to see issues at a glance and understand next steps to take, such as escalating to the relevant development team. “The on-call team includes people from different teams, not all of whom are familiar with our infrastructure. This makes it easy even for them to zero in on issues within a minute,” says Kriss. Going forward, Outbrain will map dependencies between different test results to automatically route issues to the correct team based on thresholds, without using the wiki page.

Monitor APIs. Outbrain’s APIs behave differently depending on the business rules of the publisher’s site. The APIs might power a layout that looks like a carousel on one site or page and a feed or text recommendations with thumbnails on another. As Outbrain grows, its APIs will become more complex to go beyond a classic request-response model to be increasingly feed-like. Even then, it wants to make sure it can holistically monitor its services and application layer. “We want to make sure that we can monitor our variety of APIs and can do that with Catchpoint,” says Kobrinsky.

Integrate with key tools. Catchpoint has become an essential tool in Outbrain’s digital experience monitoring efforts, empowering the company to ensure its platform works as intended. In addition, Outbrain sees more value from other monitoring tools by integrating them Catchpoint. With Catchpoint integrated with Slack, Outbrain’s on-call teams are notified of pressing issues requiring their attention.

Meanwhile, seamless integration between PagerDuty and Catchpoint means Outbrain’s developer teams can easily pinpoint potential problems and confidently escalate to the appropriate team. “I appreciate the context from Catchpoint. With a better understanding of the reason for a problem, the on-call person doesn’t need to wake someone up in the middle of the night unnecessarily to try to troubleshoot,” Kriss explains.

Results

According to Kobrinsky, uptime is expected in Outbrain’s industry. The latest data from an industry DevOps site reliability engineering (SRE) survey found that uptime/availability and response time/performance were tracked by 85 percent and 77 percent of respondents, respectively. Since the company is measured by strict KPIs and needs to deliver revenues and good user experience, 100% uptime is imperative. Fortunately, Outbrain is quite good at pinpointing problems quickly. “We almost take this capability for granted because of Catchpoint,” he says.

To that end, Outbrain communicates potential and known problems to relevant stakeholders as fast as possible. “Our business depends on the trust between us and our publisher and advertiser partners. We want to be proactive and fast about resolving issues, and monitoring with Catchpoint is one of the key ways we can do that,” continues Kobrinsky.

In fact, Outbrain appreciates its strong working relationship with Catchpoint. “Catchpoint’s support team is incredibly responsive. In the five-plus years we’ve been a Catchpoint customer, the support team has never failed us and that’s important in a partnership like this,” concludes Kriss.