learn

Website Speed Optimization Tools

How do you choose the best website speed optimization tools for your organization? You need to select one that provides visibility into every factor that impacts website performance.

Multiple components work together when a webpage loads in the user’s browser. Here's a typical flow of events after a user enters a URL in a browser:

  1. The browser resolves the domain through the DNS server. 
  2. Static assets like HTML, JavaScript, and CSS are retrieved from the CDN or web server and traverse multiple Internet service providers to reach the end-user. 
  3. The web app loads in the browser. 
  4. The app then requests data from the backend API, which queries the database, returns the data, and updates the UI.

It is also important to note that your website’s performance depends on numerous interconnected components:

  • Multiple service provider networks
  • Various Internet protocols (HTTP, DNS, BGP, etc.)
  • Third-party services and APIs
  • Content delivery networks (CDNs)
  • Cloud services and data centers

Given the workflow complexity, optimizing web performance is more than just refining website code. Every aspect, from DNS resolution to CDN delivery, web app rendering, and API response, impacts speed. Therefore, when selecting website speed optimization tools, we must choose one that provides visibility into every step to enhance performance across the entire chain.

Single sign-on sequence diagram as a proxy for the transaction path

This article explores the different features of website performance optimization tools and discusses their significance in SEO and user engagement.

Summary of key measurement features of website speed optimization tools

Here are some measurements you can expect in advanced website speed optimization tools along the transaction path between the end users and the web servers.

Desired measurement collection feature Measurement description
Core Web Vitals (CWV) Helps measure Google's standard metrics for page experience, including interactivity (First Input Delay), speed (Largest Contentful Paint), and visual stability (Cumulative Layout Shift), to assess and optimize user experience.
DNS performance Monitors the complete DNS resolution chain by tracking query response times, nameserver health, record accuracy, and propagation times. Provides visibility into DNS-specific metrics like DNSSEC validation, zone transfer success rates, and the effectiveness of DNS-based routing decisions.
Network performance Evaluates the performance of network infrastructure and third-party services, including CDN performance, network transit paths, peering connections, and third-party API availability. Helps monitor metrics like latency, packet loss, and throughput across various service provider networks.
Emulated transactions Simulates multi-step transactions from different points across various service provider networks to gain a global perspective on end-to-end user experience.
Real user transaction Continuously monitors how real users interact with the website using live browser data. Also captures detailed information about user sessions, including page load times and errors.
Internet Performance Monitoring (IPM) Offers comprehensive visibility into all aspects of website performance by combining CWV, Synthetic Monitoring, RUM, API Monitoring, and Network Monitoring into a unified view. Helps identify and address performance bottlenecks across the entire Internet stack.

Critical layers of the Internet Stack

The Internet Stack encompasses all the elements required to deliver a digital experience to end users. It includes infrastructure basics such as routers, firewalls, and services operated by Internet Service Providers (ISPs), and core elements like BGP, DNS (domain name service), CDN (content delivery network), and SASE (Secure Access Service Edge). The rest of them could be individual vendors providing payment processing, analytics engines, video hosting platforms, and third-party libraries among others–all accessed via Application Programming Interfaces (API). The collection is a fragile web of interconnected services we take for granted.

Layers in the Internet Stack

To understand this better, consider a website that relies on more than just its server to load and display content. To serve requests, it interacts with multiple third-party services and servers. 

For context, the following image illustrates a node map of requests made when accessing NYTimes.com. This map shows the various third-parties and their communication points, the transmission of the data, and the operation of different domains in terms of speeds. 

A node map of requests made when accessing NYTimes.com

Internet Performance Monitoring (IPM)

Many website speed optimization tools focus on website design and architecture while neglecting the Internet and services provided by third parties. Similarly, traditional application performance monitoring (APM) tools primarily track server-side metrics. 

IPM overcomes these limitations by providing a comprehensive approach to understanding and optimizing digital service delivery across the Internet Stack. It encompasses everything from helping you inspect server performance, network paths, third-party services, and the final rendering in users' browsers. 

A matured IPM practice involves layering different monitoring approaches. Each layer adds deeper insights, helping you move from basic performance tracking to comprehensive observability across a website’s entire digital infrastructure. 

The following section discusses the measurements in detail and how the right tool can help you optimize a website’s speed. 

{{banner-42="/design/banners"}}

Core Web Vitals

Web Vital is a Google initiative to quantify end-user experience for web pages. Core Web Vitals are metrics that measure real-world user experience regarding a web page's performance speed, interactivity, and visual stability. For example:

  • Largest Contentful Paint(LCP) measures when page navigation appears complete to the site visitor. Google recommends an LCP of 2.5 seconds or less.
  • Interaction to Next Paint(INP) is the new responsiveness metric replacing First Input Delay (FID) in March 2024. INP assesses website responsiveness to user interactions throughout the entire page lifecycle. Google recommends an INP of 200 milliseconds or less. 
  • Cumulative Layout Shift(CLS) measures a page's visual instability as the customer sees. A good CLS score is 0.1 or less.

Core Web Vitals key metrics

Sites with optimized core web vitals have lower bounce rates and improved search rankings. Google conducted case studies that demonstrated how improving core web vitals led to an increase in website adoption. For example, Vodafone recorded an 8% sales increase by improving LCP by 31%.

Consider choosing website speed optimization tools that measure Core Web Vitals. Here are some example tools and the different ways they measure CWV.

Lighthouse - Chrome Dev Tools

Lighthouse (LH) is integrated into Chrome DevTools and can be accessed using Inspect in Chrome. In the "Lighthouse" tab, you can configure the audit settings and choose between a mobile or desktop audit, depending on your target audience.

To understand its features, let us take an example of inspecting performance metrics for a Wikipedia article through Lighthouse.

Chrome Web Vitals measured for wikipedia.org

While the Lighthouse scores are decent for a starter to gauge the site’s performance, the context isn’t deep enough. 

The report helps identify areas that need improvement on the website based on the best practices of Google. However, Lighthouse has limited capability of only providing a snapshot of the site’s performance at a single moment in time. Which means, it cannot report trends or provide deep analytics to reveal the bigger picture of how a site’s performance is evolving. 

Lighthouse reports are also based on measuring the performance of a site only in an ideal situation. In most cases, it is run either locally or in a simulated environment, which implies that it does not bring out the full nuances of the real-world user experience. As the performance of a page is measured in a controlled environment, key factors like network changes, different devices, and user locations may not be accounted for. 

Related read: Learn how you can use Catchpoint to run Lighthouse reports as a custom monitor. The solution reports against multiple URLs, capture the metrics that are important to you, and trend those metrics over time. 

{{banner-34="/design/banners"}}

Web Vitals API Library

Google's web vitals library helps measure Web Vital data from real users. This is an example code snippet for reporting your Core Web Vitals once your page is loaded/backgrounded. You can integrate the code below into your website to measure performance metrics from real users and analyze the results via your custom analytics API.

import {onCLS, onINP, onLCP} from 'web-vitals';

const queue = new Set();
function addToQueue(metric) {
  queue.add(metric);
}

function flushQueue() {
  if (queue.size > 0) {
    // Replace with whatever serialization method you prefer.
    // Note: JSON.stringify will likely include more data than you need.
    const body = JSON.stringify([...queue]);

    // Use `navigator.sendBeacon()` if available, falling back to `fetch()`.
    (navigator.sendBeacon && navigator.sendBeacon('/analytics', body)) ||
      fetch('/analytics', {body, method: 'POST', keepalive: true});

    queue.clear();
  }
}

onCLS(addToQueue);
onINP(addToQueue);
onLCP(addToQueue);

// Report all available metrics whenever the page is backgrounded or unloaded.
addEventListener('visibilitychange', () => {

You can collect data from real users but need your own analytics service and storage solution for analysis.

WebPageTest

WebPageTest is a Catchpoint product that gives a detailed breakdown of page elements so you can identify performance issues and improve the user experience.

Measuring page performance metrics using WebPageTest

With the help of a filmstrip reel view, you can visualize the progression of the page load over time. You can identify when the LCP occurs and how the layout shifts.

Filmstrip WebPageTest

The Waterfall chart provides a detailed breakdown of how different resources (e.g., scripts, images) impact the page load performance so you can diagnose what might be affecting LCP or other metrics.

Waterfall chart created using WebPageTest

WebPageTest offers a free-forever service and a paid tier with more advanced features that you can learn about here. Using the free service is as simple as pasting a URL in a field on the website.

Network transit monitoring

Network latency measures the time data travels between a user and a website. Portions of the traffic traverse the local area networks configured in the data center of public cloud providers, which is more directly under the control of the website service providers. The portions outside their control, such as the segments traversing the public Internet, are harder to monitor but can affect service quality just as much.

Factors like routing efficiency, congestion, and physical distance affect this latency. Measuring network transit latency involves analyzing packet loss, jitter, and round-trip times. A typical approach is to gather telemetry data and gradually improve network KPIs and capacity planning. Even if you don't fully understand the technical details, there are many tools that can reveal if network issues are impacting your web app. Once you know there's a problem, you can work with network experts to diagnose and fix it. 

{{banner-39="/design/banners"}}

Core Internet services layer monitoring  

A monitoring tool that covers the core Internet services should include the capabilities listed below. Note that this list isn’t exhaustive but highlights the most common services that could lead end-users to experience sub-optimal website performance.

Service Importance Example failure Testing parameters
CDN (Critical) Vital for fast website loading, especially for geographically distant users. Caches content closer to users. CDN fails to optimize images, slow loading times due to inefficient caching/compression of JavaScript files. Cache hit ratios, geographic distribution analysis, content type performance (images, videos, dynamic/static content).
DNS (Critical) The Internet's address book, translating domain names to IP addresses for website connections. Slow loading times, website downtime, or users misdirected to potentially malicious sites due to incorrect records. Response times, record accuracy, server health.
FTP (Essential) Enables file transfers, often used for backend synchronization or user uploads/downloads. Failed file transfers, data corruption, security vulnerabilities, outdated content, or user errors when transferring files. Connection speed, file integrity, secure authentication/authorization.
NTP (Essential) Ensures accurate time synchronization across servers, crucial for security and data consistency. Disrupted server communication, application errors, and security vulnerabilities due to clock drift. Time accuracy, server connectivity, stratum levels.
WebSocket (Essential) Supports real-time communication features like chat, online games, and collaborative tools. Dropped connections, delayed messages, and poor user experience in real-time applications. Connection establishment, message delivery, error handling.
SSL (Essential) Secures website connections with encryption, protecting user data and building trust. Browser warnings, deterred users, damaged site reputation due to expired/invalid certificates. Certificate validity, protocol support (e.g., TLS 1.3), cipher strength.
SMTP (Essential) Enables email communication for user registration, password resets, and notifications. Failed notifications, password reset problems, and disruptions in user registration due to email sending failures. Email delivery, authentication, error handling.

API monitoring

API monitoring is essential for data-driven web applications, as API issues can negatively impact user experience. High API latency also degrades Core Web Vitals and other performance metrics. 

When selecting a tool, you should ensure it can measure latency, as typically done in API testing, and track key metrics such as response time, error rate, consistency, availability, and throughput in near real-time. Additionally, the tool should allow the creation of custom dashboards tailored to specific use cases to monitor these API metrics.

The parameters above can serve as key performance metrics for establishing Service Level Objectives (SLOs). For instance, when evaluating latency, measuring data as percentiles is useful. These percentiles represent the percentage of API requests that meet specific response time thresholds so you can identify outliers and provide insight into worst-case scenarios.

Percentile Description
P99 The 99th percentile, i.e., 99% of response time for API, falls before this value. P99 is an excellent measure to figure out the peak response time for your service.
P95 95th percentile - here, the data is more uniform.
P75 75th percentile is useful for front-end applications as it represents a variety of users.
P50 This shows the typical performance of the API.

Similarly, error rate should be categorized into 3xx, 4xx, and 5xx ranges. While 4xx errors typically indicate client-side validation issues, a high volume suggests that the user interface may lack sufficient clarity, leading to difficulties with form input and user interactions.

API monitoring & alerting dashboard (Source)

We recommend reading the article API monitoring best practices to learn more.

Best Practices Benefit
Set appropriate SLOs Clearly define KPIs and set Service Level Objectives (SLO) to have a clear objective for an API’s performance.
Continuous real-time monitoring Monitor APIs 24/7 to detect early issues during non-working hours to catch problems before peak times.
Set up alerts and reporting Configure alerts and escalation procedures.
Integrate monitoring with the CI/CD pipeline Don’t let monitoring be an after-thought, and include it as part of the code release process.
Understand the user's perspective Don’t measure API latencies in isolation but in the context of the end-user experience.

Synthetic monitoring

Your website speed optimization tool should simulate user interactions to assess performance across different services and protocols. This way, you can test your website for different traffic scenarios and discover any problems before the feature is released. 

Some synthetic monitoring capabilities to look out for include:

  • User journey transaction testing—Record or script sequential steps to capture a complete web-based business transaction, including login and SSO.
  • HTTP monitoring—Test the base page only for availability using HTTP GET or emulate a page with and without executing JavaScript.
  • Mobile simulation—Simulate mobile users by platform (Android, iOS), browser (Safari, Chrome), or wireless speed (3G, 4G).
  • CDN monitoring assesses CDN server performance from various geographic locations to distinguish localized and global incidents.
  • Browser testing helps replicate the end-user experience across various devices, OS, and browser versions, addressing user-generated UI issues. 

{{banner-36="/design/banners"}}

Global reachability

Effective synthetic monitoring ensures your digital services are accessible from various geographic locations and network conditions. For example, a website may be accessible from Asia but have high latency when accessed from Europe. To uncover problems in a specific geography or from a specific service provider’s network, a tool with thousands of nodes as vantage points can help test and mitigate the problem.

Depending on your service, you may need to check if the vendor offers points of presence (points from which the quality of service is measured), including public service nodes, Internet backbone nodes, and international nodes. Without points of presence across the transaction path from multiple wireline and wireless service providers, you can’t triangulate and  pinpoint the segment of the transaction path that’s causing the performance problems. 

Real user monitoring (RUM) 

Synthetic monitoring uses simulated transactions to test performance. In contrast, RUM collects live data on performance metrics (like page load times), user interactions (such as clicks and navigation paths), and errors (like JavaScript issues) from the browsers of website visitors. Real user monitoring (RUM) tracks how users interact with a website or application by embedding JavaScript code or similar scripts into web pages. The data is then sent to a monitoring server for analysis.

With RUM data, you can:

  • Identify slow pages and bottlenecks to improve speed and responsiveness.
  • Discover navigation patterns and drop-off points to enhance design and functionality.
  • Quickly address bugs and performance issues in accurate user reports.
  • Evaluate changes and A/B test results to refine user experience.
  • Set up notifications for performance issues and generate insights for strategic decisions.

What RUM metrics do you want your website speed optimization tools to capture? Here are a few of the most important ones.

Time to First Paint/Render

Time to First Paint (TTP) marks when the browser first renders any page content. It measures page speed as perceived by users. It depends on the Critical Rendering Path—the steps the browser follows to convert your HTML, JavaScript, and CSS into pixels on the screen.

Steps to be completed for the user to see content on the screen (source)

Document Complete

Document Complete is the point in time when all static content on a page has finished loading. This is a key milestone in page load performance, indicating that the main content is ready for user interaction. 

The Navigation Timing API provides detailed timing information about a document's loading. It allows you to measure various stages of the page load process and helps identify potential bottlenecks.

Chart showing Document Complete metric data

Visually Complete

Visually Complete represents the point at which the page appears fully rendered to the user. It’s crucial because it determines when users perceive the page as ready, even if some background tasks are still ongoing.

The Resource Timing API allows developers to track the load times of individual resources, such as images, CSS, and JavaScript files. By monitoring when all critical resources have loaded, you can determine when the page is visually complete.

Time to Interactive

Time to Interactive (TTI) quantifies how long it takes for a page to become fully usable, where all major content is visible, and the page can handle user interactions seamlessly. A high TTI often indicates that the page is busy processing background tasks, leading to delays in user interactions. This creates user friction, where the user experience is negatively impacted due to slow responsiveness. 

TTI can be assessed by monitoring the time it takes for the page to respond promptly to user inputs.

{{banner-37="/design/banners"}}

Example of using RUM to detect performance issues

An EdTech team notices a concerning drop in conversion rates for their free trial offer. Despite a steady traffic flow, users abandoned the site before using the free trial button. Using website speed optimization tools like Catchpoint, with RUM capabilities, they discovered two critical issues: 

  1. The free trial button was placed at the bottom of the homepage, leading to poor visibility.
  2. The page load time was slow, particularly on mobile devices. 

This combination resulted in a high bounce rate and a 25% drop-off rate before users encountered the button. For instance, out of 1,000 visitors, 250 exited the site before reaching the button, resulting in a substantial loss of potential revenue.

The team moved the free trial button to more prominent locations—at the top, mid-page, and footer of the homepage. They also optimized the page load time by compressing images and enhancing server response. These improvements led to a 15% increase in conversion rates, with 30 users signing up out of 1,000 visitors (up from 20), translating to an additional $5000 in potential revenue. User engagement with the button increased by 20%, and the drop-off rate decreased by 25%. Overall, these changes reduced bounce rates and enhanced user experience.

This demonstrates the effectiveness of RUM in optimizing user interactions. The team achieved significant gains in user engagement and conversion rates, resulting in increased revenue and a more effective website.

Example: How IPM can help resolve Internet Stack performance issues

Consider a fictitious company called LearnSphere, which offers an online learning platform students use worldwide. One morning, students in Europe and Asia report that they can't access the platform at all, while those in North America experience intermittent connectivity. 

The IT team at LearnSphere uses an IPM tool to diagnose the issue. The tool shows normal BGP routing and CDN performance but synthetic tests detected significant increases in DNS query response times for a critical third-party service responsible for interactive exercises and real-time student feedback. 

Drilling down further, the IPM system's correlation engine identified that a failed DNS zone transfer had created a critical service degradation. One of the authoritative DNS servers serving European and Asian regions contained outdated DNS records, causing widespread resolution failures. When the platform attempted to connect to the third-party service, it received incorrect IP addresses, resulting in connection timeouts and service failures.

The above example highlights the importance of monitoring the entire Internet Stack, including dependencies, and how an IPM tool can help.

{{banner-35="/design/banners"}}

Conclusion

While Application Performance Management (APM) tools are necessary for monitoring the application code and backend services, a comprehensive approach involves assessing the Internet Stack through Internet Performance Monitoring (IPM). The most effective website speed optimization tools let you evaluate performance bottlenecks across the end-to-end user transaction path.

Catchpoint’s IPM solution uses the world’s largest Global Observability Network to provide real-time insights into external services affecting your application’s performance. You get a unified view of web performance, combining data from all aspects of your Internet Stack, including RUM and synthetic test sources. You can optimize website speed with confidence by choosing to fix those performance issues that are in your control and deliver maximum value to your site visitors. 

What's Next?