You're reading the Ruby/Rails performance newsletter by Speedshop.

Google says pages should load in 2.5 seconds or less or be penalized in search rankings. What does that mean for your Rails app speed?

How fast do your server responses need to be? What's a good server response time?

In order to answer this question for any given application, we need to back in to our required server response time by first looking at the high-level overview of the "transaction" that's taking place.

It's important to realize that (almost) all latency requirements come from human beings. Your customers click or otherwise trigger an input which causes a page to load or navigate. That leads to a server response, which then leads to some "browser frontend stuff": loading JS, CSS, executing JS.

So, any web application interaction has several steps, of which your server response time is only a small part. It's important to realize that your user has no idea that any of this is occurring. Your user only understands the beginning (the input that triggers the interaction) and the end (the end state they wanted to get to, say, the page is loaded or the form is submitted and response recorded). Everything in between in a black box. Your customers don't care if an an interaction spends most of its time executing JavaScript or waiting on the server, they care that the interaction is fast.

So that leaves "what is a fast interaction"? Google has set an interesting new line in the sand with the interaction of Core Web Vitals: 2.5 seconds to Largest Contentful Paint. Based on probably billions of Chrome telemetry points, Google has decided that, with a cold cache, 2.5 seconds or less between navigation trigger and the Largest Contentful Paint event is good. Less than 4 seconds is okay, and more than that is bad. This goalpost is the same on mobile and desktop devices.

So, how do we "back in" to a server response time target from here? We're going to walk backwards, along the critical path of what it takes to build a webpage.

Usually the last thing that happens before Largest Contentful Paint is layout and paint. For pages with reasonable size and complexity, this takes about 50ms on a desktop device, maybe 200ms on a mobile device.

What happens next is highly application dependent. Most applications will be waiting on JavaScript to download and execute. Other sites that are better architected will be waiting on CSS to download. In either case, usually the download and execution of critical JS and CSS resources is on our path. This can take anywhere from 500 milliseconds to 1.5 seconds. We'll use the upper bound for our budget.

Finally, the last thing on the path is the server response (the HTML document). This is what lists all of those critical resources for the next step, via script and link tags. We often call this last part of the critical path the "time to first byte", because it's the time between user input and receiving the first byte of the response (there is also the time to receive _all the other bytes_, but this is usually less important because HTML documents are small and the parser does not need to download the entire document to start parsing it).

Time to first byte is composed of three parts: the server response time (as exposed by the X-Runtime header in Rails), request queue time (how long it waited for an empty Puma/Unicorn/etc process), and network round trip (ping). Request queue time, if you've scaled properly, will be 10 milliseconds or less. Network round trip varies depending on whether or not they're on a mobile network or broadband. Nowadays, 100 milliseconds is a reasonable estimation for a domestic customer who doesn't have to cross any oceans to get to your server.

So, if we subtract all of the parts except server response time from 2.5 seconds, we're left with something like 600 to 700 milliseconds. As you can tell, it depends greatly on how much time is "left over" by the frontend. A lot happens between downloading the HTML document and actually painting the page. This is why I've always spent so much time on frontend perf in all of my books and products.

700 milliseconds is actually quite a lot of time. However, it's not the target for your average response time, it's the target for your 95th percentile response time. You want 95 percent of your customers or more to experience 2.5 second LCP times, not 50% or less! Hint: 700 milliseconds is a good value to set your Apdex thresholds to!

For a quick and dirty approximation, you can usually divide a 95th percentile by 4 to get the average. That means, in order to get a Largest Contentful Paint time of 2.5 seconds, our average response times should be approximately 175 milliseconds.

You can use this process to "back in" to requirements for other types of interactions too: smaller JSON-driven interactions in a React app, for example.

I hope you've found this discussion interesting and resulting requirement useful - see you next week.

-Nate

You can share this email with this permalink: https://mailchi.mp/railsspeed/how-fast-should-your-server-response-times-be?e=[UNIQID]

Copyright © 2021 Nate Berkopec, All rights reserved.

Want to change how you receive these emails?
You can update your preferences or unsubscribe from this list.