Copy
You're reading the Ruby/Rails performance newsletter by Speedshop.

Performance metrics need to flow from the user experience. Start from the highest possible level, and then work your way down.

One thing I get involved in a lot at clients is improving monitoring and observability of performance issues. Teams want to improve their understanding of where they sit with perf before they decide on work, priorities, and user stories. That's great and all makes a lot of sense - to do otherwise would be to work on gut instinct and guesses, which, as we know, is premature optimization.

However, there is one common mistake I see at almost every one of my clients when it comes to performance monitoring: starting with low-level metrics.

What are some low-level metrics I see prioritized too soon? They may even sound high-level to you:

1. Backend response times
2. Single-page-app route change times
3. Response times in a particular controller, endpoint or path
4. Background job service times (time to process a single job once it's popped from the queue)
5. Tests-per-second

The common thread between each one of these metrics is that a higher-level metric exists which could contain information that makes the lower-level metric useless. While it's not always the case (sometimes the higher level metric says "Hey, that lower-level metric really matters a lot), you can't know for sure without the higher level metric in front of you. It's a critical piece of information that you're missing.

Higher level metrics are ones which reflect the user's experience of latency. They reflect the user's perspective that your application is a black box, which makes pixels move on a screen in response to a keypress or a click. In different contexts and with different users, these high-level metrics can change. Consider the user experience of background job performance when the background job in question is a password reset email sender. The user clicks a button on a website, then waits in their email inbox for something to arrive. What are the lower-level components which, when combined, make up that experience?

Without a metric which reflects total end-to-end wait time from input to desired output, your metrics and dashboards are making assumptions about what is important. This is often true at companies which use backend performance monitoring without frontend performance monitoring (sometimes called Real User Monitoring). If someone decides the app needs to be faster, the only number in front of them to optimize is backend response times. But, what if the user is spending 2-3 seconds waiting for JavaScript to compile and execute, or is stuck waiting because the frontend is inefficiently grouping its AJAX requests? By working on backend response times, you are assuming that improving those times will be make a difference to end user experience. But without that high level metric of time from user input to user's desired output (page ready), you are just guessing.

This is not to say lower-level metrics are useless - of course not! But they must exist as part of a chain:
  1. The highest-level metric: end-to-end latency from user input to completion/rendering of the final state.
  2. 2nd-order metrics - these latency numbers, when combined, equal the 1st-order metric. For example, sending a password reset email is comprised of the time required to enqueue a background job, time spent waiting in a queue, time spent executing the job, and time spent by the email provider sending the email. Add all of these numbers up, and you should get something roughly equal to the high level metric.
  3. nth-order metrics. What are the many components of enqueuing the background job? As for most web responses, we have the time spent queueing the request, then executing Rack middleware, then executing the controller. You can keep building this chain down and down and down the layers of abstraction.
Don't put the cart before the horse. All latency metrics should "flow" from the highest possible order metric: the user's experience of input to output.

Until next time,

Nate
You can share this email with this permalink: https://mailchi.mp/railsspeed/the-performance-chain?e=[UNIQID]

Copyright © 2021 Nate Berkopec, All rights reserved.


Want to change how you receive these emails?
You can update your preferences or unsubscribe from this list.