Copy
You're reading the Ruby/Rails performance newsletter by Speedshop.

Do you like money? If so, don't use Perf-M dynos on Heroku.

When you're on a cloud hosting provider, there are often several options provided for the size and type of server you can rent. For example, AWS has the m and c series instance types, Heroku has 4 or 5 different dyno types and sizes, and Google Cloud has a few different configurations as well.

When making this decision, there are a few parameters to keep in mind from a scaling cost effectiveness perspective:
  • CPU benchmarking
  • RAM per dollar
  • CPU cores per dollar
You can also, of course, compare these metrics across providers, but that's a more complicated decision, of which "cost effective scaling" is just one part. However, making the right choice between the various options your provider has can make a difference (sometimes a big one) in cost.

Why are these metrics important? They govern how much parallelism we are able to buy on a per-dollar basis.

When scaling a web service like a Rails app, we essentially want as many Ruby processes running our app as possible a given level of spending. RAM, CPU frequency/performance, and CPU count all govern this.

RAM is the constraint that Ruby apps hit most often. You can't add another Puma process to your server, for example, because each process takes up about 500 MB of memory, you have 2 GB, and you're already running 4 processes.

CPU core count is probably the most important determinant of the CPU resources available. CPU core count will always be roughly equal to "how many requests we can process, actively, in parallel". If we have more requests that need to run Ruby than we have CPU cores, we are in trouble.

In modern CPUs, there are physical and logical cores. Physical cores are, well, the actual physical CPU dies. We usually (depending on CPU architecture) have two logical cores per physical core. Logical cores are the result of hyperthreading and related technologies, which basically just allow the CPU to smartly schedule CPU tasks so that 2 threads can use a core roughly at the same time. In modern cloud parlance, we call hyperthreads/logical cores vCPUs. If you ever see a cloud provider talk about CPU core count, it's usually logical core count. More cores, more Ruby processes.

CPU speed is also important here. Recall Little's Law: the number of items in a queueing system is equal to the average time items spend in the system multiplied by the arrival rate of items. In a CPU, CPU tasks take a certain number of clock cycles to execute. If we can execute more instructions per second to run CPU tasks faster, we can reduce the number of active CPU tasks. This means that a machine with 2 CPU cores but a clock frequency of 1 Hertz has roughly the same CPU capacity as a machine with 1 core and a clock frequency of 2 Hertz. Frequency is no longer a great indicator of CPU speed, so instead I would just rely on benchmarks conducted of cloud providers to judge CPU power and speed.
 
With that in mind, let's take a look at Heroku's main dyno offerings: the 1x, 2x, Perf-M and Perf-L dynos:

Type     RAM      vCPU $/mo
1x       512MB    1?   25
2x       1GB      1?   50
Perf-M   2.5GB    2    250
Perf-L   14GB     8    500


The complicated thing about 1x and 2x dynos is that they're using a more complex CPU sharing scheme than Perf-M and Perf-L dynos. On performance dynos, you get unrestricted access to the vCPUs you're paying for. On Standard dynos, you're sharing CPU time with anyone else on the physical machine. It's unclear to me whether or not 1x or 2x dynos can reliably access more than one vCPU for any period of time, so that's why I've put a question mark next to their vCPU count.

Compare for a moment the cost-efficiency of Perf-M and Perf-L. For twice the price per month, Perf-L provides 5x as much RAM and 4x the number of vCPUs. That means the Perf-L dyno provides 2x the concurrency and scale per dollar as a Perf-M dyno.

Consider the following hypothetical scenario:

Our app processes 50 requests per second. Each request takes 0.3 seconds to process. We want 50% utilization of our available capacity, which will keep queue times reasonably low.

Following Little's Law, 50 requests per second multiplied by 0.3 seconds per request = 15. This means that in the long run, we'll be processing 15 requests in parallel at any given moment. So, we'll need at least 15 Ruby web processes. To get 50% utilization, we'll need double that number, so 30 processes.

Let's say each process uses 1GB of memory. The Perf-M dyno, therefore, can fit about 2 processes. The Perf-L could easily fit 8, maybe more depending on CPU usage, but for now, let's just say 8.

That means our Perf-M dynos are 2 processes each for $250 each, or $125 per process. Perf-L dynos are $500 each, or $500/8 = $62.50 per process.

Deploying 30 processes across Perf-M dynos at 2 processes per dyno will cost you $3750 per month. Deploying 32 processes across 4 Perf-L dynos will cost you just $2,000/month.

If you're ever going to run more than a single Perf-M dyno, the math always works out in favor of the Perf-L.

You should almost never use Perf-M dynos. Always upgrade to a Perf-L.

This is a problem more or less unique to Heroku, because almost every other cloud provider uses linear scaling: If you pay 2x more, you get 2x the RAM and 2x the CPU. Only Heroku, as far as I know, uses this strange pricing. If it scaled linearly, Perf-M dynos would cost something like $125/month instead.

So, until Heroku changes their pricing to be more reasonable: avoid Perf-M.
You can share this email with this permalink: https://mailchi.mp/railsspeed/performance-m-dynos-considered-harmful-to-your-wallet?e=[UNIQID]

Copyright © 2020 Nate Berkopec, All rights reserved.


Want to change how you receive these emails?
You can update your preferences or unsubscribe from this list.