You're reading the Rails Performance newletter, written by Nate Berkopec of Speedshop.

On a personal note, I wrote on my blog last week about how the experiencing of appearing on American reality television 10 years ago affected me personally. Give it a read if you're struggling with a tough failure in your own life.

Why worker killers are more dangerous than you might think

Often when working with clients I run into people using Puma Worker Killer, or a similar plugin for Unicorn/Passenger.

These tools kill your application server's child processes when they exceed a certain threshold. Generally, they're either installed because someone thinks the use of these tools or settings is a "best practice", or they're exceeding the memory usage allowed by their hosting provider or box.

Application servers are tools like Puma, Unicorn or Passenger. Their job is to take your Rack-compatible application, like a Rails app, and serve it over HTTP. All of these servers I just mentioned use a design where they start your application in one process, and then call fork to create "child" processes, sometimes called "workers". These child processes are the things that actually listen on the socket and serve your requests.

So, the idea is that these tools kill any workers that exceed a certain memory threshold. It seems pretty harmless. You might even be wondering why Puma, for example, doesn't include a setting that does this for you, like Passenger does.

The problem with these tools is that they can go completely haywire, decreasing the response times of your app by 10x, and you may not even realize what's happening. These tools tend to "fail" silently, which is sort of by design, because they're turning what should be an error or exceptional condition (out of memory) into something normal (hitting restart, everything's fine again!).

Using an automatic worker killer is a bit like writing `rescue nil` at the end of a line of Ruby.

New processes (ones that haven't processed many requests) are always slower than old processes (ones that have served many requests). Internal caches, such as any memoized instance variables or Rails' statement cache, are warmed up and full. The routing table is built (it gets built on the first request in Rails). The effect is most pronounced on the first request vs the second (the second request is often 5x faster than the first) but Rails processes tend to speed up over the first 1000 requests or so.

The worst case scenario, which I've seen at several clients, is that their worker killers are killing and restarting processes so frequently that each process may only be serving a few requests before it is killed. The worst case I've seen was one where processes were able to serve just one or two requests before being killed. Removing puma_worker_killer in that case improved the performance of the application almost 10x.

To understand if this happening to you, you can check your logs or, if you're using New Relic, look at the number of instance restarts occurring. In general, if you're restarting application processes more than once per hour, you could improve overall response times by either removing the worker killer, changing its settings so that you're not restarting as often, or by reducing memory usage enough so that you don't need the worker killer anymore.

Next week, I'd like to talk about how to actually configure these tools in a sane way so that you're not shooting yourself in the foot, and then how to do the hard work of fixing memory issues so that you don't need them in the first place.

-Nate, The Speedshop, a Rails/Ruby performance consultancy

You can share this email with this permalink: https://mailchi.mp/railsspeed/worker-killers-puma_worker_killer-etc-the-silent-footgun?e=[UNIQID]

Copyright © 2019 Nate Berkopec, All rights reserved.

Want to change how you receive these emails?
You can update your preferences or unsubscribe from this list.