Working Around Memory Leaks in Your Django Application

2019-09-19 Leaky pipes

Several large Django applications that I’ve worked on ended up with memory leaks at some point. The Python processes slowly increased their memory consumption until crashing. Not fun. Even with automatic restart of the process, there was still some downtime.

Memory leaks in Python typically happen in module-level variables that grow unbounded. This might be an lru_cache with an infinite maxsize, or a simple list accidentally declared in the wrong scope.

Leaks don’t need to happen in your own code to affect you either. For example, see this excellent write-up by Peter Karp at BuzzFeed where he found one inside Python’s standard library (since fixed!).

Workarounds

The below workarounds all restart worker processes after so many requests or jobs. This is a simple way to clear out any potential infinitely-accumulating Python objects. If your web server, queue worker, or similar has this ability but isn’t featured, let me know and I’d add it!

Even if you don’t see any memory leaks right now, adding these will increase your application’s resilience.

Gunicorn

If you’re using Gunicorn as your Python web server, you can use the --max-requests setting to periodically restart workers. Pair with its sibling --max-requests-jitter to prevent all your workers restarting at the same time. This helps reduce the worker startup load.

For example, on a recent project I configured Gunicorn to start with:

gunicorn --max-requests 1000 --max-requests-jitter 50 ... app.wsgi

For the project’s level of traffic, number of workers, and number of servers, this would restart workers about every 1.5 hours. The jitter of 5% was be enough to de-correlate the restart load.

uWSGI

If you’re using uWSGI, you can use its similar max-requests setting. This also restarts workers after so many requests.

For example, on a previous project I used this setting in the uwsgi.ini file like:

[uwsgi]
master = true
module = app.wsgi
...
max-requests = 500

Uwsgi also provides the max-requests-delta setting for adding some jitter. But since it’s an absolute number it’s more annoying to configure than Gunicorn. If you change the number of workers or the value of max-requests, you will need to recalculate max-requests-delta to keep your jitter at a certain percentage.

Update (2020-09-20): Added this section on the uWSGI Spooler, thanks to Jamesie Pic for pointing it out.

If you’re using the uWSGI Spooler for background tasks, you’ll also want to set the spooler-max-tasks setting. This restarts a spooler process after it has processed so many background tasks. This is also set in uwsgi.ini:

[uwsgi]
...
spooler-max-tasks = 500

Celery

Celery provides a couple of different settings for memory leaks.

First, there’s the worker_max_tasks_per_child setting. This restarts worker child processes after they have processed so many tasks. There’s no option for jitter, but Celery tasks tend to have a wide range of run times so there will be some natural jitter.

For example:

app = Celery("myapp")
app.conf.worker_max_tasks_per_child = 100

Or if you’re using Django settings:

CELERY_WORKER_MAX_TASKS_PER_CHILD = 100

100 jobs is smaller than I suggested above for web requests. In the past I’ve ended up using smaller values for Celery because I saw more memory consumption in background tasks. (I think I also came upon a memory leak in Celery itself.)

The other setting you could use is worker_max_memory_per_child. This specifies the maximum kilobytes of memory a child process can use before the parent replaces it. It’s a bit more complicated, so I’ve not used it.

If you do use worker_max_memory_per_child, you should probably calculate it as a percentage of your total memory, divided per child process. This way if you change the number of child processes, or your servers’ available memory, it automatically scales. For example (untested):

import psutil

celery_max_mem_kilobytes = (psutil.virtual_memory().total * 0.75) / 1024
app.conf.worker_max_memory_per_child = int(
    celery_max_mem_kilobytes / app.conf.worker_concurrency
)

This uses psutil to find the total system memory. It allocates up to 75% (0.75) to Celery, which you’d only want if it’s a dedicated Celery server.

Tracking Down Leaks

Debugging memory leaks in Python isn’t the easiest, since any function could allocate a global object in any module. They might also occur in extension code integrated with the C API.

Some tools I have used:

The standard library module tracemalloc.
The objgraph and guppy3 packages both pre-date tracemalloc and try to do similar things. They’re both a bit less user friendly but I’ve used them successfully before.
Scout APM which instruments every “span” (request, SQL query, template tag, etc.) with CPython memory allocation counts. Few APM solutions do this. Disclosure: I maintain the Python integration.

Update (2019-09-19): Riccardo Magliocchetti mentioned on Twitter the pyuwsgimemhog project that can parse uwsgi log files to tell you which paths are leaking memory. Neat!

Some other useful blog posts:

Buzzfeed Tech’s write up for a how-to guide using tracemalloc on a production Python web service.
Fugue’s write up also using tracemalloc.
Benoit Bernard’s “Freaky Python Memory Leak” post where he uses a variety of tools to track down a C-level leak.

Fin

May you leak less,

—Adam

Read my book Boost Your Git DX to Git better.

One summary email a week, no spam, I pinky promise.

Related posts:

Tags: celery, django