Introduction to Dockerizing for Production

Improve your DevOps skills: learn an iterative process for Dockerizing your code.

Security scanners for Python and Docker: from code to dependencies

by Itamar Turner-Trauring
Last updated 02 Feb 2022, originally created 20 May 2020

You don’t want to deploy insecure code to production—but it’s easy for mistakes and vulnerabilities to slip through. So you want some way to catch security issues automatically, without having to think about it.

This is where security scanners come in. They won’t solve all your probems—you should still be using services that proactively point out insecure dependencies, for example. But it’s good to have some automated checks in your build or CI system to help catch problems.

For a Python application packaged in Docker, vulnerabilities can occur among other places in:

Your code.
Your code’s Python dependencies.
The system packages (Debian/RedHat/Ubuntu/etc.) included in the Docker image.

Let’s see how you can scan for vulnerabilities in each.

Scanning your Python code

The first place to catch security problems is in the code you’re writing. A useful tool for doing that is Bandit.

Consider the following deliberately insecure code:

import pickle
import sys
from urllib.request import urlopen

obj = pickle.loads(urlopen(sys.argv[1]).read())
print(obj)

If I run bandit against it, it catches a number of problems:

$ bandit example.py
...
>> Issue: [B403:blacklist] Consider possible security implications associated with pickle module.
...
>> Issue: [B301:blacklist] Pickle and modules that wrap it can be unsafe when used to deseria$ize untrusted data, possible security issue.
...
>> Issue: [B310:blacklist] Audit url open for permitted schemes. Allowing use of file:/ or custom schemes is often unexpected.
...

In the full output each of those warnings also points at the specific line of code where the warning applies. And as one would hope for a tool that needs to run in CI, finding security issues results in exit code 1, which will usually make your build runner fail the build.

When I wrote that example I was only thinking about the fact that pickle is insecure; I wasn’t expecting the URL scheme warning, but it’s a fair point—imagine accepting a URL in a web page form and then opening it. Automated security scanners are handy!

Another tool to look at is pysa, which is included in the Pyre type checker. It can trace values as they flow through your code to see if unsafe inputs are reaching particular functions.

Scanning your Python dependencies

Your Python application likely depends on many Python libraries; occasionally one of them will have a security vulnerability, and you’ll want to make sure you’re using the fixed version.

There are a number of services that will preemptively scan your code and open PRs with suggested updates; GitHub acquired and is integrating Dependabot, for example, so you can use it for free. But you may also want to do some scans in your own CI, as part of your build.

One tool to do that is Trivy, which can scan requirements.txt, Pipenv lock files, and Poetry lock files. If you want a non-zero exit code when vulnerabilities are found (and you do!), you should use --exit-code 1.

$ cat requirements.txt
$ trivy fs --exit-code 1 ./
INFO    Number of language-specific files: 1
INFO    Detecting pip vulnerabilities...

requirements.txt (pip)
======================
Total: 25 (UNKNOWN: 0, LOW: 1, MEDIUM: 11, HIGH: 10, CRITICAL: 3)
...

Scanning your Docker image

Your Docker image includes not only your Python code and its Python dependencies, but also system packages. For example, if you’re building on my recommended base image, the official python image, your application’s Docker image is based on Debian. Over time, Debian will ship security updates for various included packages, and you want to make sure your image includes those fixes.

Again, Trivy is a good tool to use here, since it can scan a Docker image for many kinds of security vulnerabilities, both system packages and programming language-specific packages.

The python:3.8.1-slim-buster image is obsolete, so it’s not getting security updates; so let’s run trivy against it using the less-verbose --light option. If you omit --light you’ll also get a summary of the vulnerability details.

$ trivy image --light --exit-code 1 python:3.8.1-slim-buster
...
Total: 199 (UNKNOWN: 8, LOW: 83, MEDIUM: 38, HIGH: 51, CRITICAL: 19)

+----------------|---------------------|---------------+
|    LIBRARY     |  VULNERABILITY ID   | FIXED VERSION |
+----------------|---------------------|---------------+
| apt            | CVE-2020-3810       | 1.8.2.1       |
+                +---------------------|---------------+
|                | CVE-2011-3374       |               |
+----------------|---------------------|---------------+
| bash           | CVE-2019-18276      |               |
+                +---------------------|---------------+
|                | TEMP-0841856-B18BAF |               |
+----------------|---------------------|---------------+
| coreutils      | CVE-2016-2781       |               |
+                +---------------------|---------------+
|                | CVE-2017-18018      |               |

... many more vulnerabilities here ...

199 vulnerabilities is quite a lot, but it’s something of an exaggeration. In particular, there are some security issues you can’t necessarily do anything about:

Some of the CVEs in the output above are quite old—one of them is from 2011. A maintainer may choose not to fix certain unimportant security problems.
Sometimes you will have security problems that have been reported, but not yet fixed in a released package.

If you want to filter both these categories out, you can use the --ignore-unfixed to filter those out; you probably want to use that by default.

$ trivy image --light --exit-code 1 --ignore-unfixed python:3.8.1-slim-buster
...
Total: 58 (UNKNOWN: 0, LOW: 0, MEDIUM: 27, HIGH: 25, CRITICAL: 6)
...

All the vulnerabilities in this sorter list can be fixed by correctly updating the system packages.

Other notes about trivy:

As mentioned above, it also supports scanning Python packages, via requirements.txt, Pipenv, or Poetry lock files.
Make sure to use v0.23 or later; older versions had some data sources restricted to non-commercial usage.

As an alternative to trivy you can use Anchore Engine or Claire + Klar, but unfortunately they’re less easy to setup. In most cases you image registry will also do security scans for you, so you can check those scans are pushing the image.

Setup a scanner today

Setting these scanners up is as easy as adding up a few extra lines to your build or CI configuration. Do that now, and you’ll hopefully get alerted in advance about at least some of the security vulnerabilities in your packaged application.

The concise and action-oriented guide to Docker packaging for production

Docker packaging for production is complicated, with as many as 70+ best practices to get right. And you want small images, fast builds, and your Python application running securely.

Take the fast path to learning best practices, by using the Python on Docker Production Handbook.

Free ebook: "Introduction to Dockerizing for Production"

Learn a step-by-step iterative DevOps packaging process in this free mini-ebook. You'll learn what to prioritize, the decisions you need to make, and the ongoing organizational processes you need to start.

Plus, you'll join over 7600 people getting weekly emails covering practical tools and techniques, from Docker packaging to Python best practices.