Keeping up with dependencies like a boss - Pieter Vincken

Anything worth doing twice is worth automation.

What’s the problem?
What is Mend Renovate?
Behind the curtains
How to setup
Tips and tricks
Why should I use it?
Links

What’s the problem?

Lack of features

Let’s imagine you need to implement support for a new feature. Let’s imagine that that feature is super easy to implement due to almost native support for the functionality in a library you’re already using. That sounds like a great day, right? You add the code to glue together the API and the library, perform some tests and call it a day. There is just one problem, you didn’t check which version of the dependency you were using and the feature you need is only supported in versions 8.x.x and beyond. You check your pom.xml, only to figure out that you’re on version 6.8.21. Now you have two options, refactor 25% of your codebase to be able to use the new library or 5x your effort for implementing the feature without the support of the library. That doesn’t sound like a great day at the (home-)office, now does it?

Wouldn’t be great to have been on version 8.1.2 already?

Dependent systems

Let’s imagine another scenario. You’re using a library to connect to an Elasticsearch cluster that’s provided by another team. Your application is running nicely in production and you’re steadily adding features to the codebase. The end of the year approaches and with that, the pressure to deliver the final set of promised features increases. All of a sudden, your application starts throwing all kinds of errors in your development environment. After some stressful debugging, you figure out that the Elasticsearch cluster has been upgraded to the latest version by the other team. Furiously, you open up your mailbox and look for an email about the upgrade. Of course, you find the email dating back four weeks where the team announced that they will start upgrading this week for development and do production in two weeks.

Since it’s a shared system, you can’t fault the other team, nor can you halt their upgrade path. So the only way forward is to upgrade the library. You open up your pom.xml file only to discover that you’re already 2 major versions behind. Upgrading will take ages since your code depends on previously deprecated and by now removed API support.

That sounds like a lot of overtime, stress and/or missed deadlines for the end of the year, doesn’t it…

Security Vulnerability

Now we get to a scenario that maybe 50% of the Java community experienced at the end of 2021. A severe security vulnerability was discovered in a very popular logging library: Log4J. Normally, vulnerabilities aren’t this severe and there aren’t part of the nine o’clock news. Now imagine that you are using a vulnerable software component and you aren’t informed by the news that you need to urgently update, how would you know about the vulnerability? Maybe you have some scanning software that checks for know CVEs? Maybe you have really good developers that are subscribed to the mailing lists of all dependencies they’re using? Or more likely than not, you just don’t know you’re vulnerable. OWASP identified Vulnerable and outdated components as number six on their Top 10 Web Application Security Risks of 2021.

Wouldn’t it have been nice to have a PR on every repository that had the Log4J dependency with the new version updated? So that you only had to merge that PR to mitigate the vulnerability?

What is Mend Renovate?

Mend Renovate (formerly known as WhiteSource Renovate) is a tool that detects dependencies in your code and informs you about new versions of your dependencies. It’s a free (at the time of writing) tool that can be used as a GitHub App or as a self-hosted tool. Renovate scans a repository and detects the used dependencies, relying on dependency managers. As of time of writing, Renovate supports about 80 different dependency management systems out of the box. Next, it can integrate with your code repositories (e.g. Bitbucket, Github, Gitlab, …) and automatically update the dependency in your code. It can create pull requests (PR) for every update it detects and even adds changelog (if available) information to the PR. This is especially helpful if there are (manual) changes required to use the new version of the dependency. So Renovate can detect your dependencies, inform you about the updates and even tell you how to update.

Behind the curtains

Now, how does this magic work?

Renovate starts by scanning your source code for dependencies. It does this by looking for dependency files in your repository, let’s take the file structure below as an example.

repo/ 
├─ src/ 
├─ Dockerfile 
├─ pom.xml

Renovate will detect the Dockerfile and the Maven dependency file (pom.xml). It will pass these files to the dependency managers internally and detect which dependencies are being used. Let’s imagine that the Dockerfile has the following FROM line at the top: FROM amazoncorretto:18.0.0 The dependency manager for docker will detect that this is the repository amazoncorretto with tag 18.0.0 on Docker Hub. Next, it will look in the pom.xml.

Excerpt of the pom.xml:

<dependency>
    <groupId>org.apache.logging.log4j</groupId>
    <artifactId>log4j-core</artifactId>
    <version>2.12.2</version>
</dependency>

Note: Never use this dependency! This is a vulnerable version of log4j-core.

Renovate will detect the following Maven dependency: org.apache.logging.log4j:log4j-core:2.12.2

Next, Renovate will check for new versions of those dependencies. By default, Renovate will use the central Apache Maven Repository to check for newer versions of Maven dependencies. For Docker images, Docker Hub is used by default. Like almost everything in Renovate, you can configure it to use different data sources, like your own Nexus, Artifactory or AWS Elastic Container Registry for example. This way you can limit what sources are considered and you can also use Renovate to update internal dependencies instead of just public ones.

Now let’s imagine that the most recent version of the amazoncorretto image is 18.0.2 (This is latest at the time of writing). Renovate will create a branch from your main branch, update the FROM entry in your Dockerfile and push the changes into your repository as a new branch. Now your regular CI/CD process can be followed to build, test, merge and deploy your component with the updated dependency.

The same scenario, with a second branch, will be performed to update the log4j dependency.

As an additional bonus, if the dependency exposes it, Renovate will add the changelogs in the pull/merge request it creates in your repository. This makes it visible if any breaking changes were introduced and allows you to easily check, together with your tests, if you need to make changes to your code to update the dependency.

Example of changelogs embedded in PR.

PR with changelogs

How to setup

How to use Renovate depends a bit on where your code lives.

Renovate GitHub App

If you’re using GitHub, the setup is as easy as enabling the GitHub Renovate app for your repository. Next, Renovate will create an onboarding pull request where it will suggest a default Renovate configuration, show a preview of what dependencies it detected and an example of what updates it has found. The only thing left is to merge the pull request and renovate will automatically start scanning your repository and start creating branches and pull requests with dependency updates.

Self-hosted Renovate

The second option is to host Renovate yourself. This option also comes with a lot more configuration capabilities, especially w.r.t. private data sources. You can run Renovate as (part of) a pipeline, as a Docker container or even as a CronJob on Kubernetes. Since you only need NodeJS/NPM to be available, you can run it almost anywhere you want. You can find an example of how to run Renovate as a Cronjob on a Kubernetes cluster in this repository.

With the self-hosted deployment method, you need to either explicitly tell Renovate which repositories it needs to consider or configure it to auto-detect repositories based on the access of its user or the integration with your code repositories. For Bitbucket Server you can for example configure Renovate to automatically discover all repositories in a specific project. This prevents you from having to make changes to the Renovate configuration every time a new repository gets added.

In this scenario, you’ll have to provide Renovate an identity to interact with the code repositories as well, as it needs to be able to create branches and push code changes.

Tips and tricks

Aka things we discovered and/or went wrong while we started using Renovate.

Limit concurrent branches / PRs

An important tip, especially if your repository contains quite some outdated components, is to limit the number of concurrent branches and pull requests that Renovate is allowed to create. By default, Renovate is not limited to a specific number which might result in literally 10s if not over 100 pull requests being created. To prevent you and/or your developers from becoming overwhelmed by this, limiting the concurrent pull requests is a must!

Setup auto-merging

Auto-merging is a very powerful feature in Renovate, especially in combination with very good automated tests and continuous integration practices. Don’t enable this when starting with Renovate. Over time, you’ll be able to identify repositories and data sources for which updates are becoming as easy as just accepting the PRs. For these combinations of repositories and data sources, you can enable auto-merging in Renovate. This means that Renovate will perform the update to the code repository and if the result of the pipeline for that change is green, it will attempt to merge that PR. If you have a well-automated CI/CD process, this can allow Renovate to automatically update your dependencies and make sure that your software automatically has the latest dependencies.

Use it in deployment repositories

The term dependency is interpreted quite broadly in Renovate. Not only classical libraries and Docker base images can be detected. (aka build phase dependencies) It can also detect dependencies in your deployment setups. This means that it can detect updates in Ansible playbooks, update Helm Chart references, update Kubernetes manifests (including the API versions as shown below!) and even Kustomize and Terraform setups.

Enabling automated dependency management for those deployment repositories can greatly reduce the amount of effort that is required to update a newer external dependency. It also helps to make sure that an update is rolled out consistently across many different environments. Especially in a larger corporate context, it can be hard to determine what the latest version of a specific service is and on which environments it’s running or not. By enabling Renovate to detect the dependencies, it can easily inform the different teams/users of newer versions.

Example of automated Kubernetes API updates.

Kubernetes API updates

Start slow!

The last tip might sound a bit contradictory, but starting small with Renovate is key. You want to build confidence in the data sources you’re using and prevent your teams from being overwhelmed by Renovate PRs. A good way to start with Renovate is to enable it on just one or maybe a handful of low-impact repositories. This way, you can see what effect it has on your workflow and how teams react to having these PRs pop up in their change feed.

Why should I use it?

At the start of this blog post, we’ve discussed why updating dependencies is important. As discussed, bulk updating dependencies is an option, but it might be time-consuming and therefore might be bumped to the bottom of the priority list. Not to mention that you might miss important security updates if you need to “look” for the changes manually. Automating the update process and allowing a tool to automatically discover the updates, makes the update process a lot simpler and faster. If you already have a good CI/CD process that allows you to build, test, merge and deploy small changes easily, Renovate will save you tons of time and prevent you from running unsafe, outdated software.

And as one of my favorite sayings goes: anything worth doing twice is worth automating

Links

This blog post is a companion to a talk at the JOIN 2022 conference.

Feel free to reach out to me if you want to look into this solution.

Special thanks to the Unicorn team for helping with the blog post and automating the dependency updates!