Make the computer do it!

Abstract

As software developers, we have all of the skills that we need to make the boring parts of our work the computer's problem! In this article, I explore how I think about if something is worth automating, and some of the approaches I use to save time in my software development process.

Does your work have slow and tedious tasks? Things that you want to be done, but don't want to do them? I'm talking about things like setting up a server to host the new version of your software, typing a ticket number into a Git commit or even making backups. While these tasks are important, I would rather be doing the more novel, interesting parts of my job.

Luckily, as software developers, we have all of the skills that we need to make the boring parts of our work the computer's problem! Usually, we apply those skills to automate business processes, like paperwork and calculating statistics. Why not use them to automate our own time sinks away? Not only does automating these jobs save time, but it also tends to make them less error prone. After all, bored humans make a lot of mistakes.

In a previous job, I worked as part of a "developer productivity" team. It was part of my role to find the common parts of our development process which could be improved with better tools, and then to build those tools. Let me share some tips with you on how I figured out what parts of my job were worth automating, and how I then went about building tools that improve my own software development process.

Is it worth the time it takes to automate?

The first question to ask yourself is:

Can I save time by automating this, or is it better to just do it manually?

You're trying to weigh up the total time that it takes you to do the task against the time it takes to automate it. Designing, implementing and maintaining a system to take over a repetitive task also takes time! If you only need to do something once, then you probably shouldn't bother automating it.

But time measurements can be deceptive! Sometimes if you're only doing something complex once a year, you'll forget how to do it. It will take much more time, and be much more error prone, to figure out how to do it again a year later, but running the script that you wrote last year could be fast.

Another way that time measurements are deceptive is if you work with others. It might be true that you only need to do something once, but everyone else in the company also needs to do it once.

Slow tasks that need to be done regularly, or tasks that you want to do more reliably, are the best candidates for automation. Saving five minutes on a task that you need to do every day can add up to a lot over time. And, since software isn't as corporeal as other types of tools, it's easy to share with your team! That adds up quickly to even more time that your team could be using somewhere else!

When figuring out whether a task is worth automating, the factors to consider are:

How much complexity could you hide by automating the job?
How often you do the job?
How long does the job take to do manually?
How much faster could the job be done with automation?
Who else on your team currently does this job?
How long would it take to automate?

Hopefully, there's some time saving between the time it takes to do something manually and to do it with better tooling.

As a software engineer, there's a few areas of the software development process which have become industry standard to automate, such as:

Testing that the software you wrote does what you think it does.
Doing everything you need to build a releasable package with a single command.
Releasing a new version.

Individuals and interactions over processes and tools

Something important to remember when deciding how to automate a task is the agile principle of prioritising individuals and interactions over processes and tools.

When you're specifically introducing tools to help make your team more efficient, it's important to remember that the people don't serve the tools, the tools should serve the people.

When you're designing a tool or a process that you want other people to use, set up the process so that people can opt into it. If people choose to pull a tool into their environment, rather than having it forced on them, they're much more likely to respond positively and actually get a benefit from it.

Another point to keep in mind is that, even though I'm talking mostly about software improvements here, sometimes you can get more benefits by coming up with a solution in the physical world. Don't discount pens and sticky notes stuck up on a whiteboard! I personally find that I do better proofreading of articles, like this one, if I print them out on paper and write on them. The better quality makes up for needing to go back and recapture my edits into the computer.

Look to existing solutions

The fastest program to write is one that someone else has already written.

This is the most common situation in terms of automating processes with software; you buy something or download it from the internet and just start using it.

Even if you're convinced that your problem is unique, you should spend some time researching it. Research is a really important first step in any problem solving situation, because you may find that a solution to your problem already exists, or you may find solutions to similar problems. In short, it will allow your designs to be informed by solutions that came before.

For example, a while back I decided that I wanted to set up my own Dropbox-alternative for syncing files between computers, by using Git to keep the version history of the files. I didn't follow the advice to do some background reading first, and jumped straight into trying to write my own script to automatically pull, commit, and push changes. It turns out that there are many edge cases to consider, like if someone is currently in the middle of resolving a merge conflict when your script runs. Luckily, I did eventually get around to doing some reading on the problem, and I found a really nice open source Git syncing script that already did exactly what I was trying to do.

The drawback of off-the-shelf solutions is that they often aren't perfectly customised to your exact situation. If you're only finding tools that almost fit your requirements, there are two approaches that you can take to customise your tools to your environment:

Choose tools that are easy to add features to, so that you can customise them until they perfectly match your environment.
Choose tools that do one thing, do it well, and are easy to compose together with other tools.

Let's quickly explore these two approaches.

Approach 1: Choose tools that are easy to extend

If you want to customise a tool to fit your situation, you need to have some way of adding your own features to it. In other words, the tool needs to be easy to extend.

You may ask, "How will I know if a tool I'm evaluating is easy to extend?" Typically, you will know because it will tell you how to extend it somewhere in its documentation. Different programs handle extensions in different ways, but one common pattern is to give you a way to call out to your own programs and scripts when certain things happen. For example, Git has Git hooks, which can call your program whenever somebody is doing something with Git, like creating a new commit.

As a rule of thumb, I've found that open source software is more likely to be easy to extend. I think it comes along with the nature of open source: you have the source code and you can do whatever you want with it.

One example of this is that, if you choose to use Linux, it's easier to automate the setup of your computer. One script can take you from the point of having a newly installed machine, to the point where you can start working on your project.

Everything in Linux can be driven from the command line, so it's much easier than Windows to write scripts that install programs and change system settings. This might seem like it wouldn't be worth the time, but in my current job we maintain a "dev setup" project which sets up your machine to work on our codebase. If we need everyone to change something about their development environment, all we do is add it to the scripts. This means that everyone can rerun the scripts to get the new changes, without needing to stop what they're doing to figure out what exactly needs to change on their machine. It also dramatically reduces the time for a person joining the company to get their development environment to the point where they can work on our project.

Approach 2: Choose tools that do one thing and do it well

You might recognise this as a quote from the Unix Philosophy for designing programs. The benefit of choosing tools like this is that the tools are easy to combine with other tools that also do one thing. You can do surprisingly many things with a combination of find, grep, and a bit of shell scripting.

I used this approach to build a script to convert my music collection to use Opus. The script uses a combination of:

Bash: For scripting the process, including some clever built in path manipulation.
FFmpeg: For doing the actual media conversion.
Find: For doing the same action on all music files.
Git: For syncing my collection between my devices.

The authors of these tools didn't specifically design them to handle converting my music collection in the way I wanted it converted. But they did design tools that are good at something, and can be composed together with other programs easily. I didn't need to write everything from scratch, I just needed to tie these programs together, and I could get the benefit of having my music collection automatically converted to use the codec I wanted, and synced between my various computers.

If you're on a Linux system, then I'd recommend that you take a look at the GNU Coreutils. It's a fairly large collection of programs that follow the Unix philosophy and are almost always installed on a Linux system. You've probably encountered many of them before without thinking about where they came from.

Work on the performance of your existing tools

Another area where you can save a good chunk of time is in optimising the performance of the tools that you already use.

Benchmark, improve, repeat

Back when I was on a developer productivity team, we had a fairly large codebase written mostly in Scala. Scala is unfortunately notorious for taking a long time to compile, and as our codebase grew our compilation times unfortunately become progressively worse.

The trick to improving compilation performance is the same as improving the performance of any application:

Gather performance metrics to see where you're waiting for your tools.
Find a way to improve the worst part.
Repeat.

Many modern build tools even have profiling built in, so gathering performance metrics might be as easy as just asking for them.

Improvement here doesn't necessarily even mean significant work. I found that our Sass stylesheets were taking a long time to compile. Digging a bit deeper, I discovered that the problem was that we were running an outdated version of the Sass compiler. I updated our Sass compiler to the latest version and the performance of the build immediately improved.

Saving a few minutes in our compilation time added up to a huge time saving when you consider the 50 developers we had at the time, all independently waiting for that compilation to finish before they could continue working many times a day.

Buy better hardware

As a software developer, you do a lot of things in front of the computer, and you can waste a lot of time if you have to wait every time you ask your computer to do something. An easy win with programs that take a long time to run on developers' machines is to buy better hardware.

Generally, the cost of paying a developer's salary is much higher than the price of high-end computers. Even if you just take it from a business expenses point of view, it makes sense to invest in better hardware that saves time for developers.

The computer hardware market moves quickly, so any recommendations on exactly what hardware to buy now is going to become dated very quickly. Whenever I'm upgrading my computer, I take a look at the reviews and buying guides on Tom's Hardware and aim to buy somewhere in the middle of the high-end range. My experience is that spending more on hardware up front means that it lasts longer before you need the next upgrade. Just be aware that the prices of the latest and greatest technology is usually a bit silly. If you're like me, you'll be looking at the stuff that was the latest and greatest a year or two ago and has become a bit cheaper.

Two specific components that I specifically feel need to be in software development machines:

A Solid State Drive (SSD): Your software development tools will need to read the many files that make up your project. SSDs will typically have a smaller total capacity, and cost more, but can read and write files significantly faster. I think this has luckily become the default for modern computers, but make sure it's big enough for your work.
Plenty of RAM: A good threshold is currently more than 16GB. If your computer can't fit everything that you're currently running into RAM, including your RAM-intensive dev-tools, then it will be forced to use some of your hard drive space as 'virtual RAM', which is much slower. The technical term for when this gets really bad is thrashing. You have enough RAM when this never happens on a normal working day.

Even if you're currently working on a laptop that your employer gave you, you can generally upgrade your performance with those two things without replacing the whole machine.

Making tools to last

So you have some tasks that you've automated, and you want to keep them working for a long time. There are some of the things you need to consider up front:

Don't tie yourself to a single operating system

Over the last five years, I've worked on Windows, Mac OS and Linux at various points. Each company, person, and project has had different ideas on what operating system should be involved. When I automate a repetitive task in my work day, I'd like it to function wherever I am currently working.

The trick to writing code that works on many different operating systems is to introduce an extra level of indirection. In other words, write programs using libraries and languages that already support multiple operating systems. If you're choosing a compiled programming language, make sure that you can compile it on many different systems. If it's an interpreted language, the language's runtime environment must support many different systems. If you're writing shell scripts, targeting standards like POSIX can make your scripts more portable.

That should be all you need to do to get support for multiple operating systems. Unfortunately, you'll sometimes find that the differences in operating systems thwart even the best tools. The only way that you'll find the places that break is to test your programs on all of the operating systems that you want to support.

Log your errors

If something that is usually automatic is broken, you will want to know that it's broken, and what part of it is broken.

In other words, you want to have clear error messages, and you want to have some way of getting them back to you. If you don't have this information, you'll have no way of fixing the problem and getting the automation working again. Worse, you might assume that it's still running when it isn't!

The nature of how the tool works will typically dictate how best to report errors. For programs running in the background on my personal computer, I've found that an effective way is to email myself any error messages. Sending an email to yourself every time there's an error might not be a good idea if you're encouraging everyone on your team to use your new tool, or if you're releasing it publicly (unless you really love emails and want to collect oodles of them).

Error handling is another place to remember that the tools serve the people, not the other way around. Make your error messages clear and specific enough that the person using the tool will be able to figure out what's going on, and how to fix it, without your help.

One example of a helpful error message is the one I get when I misuse the mkdir program in the GNU Coreutils. In this case, I've tried to create a directory that already exists.

mkdir: cannot create directory ‘example’: File exists

The error message tells me what the program failed to do, and why it failed. It's enough information that I can find and fix the problem that lead to the error.

Test your code, and make use of types

Every now and then, you're going to make a mistake, and your shiny new tool won't quite work the way you expected. It's inevitable. But, if it breaks spectacularly while someone is using it, and they're stopped from getting something done, they're going to be upset.

You can save yourself from this pain by discovering your mistakes while you're still creating the tool. The technique that I find valuable for this is the same as with any software engineering endeavour: Test your code!

Let me tell you the story of a poorly tested tool that I rushed together. I was improving my development environment by writing a Git hook that includes my current branch name in commit messages. My branch names usually include some reference back to a task tracking system, so it's useful information for anyone looking at the commit message later. My first stab at this Git hook was a hastily hacked together shell script. A few weeks later, this broke catastrophically from a problem that could have been prevented. If the branch name contained certain characters, the script would break because it was confusing the branch name being inserted and the logic for inserting the branch. As a general description of my problem, you could say that it was vulnerable to a code injection attack. If I'd written tests for that code, I would have discovered the problem before it suddenly stopped working.

This clearly wasn't a good position, so I later rewrote the same Git hook using Rust. A big difference in the approaches, between using a shell script and using Rust, is that Rust lets you be much stricter about what types things are. You can structure your program such that when you compile, it can verify that the program can't muddle up a branch name and your programming logic. This is sometimes referred to as making impossible states impossible.

I could have achieved the same effect by unit testing my shell script more thoroughly, or by adding extra linting tools. Unit tests are a useful tool to make your code more robust. I find that using a programming language with a strong type system, and using that type system effectively, compliments unit tests. The type system can verify during compilation that the rest of the program is calling your function appropriately. Unit tests can verify that your function is implementing the correct logic. This eliminates a class of test that are boring to write by moving the testing of those properties of your code into the compiler.

A strong type system is another layer of testing that your code does what you meant it to.

I've been using the Rust version of this hook for more than a year now, with no incidents of it breaking down from an unfortunately named branch.

What works for you?

These are the tips that I've found work well for me. The advantage of writing your own tools is that you can mould the approach to be something that works perfectly for you.

Generally speaking, if you have something that you need to sit and wait for, you should start thinking about how you can improve the situation. You can swap it out for a faster tool, improve the performance of the existing tool, or figure out a way of working that avoids the things you're waiting for entirely.

A final question for you: What part of your work day can you make a bit more efficient?

Evaluate if it's actually worth automating this thing. Not everything is worth the time and effort, especially if it's a once off.
Research existing solutions.
Try prototyping a solution to your problem.

It might not work perfectly the first time, but if you keep trying you will get better at it. And most of all, don't forget to have fun while you're doing it!

Resources

Other blog posts where I discuss things I've automated:
Generally useful tools to know about:
- GNU Coreutils - Widely installed set of tools that can be easily composed together to great effect
- Emacs - The thermonuclear word processor
- Ansible - A tool for automating computer setup (it's meant for managing many servers, but works just as well running directly on your own machine)
- Chef - Another option for automating computer setup
- Shell Scripting - Writing programs by composing together other programs
Resources on how to write better tested software:

Support

If you get value from these blog articles, consider supporting me on Patreon. Support via Patreon helps to cover hosting, buying computer stuff, and will allow me to spend more time writing articles and open source software.

The Localhost Podcast

I wanted to manage the process of syncing audiobooks from my computer to my phone better. The solution that worked well for me is to use a podcasting app and an RSS feed. This article explains why this works well for me, and how you can try it out.

Concatenating PDF files on Linux

This article just shares a shell script I use to concatenate PDF files together into one long PDF. Under the hood, the script is using Ghostscript.