One night in Bangkok (Image by the author)

How I used Mutation Testing to Improve my Code

And make my tests more honest

Dave Sag
ITNEXT
Published in
7 min readApr 28, 2019

--

A while ago I wrote a small utility called amqp-delegate that uses the standard amqplib library to allow the easy creation and invocation of remote workers via an aqmp message bus such as Rabbit MQ.

I wrote an article about this called ‘Delegating Work using NodeJS and AMQP’.

I was at the beach when I wrote it and was feeling pretty lazy. In order to get a nice green 100% coverage badge on my repo I cheated and used /* istanbul ignore next */ to completely ignore my work delegator’s invoke function.

In my tests I added a little TODO note:

and then I skipped the test.

I figured I’d tested this code with my integration tests so obsessing over unit test code coverage was just a waste of time. My code worked, it was therefore good code.

Then I read about a mutation testing library called Stryker Mutator and figured I’d add it to my library just to see what it might do for me.

What is mutation testing.

Mutation testing is a way of testing your tests. It’s easy, as outlined above, to cheat your test coverage reporting by skipping bits of code, but it’s also sometimes not obvious that your unit tests are not really doing their job.

Mutation testing breaks your code in clever ways, changing false to true, changing the values of strings and numbers, changing plus to minus, that sort of thing, and then runs your tests again and again for each mutation of your code. If the tests still pass despite the changes made to your code then your tests are considered to be broken.

In an ideal world none of your tests will survive your code being mutated.

Trying it out

Running my mutation tests showed huge swathes of red in my terminal as all the code that I’d told Istanbul to ignore got flagged, along with a bunch of other code that I’d assumed was well tested.

To give a more detailed example, here’s the code from the makeDelegator function I mentioned above.

Specifically here’s what the invoke function looked like. You can see why this is hard to unit-test.

The function works by registering a message consumer, then sending the data to the message queue. The message consumer waits until it gets the response with the correct correlationId and only then does it resolve or reject the promise.

The call to resolve or reject is buried deep in the response-message handler, making it very hard to unit test. Being something of a completist I just had to give it a go however.

Refactoring

The first step was to extract the response-message handler and test that in isolation.

The handler function needs access to the overarching promise’s resolve and reject functions as well as the correlationId to compare against the message’s own correlationId. I created the following curried utility function:

src/utils/messageCorrelator.js

This is easy to test. Just pass in stubs for the resolve and reject functions, and set up scenarios for when the correlationIds do or don’t match. Also, for completness, throw in a test for when the response message’s content is not parsable as JSON.

Now I had this utility I could pull out the invocation logic from the invoke function by making a simple curried invoker function as follows:

This uses the messageCorrelator and is very easy to test. There’s no branching logic in this function at all. The function doesn’t care if the resulting promise is resolved or rejected, so the tests are very simple.

In a number of tests use a test utility to create a fake channel that I can pass in instead of a real amqp channel.

I use proxyquire to stub out the messageCorrelator as I’ve already tested that separately, so the test looks like this:

Then I could refactor the original invoke function to be much simpler:

This is much easier to test.

I then applied the same sorts of decomposition to the other parts of the code that were previously very hard to test.

Conclusion

By adding mutation testing I was forced to go back and make my code more amenable to being tested, and as a consequence I’ve made it much more modular and much easier to reason about. The result is without a doubt better code, even though it actually works no better than the pre-mutation testing code.

The benefits are such that over the last couple of weekends I went back over all of the open-source code bases I maintain and added mutation testing to all of them, and fixed up all of the issues that the mutation testing revealed.

Links

Like this but not a subscriber? You can support the author by joining via davesag.medium.com.

--

--