To Mock, or Not to Mock, That is the Question

In this post I’m going to provide a definitive and final answer to the question: “are mocks in my unit tests good or bad?”. Given the years-long debates about the utility of mocks and full-fledged flame wars over the same question, this goal is truly aspiring. But I shall not be afraid, and I shall state the ultimate truth for the generations to come :).

The way I’m going to structure this post will be a bit unusual, though. See, I’m going to review and discuss several opinions on mocking stated by other developers. Hopefully, this way I’ll be able to cover this subject in the most comprehensive way. That said, this also means that you’re about to read a long article.

So, without further ado, let’s start.

Mocking Frameworks are Bad

The first resource I’m going to review is an article titled Testing Kotlin Lambda Invocations Without Mocking. The author of this article states that “we generally want to avoid using mocks in unit tests” due to the following reasons:

Therefore, instead of this unit test that uses Mockito-Kotlin:

private val dataLoader = DataLoader(...)

private val testData = Data("result")

@Test
fun `fetcher is not executed when data associated with the query exists in cache`() {
    val mockFetcher = mock<(String) -> Data> {
        on { invoke(any()) }.doReturn(testData)
    }

    dataLoader.load("query", mockFetcher)

    // fetcher should be invoked the first time
    verify(mockFetcher, times(1)).invoke(any())

    clearInvocations(mockFetcher)

    dataLoader.load("query", mockFetcher)

    // fetcher should NOT be invoked for the same query the second time
    verify(mockFetcher, times(0)).invoke(any())
}

the author proposes to write the following test which uses manually defined “fake”:

private val dataLoader = DataLoader(...)

private val testData = Data("result")

@Test
fun `fetcher is not executed when data associated with the query exists in cache`() {
    var invokeCount = 0
    val testFetcher = { _: String ->
        invokeCount++
        testData
    }

    dataLoader.load("query", testFetcher)

    assertThat(invokeCount).isEqualTo(1)

    dataLoader.load("query", testFetcher)

    // invokeCount should NOT increment
    assertThat(invokeCount).isEqualTo(1)
}

As you can see, in this case, the argument against mocking is basically an argument against mocking frameworks. The author claims that mocking frameworks have several drawbacks, which I’ll address shortly. Then they correctly point out that you can write all “fakes” yourself and you don’t strictly need mocking frameworks.

Robert Martin, also known as Uncle Bob, expressed a similar sentiment in his post When to Mock. However, Uncle Bob acknowledges that dislike of frameworks is mostly his personal preference. In no place did Uncle Bob say that mocking frameworks “couple test code and production code”, which is the most serious “accusation” made by the author of this article.

So, are mocking frameworks bad? In my opinion, just like any other tool, they can be abused. When this happens, they’re bad. However, I find them very handy in many situations.

For example, consider the following simple implementation of an event bus:

public class EventBus {

    public interface Subscriber {
        void onEvent(Object event);
    }

    private final List<Subscriber> mSubscribers = new ArrayList<>();

    public void publishEvent(Object event) {
        for (Subscriber subscriber : mSubscribers) {
            subscriber.onEvent(event);
        }
    }

    public void subscribe(Subscriber subscriber) {
        if (!mSubscribers.contains(subscriber)) {
            mSubscribers.add(subscriber);
        }
    }
}

and the associated unit tests:

@RunWith(MockitoJUnitRunner.class)
public class EventBusTest {

    private static final Object TEST_EVENT = new Object();

    @Mock EventBus.Subscriber mSubscriber1;
    @Mock EventBus.Subscriber mSubscriber2;

    EventBus SUT = new EventBus();

    @Test
    public void publishEvent_subscribersNotifiedOnceInOrder() throws Exception {
        // Arrange
        SUT.subscribe(mSubscriber2);
        SUT.subscribe(mSubscriber1);
        // Act
        SUT.publishEvent(TEST_EVENT);
        // Assert
        InOrder inOrder = Mockito.inOrder(mSubscriber1, mSubscriber2);
        inOrder.verify(mSubscriber2).onEvent(TEST_EVENT);
        inOrder.verify(mSubscriber1).onEvent(TEST_EVENT);
    }

    @Test
    public void publishEvent_multipleSubscriberRegistration_subscriberNotifiedOnlyOnce() throws Exception {
        // Arrange
        SUT.subscribe(mSubscriber1);
        SUT.subscribe(mSubscriber1);
        // Act
        SUT.publishEvent(TEST_EVENT);
        // Assert
        verify(mSubscriber1).onEvent(TEST_EVENT);
    }
}

Here I use Mockito to mock event bus’ subscribers during the test. Does this couple the test to the production code any more than if I’d write these mocks by hand? Absolutely not. Furthermore, rewrite of these tests to manual “fakes” wouldn’t make them more readable. On the contrary, ordering verification would probably end up being much uglier than what I have right now.

So, in this case I claim that Mockito is the simplest and the most readable way to unit test this component. Now I want to show you an example of Mockito usage which I wouldn’t write in production.

Take a look at this test class (it’s part of my unit testing course):

@RunWith(MockitoJUnitRunner.class)
public class FetchCartItemsUseCaseTest {

    // region constants ----------------------------------------------------------------------------
    public static final int LIMIT = 10;
    public static final int PRICE = 5;
    public static final String DESCRIPTION = "description";
    public static final String TITLE = "title";
    public static final String ID = "id";
    // endregion constants -------------------------------------------------------------------------

    // region helper fields ------------------------------------------------------------------------
    @Mock GetCartItemsHttpEndpoint mGetCartItemsHttpEndpointMock;
    @Mock FetchCartItemsUseCase.Listener mListenerMock1;
    @Mock FetchCartItemsUseCase.Listener mListenerMock2;

    @Captor ArgumentCaptor<List<CartItem>> mAcListCartItem;
    // endregion helper fields ---------------------------------------------------------------------

    FetchCartItemsUseCase SUT;

    @Before
    public void setup() throws Exception {
        SUT = new FetchCartItemsUseCase(mGetCartItemsHttpEndpointMock);
        success();
    }

    private List<CartItemSchema> getCartItemSchemes() {
        List<CartItemSchema> schemas = new ArrayList<>();
        schemas.add(new CartItemSchema(ID, TITLE, DESCRIPTION, PRICE));
        return schemas;
    }

    @Test
    public void fetchCartItems_correctLimitPassedToEndpoint() throws Exception {
        // Arrange
        ArgumentCaptor<Integer> acInt = ArgumentCaptor.forClass(Integer.class);
        // Act
        SUT.fetchCartItemsAndNotify(LIMIT);
        // Assert
        verify(mGetCartItemsHttpEndpointMock).getCartItems(acInt.capture(), any(Callback.class));
        assertThat(acInt.getValue(), is(LIMIT));
    }

    @Test
    public void fetchCartItems_success_observersNotifiedWithCorrectData() throws Exception {
        // Arrange
        // Act
        SUT.registerListener(mListenerMock1);
        SUT.registerListener(mListenerMock2);
        SUT.fetchCartItemsAndNotify(LIMIT);
        // Assert
        verify(mListenerMock1).onCartItemsFetched(mAcListCartItem.capture());
        verify(mListenerMock2).onCartItemsFetched(mAcListCartItem.capture());
        List<List<CartItem>> captures = mAcListCartItem.getAllValues();
        List<CartItem> capture1 = captures.get(0);
        List<CartItem> capture2 = captures.get(1);
        assertThat(capture1, is(getCartItems()));
        assertThat(capture2, is(getCartItems()));
    }

    @Test
    public void fetchCartItems_success_unsubscribedObserversNotNotified() throws Exception {
        // Arrange
        // Act
        SUT.registerListener(mListenerMock1);
        SUT.registerListener(mListenerMock2);
        SUT.unregisterListener(mListenerMock2);
        SUT.fetchCartItemsAndNotify(LIMIT);
        // Assert
        verify(mListenerMock1).onCartItemsFetched(any(List.class));
        verifyNoMoreInteractions(mListenerMock2);
    }

    @Test
    public void fetchCartItems_generalError_observersNotifiedOfFailure() throws Exception {
        // Arrange
        generaError();
        // Act
        SUT.registerListener(mListenerMock1);
        SUT.registerListener(mListenerMock2);
        SUT.fetchCartItemsAndNotify(LIMIT);
        // Assert
        verify(mListenerMock1).onFetchCartItemsFailed();
        verify(mListenerMock2).onFetchCartItemsFailed();
    }

    @Test
    public void fetchCartItems_networkError_observersNotifiedOfFailure() throws Exception {
        // Arrange
        networkError();
        // Act
        SUT.registerListener(mListenerMock1);
        SUT.registerListener(mListenerMock2);
        SUT.fetchCartItemsAndNotify(LIMIT);
        // Assert
        verify(mListenerMock1).onFetchCartItemsFailed();
        verify(mListenerMock2).onFetchCartItemsFailed();
    }

    // region helper methods -----------------------------------------------------------------------

    private List<CartItem> getCartItems() {
        List<CartItem> cartItems = new ArrayList<>();
        cartItems.add(new CartItem(ID, TITLE, DESCRIPTION, PRICE));
        return cartItems;
    }

    private void success() {
        doAnswer(new Answer() {
                @Override
                public Object answer(InvocationOnMock invocation) throws Throwable {
                    Object[] args = invocation.getArguments();
                    Callback callback = (Callback) args[1];
                    callback.onGetCartItemsSucceeded(getCartItemSchemes());
                    return null;
                }
            }).when(mGetCartItemsHttpEndpointMock).getCartItems(anyInt(), any(Callback.class));
    }

    private void networkError() {
        doAnswer(new Answer() {
                @Override
                public Object answer(InvocationOnMock invocation) throws Throwable {
                    Object[] args = invocation.getArguments();
                    Callback callback = (Callback) args[1];
                    callback.onGetCartItemsFailed(GetCartItemsHttpEndpoint.FailReason.NETWORK_ERROR);
                    return null;
                }
            }).when(mGetCartItemsHttpEndpointMock).getCartItems(anyInt(), any(Callback.class));
    }

    private void generaError() {
        doAnswer(new Answer() {
                @Override
                public Object answer(InvocationOnMock invocation) throws Throwable {
                    Object[] args = invocation.getArguments();
                    Callback callback = (Callback) args[1];
                    callback.onGetCartItemsFailed(GetCartItemsHttpEndpoint.FailReason.GENERAL_ERROR);
                    return null;
                }
            }).when(mGetCartItemsHttpEndpointMock).getCartItems(anyInt(), any(Callback.class));
    }

    // endregion helper methods --------------------------------------------------------------------

    // region helper classes -----------------------------------------------------------------------
    // endregion helper classes --------------------------------------------------------------------

}

As you can see, I use Mockito there to mock three objects: two listeners and HTTP endpoint. [Side note: excessive use of ArgumentCaptor was intentional]

I’m totally fine with mocking the listeners this way, but at the bottom of this class you can see these three methods which contain some cryptic Mockito magic:

private void success() {
    doAnswer(new Answer() {
            @Override
            public Object answer(InvocationOnMock invocation) throws Throwable {
                Object[] args = invocation.getArguments();
                Callback callback = (Callback) args[1];
                callback.onGetCartItemsSucceeded(getCartItemSchemes());
                return null;
            }
        }).when(mGetCartItemsHttpEndpointMock).getCartItems(anyInt(), any(Callback.class));
}

private void networkError() {
    doAnswer(new Answer() {
            @Override
            public Object answer(InvocationOnMock invocation) throws Throwable {
                Object[] args = invocation.getArguments();
                Callback callback = (Callback) args[1];
                callback.onGetCartItemsFailed(GetCartItemsHttpEndpoint.FailReason.NETWORK_ERROR);
                return null;
            }
        }).when(mGetCartItemsHttpEndpointMock).getCartItems(anyInt(), any(Callback.class));
}

private void generaError() {
    doAnswer(new Answer() {
            @Override
            public Object answer(InvocationOnMock invocation) throws Throwable {
                Object[] args = invocation.getArguments();
                Callback callback = (Callback) args[1];
                callback.onGetCartItemsFailed(GetCartItemsHttpEndpoint.FailReason.GENERAL_ERROR);
                return null;
            }
        }).when(mGetCartItemsHttpEndpointMock).getCartItems(anyInt(), any(Callback.class));
}

Whenever usage of Mockito deteriorates into something like that, I immediately refactor to a manual “fake”. That’s the result of such refactoring:

@RunWith(MockitoJUnitRunner.class)
public class FetchCartItemsManualTestDoublesUseCaseTest {

    // region constants ----------------------------------------------------------------------------
    public static final int LIMIT = 10;
    public static final int PRICE = 5;
    public static final String DESCRIPTION = "description";
    public static final String TITLE = "title";
    public static final String ID = "id";
    // endregion constants -------------------------------------------------------------------------

    // region helper fields ------------------------------------------------------------------------
    GetCartItemsHttpEndpointTd mGetCartItemsHttpEndpointTd;
    @Mock FetchCartItemsUseCase.Listener mListenerMock1;
    @Mock FetchCartItemsUseCase.Listener mListenerMock2;

    @Captor ArgumentCaptor<List<CartItem>> mAcListCartItem;
    // endregion helper fields ---------------------------------------------------------------------

    FetchCartItemsUseCase SUT;

    @Before
    public void setup() throws Exception {
        mGetCartItemsHttpEndpointTd = new GetCartItemsHttpEndpointTd();
        SUT = new FetchCartItemsUseCase(mGetCartItemsHttpEndpointTd);
        success();
    }

    private List<CartItemSchema> getCartItemSchemes() {
        List<CartItemSchema> schemas = new ArrayList<>();
        schemas.add(new CartItemSchema(ID, TITLE, DESCRIPTION, PRICE));
        return schemas;
    }

    @Test
    public void fetchCartItems_correctLimitPassedToEndpoint() throws Exception {
        // Arrange
        // Act
        SUT.fetchCartItemsAndNotify(LIMIT);
        // Assert
        assertThat(mGetCartItemsHttpEndpointTd.mInvocationCount, is(1));
        assertThat(mGetCartItemsHttpEndpointTd.mLastLimit, is(LIMIT));
    }

    @Test
    public void fetchCartItems_success_observersNotifiedWithCorrectData() throws Exception {
        // Arrange
        // Act
        SUT.registerListener(mListenerMock1);
        SUT.registerListener(mListenerMock2);
        SUT.fetchCartItemsAndNotify(LIMIT);
        // Assert
        verify(mListenerMock1).onCartItemsFetched(mAcListCartItem.capture());
        verify(mListenerMock2).onCartItemsFetched(mAcListCartItem.capture());
        List<List<CartItem>> captures = mAcListCartItem.getAllValues();
        List<CartItem> capture1 = captures.get(0);
        List<CartItem> capture2 = captures.get(1);
        assertThat(capture1, is(getCartItems()));
        assertThat(capture2, is(getCartItems()));
    }

    @Test
    public void fetchCartItems_success_unsubscribedObserversNotNotified() throws Exception {
        // Arrange
        // Act
        SUT.registerListener(mListenerMock1);
        SUT.registerListener(mListenerMock2);
        SUT.unregisterListener(mListenerMock2);
        SUT.fetchCartItemsAndNotify(LIMIT);
        // Assert
        verify(mListenerMock1).onCartItemsFetched(any(List.class));
        verifyNoMoreInteractions(mListenerMock2);
    }

    @Test
    public void fetchCartItems_generalError_observersNotifiedOfFailure() throws Exception {
        // Arrange
        generalError();
        // Act
        SUT.registerListener(mListenerMock1);
        SUT.registerListener(mListenerMock2);
        SUT.fetchCartItemsAndNotify(LIMIT);
        // Assert
        verify(mListenerMock1).onFetchCartItemsFailed();
        verify(mListenerMock2).onFetchCartItemsFailed();
    }

    @Test
    public void fetchCartItems_networkError_observersNotifiedOfFailure() throws Exception {
        // Arrange
        networkError();
        // Act
        SUT.registerListener(mListenerMock1);
        SUT.registerListener(mListenerMock2);
        SUT.fetchCartItemsAndNotify(LIMIT);
        // Assert
        verify(mListenerMock1).onFetchCartItemsFailed();
        verify(mListenerMock2).onFetchCartItemsFailed();
    }

    // region helper methods -----------------------------------------------------------------------

    private List<CartItem> getCartItems() {
        List<CartItem> cartItems = new ArrayList<>();
        cartItems.add(new CartItem(ID, TITLE, DESCRIPTION, PRICE));
        return cartItems;
    }

    private void success() {
        // no-op
    }

    private void networkError() {
        mGetCartItemsHttpEndpointTd.mNetworkError = true;
    }

    private void generalError() {
        mGetCartItemsHttpEndpointTd.mGeneralError = true;
    }

    // endregion helper methods --------------------------------------------------------------------

    // region helper classes -----------------------------------------------------------------------

    private class GetCartItemsHttpEndpointTd implements GetCartItemsHttpEndpoint {

        private int mInvocationCount;
        private int mLastLimit;

        private boolean mNetworkError;
        private boolean mGeneralError;

        @Override
        public void getCartItems(int limit, Callback callback) {
            mInvocationCount++;
            mLastLimit = limit;
            if (mNetworkError) {
                callback.onGetCartItemsFailed(FailReason.NETWORK_ERROR);
            } else if (mGeneralError) {
                callback.onGetCartItemsFailed(FailReason.GENERAL_ERROR);
            } else {
                callback.onGetCartItemsSucceeded(getCartItemSchemes());
            }
        }
    }
    // endregion helper classes --------------------------------------------------------------------

}

The code didn’t get much shorter, but, at least, I can read it and understand what’s going on there.

You might’ve noticed one interesting aspect of that refactored test class: I use a combination of Mockito mocks and manual “fakes” there, and it works out absolutely great. So, they aren’t mutually-exclusive really.

So, are mocking frameworks bad? I don’t think so. You can definitely write very ugly code if you commit to mocking frameworks exclusively, but, fortunately, you don’t have to do that.

In addition, I think that it’s totally reasonable to say that you don’t like frameworks and prefer to use manual “fakes”. We all have personal preferences. However, stating that mocking frameworks are universally bad is just counter-productive dogmatism in my opinion. There are trade-offs there, sure, but it’s never black or white.

Test-Doubles of Type Mock are Bad

In the previous section we discussed the idea that mocks are bad. Seemingly, this should’ve exhausted the subject and there should be nothing more to add. However, it’s not exactly the case. The next resource that I want to review is an article titled Mocking is not Practical – Use Fakes.

I can imagine that at this point you might think: “That was the exact premise of the previous article, even the terms are the same!”. Indeed, the terms are the same, but the premise is quite different. Surprise, surprise.

In the previous article, the term “mock” referred to framework-specific test double, whereas the term “fake” referred to a manual implementation of the same behavior. In this article, however, the same terms are defined in a different manner:

As you can see, the discussion of whether to mock or not in this article isn’t concerned with frameworks anymore. Instead, the author argues that one type of test-double objects is better than another. Consequently, the question I’m going to address now is: “are test-doubles of type mock not practical?”.

The author of that article claims that mocks imply white-box testing while fakes imply black-box testing. Therefore, mocks couple tests to internal implementation details of the production code, whereas fakes avoid that coupling. In my opinion, that’s fundamental misunderstanding.

Unit testing, in the most general sense, implies white-box testing. It’s especially true if you do proper TDD with red-green cycle where you write tests for pretty much individual lines of code ahead of time. Therefore, you aren’t in black-box situation if you do unit testing, regardless of which types of test doubles you’re using.

For example, consider author’s demonstration of a unit test that uses fakes:

class AccountManagerFakeTest {

    private lateinit var accountRepository: AccountRepository
    private lateinit var transactionRepository: TransactionRepository
    private lateinit var transactionManager: TransactionManager

    private lateinit var accountManager: AccountManager

    @Before
    fun before() {
        accountRepository = FakeAccountRepository()
        transactionRepository = FakeTransactionRepository()
        transactionManager = TransactionManager(accountRepository, transactionRepository)
        accountManager = AccountManager(accountRepository, transactionManager)
    }

    @Test
    fun `GIVEN amount and interest to deposit WHEN deposited THEN should update account balance`() {
        // GIVEN
        val originalAccount = Account("account1", 0)
        val amount = 1000
        val interest = 100
        val expectedAccount = originalAccount.copy(balance = 1100)

        accountRepository.put(originalAccount.id, originalAccount)

        // WHEN
        val actualAccount = accountManager.deposit(originalAccount.id, amount, interest)

        // THEN
        assertEquals(expectedAccount, actualAccount)
    }
}

Note how this test adds Account into AccountRepository. This immediately leaks an internal implementation detail of SUT: it makes use of this object in a very specific way. At this point, “black-box theory” flies out of the window, but that’s not all. Take a look at before() method. See how it is coupled to objects’ construction details? If you change any of these constructors, even without changing the functionality of SUT in any way, this test will break.

All in all, as I said, discussion of black-box vs white-box is irrelevant here. There is no black-box in unit testing (unless we’ll redefine the term “unit testing” itself).

Another interesting claim in that article is that mocks make your tests verbose. To demonstrate this point, the author shows a long test that uses mocks and a shorter version that uses fakes. The difference in length is pretty impressive.

Well, first of all, the test that uses mocks is needlessly long. Let me remove the unneeded code:

class AccountManagerMockTest {

    private lateinit var accountRepository : AccountRepository
    private lateinit var transactionManager : TransactionManager

    private lateinit var accountManager: AccountManager

    @Before
    fun before() {
        accountRepository = mockk(relaxed = true)
        transactionManager = mockk(relaxed = true)
        accountManager = AccountManager(accountRepository, transactionManager)
    }

    @Test
    fun `GIVEN amount and interest to deposit WHEN deposited THEN should create deposit transactions and execute them`() {
        // GIVEN
        val originalAccount = Account("account1", 0)
        val amount = 1000
        val interest = 100
        val expectedAccount = originalAccount.copy(balance = 1100)

        every { accountRepository.get(originalAccount.id) } returns originalAccount
        every { transactionManager.executePendingTransactions(originalAccount.id) } returns expectedAccount

        // WHEN
        val actualAccount = accountManager.deposit(originalAccount.id, amount, interest)

        // THEN
        verifySequence {
            transactionManager.add(TransactionRequest(TransactionType.DEPOSIT, originalAccount, amount))
            transactionManager.add(TransactionRequest(TransactionType.DEPOSIT, originalAccount, interest))
            transactionManager.executePendingTransactions(originalAccount.id)
        }
        assertEquals(expectedAccount, actualAccount)
    }
}

This version is still longer than the “fake” one, but the difference isn’t staggering anymore. In practice, however, comparing the length of these tests is simply silly because they aren’t equivalent.

In the test with mocks, the SUT is AccountManager class. In the test with fakes, the SUT is AccountManager and TransactionManager classes together (it’s totally alright to test more than one class as a single unit). The test with mocks verifies that amount and interest were added separately and their respective values were correct. The test with fakes verifies that the resulting total balance is correct. These tests cover different aspects of the application and provide different guarantees. Apples vs oranges.

One thing you might’ve noticed is that in addition to simplification of the “mocking” version of the test, I also replaced verify with verifySequence. Otherwise, theoretically, SUT could call executePendingTransactions before adding the transactions themselves, and the test would still pass. Furthermore, in all real-world financial systems that I’m aware of, the order of transactions is important. Therefore, it’s probably a good idea to verify the ordering of amount and interest transactions as well.

Speaking of ordering, do you see anything odd in the implementation of FakeTransactionRepository class:

class FakeTransactionRepository : TransactionRepository {
    private val transactionTable = HashMap<String, Transaction>()

    private var newId = 0
    override fun getAll(account: Account, status: TransactionStatus): List<Transaction> {
        return transactionTable.values
            .filter { it.status == status }
            .filter { it.account == account }
    }

    override fun put(request: TransactionRequest): Transaction {
        val id = (++newId).toString()
        val transaction = Transaction(
            id,
            request.type,
            request.account,
            request.amount,
            TransactionStatus.PENDING
        )
        transactionTable[id] = transaction
        return transaction
    }

    override fun update(transactionId: String, transaction: Transaction): Transaction {
        transactionTable[transactionId] = transaction
        return transaction
    }
}

This fake implementation uses HashMap under the hood. Therefore, in principle, when you get transactions from it, the order of transactions can be changed. This might be not important in the context of this simple tutorial, but could be disastrous in real project. When you use fakes, you must ensure that their behavior in tests is equivalent to “the real thing”, including all the nuances. In some cases, that’s not simple to achieve. Furthermore, you’ll need to invest additional effort to maintain this parity as you modify and change your production code.

All in all, both tests presented in that article are white-box tests and, furthermore, they aren’t equivalent. I guess that’s enough to undermine the credibility of this resource, even if we forget about all other issues.

In reality, fakes and mocks are used for different purposes. It’ll take yet another article to describe their use cases in details, but the main difference is behavior verification vs state verification. Neither of them is better or worse, as long as you use them judiciously and understand the trade-offs.

In other words: fake and mock test-doubles are complementary to each other, not alternatives. Use them both.

Test-Doubles are Bad

At this point you might think: “after these two long discussions, there is surely nothing more to say about mocks, right?”. Well, there is. Quite a bit.

The last resource that I’ll review here is a talk titled TDD, Where Did It All Go Wrong. This video is long, complex and loaded with many ideas. You can watch it from start to end if you want, but the parts about mocks are mostly concentrated here, here and here.

TL; DR; this speaker uses the term “mocks” to refer to all test-doubles, regardless of their types. The argument, though, is very similar to what we discussed in the previous section (but don’t forget that these arguments assumed different meaning of “mock”): if you use mocks (i.e. test-doubles), you couple your tests to implementation details. Therefore, future refactorings can break the existing tests even though they don’t change the functionality.

Well, in the previous section we’ve already established that all types of test-doubles indeed couple the tests to implementation details to some degree. So, as a standalone statement, that’s correct and constitutes a valid concern. However, this is far from being the only concern that you need to be aware of. I’ll try to explain what I mean with an example.

Consider the following class:

public class DataValidator {

    public boolean validate(Data data) {
        ...
    }
}

This class is an abstraction which represents a class of objects that you might use to validate some data in your system. It might be as simple as validating the format of a password in registration form, or as complex as validating the authenticity and the integrity of an encrypted payload. Or even more complex than that.

To emphasize the point I’m going to make, let’s further assume that this class has no additional dependencies and it’s not stateful. It’s pure, static and self-contained algorithm that returns yes-no answer. In addition, it executes really fast, such that it won’t degrade the speed of our test suite even if used in hundreds of tests.

Given all the above assumptions, it’s trivially simple to use DataValidator in unit tests of other classes and even entire modules without mocking. However, I’d still write dedicated unit tests for this class alone and I’d probably mock it out in most cases.

The reason to write dedicated unit tests for this class specifically is functional coverage (not to be confused with line coverage).

See, even for something as simple as “the password must have at least eight characters and contain at least one digit” I’ll need about five test cases. If I test this class alone, these test cases will be trivially simple and will take me minutes to write. If I test this functionality as part of a larger “unit”, however, they’ll probably end up much more complex and, potentially, more time-consuming (depending on the functionality that will constitute a “unit”).

Furthermore, let’s imagine that I use this class as part of a bigger “unit” and one of the bigger unit tests fails in the future. How do I know which part of that “unit” is responsible for the failure? In some cases, I won’t know this immediately and will need to spend time pinpointing the root cause. With dedicated unit tests, on the other hand, I’ll know immediately that the problem is somewhere inside this class.

You could say: “well, then, that’s a special case of a module which consists of one single class”. That’s fine, but it’s not a rare exception. It’s commonplace. For example, I totally recommend that you employ the same approach with all algorithms in your application. Unit testing those in isolation is the best ROI you’ll ever get with unit tests. Many (most?) of these algorithms will fit nicely into a single class.

Even though the above discussion is not about mocks specifically, it is closely related. See, once I have dedicated unit tests for DataValidator, I no longer need to test it as part of bigger “units”. So, I can mock it out to reduce coupling between other tests and internal implementation details of this class. I guess this statement might sound confusing, given we’ve already established that mocking implies some level of coupling. So, how comes I mock to reduce coupling? Let me explain.

When I use DataValidator as part of a bigger unit, I exercise its internal functionality in some of the unit tests of that unit (maybe even in all of them). These tests directly depend on the specifics of this class’ implementation, which constitutes the highest degree of coupling possible. This statement might sound surprising, especially if you’ve already bought into “mocks increase coupling” mindset, but it’s a fact. For example, if DataValidator‘s functionality changes in the future, some of the tests of the bigger unit will fail.

But even if I write dedicated tests and change the functionality of a class, some of those tests will fail. So, what’s the difference? The difference is the scope of the affected logic.

Imagine that DataValidator validates the format of a username and it’s a part of five different larger “units”. Now imagine that two years from now the business decided to change username validation rules and make them more restrictive. If you used the actual DataValidator class when you tested the larger “units”, you can potentially get failures in five different test suites. Ooops.

If, however, you used a simple stub test-double in those unit tests and configured it to return true or false to simulate specific conditions, you’re totally fine. Since the actual algorithm is not part of the larger units anymore, its change doesn’t affect those unit tests (as long as the public API is stable). In other words, mocking of this class allowed me to reduce the coupling between unit tests and its internal implementation.

In addition, if you test DataValidator as part of five different units, does it mean that in all of them you’ll need to write exhaustive tests? As I wrote above, I’ll need about five test cases to cover something like “the password must have at least eight characters and contain at least one digit”. However, if I’ll need to do that in five different units, then it’s 25 unit tests! And if you think that you can write exhaustive tests in just one unit and get away with superficial tests in another four, just imagine maintaining codebase with that assumption for years. Not that practical, in my opinion.

Lastly, since the speaker in that video said that Kent Beck’s TDD book is the best and the only book about unit testing you’ll need, let me quote from that book.

How do you test an object that relies on an expensive or complicated resource? Create a fake version of the resource that answers constants. […] Mock Objects encourage you down the path of carefully considering the visibility of every object, reducing the coupling in your designs. They add a risk to the project – what if the Mock Object doesn’t behave like the real object? You can reduce this strategy by having a set of tests for the Mock Object that can also be applied to the real object when it becomes available.

Test-Driven Development by Example

Kent Beck evidently uses the term “mock object” to refer to test-doubles of type fake. Hopefully, you aren’t surprised by terminology differences anymore. Does Kent second the argument discussed in the previous section, that test-doubles of type mock shouldn’t be used? Not at all. Right after the section I quoted above, there is a discussion of “self shunt”:

How do you test that one object communicates correctly with another? Have the object under test communicate with the test case instead of with the object it expects.

Test-Driven Development by Example

The examples that follow show a manual implementation of test-doubles of type mock. So, Kent Beck was totally fine with using both fakes and mocks in unit tests. In fact, there is also a discussion of “crash test dummy” there, which is a test-double of type stub that throws an exception.

Hopefully, after all this discussion, you see that the argument “avoid mocks (i.e. test-doubles) in your unit tests” is built on top of very, very shaky foundation. In fact, I’m yet to see any kind of actionable tutorial, course, or even just real-world open-source project that would demonstrate this approach. The speaker in the video we’re reviewing here, for example, has been giving this same talk for years, but didn’t produce any content to help audience to transition from theory to practice. And that’s not an exception. I’ve read numerous articles on the subject and saw numerous talks, and I’m more than willing to pay to learn. But, as far as I can see, “avoid test-doubles” discussion stays purely theoretical.

If I’d need to distill something useful out of “avoid test-doubles” argument, I’d summarize it as follows: “You can test each class in isolation by mocking all its dependencies, but, sometimes, it won’t be the optimal strategy. Unit testing real implementations of several interconnected classes as a single unit is perfectly legitimate approach as well. Be careful not to abuse test doubles”.

Conclusion

That’s the end of this post. Seriously.

The main takeaway that I’d like you to take from this article is this:

There is no single “no mocks” movement among developers. Instead, there are several “no mocks” camps which advocate for completely different ideas. Some of these ideas are mutually exclusive. “No mocks” advocates often use arguments that sound clever and impressive, but turn out to be fundamentally incorrect on closer inspection in many cases. When it comes to showing actual practical examples, you’ll often find the most trivial code, showing a carefully cherry-picked scenario. But even those often contain fundamental errors and, in some cases, prove the opposite of their premise. In other cases, the discussion stays purely theoretical and it’s not even clear how to apply it to your actual day-to-day work.

Sounds harsh? It is. But that’s my professional opinion, based not just on the three resources I reviewed in this post, but on a much larger sample. I, personally, use all “mocks” which were covered in this post. But I also invested time to udnderstand their respective use cases, so I don’t use any of them dogmatically.

One more consideration, if I might.

The first time I tried to produce a resource about unit testing which would be useful and wouldn’t be trivial, I ended up with this series of videos on YouTube. Almost an hour and a half of content to show the basics of TDD.

Then I wanted to make a resource which would teach developers the art of unit testing and TDD at a professional level. It resulted in this 5+ hours course.

The third time I dared to touch the topic of unit testing, I wanted to write a short post to express my opinion on one relatively widespread idea, which I always found to be odd. This post. It’s so “short” that I almost died of aging while writing it.

Therefore, following my experience with unit testing, and my experience teaching unit testing to others, I’m very skeptical of short articles that deal with fundamental aspects of this practice. I equally skeptical of conference talks which pack a bunch of very far-reaching statements, but provide no resources to follow up with practice. Unit testing is a very difficult aspect of software development and I simply don’t think you can address its complexity in short bites. Unless, of course, it’s just a tutorial of a very specific technique.

I hope you weren’t bored too much while reading this article. As usual, leave your comments and questions below and subscribe if you liked the post.

Check out my premium

Android Development Courses

3 comments on "To Mock, or Not to Mock, That is the Question"

  1. Thanks Vasiliy for the very detailed analysis of this issue. I have gained a lot more insight into this and hopefully will be able to apply it in my work. Thank you

    Reply
  2. Well balanced article packed with heavily researched info and your field-tested real world experience. Nothing to add, I’m on a binge reading your blog and I’m very grateful for all this valuable content you create.

    Reply

Leave a Comment