Prefer test-doubles over mocking frameworks

Mocking frameworks are a path to insanity! They start out great. They're a great way to mimic methods. They initially save time. But they're a gateway drug that will ultimately break your encapsulation, break your builds, break your tests, and break your mental capacity to understand what's going on!

This post describes why I prefer using in-memory test doubles over mocking frameworks for all but the simplest of scenarios in unit tests. The examples are in C# and use NSubstitute as the mocking framework, and FluentAssertions for the asserts, but the idea applies to other languages and mocking frameworks.

For a bit of background: I used to love mocking frameworks! I marvelled at their use of generics. It was a personal challenge to see how far I could push the syntax of these frameworks, and I revelled in the satisfaction of controlling types that didn't actually exist. But gradually, the novelty wore off. The syntax became jarring. Other contributors had difficulty reading the tests. And changing signatures of methods in the code being tested would either cause compilation errors in the tests, or cause the tests to fail. I eventually concluded that creating small, simple test doubles was a better approach.

First, a quick recap of the differences and what they give us:

And next, just a quick recap of why we use either of them: when testing the behaviour of something (often called the System Under Test, or SUT), we often want to swap out certain dependencies that the SUT has. For example, dependencies that write to a database. We may not want to write to a real database in our unit tests, so we need to replace it with a test double or a fake.

What follows is why I think manually created test doubles are preferable to mocking frameworks.

Test doubles focus on behaviour, not implementation #

When you write a test double, you focus on mimicking the behaviour of the real production code.
When you use a mocking framework, you focus on which methods are called with which parameters.

Having such intimate knowledge breaks encapsulation and is a burden. It tightly couples your test setup code to actual method signatures in the production code. And when a method signature changes, you will likely need to update all the places in your test setup code that have set up the call to that method. But with a test double, you only need to update the test double itself, usually in just one place.

To demonstrate, consider this interface and type that uses it:

public interface IProductRepository
{
void Store(Product product);
Product Get(int id);
}

public class ProductService(IProductRepository _productRepository)
{
public void OnboardNewProduct(int id, string name) =>
_productRepository.Store(new Product(id, name));
}

Here's a test using NSubstitute:

[Fact]
public void Using_mocks()
{
var repo = Substitute.For<IProductRepository>();
var sut = new ProductService(repo);

sut.OnboardNewProduct(123, "Product 123");

repo.Received().Store(Arg.Is<Product>(p => p.Id == 123));
}

There are a couple of things wrong with this approach. Aside from needing to understand the mocking framework syntax to read and modify the test, the test has explicit knowledge of the implementation of the OnboardNewProduct method. It knows that somewhere in the implementation, it calls a method named Store and it knows the parameters that are provided to that method.

Your test doesn't need to know this, it only needs to know that the product is stored in the repository. To explain further, here's the test double and revised test:

public class InMemoryProductRepository : IProductRepository
{
private readonly List<Product> _products = new();

public void Store(Product product) => _products.Add(product);
public Product Get(int id) => _products.FirstOrDefault(p => p.Id == id);

// This is not part of the interface, but is useful for testing
public bool DidStore(int id) => Get(id) is not null;
}

And here's the test that uses the test double:

[Fact]
public void Using_test_doubles()
{
var repo = new InMemoryProductRepository();

var sut = new ProductService(repo);

sut.OnboardNewProduct(123, "Product 123");

repo.DidStore(123).Should().BeTrue();
}

The test no longer needs to have the intimate knowledge of specific methods being called. The only thing it needs to know is that the observable behaviour is correct. It does this with the call to repo.DidStore(...).

The main point to note here: if the signature of the Store method changes, then it only needs to be updated in one place, in InMemoryProductRepository. And if you use a refactoring tool, it will likely change the signature in both places.

One of the biggest objections I hear when suggesting test doubles is that it is more code and more typing. But generally, you type this stuff once, and it is read thousands of times. So if you optimise for reading, you're saving yourself time in the long run. And the test double evolves with the real code; changes happen in a single place, rather than being spread out across all the test setup code (more on that in the next item).

Another scenario that I often use is to tell the test double to always throw an exception. Let's see how that looks using both approaches. Here's the test using a mocking framework:

[Fact]
public void Could_not_store_new_product_using_mocks()
{
var repo = Substitute.For<IProductRepository>();

repo.When(x => x.Store(Arg.Any<Product>())) 👈
.Do(x => throw new InvalidOperationException("oh no!")); 👈

var sut = new ProductService(repo);

Action a = () => sut.OnboardNewProduct(123, "Product 123");

a.Should().ThrowExactly<InvalidOperationException>()
.WithMessage("oh no!");
}

Again, the test setup needs to know explicit implementation details, namely that the Store method is called with a particular parameter.

Here's how to do the same thing with a test double. The test double from above needs modifying:

public class InMemoryProductRepository : IProductRepository
{
private readonly List<Product> _products = new();
private Exception? _alwaysThrowsWhenStoring; 👈

public void Store(Product product)
{
if (_alwaysThrowsWhenStoring is not null) 👈
throw _alwaysThrowsWhenStoring; 👈

_products.Add(product);
}

public Product Get(int id) => _products.FirstOrDefault(p => p.Id == id);

public bool DidStore(int id) => Get(id) is not null;

public InMemoryProductRepository AlwaysThrowsWhenStoring(Exception e) 👈
{ 👈
_alwaysThrowsWhenStoring = e; 👈
return this; 👈
}
}

The test setup has a simple one-liner to say that the behaviour associated with storing (no method name is needed), now throws an exception:

[Fact]
public void Could_not_store_new_product_using_test_doubles()
{
var repo = new InMemoryProductRepository()
.AlwaysThrowsWhenStoring(new InvalidOperationException("oh no!")); 👈

var sut = new ProductService(repo);

Action a = () => sut.OnboardNewProduct(123, "Product 123");

a.Should().ThrowExactly<InvalidOperationException>()
.WithMessage("oh no!");
}

The differences between the two are:

// mocking
var repo = Substitute.For<IProductRepository>();
repo.When(x => x.Store(Arg.Any<Product>()))
.Do(x => throw new InvalidOperationException("oh no!"));

// test double
var repo = new InMemoryProductRepository()
.AlwaysThrowsWhenStoring(new InvalidOperationException("oh no!"));

I think you'll agree that the test double version is more readable, which brings us onto the next point...

Readability and maintainability #

Test doubles promote cleaner, more understandable code because the logic is in a separate class, not intertwined with test setup code.

In the example above, if a reader wanted to know how the repository is imitated, they'd just take a look at the test double to see what it does. That knowledge is in one place, as opposed to dozens or hundreds of mocked method calls spread throughout the tests. Also, it is clear where to add convenience methods such as DidStore.

DRY: these small examples are tiny, but in a real codebase, there is likely hundreds or even thousands of calls setting up particular methods. A lot of this is duplication (and 'wordy' duplication at that), which usually ends up obscuring the real intent of the test.

And speaking of the DidStore method: such methods make the tests easier to read as you're asking questions of the behaviour, not individual methods. Which of the following would you prefer to read?

// mocking framework
repo.Received().Store(Arg.Is<Product>(p => p.Id == 123));

// or, using a test double
repo.DidStore(123).Should().BeTrue();

Better simulation of real scenarios #

Test doubles can mimic real-life scenarios more effectively than mocking frameworks. They can preserve state across multiple method calls, which can be challenging to do with mocking frameworks. Mocking frameworks tend to be very specific and focused on individual calls, making them less suited to complex or state-dependent behavior.

Performance #

Test doubles can be more performant than mocking frameworks. With test doubles, there is no need for runtime generation of proxies or for methods to be intercepted and recorded. Performance may or may not be a problem for you, but I've seen it make a huge difference with thousands of tests. I've also seen this make a big difference when using automated test runners, such as the ReSharper/Rider continuous tests or NCrunch.

More expressiveness #

In your test doubles, you can add a fluent interface.

In software engineering, a fluent interface is an object-oriented API whose design relies extensively on method chaining. Its goal is to increase code legibility by creating a domain-specific language (DSL). The term was coined in 2005 by Eric Evans and Martin Fowler.

For instance, you may want to just set up a test double to always throw an exception, or always return a specific value. Here's an example showing how that looks in your test:

[Test]
public void TestGetProduct()
{
var repo = new InMemoryProductRepository()
.AlwaysReturns(new Product(123, "Test Product"));

var sut = new MyService(repo);

// any call to Get will return the product

...
}

AlwaysReturns is a fluent method that returns the type itself so that method calls can be chained. Here's what it looks like in the test double:

public InMemoryProductRepository AlwaysReturns(Product p)
{
_alwaysReturns = p;
return this;
}

Note that we've changed the behaviour of the test double to always return a specific product. We didn't change the test setup code to say that "when the Get method is called, with an ID parameter of any value, then return this particular product". We merely said "This repository always returns this product". This is an important consideration; we don't care how this is done (e.g. someone calls the Get method), we leave that to the in-memory abstraction (more on abstractions next).

More focused abstractions #

If your interfaces are small (and they should be), then writing test doubles should be trivial. But if your interfaces are big, then writing test doubles for them is not going to be trivial. The test doubles will end up with reams of code that will be difficult to follow and difficult to evolve along with the real production code. In this case, you are no better off than using a mocking framework.

If you find it difficult or onerous writing a simple imitation of an interface, then that is likely a sign that the interface is too large and needs to be broken down into smaller, more focused abstractions.

Earlier, I described that a common objection from fellow developers to using test doubles is "But it's more typing". This objection is compounded and affirmed when you have overly large interfaces. The correlation between a large interface and complexity isn't always obvious to them, but what is obvious to them is that mocking frameworks make their tests easier to write, and they therefore regard mocking frameworks as good.

🤔 If your imitation (test double) is difficult to write and understand, just imagine how difficult the real implementation will be to write and understand!

What have we seen? #

While mocking frameworks have their uses, in-memory test doubles provide a robust, clear, and efficient approach to testing. They isolate tests from specific implementation details, make the code more readable, provide resilience to refactoring, and more effectively simulate real-world scenarios.

Remember, the goal of tests is not just to ensure that the system works as expected but also make the codebase more maintainable. By using in-memory test doubles, you're taking a step towards writing clean, maintainable, and robust tests.

The overall theme here has been the benefits of focusing on behaviour and not implementation.

Please leave a comment with your thoughts

FAQ #

Here's a couple of things that I've heard asked:

How can I test that my SUT has called the right method with the correct parameters? #

This is testing implementation and not behaviour. Your SUT called something and there is likely an observable side-effect of that. Test the side-effect and not that a particular method was called. If the code is refactored (e.g. you change the implementation but not the behaviour), then your test that checked that a method was called will likely break, but your test that tested the behaviour should remain unchanged and should still pass.

How is DRY violated? Can't I just move duplicated setup code into a shared method in a 'TestHelper' class? #

You could, but the problem remains of needing explicit knowledge of what methods need to be called. The method that you just created in your TestHelper class to avoid duplication will likely take some parameters that relate to the method that needs to be set-up. So, instead of changing hundreds of lines of code that used to set-up a method, you're now changing hundreds of lines of code that call your TestHelper method.

By focusing on behaviour and driving that behaviour with descriptively named methods in the test double itself (e.g. AlwaysThrows), then you both remove duplication and the reduce the chances that you'll need to change parameters.

🙏🙏🙏

Since you've made it this far, sharing this article on your favorite social media network would be highly appreciated 💖! For feedback, please 🦋 ping me on Bluesky! 🦋

Leave a comment

Comments are moderated, so there may be a short delays before you see it.

8 comments on this page

  • jofla

    commented

    I don't like in memory repositories, I prefer mocking if you need to unit test some code that has dependencies. In my experience it is so useless to maintain updated the test doubles when you have complex logic, on top of that is code that is never going to touch production.

    Nowadays, I like to limit my unit tests to pure functions/ methods.

  • Jhon

    commented

    Thanks for this article, I agree with this. One small mistake in Using_mocks() method, you meant to pass repo instead of "mock"

  • Jason Bock

    commented

    Your post's motivated me to make a video on my views of test doubles and mocks - https://www.youtube.com/watch?v=n__1wXwljtk

  • Krishna

    commented

    Thanks for the write up, it was illuminating!

  • Alex Rampp

    commented

    I like this approach since it keeps things simple and maintainable. But often people complain that they have to write additional test doubles. I think we need to point out more the advantages on maintainability which have more a mid term focus and do not pay out when writing the very first test.

    Btw.: there's a typo in the first code example using the test double. You instantiate the in memory repository in the variable 'repo' but pass it as 'testdouble' to the system under test.

  • nope

    commented

    "Your test doesn't need to know this, it only needs to know that the product is stored in the repository. To explain further, here's the test double and revised test" and so on is just smoke in the eyes.

    Mock leaking abstraction by exposing internals? Yup. Binds tightly tests to that internals? Partially, it binds to the interface of IRepository. But yeah, you could call that 'internals' since we're testing ProductService (it's not 'internals', it's DEPENDENCIES, but I'll let it slide). Modify "internals" means correct all tests? Yup. That's mocks and interfaces. Plain in the eyes.

    But now let's see your test doubles. You wrote i.e. InMemoryProductRepository. You have to implement the whole interface, even if 99% is not used in the test. You have to manually add switches and tricks like AlwaysThrowsWhenStoring. Is it all bound to internals? Oh yes! Since it basically refers to the same interface as the mocks, it also has to refer to methods like 'Store' so it has the same binding to the 'internals' like that was in the previous case.

    But there's a difference. Now the 'test double' is bound to that, not the test itself. Ok. Let's say that's a point.

    Now, let's look what happens NEXT.

    You will have 10, 100, 1000 tests like that. Either you will create test doubles for each test separately (and cry at the duplication), or sooner or later you will try to reuse test doubles. This means that all the SWITCHes and HACKs like AlwaysThrowsWhenStoring will start building up within the implementation of the test double. You will have to mantain, design, plan out, untangle, everything, or you will end up with absolute ball of mud that everyone will be afraid to touch and that everyone will just add more new switches on top of old switches just to ensure that no old tests break.

    Ok. So for your 100s test(cases), depending on how much you care about impl of doubles, you either have 30+ slim hardly-reusable test doubles for IRepository, or you have just 1-3-maybe-5 of them, highly-reusable, ugly inside (if they just grew) or nice inside (if you spent time to design&refactor&..). Now let's say something in 'internals' changed. Sure, you don't have to correct 100s of tests. WIth mocks you probably would need to adjust not every single test, but those tests that touch the area that changed. Say, 40 tests. With doubles, you have to correct EVERY single double. You had lots of simple slim doubles? You have a PITA. You had a few reusable hairball/mudball-style test double? You have a PITA - adjusting a mudball in a non-breaking way is hard. Most probably you or your team mate will just add more switches for this one new test and thus grow the mudball. If you had 1-2 nicely designed polished doubles, yes, you won! You just update the double's impl in that 1-2 places, and done! But even with, or especially with nicely designed doubles, if a serious change shows up, you will want to like to adjust its public API (consider 'DidStore'), just to keep it nicely&cleanly&designed, and then you have to get back to correcting related tests. So, win, but kinda partial win.

    Now. Let's see those ugly abstraction-leaking mocks. We all know plain utility methods. Factories. Helpers. Extension methods. Code duplication because you have to set up a mock the same way in a few tests? Yeah. Maybe. But no. Just wrap that setup into a utility method and have fun reusing. Lots of such setups? Just wrap it up again, or make a mock factory/helper/whatever so it's easier to find and deduplicate. Repeated Verify/Assert like 'received().store()'? Just wrap that into a method or extension method, again. You probably see where I am going to, so I'll stop.

    Every single thing you wrote against mocks is smoke screen. Those issues can be solved with any other typical solutions for those issues. You don't need test doubles to solve them.

    Most of contemporary mocking frameworks (like Moq you mentioned) are all about 'behavior' as well. The 'expose internals' part is there jsut because it has to be somewhere. Either in the plain sight, or wrapped up with helper methods for dedupe'ing, or wrapped up in doubles, but it has to be somewhere, and that's because of DEPENDENCIES and INTERFACES like IRepository, not because of mocks/doubles/younameyourtools. And on top of 'behavior' those contemporary mocking frameworks go out of their ways to assist you in SKIPPING what's not needed and on SPECIFYING ONLY WHAT A TEST NEEDS. If you have IRepository with 100 precisely shaped querying methods, your test touches one, your mock will specify just one, and ignore 99. Your test double will throw-not-implemented 99 of them. For godssake, please, no.

    Mocks and test doubles are tools. You pick tools depending on the case. You have wide, complex interface, pick mocking framework and trim. You have dead simple interface, make a test double. You like writing classes and boiler place, write test doubles. You like lambdas and combining simple functions into larger blocks, pick mocks. Your nice and coherent test double detoriated into several loose blocks, each used in 1 or 2 places? Split it into short mocks written on-site-of-use. Your simple one-liner mock grew with captures and semi-persistent temporary storage so pair of methods like Add/Find actually kinda work? Scrap it and convert to a test double.

    Do NOT "Prefer test-doubles over mocking frameworks".

    Instead, consider mocks or test doubles equal and choose them where they make more sense and are easier to mantain in the foreseeable future.

  • James Moore

    commented

    If you find yourself reaching for tools like this, stop. You need a full end to end test that actually hits the real database or service.

  • Marc

    commented

    I like that more people are talking thoughtfully about friction points in unit tests.

    Some devs & teams might already have a lot of experience with mocking frameworks. And at a point, simple mocking code might be simpler to read across a codebase than many new and unique test double classes.

    Thinking out loud, I wonder if it's better to use custom test doubles for the most complicated parts of the codebase rather than as a default approach.

Published