Primitive Obsession

Primitive Obsession means being obsessed with primitives. Thanks for reading, see you next time!

But to elaborate more:

"Primitive Obsession is using primitive data types to represent domain ideas" #

It is also known as StringlyTyped where important domain objects are represented as strings.

It is this:

int customerId = 42

It is being obsessed with the seemingly convenient way that primitives, such as ints and strings, allow us to represent domain objects and ideas.

Having quite an addictive personality myself, I find the fact that I can do int customerId = 42 quite irresistible (heaven knows what'd happen if I ever discovered fags and booze!). But as with most things that are irresistible, there's a price to pay.

This isn't a long post. It describes the problem and some of the pitfalls of using primitives, and suggests an alternative.


💡 Update 2024: this post was updated to change the example code to use Vogen, a value object generator with
extremely low overhead compared to using primitives

Primitive. Obsession. #

Primitive - "is a basic data type provided by a programming language" #
It's basic. It's provided by the people that did the programming language you program in. They didn't know about your domain objects.

Obsession - "the domination of one's thoughts or feelings by a persistent idea, image, desire, etc." #
The convenience of primitives make them desirable and dominate your thoughts (inappropriate joke removed) - but there's a price to be paid for that convenience...

Here's an example:

public class Customer
{
public int Id { get; }
}

A customer ID likely cannot be fully represented by an int. An int can be negative or zero, but it's unlikely a customer ID can be. So, we have constraints on a customer ID. We can't represent or enforce those constraints on an int. An int has its own constraints, namely the range of values limited by the bitness of the Operating System, but this constraint has very little to do with identifying a customer (but it'd be a nice problem to have if you do have 2,147,483,647 customers!)

So, we need some validation to ensure the constraints of a customer ID are met. We could have a method on Customer that checks it. Then, if we use that int elsewhere, we could move that check out of Customer and into a common place and remember to call it every time we create a customer ID... everywhere except one place:

⚖️ Shalloway's Law #
Shalloway's Law says: "when N things need to change and N > 1, Shalloway will find at most N - 1 of these things."
Concretely: "When 5 things need to change, Shalloway will find at most, 4 of these things."

But even if we could escape Shalloway's Law (which is impossible), we're still left with the fact that we have a primitive representing a domain idea. Let's say we're dealing with a customer who's ID is 42. In your programming language, it is likely that "42 == 42". This is intentional; the primitives come with your programming language - but in the real world of managing customers, "42 != 42"! That's because that 2nd "42" that we're comparing with is actually a Supplier ID. But on a primitive level, they're both the same. To give a more concrete example:

public void DoSomething(int customerId, int supplierId, int amount)

and a caller...

_something.DoSomething(_supplierId, _customerId, _amount)

We've messed up the order of the parameters but our compiler won't tell us. The best we can hope for is a failing unit test, but given the contrived data often used in unit tests, it could be that the data will hide the problem by using the same ID for customer and supplier.

What's the answer? #

A ValueObject. (#)

A ValueObject is something that's comparable by its content, not its identity. The examples below are from my
.NET library named Vogen (#), which is a source generator and a set of
analyser. ValueObject implementations are trivial and quite general, so roll your own if you like. Essentially, a
ValueObject is just a type that wraps a primitive. By wrapping the primitive, you're essentially given it more meaning. An int becomes a CustomerId:

[ValueObject<int>]
public partial struct CustomerId
{
}

Here it is again with some validation:

[ValueObject<int>]
public partial struct CustomerId
{
private static Validation Validate() => Value > 0
? Validation.Ok
: Validation.Invalid("Customer IDs cannot be zero or negative.");
}

So, the Customer type from above now becomes:

public class Customer
{
public CustomerId Id { get; }
}

To create a ValueObject, use:

CustomerId id = CustomerId.From(123);

If there's anything wrong with validating the object, then one should not be created. The beauty of ValueObjects is that they represent a value in your domain. If a CustomerId can never be zero or negative, then it should be impossible to create one in that state. If the representation of an invalid ValueObject is required, then we can declare an instance of it, e.g.

[ValueObject<int>
public partial struct CustomerId
{
public static readonly CustomerId Invalid = new(-1);

private static Validation Validate() => Value > 0
? Validation.Ok
: Validation.Invalid("Customer IDs cannot be zero or negative.");
}

This allows us (the domain author) to specify what in invalid instance looks like, but disallows intances to be
created. For example, when retrieving data from a legacy system, the IDs might not be present, so you do something like:

CustomerId id = primitiveFromLegacySystem == -1 ? CustomerId.Invalid : CustomerId.From(primitiveFromLegacySystem);

For Vogen, a ValueObjectValidationException exception is thrown when validation fails.

Having all validation is in one place means we satisfied DRY (Don't Repeat Yourself), and, if we're careful, we might just have escape Shalloway's Law!

Now we can rewrite the DoSomething method above, and instead of:

public void DoSomething(int customerId, int supplierId, int amount)

we can have:

public void DoSomething(CustomerId customerId, SupplierId supplierId, Amount amount)

We'll find that, given the code below, we can no longer call the method incorrectly:

CustomerId _customerId;
SupplierId _supplierId;
Amount _amount;

DoSomething(**_supplierId**, _customerId, _amount)
Error: Argument 1: cannot convert from 'SupplierId' to 'CustomerId'

ValueObjects are easy to use, but... #

ValueObjects are easy to use, but, they're not as easy to use as primitives.
The difference might only be a line or two of code, e.g.

int customerId;

customerId = 42;

as opposed to

[ValueObject<int>]
partial struct CustomerId;

var customerId = CustomerId.From(42);

You initially think 'I'll just quickly use a primitive, it'll be fine', but then find you need to validate it, and then find you create and use them all over the place! Your code is eventually scattered with hundreds of method parameters all taking primitives and you have no certainty if you're dealing with valid domain objects or potentially invalid primitive values.

It can sometimes be a tricky thing to sell to your team when you want to introduce ValueObjects. Some languages have them built in, so it's not a problem. But if your language doesn't have them built in, then the allure of the seeming simplicity of primitives is difficult to break. But the keyword here is seeming simplicity: you pay a cost at some point, and that cost is repetition, repetition of validation (has someone validated before you got it, who knows! better do it again!), and the longer its left, the greater the cost.

What have we seen? #

We've seen

Hopefully this post has been interesting and useful. Thank you for reading and please feel free to use the comments below and/or share via Twitter etc.

🙏🙏🙏

Since you've made it this far, sharing this article on your favorite social media network would be highly appreciated 💖! For feedback, please ping me on Twitter.

Leave a comment

Comments are moderated, so there may be a short delays before you see it.

Published