What Is Primitive Obsession And How To Avoid It

What is primitive obsession, the different kinds of primitive obsession, consequences it can have and solutions for each case

Posted by : Giannis Akritidis on Jun 11, 2024

Category : Architecture C#

What Is Primitive Obsession

An important aspect of object-oriented programming is the ability to create custom types. For users, a type is defined by its behaviors, not its data. For example, a 32-bit integer and a 32-bit float contain the same data, 32 bits that can be zero or one, but there is a distinct difference between an int and a float because of their differing behaviors (or operations, in the context of numeric types).

Every OOP language provides some basic types, known as primitives, which can be used to construct more complex types.

Primitive obsession is the use of primitive types (e.g., int, float, string) to represent higher-level concepts. While not always bad, excessive reliance on primitive types can negatively impact a program’s readability, ease of maintenance, debugging, and correctness.

Although primitive obsession is a code smell, it is also very easy to avoid. A program often suffers from primitive obsession because the programmer either isn’t aware of it or, more commonly, because of momentary laziness, uses a primitive type instead of creating a custom one, which then propagates throughout the code.

Let’s explore some common manifestations of the primitive obsession code smell, the problems it may cause, and solutions to it.

Primitives As Types

Many times, a programmer may use a primitive type in place of a higher level type. For example:

int age = 20;
float damage = 10f;

Here we use an int type to represent age and a float type to represent damage. If age and damage are important concepts in our program, then using primitive types to represent them, in different places in our code can create certain problems:

The Loss of Type Safety And Boundaries Checking

One of the problems, that primitive obsession creates, is that we lose the type safety of the higher-level type. In the previous example, the values that the age variable can have, is a subset of the values an integer type can hold. For instance, an age variable can never be negative, or if it represents the age of an adult, it can never be below 18.

The same applies to our damage variable. Although it should never be negative, a mistake can easily be made, resulting in an invalid value being assigned to it. Even worse, someone could easily write something like this:

damage = age;

That obviously makes no sense, but because we don’t have type safety, the compiler won’t complain.

By using, primitive types that are a superset of our intended type, we lack boundary checks and type safety. One could place guards throughout the code, but this leads to higher maintenance since any changes would need to be made in all the places where guards are used, instead of having a new type with the appropriate guards built in.

Behaviors In Primitives That Don’t Apply To Our Represented Type

Another problem that arises when using primitive types to represent domain concepts is that these types may include behaviors that make no sense for our concepts. For example, with our previous variables, dividing an age by another age or subtracting damage from damage would be allowed by the compiler without errors, as the compiler understands only integers, not the specific concepts of damage and age.

If we had created our own types, we could define the appropriate behaviors and overload the operators that make sense for those types. Any other operations allowed for the int and float types would result in compiler errors. Additionally, we could overload the checked operators with logic specific to our types, ensuring that any overflows or underflows would be detected by the compiler. For more information about the checked operators overloading, you can check my post Encapsulation of primitive types and checked operator overloading in C# 11

The Implied if’s Problem

A problem that primitive obsession creates and can really make our code more complicated, harder to maintain and difficult to reason about, is what I call the Implied If’s problem. In short, when we use primitives to represent a concept, we can inadvertently create variables that represent two different things, depending on their current value. An example of that, is an integer damage variable that represents damage when it has a positive value and healing when it has a negative value. This creates issues in our codebase because the if statement is implied. Eventually, an implied if statement leads to more implied if statements, until a programmer spends more mental effort remembering all these implied rules than thinking of new solutions for their code.

You can read a more in depth explanation of this, in my post What Are The Implied If Statements In Code.

Solution: Custom Immutable Value Object In Place Of Primitives

A solution to the primitive obsession code smell, when it manifests as using primitives in place of domain types, is to create custom immutable value objects that wrap the primitive and perform the necessary checks.

These types can have validation rules to ensure that they always represent a valid state and can also contain any common behavior. These value objects should be immutable and have value equality.

In C# there are two mechanisms that can help us with this. Both save us from writing a lot of boilerplate code. These are the record types and record struct types. You can check them Here.

In short, both have overridden behaviors to perform value equality, record structs are value types, so they are more appropriate to wrap value type primitives to avoid any boxing and the readonly modifier can ensure the record struct’s immutability. For a record class, we should ensure immutability ourselves.

Multiple If Statements Instead Of Strategy (OCP violation)

Another manifestation of the primitive obsession code smell is using a primitive value solely as an identifier for multiple different cases. This often results in a block of code with numerous if statements or a large switch statement, each containing a significant amount of logic.

This is a violation of The Open Closed Principle, is solved with the ‘Replace conditional with polymorphism’ refactoring and is also a good opportunity for the application of the strategy pattern.

Primitives As Parameters

Finally, primitive obsession can also manifest when we pass primitives as parameters to methods. Although sometimes a primitive parameter is appropriate, often we have code that extracts a value from an object and then passes that value as a parameter.

For example, let’s suppose we have a type called Driver with an age property of type int. The Driver type performs all the necessary checks whenever we set the age property, ensuring each Driver object is always in a valid state. Now, suppose we need to create a method in another class that takes a driver’s age and, after some calculations, returns the percentage of accidents that drivers of this age have. We might be tempted to create this method with an int parameter called age, but that would be a primitive obsession code smell.

The reason is that although the age of the driver is of type int, the Driver type performs all the necessary checks to ensure the age value is within appropriate limits. The method, however, can take any int value. The correct approach is for our method to take a Driver object and then extract the age value from the age property of the Driver within the method.

The Fewer dependencies Fallacy

Some may think that using a Driver type as a parameter instead of an int is wrong because it creates more dependencies in our code. After all, the fewer dependencies each type has, the easier the code is to understand.

While it is true that we should aim to minimize dependencies between classes, this is more of a conceptual issue than a technical one. Using an int type only eliminates a technical dependency, but conceptually, the dependency on the Driver type still exists. The parameter in our method is not truly of type int because it cannot take all the values an int type can. Instead, it is constrained by the same rules as the age property in the Driver type.

For example, if the Driver type restricts the driver’s age to be above 18 years, this check should also exist inside our method. This creates a problem because now we have duplicate code performing the same checks in two different and unrelated parts of our program. Later, if the requirements change and drivers can be 16 years old or older, the maintainer of our program must also update the checks in the method.

We have an implied dependency between this method and the Driver type, where the method’s parameter and the Driver’s age property must always obey the same rules and restrictions. Any change to the Driver’s checks must be mirrored in the method’s checks. Implied dependencies, like implied if statements, create problems with the complexity of our code.

Solution: Passing Objects As Parameters

Fewer dependencies are beneficial, but implied code is probably the worst thing in a codebase. Anything implied should be explicitly written down so that any reader can understand it. For this reason, our method should have a Driver parameter because it explicitly states the dependencies that exist.

One solution to the primitive parameters code smell is to always pass an object that contains the primitive type in a valid state, instead of extracting it, passing it as a parameter, and then performing the same checks to validate its state as those in the object’s type.

Solution: Create Parameter Object

Sometimes, we may pass primitives into a method that don’t belong in a type. For example, let’s consider a method that creates a random level enemy by accepting two parameters: the minimum and maximum level of the enemy as integers.

This can be prone to mistakes, as we might inadvertently pass a minimum level that is higher than the maximum level. Of course, we can include appropriate checks inside our method. However, if we find ourselves using these two values in more than a couple of methods, it’s more appropriate to introduce a new type, such as EnemyLevelRange, that contains these values and any relevant checks.

Serialization problems

While creating new types that wrap primitive types with appropriate behavior can benefit our code, we should be careful not to overdo it. Overdoing it may overly complicate our code dependencies and can also create another problem.

Serialization of primitive types, is easy as is supported by many serialization methods. However, when we create our own types, if serialization is needed we should be careful, as this can complicate things. Unwrapping the custom type to perform serialization and vice versa can consume development time and effort.

Conclusion

OOP is about creating new types and using primitives with implied boundaries, or having code that performs the same checks on primitives in different places, is a code smell called primitive obsession. By creating our own types, we can leverage one of the biggest advantages OOP has to offer: encapsulation.

Thank you for reading, and as always, if you have any questions or comments you can use the comments section, or contact me directly via the contact form or by email. Also, if you don’t want to miss any of the new blog posts, you can subscribe to my newsletter or the RSS feed.

Recent Articles