C# Equality and order comparisons (Part 1)

Posted by : on

Category : C#

Introduction

In C# there are many ways to check for equality or ordering between our types. Some of those can be overloaded or overridden, some are determined at compile time, others are determined at runtime. Some are used for value equality, others for referential equality. Some are faster but not as reliable and others are pluggable protocols used for equality checking that is specific for collections of our types.

Specifically the following post will try to describe the differences between:

  • The == and != operators
  • The virtual Equals method in the object class
  • Calling the static Equals method from the Object type
  • The Equals method of the IEquatable <T> interface
  • The virtual GetHashCode method
  • Calling the static ReferenceEquals method from the Object type
  • The IEqualityComparer and IEqualityComparer<T> interfaces
  • The IStructuralEquatable interface

for equality checking and

  • The < and > operators
  • The IComparable and IComparable<T> interfaces
  • The IComparer and IComparer<T> interfaces
  • The ReferenceEqualityComparer.Instance
  • The StringComparer for string comparison
  • The IStructuralComparable interface

for ordering.

All these have their uses, considering that we have two kinds of equality checking: Value and referential equality but we also care about the speed of our equality checking and/or specific equality behaviour for some of our types inside collections.

Let’s start.

Value equality, structural equality and referential equality

For the rest of this post, i will use the following types for the examples in equality checking:

public class EnemyClass
{
   public int Level { get; init; }
   public int HitPoints { get; set; }
   private readonly float _gold;

   public EnemyClass(float gold) => _gold = gold;
}

public struct EnemyStruct
{
   public int Level { get; init; }
   public int HitPoints { get; set; }
   private readonly float _gold;

   public EnemyStruct(float gold) => _gold = gold;
}

public record EnemyRecordClass
{
   public int Level;
   public int HitPoints;
   private float _gold;

   public EnemyRecordClass(float gold) => _gold = gold;
}

public record struct EnemyRecordStruct
{
   public int Level { get; init; }
   public int HitPoints { get; set; }
   private readonly float _gold;

   public EnemyRecordStruct(float gold) => _gold = gold;
}

The first two, are the most important because they represent a reference type (class) and a value type (struct). The last two are a record class and a record struct, which essentially are a class and a struct with some methods predefined for us. Among those methods are some of the methods that check for equality so that we won’t have to implement them again.

There are two types of equality checking: Value and reference equality.

With value equality we check if two types are equal with some rules we define ourselves (or are predefined for build in types) and make sense for the type we are checking.

Unless we have overridden something the following apply:

Value types can only use value equality, for example two integers are considered equal if they have the same value:

int x = 10;
int y = 10;
Console.WriteLine(x == y); // true

There is special case for value equality called structural equality. With structural equality two values are considered equal if all their members are equal. For example:

EnemyRecordStruct enemyRecordStruct = new(10f) { Level = 5 };
EnemyRecordStruct enemyRecordStruct2 = new(10f) { Level = 5 };
enemyRecordStruct.HitPoints = 100;
enemyRecordStruct2.HitPoints = 100;

Console.WriteLine(enemyRecordStruct == enemyRecordStruct2); // true

There is an exception here in the rule that value types can only use value equality. If our value types are boxed then referential equality is used (because object is actually a class).

With referential equality, we check if two references refer to the same object. If they are not, they are not considered equal even if they hold the same values. For example:

int x = 10;
int y = 10;
object xobj = (object)x;
object yobj = (object)y;

Console.WriteLine(xobj == yobj); //false

Reference types use referential equality by default.

EnemyClass enemyClass = new(10f) { Level = 5 };
EnemyClass enemyClass2 = new(10f) { Level = 5 };
enemyClass.HitPoints = 100;
enemyClass2.HitPoints = 100;

EnemyClass enemyClass3 = enemyClass;

Console.WriteLine(enemyClass == enemyClass2); // false
Console.WriteLine(enemyClass == enemyClass3); // true

The == And != operators

the default use of the == and != operators is as shown above: value equality for value types and referential equality for reference types. Because they are operators they are resolved statically, that means that the decision about the equality happens at compile time and that makes them extremely fast.

The thing to notice here is that if we decide to overload one of those operators, we actually have to overload the other too.

For structs we have to overload the operators ourselves, but for record structs this has already be done for us and that’s why the EnemyRecordStruct example above, works.

Equals, object.Equals and the IEquatable<T> interface

The equals method is useful when we want to have a different meaning for our equality checks.

For example we may want a class to implement value equality with the equals method and referential equality with the == operator.

Or we may have a record that we don’t want full structural equality, but equality between only some of its fields.

The virtual Equals method in the object class is calculated at runtime. With reference types, it performs referential equality by default, with structs it performs structural equality by default:

For the EnemyClass the result is the same as in the previous example, for the EnemyStruct :

EnemyStruct enemyStruct = new(10f) { Level = 5 };
EnemyStruct enemyStruct2 = new(10f) { Level = 5 };
enemyStruct.HitPoints = 100;
enemyStruct2.HitPoints = 100;

Console.WriteLine(enemyStruct.Equals(enemyStruct2)); // true

There is a problem here. If our field is null, this will throw. Either we have to check for null before the equality check, or even better we can use the static object.Equals method.

This is useful for generics. In generics we cannot use the == operator, because the compiler cannot make a decision at compile time about a type that is not known, so object.Equals is a solution, but not the best solution.

The object.Equals method has an argument of type object and this will cause boxing for our value types. In the previous example, both enemyStruct and enemyStruct2 get boxed. The first because it uses the virtual Equals method that inherits from the object type and the second because the method takes an object type as a parameter.

The solution to this is the IEquatable<T> interface. By implementing it, we can avoid boxing of our value types so that the equality check will be faster.

The IEquatable<T> interface has only one method, named Equals. In our EnemyClass we can implement it like this:

public class EnemyClass : IEquatable<EnemyClass>
{
   public int Level { get; init; }
   public int HitPoints { get; set; }
   private readonly float _gold;

   public EnemyClass(float gold) => _gold = gold;
   
   public bool Equals(EnemyClass? other)
   {
      return other != null && GetType() == other.GetType() && Level == other.Level && HitPoints == other.HitPoints;
   }

   public override bool Equals(object obj)
   {
      return Equals(obj as EnemyClass);
   }
}

Here is what happens:

  • The equals method checks only for the Level and HitPoints, this is a design decision, but if for example our enemies are considered equal if they have the same Level and HitPoints and we don’t care about their gold, we only check for those fields in our method.
  • The GetType() == other.GetType() exists, because we may have inherited from the class. A child of the EnemyClass cannot be compared with its parent. We only compare objects of the same type.
  • The other != null ensures that we won’t get a null reference error in our comparison.
  • Finally we override the Equals method of the object because we want both of those methods to have the same behavior. Actually that is easy, as we call the IEquatable<T> Equals method from the object.Equals method.

The same thing, we can also do with structs. In fact because structs don’t have the == operator, it makes sense to overload it.

With classes the most usual thing is to have the == operator for reference equality and overriding the Equals method for value equality.

To be continued

For the equals method, the statement enemyStruct.Equals(enemyStruct) must always return true, some collections depend on that. Those collections also depend on the GetHashCode() method, so overriding it whenever we override the Equals method is recommended.

But this post is already getting long, so in the next post, let’s see the remaining methods that check equality and most importantly the GetHashCode() method, what it is, how we can implement it, why we should not depend on it alone but also have the Equals method and why we should ensure immutability of our data, when we are dealing with the GetHashCode() and collections that depend on a hash for comparison.

Go to part 2 here

Thank you for reading, see you in the next part and if you have any questions or comments, you can use the comments section or contact me directly via the contact form or by email. Also if you don’t want to miss any of the new blog posts, you can always subscribe to my newsletter or the RSS feed.


Follow me: