Introduction
This is a continuation of my previous post about equality in C#.
After looking at the ==
operator and the three Equals
methods (the static object.Equals, the virtual object.Equals and the IEquatable <T> Equals) let’s continue with the GetHashCode()
method, why it should always be implemented when we implement the Equals
method and the remaining ways to check for equality in C#.
HashCodes
Both the Dictionary and Hashtable collections depend on hashes to perform fast equality checks between their elements. The reason they are fast is because equality checks depend on the hash primarily, which in C# is a 32 bit integer.
The return value of the object.GetHashCode must be the same between two objects for which the Equals
method returns true. When we have referential equality, the hash code is calculated by an internal token which is unique for each instance. That is the default behaviour for our classes, but when we implement the Equals
method to have value equality, then we have to find a way to calculate the hash code, so that is the same number for two objects that are considered equal.
Let’s remember the previous example where we implemented the IEquatable<T>
interface in our EnemyClass
.
public class EnemyClass : IEquatable<EnemyClass>
{
public int Level { get; init; }
public int HitPoints { get; set; }
private readonly float _gold;
public EnemyClass(float gold) => _gold = gold;
public bool Equals(EnemyClass? other)
{
return other != null && GetType() == other.GetType() && Level == other.Level && HitPoints == other.HitPoints;
}
public override bool Equals(object obj)
{
return Equals(obj as EnemyClass);
}
}
Now we have to find a way to implement the GetHashCode()
so that it only depends from the Level
and HitPoints
fields. Fortunately C# has some methods that can help with that.
The first thing, we have to remember is that the hashcode has to be the same value for two objects that are considered equal, or for the same object if it hasn’t changed. But that doesn’t mean that we cannot have the same hashcode for two different objects.
The hashcode is an 32 bit integer and that means that there are 2^32 different values. Obviously we may have more than 2^32 different objects. For example we can have infinite strings.
For that reason although we should try to implement the GetHashCode()
method in a way that gives different values for different objects, the collections that depend on the hashcode, also do an equality check between two objects that have the same hashcodes. That is why, it is important to have as many unique hashcodes as possible, so that the performance of these collections won’t suffer by also checking the Equals
method for objects that aren’t equal but happen to have the same hashcode.
The second thing, is that the hashcodes are not the same, for the same object in every run of our program. Although the same object has the same hashcode in the same run, in multiple runs can have different hashcodes. For example the hashcode for a string will be different every time we run our program. So saving the hashcodes in a file is not a good solution for keeping track of our objects in multiple runs of our program.
Finally the last thing we have to remember, is that the data from which we calculate our hashcode has to be immutable. If that data changes, then obviously the hashcode will change and that will make our object inside the collection inaccessible.
After all that, let’s see the implementation:
public class EnemyClass : IEquatable<EnemyClass>
{
public int Level { get; init; }
public int HitPoints { get; set; }
private readonly float _gold;
public EnemyClass(float gold) => _gold = gold;
public bool Equals(EnemyClass? other)
{
return other != null && GetType() == other.GetType() && Level == other.Level && HitPoints == other.HitPoints;
}
public override bool Equals(object obj)
{
return Equals(obj as EnemyClass);
}
public override int GetHashCode()
{
return Level.GetHashCode();
}
}
Here are some things worth noticing:
- Here we calculate the hashcode, only by the
Level
field, the reason is that theHitPoints
field is mutable and if we change it after we have added our object to a collections that uses the hashcodes ( for example as a key in a dictionary), the hashcode would also change and our object would become inaccessible. - That though means that the search for our object will not be as efficient, because all the objects with the same level would have the same hashcode and then the equals method will have to run.
- If our
HitPoints
field was immutable, for examplepublic int HitPoints { get; init; }
then we can use theCombine
method of the HashCode class:return HashCode.Combine(Level, HitPoints);
- Finally we can also create a HashCode object instead of the
Combine
method like this:
HashCode hash = new HashCode();
hash.Add(Level);
hash.Add(HitPoints);
return hash.ToHashCode();
again only if our HitPoints
field was immutable, unless we make sure that the HitPoints
field will not change, as long as our object is inside a collection that depends on hashcodes for equality.
ReferenceEquals
The static object.ReferenceEquals
method ensures referential equality. The reason it exists, is that even if the ==
operator and the Equals
method are overloaded, we still have a way to be sure that we use referential equality.
Effectively, it has the same result as when we cast our objects to the type object
and then use the ==
operator.
Plug in protocols
Sometimes we might not want the equality methods we have for an object, to be used inside a collection.
For example we don’t want the strings "uppercase"
and "UpperCase"
to be considered equal in general, but to be considered equal when we use them as keys to a dictionary.
For that reason, C# has some plug-in protocols:
The IEqualityComparer and IEqualityComparer<T> interfaces
Those two interfaces are the same, we have a generic version and a non generic that uses the object class. They have two methods, the Equals
and the GetHashCode
which behave the same way as before, but take effect only when we pass the class that implements those interfaces as a parameter to our collection. For example:
public class EnemyClass
{
public int Level { get; init; }
public int HitPoints { get; set; }
private readonly float _gold;
public EnemyClass(float gold) => _gold = gold;
}
public class PluggedInEquality : IEqualityComparer<EnemyClass>
{
public bool Equals(EnemyClass? enemy1, EnemyClass? enemy2)
{
if (ReferenceEquals(enemy1, enemy2)) return true;
if (ReferenceEquals(enemy1, null)) return false;
if (ReferenceEquals(enemy2, null)) return false;
if (enemy1.GetType() != enemy2.GetType()) return false;
return enemy1.Level == enemy2.Level && enemy1.HitPoints == enemy2.HitPoints;
}
public int GetHashCode(EnemyClass obj)
{
return HashCode.Combine(obj.Level, obj.HitPoints);
}
}
then we can use it like this:
PluggedInEquality pluggedInEquality = new PluggedInEquality();
Dictionary<EnemyClass, string> dictionary = new Dictionary<EnemyClass, string>(pluggedInEquality);
As before, if we change the HitPoints
while an object of the EnemyClass
is being used as a key in the dictionary, the hashcode will change and our object will become inaccessible.
If we don’t want to implement both of those interfaces, there is also the EqualityComparer<T>
abstract class that implements them and we only have to override one Equals
and one GetHashCode
method, if we derive from it.
Finally there is also the EqualityComparer<T>.Default
static property. It has the same behavior as the static object.Equals
method, but first uses the Equals
method from the IEquatable<T>
interface if it is implemented so that it can avoid boxing and if not will use the object.Equals
method. For example:
EqualityComparer<T>.Default.Equals(obj1, obj2);
From .NET9 we also have the option to use the
IAlternateEqualityComparer<TAlternate, T>
interface to compare equality between two different types. Useful for alternate lookup in hash-based collections. See: Alternate Lookup For Dictionary And HashSet With The IAlternateEqualityComparer
ReferenceEqualityComparer.Instance
The ReferenceEqualityComparer.Instance
returns a plug-in instance that will always perform referential equality.
The IStructuralEquatable interface
In the first part, we saw that structs always perform by default structural equality. Sometimes we may need structural value equality for other types.
For example we may need a way to check if the values of two arrays are the same. By creating a class that implements the IStructuralEquatable
interface, we can pass this class as a parameter to the equals method, or we can use an equality comparer that already exists, for example:
string name1 = "John Smith";
string name2 = "JOHN SMITH";
string[] name1Array = name1.Split();
string[] name2Array = name2.Split();
IStructuralEquatable name1ArrayStatic = name1Array;
Console.WriteLine(name1.Equals(name2)); // false
Console.WriteLine(name1ArrayStatic.Equals(name2Array, StringComparer.CurrentCultureIgnoreCase)); // true
Conclusion
And that’s it for the equality in C#. It is a big subject but hopefully these two posts have in one place all the different ways we can have equality checks. In part 3, we will see the different ways of comparison that exist between objects.
Thank you for reading, if you think I forgot something or if you have any questions or comments, you can use the comments section or contact me directly via the contact form or by email. Also if you don’t want to miss any of the new blog posts, you can always subscribe to my newsletter or the RSS feed.