Collection Expressions in C#

Posted by : on (Updated: )

Category : C#

Introduction

Starting with C# 12, a new feature called collection expressions provides a consistent syntax for initializing different collection types. While some collections, such as Dictionary, are not yet supported, there are plans to extend support to more collections in C# 13.

Collection expressions are designed with performance and extensibility in mind. They have the potential to offer greater performance than collection initializers and can be utilized for custom types as well.

Collection Initializers and Collection Expressions

Collection initializers, offer us a way to initialize collections too, but they don’t offer a consistent syntax or the ability to optimize the creation of the collection. This is how we have been using collection initializers:

List<int> example = new List<int> {1, 2, 3, 4, 5};
List<int> example2 = new(){1, 2, 3, 4, 5};
var example3 = new List<int>{1, 2, 3, 4, 5};

Now we can do the same using collection expressions:

List<int> example = [1, 2, 3, 4, 5];

We can use square brackets to initialize a list, and the compiler will choose the most efficient method for us. Here are two important points to note:

  1. We must specify the type explicitly, such as List<int>, before our variable. If we use the var keyword, the compiler cannot infer the type. This isn’t a significant issue since the type would be unknown to us as well with the var keyword. Although it was considered to have a default type assignment for collection expressions when the type cannot be inferred, it was decided against this approach.

  2. The phrase ‘the best available way’ is crucial. When we use collection initializers, the lowered C# code essentially calls the collection’s Add method. In the previous example, this results in five calls to the Add method of the List. However, with collection expressions, the best available method is used depending on the collection and the desired initialization.

In the same example, collection expressions would initially create a List with five elements. This approach is important because it avoids triggering the garbage collector that would typically be activated after the first four elements are added to a List and each time the number of elements doubles. For more information on that see Lists in C#. After that, the lowered code would use the CollectionsMarshal unsafe method to access the underlying data of our List and initialize it. The lowered code would look something like that:

int num = 5;
List<int> list = new List<int>(num);
CollectionsMarshal.SetCount(list, num);
Span<int> span = CollectionsMarshal.AsSpan(list);
int num2 = 0;
span[num2] = 1;
num2++;
span[num2] = 2;
num2++;
span[num2] = 3;
num2++;
span[num2] = 4;
num2++;
span[num2] = 5;
num2++;

This code can vary depending on the collection we are using. This happens, because collection expressions utilize a factory class with a builder method responsible for constructing the collection. Unlike collection initializers, which rely on the Add method of each collection, collection expressions allow us to write code in the builder method that optimizes the initialization process. We can also create our own factory classes and builder methods for custom collections, ensuring the most efficient initialization for our specific needs.

The Collection Expression Type

A collection expression has no inherent type. Therefore, if the compiler cannot infer the type of the collection during declaration and initialization, an error will occur. However, a collection expression will implicitly convert to a type. It will implicitly convert to Span<T> and ReadonlySpan<T> types, to an array, to any type that supports the IEnumerable<T> interface AND has an accessible Add method, as also any of the following interfaces:

System.Collections.Generic.IEnumerable<T>.
System.Collections.Generic.IReadOnlyCollection<T>.
System.Collections.Generic.IReadOnlyList<T>.
System.Collections.Generic.ICollection<T>.
System.Collections.Generic.IList<T>.

The rule is that a Span<T> or ReadonlySpan<T> or another ref struct type is always preferred to a non ref struct type, and that a concrete implementation is always preferred from an interface type.

Because the collection expressions don’t have a type, they can be used as a parameter to a method that expects a collection that is supported, for example:

void ExampleMethod(List<int> aList)
{
    // your code here
}

can be called like this:

ExampleMethod([1, 2, 3, 4, 5]);

This is helpful, because if we want to change the parameter type to another type that also supports collection expressions, we don’t have to change the statement that makes the call.

Finally, collection expressions support the spread element, for example we can call our ExampleMethod like this:

List<int> example = new List<int> {1, 2, 3};
List<int> example2 = new(){1, 2, 3};

ExampleMethod([..example,..example2]);

This will create the list “1 2 3 1 2 3”. We can also combine the spread element with an already initialized collection using Indices and Ranges:

ExampleMethod([..example,..example2[^2..]]);

will be the list “1 2 3 2 3”.

Be Careful Of Extra Allocations

When our target type is IReadOnlyList<T> or IEnumerable<T>, a collection expression has an additional allocation. First the expression will be stored in an array and after that a new object will be allocated. This new object will be a collection that doesn’t allow any modifications at runtime. This is a minor detail, that may be important in performance critical scenarios.

Refactoring Considerations

When refactoring existing code to use collection expressions, we must be mindful of the following scenario. Consider the method:

IEnumerable<int> RefactorExample() => new List<int> { 1, 2, 3 };

If we refactor this to a collection expression, it may break compatibility for anyone relying on casting the result of RefactorExample back to a List. This would no longer be possible since the [1, 2, 3] collection was never of type List<int>.

Creating Custom Types That Support Collection Expressions

If we have our own collection, we can support the collection expressions syntax, by creating an accessible static class that has a Create method and our collection type is decorated with the CollectionBuilderAttribute. The Create method, must return an object of our collection type and will take as a parameter a ReadonlySpan<T> struct.

Let’s see an example with a custom collection. Let’s suppose that we have the following:

public class MyCollection
{
   private readonly int[] _phoneDigits = new int[10];

   public MyCollection(int[] phoneDigits)
   {
      if(phoneDigits.Length != _phoneDigits.Length)
         throw new ArgumentOutOfRangeException(nameof(phoneDigits),"A phone has to have 10 digits!");
      
      _phoneDigits = phoneDigits;
   }

   public IEnumerator<int> GetEnumerator() => _phoneDigits.AsEnumerable().GetEnumerator();
}

Then we can use the collection initializer syntax:

MyCollection myCollection = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9];

If we create the static class:

internal static class MyCollectionBuilder
{
   internal static MyCollection Build(ReadOnlySpan<int> digits) => new MyCollection(digits.ToArray());
}

And annotate the MyCollection class with:

[System.Runtime.CompilerServices.CollectionBuilder(typeof(MyCollectionBuilder), "Build")]

I chose this example to show some important details:

  1. The name of the method that creates the collection from a collection expression, doesn’t have to be Create, it can be whatever we want, as long as we use that in the CollectionBuilder attribute.
  2. Our collection doesn’t have to implement any interface, not IList<T>, not Collection<T> not even IEnumerable<T>, as long as it has a GetEnumerator method that returns an IEnumerator<T> we are ok.
  3. The argument of out Create method, has to be a ReadonlySpan<T>, where T is the type of the elements in our collection.

Just because we can, doesn’t mean we should. Ideally, our collection should implement a collection interface, at least IEnumerable<T>, which only requires the GetEnumerator method. This approach enhances code readability and allows our collection to be used as a parameter for any method that accepts an IEnumerable. Additionally, unless there is a compelling reason to change the name of the Create method, it should remain unchanged to avoid confusing readers.

Beyond these points, there are other best practices that define a “well-behaved” collection. These practices ensure that collection expressions function correctly. Failure to follow these practices results in undefined behavior when using collection expressions with our collection:

  1. The value of Count or Length, if present in our collection, should always return the actual number of elements in the collection.
  2. An AddRange method, if implemented, should yield the same result as adding the elements individually and in the same order.

Finally, the static Create method can reside within our collection, but the class implementing the Create method cannot be a generic class.

Conclusion

That covers collection expressions. Most of the time, their usage is straightforward as they unify the way we initialize collections using square brackets, []. When we want to support collection expressions in our own collections, the requirements are minimal. However, we should also follow some recommended practices to ensure that any initialization with a collection expression of our type behaves as expected.

Thank you for reading, and as always, if you have any questions or comments you can use the comments section, or contact me directly via the contact form or by email. Also, if you don’t want to miss any of the new blog posts, you can subscribe to my newsletter or the RSS feed.


Follow me: