Enumerables in .NET

IEnumerable and IEnumerator

Collections in .NET implement IEnumerable. This interface has just one method: GetEnumerator(). It returns an implementation of IEnumerator().

IEnumerator allows to (MSDN):

move between elements in the collection
read the current element
reset state

Why do we even need IEnumerable and don’t just implement IEnumerator in every collection type? That’s because we want to be able to enumerate a single collection by multiple actors. IEnumerator has a state. We can’t use the same instance of it from two different services, because none of these services would get a full picture of the collection. Instead, each service should retrieve its own copy of IEnumerator (via IEnumerable) and use it.

Examples

Array

An example of an IEnumerable is an Array. Here we can see that it implements IEnumerable.GetEnumerator(). The enumerator itself can be found here. It keeps reference to the Array. One of its components is the Current property. All it does is it returns the element at the current _index. The MoveNext() is also there. It basically increments the _index.

Infinite Enumerator

This is a simple enumartor that just returns consecutive numbers, forever (well, until it overflows):

public class InfiniteEnumerable : IEnumerable
{
    public IEnumerator GetEnumerator() => new InfiniteEnumerator();
}

public class InfiniteEnumerator : IEnumerator
{
    private int _current = 0;
    public object Current {get; private set;} => _current;

    public bool MoveNext()
    {
        _current++;
    }
}

Here’s how we’d use it:

foreach (var value in new InfiniteEnumerable())
{
    Consol.WriteLine(value);
}

It would just print numbers forever.

foreach

foreach keyword is basically a syntax sugar that relies on the IEnumerable and IEnumerator:

Syntax sugar:

foreach (var element in collection)
{
    Console.WriteLine(element);
}

Behind the scenes:

var enumerator = collection.GetEnumerator();
while(enumerator.MoveNext())
{
    Console.WriteLine(enumerator.Current);
}

yield

The yield keyword is a shortcut that allows us to create our own IEnumerable/IEnumerators. For example, to create an infinite enumerator, we don’t have to create new implementations of IEnumerable and IEnumerator. All we have to do is this:

public IEnumerable<int> GetNumbersForEver()
{
    var i = 0;
    while(true)
    {
        yield return i++;
    }
}

It’s a method that uses the yield keyword. It works in a way that each time we ask for the next value from the returned IEnumerable, it is going to execute the loop iteration, until it finds the next yield. In our case, there’s just 1 line of code in the loop, but there could be more.

So, basically, yield creates a custom IEnumerator (behind the scenes) that returns values only when we ask for them. The code in the method using yield runs ONLY when we ask for the next element. It is very similar to how LINQ works.

LINQ

A small remark about LINQ and its IEnumerable is that it is not always the case that one element is pulled from the source at a time. In some cases we have to know all the values upfront before we’re able to return even the first value. Example of it are:

Reverse
OrderBy

We can’t reverse a collection if we don’t know the last value.

IQueryable

The IQueryable interface is a bit similar to IEnumerable, but also totally different at the same time. In fact, it inherits from IEnumerable. IQueryable is mostly used with LINQ and data providers. The advantage of it is that is allows to construct a query before executing it against a data source (e.g., a database). IQueryable has a property called Expression. This is the expression tree that a given instance of IQueryable represents. For example (using Entity Framework):

var people = context.People.Where(p => p.Name.StartsWith("B"));

This is turned into an expression tree, stored in the IQueryable.Expression.

Let’s say I add another line of code to what I had:

var threePeople = people.Take(3);

If we were using IEnumerable, we’d request all the people from the database first, and then (locally) extract three entities from that. It’s obviously inefficient. However, thanks to the use of IQueryable, when the Take(3) was added, the expression tree got modified and the constructed SQL query could make use of something like TOP to deliver just 3 people instances.

Here’s a quote from MSDN:

The second property (Expression) gives you the expression that corresponds to the query. This is quintessential essence of IQueryable’s being. The actual ‘query’ underneath the hood of an IQueryable is an expression that represents the query as a tree of LINQ query operators/method calls. This is the part of the IQueryable that your provider must comprehend in order to do anything useful. If you look deeper you will see that the whole IQueryable infrastructure (including the System.Linq.Queryable version of LINQ standard query operators) is just a mechanism to auto-construct expression tree nodes for you. When you use the Queryable.Where method to apply a filter to an IQueryable, it simply builds you a new IQueryable adding a method-call expression node on top of the tree representing the call you just made to Queryable.Where.

If we had a similar case with LINQ to Objects, IEnumerables would be fine. Here’s an example:

var top3 = GetCollection().Where(n => n < 6).Take(3);

foreach (var item in top3)
{
    Console.WriteLine(item);
}

IEnumerable<int> GetCollection()
{
    for (int i = 0; i < 8; i++)
    {
        Console.WriteLine("YIELD");
        yield return i;
    }
}

YIELD is printed three times. With external systems (like databases) we can’t go and ask for items one by one, that would be really inefficient. We need some mechanism to prepare an optimized query and send it once. That’s what IQueryable is for.

IAsyncEnumerable and IAsyncEnumerator

An async version of IEnumerable and IEnumerator is the pair of IAsyncEnumerable and IAsyncEnumerator.

Here’s a comparison of their methods/properties:

Enumerable:

IEnumerable	IAsyncEnumerable
GetEnumerator	GetAsyncEnumerator

Enumerator:

IEnumerator	IAsyncEnumerator
Current	Current
MoveNext	MoveNextAsync
Dispose	DisposeAsync
Reset	-

Really, the most important difference between traditional and async Enumerable is that the latter supports async loading of next items via MoveNextAsync.

Here’s an example of how to use the IAsyncEnumerable:

var collection = GetCollection();
await foreach(var item in collection)
{
    Console.WriteLine(item);
}

private async IAsyncEnumerable<string> GetCollection()
{
    for (var i = 0; i < 5; i++)
    {
        yield return await LoadItemFromTheWeb(i);
    }
}

The compiler actually translates the await foreach... into code that invokes await e.MoveNextAsync() in a loop.

CancellationToken

Async methods normally support cancellation. IAsyncEnumerable’s GetAsyncEnumerator is no different. It accepts a CancellationToken. However, since we’re rarely calling that method directly (we use await foreach instead, which deals with enumerators behind the scenes), there’s a special way to pass the CancellationToken:

await foreach (int item in GetAsyncEnumerable().WithCancellation(ct))
  Console.WriteLine(item);

The WithCancellation(ct) extension method passes the token.

On the other side, we also need to be able to accept CancellationToken to our methods that return IAsyncEnumerable. We use a special attribute for that:

private async IAsyncEnumerable<string> GetCollection(
    [EnumeratorCancellation] CancellationToken cancellationToken = default
)
{
    for (var i = 0; i < 5; i++)
    {
        yield return await LoadItemFromTheWeb(i, cancellationToken);
    }
}

The compiler will know that it should pass the CancellationToken (that we could provide via WithCancellatin(...)) to our method.

LINQ

LINQ does not support IAsyncEnumerables by default. There’s a NuGet package for that - System.Linq.Async. Adding it in is all we need to do, the namespace to import is just System.Linq, like in the standard LINQ.

References

← Comparisons

Unit Tests →