Ayende @ Rahien

It's a girl

Checking For Empty Enumerations

Phil Haack has an interesting post about this topic, where he presents the following solution:

public static bool IsNullOrEmpty<T>(this IEnumerable<T> items) {
    return items == null || !items.Any();
}

This solution, unfortunately, suffers from a common problem related to handling IEnumerables. The assumption that you can iterate over enumerable more than once. This hold true for things like collections, but in many cases, this sort of code will silently hide data:

var files = Directory.EnumerateFiles(".","*.cs");
if(files.IsNullOrEmpty())
{
    Cosnole.WriteLine("No files");
}
else
{
   foreach(var file in files)
   {
          Console.WriteLine(file);
   }
}

The first file will never appear here.

A better solution is:

public static bool IsNullOrEmpty<T>(this IEnumerable<T> items, out IEnumerable<T> newItems) 
{
    newItems = items;
    if(items == null)
        return false;
    
    var enumerator = items.GetEnumerator();
    if(enumerator.MoveNext() == false)
        return false;
        
    newItems = new[]{enumerator.Current}.Concat(enumerator);
    
    return true;
}

That will not lose data.

Comments

SYM
06/10/2010 09:53 AM by
SYM

Good! But if we use the better solution, the client code will be like this:

IEnumerable <t items = ...;

if (items.IsNullOrEmpty(items)) {

  // ...

}

I think the parameter of IsNullOrEmpty() method would be confusing.

Dennis
06/10/2010 10:07 AM by
Dennis

But your solution will fail on infinite ranges.

Ayende Rahien
06/10/2010 10:12 AM by
Ayende Rahien

Dennis, why would it fail?

SYM,

I agree it can be confusing, but that is about the only thing that can work in this situation

Dennis
06/10/2010 10:20 AM by
Dennis

Actually it doesn't compile... There isn't a Concat operator for IEnumerators

Tom de Koning
06/10/2010 10:31 AM by
Tom de Koning

You might want to check up on that again

[TestClass]

public class EnumerableTests

{

    [TestMethod]

    public void will_enumerate_all_items()

    {

        var counter = 0;

        var ints = new[] { 1, 2, 3, 4 }.AsEnumerable();

        if (ints.Any())

        {

            foreach (var i in ints)

            {

                counter++;

            }

        }

        Assert.AreEqual(3, counter);

    }

}
Dennis
06/10/2010 10:54 AM by
Dennis

This will work, but it is vastly more code. (sorry I didnt dump i somewhere)

public class IsNullOrEmptyChecker <t : IEnumerable <t
{

  private IEnumerator

<t _enumerator;

  private readonly IEnumerable

<t _orig;

  private bool _isEmpty;


  public IsNullOrEmptyChecker(IEnumerable

<t orig)

  {

     _orig = orig;

  }


  private void Init()

  {

     if (_enumerator != null)

        return;

     if (_orig != null)

     {

        _enumerator = _orig.GetEnumerator();

        _isEmpty = !_enumerator.MoveNext();

     }

     else

     {

        _isEmpty = true;

     }

  }


  public bool IsNullOrEmpty

  {

     get

     {

        Init();

        return _isEmpty;

     }

  }


  public IEnumerator

<t GetEnumerator()

  {

     Init();

     if (_isEmpty)

        yield break;

     yield return _enumerator.Current;

     while (_enumerator.MoveNext())

        yield return _enumerator.Current;

     _enumerator = null;

  }


  IEnumerator IEnumerable.GetEnumerator()

  {

     return GetEnumerator();

  }

}

Use: var files = new IsNullOrEmptyChecker <int(Directory.EnumerateFiles(".","*.cs"));

Dennis
06/10/2010 10:56 AM by
Dennis

Tom, that will fail with this:

  private static bool hest = false;

  public static IEnumerable

<int GetEnumerator()

  {

     if (hest)

        throw new Exception();

     hest = true;

     yield return 1;

     yield return 2;

  }

An enumerable that only allows 1 iteration

KAE
06/10/2010 11:09 AM by
KAE

"The first file will never appear here."

I'm afraid you didn't try to run your code.

Because it will appear.

So your point is valid only for some really strange implementation of IEnumerable, that change content based on number of iteration of enumerator that it return through call to GetEnumerator.

LL
06/10/2010 11:09 AM by
LL

Ayende, enumerators are IDisposable and should be disposed.

Patrik H&#228;gne
06/10/2010 11:10 AM by
Patrik Hägne

That would be a totally wrong implemented IEnumerable. I think you confusing IEnumerable(T) with IEnumerator(T).

LL
06/10/2010 11:13 AM by
LL

Patrik, below is the structure of IEnumerator <t, courtesy of Reflector:

public interface IEnumerator <t : IDisposable, IEnumerator

{

// Properties

T Current { get; }

}

Morne
06/10/2010 11:18 AM by
Morne

Your example disproves your assertion.

Would be nice to see an actual example where Phil's IsNullOrEmpty() check will lose the first item.

Your assertion seems to be based on the assumption that when a foreach contstruct is called on a IEnumerable collection, it will use an existing enumerator for the colleciton if it is available instead of getting a new one. This assumption doesn't seem right to me.

Patrik H&#228;gne
06/10/2010 11:36 AM by
Patrik Hägne

LL, I wasn't commenting on your comment. I was commenting on Ayendes blog post.

An that IEnumerable would return an enumerator that skips the first element the second time GetEnumerator is called would be totally wrong implemented IEnumerable.

Patrik H&#228;gne
06/10/2010 11:58 AM by
Patrik Hägne

My last comment should read:

An IEnumerable that would return an enumerator that skips the first element the second time GetEnumerator is called would be totally wrong implemented IEnumerable.

Diego Mijelshon
06/10/2010 12:48 PM by
Diego Mijelshon

Oren,

As Patrick noted, separate enumerations of an enumerable should start at the first element again.

In fact, the example you gave (Directory.EnumerateFiles) is wrong: it works as expected when doing .Any() first.

tobi
06/10/2010 01:21 PM by
tobi

Ayende, did you actually run the code. I think that Directory.EnumerateFiles(".","*.cs"); returns an IEnumerable. The IsNullOrEmpty method will create ea fresh enumerator from it as well as the foreach loop.

tobi
06/10/2010 01:31 PM by
tobi

But you are right that this will do side-effects twice although only for one element.

Jason
06/10/2010 01:45 PM by
Jason

I don't like the semantics of this method. Calling a method in an instance fashion to check for null should throw a null pointer exception if it's null.

We don't say "somestring".IsNullOrEmpty, we use String.IsNullOrEmpty(someString)

Seems counter-intuitive.

Marcel Popescu
06/10/2010 01:48 PM by
Marcel Popescu

[OT] Jason, you mean I'm the only one who wrote an IsNullOrEmpty extension method? "string".IsNullOrEmpty() is part of all my projects :)

Jason
06/10/2010 01:59 PM by
Jason

Marcel,

Oh I've done it. But I decided against it b/c it just seemed so wrong to have a NULL instance return something from a method call rather than throwing and exception.

Patrik H&#228;gne
06/10/2010 02:09 PM by
Patrik Hägne

I totally agree with Jason that the semantics are off, but I guess that's a comment to Phils original post, not to this post.

Haacked
06/10/2010 09:22 PM by
Haacked

As many commenters mentioned, the implementation of the Any method gets a new Enumerator which starts at the beginning. So it should be safe to call. I wasn't able to produce your results.

Robin
06/11/2010 05:01 AM by
Robin

As a side note your implementation is inversed from the name. Or I might be missing something obvious? I. e. the method will return false when the IEnumerable is null or empty which is the opposite of you might expect when calling IsNullOrEmpty().

Bryan Avery
06/11/2010 09:31 AM by
Bryan Avery

It works fine for me, sounds like brain thinking before testing a solution.

The other issue with this is that you are making a copy of the object, lets just hope the object is not too large...!

Royston Shufflebotham
06/11/2010 12:15 PM by
Royston Shufflebotham

Oren, if you were talking about IEnumerator <t, you'd be bang on, but with IEnumerable <t you get a brand new IEnumerator <t each time. As long as the underlying data doesn't change, repeated calls to IEnumerable <t.First() would give you the same value each time, as each call gets a new IEnumerator <t.

(With Phil H's original post, we really shouldn't have to be checking for null IEnumerable <t objects - APIs that return null IEnumerables are really broken; you just don't want to have two representations of 'no value', and you can always return Enumerable.Empty <t() if performance is crazily critical.)

Paul Irwin
06/11/2010 01:13 PM by
Paul Irwin

In your example Ayende, what about IEnumerator.Reset()? Although I agree with previous post about it being IDisposable and should be disposed, which usually would call Reset. Enumerable.Any() in System.Linq does the using pattern to reset the enumerator after it is used.

Luis Abreu
06/11/2010 02:36 PM by
Luis Abreu

Guys, can't believe this.

X x = null;

x.DoSomething();

what's the expected result? null reference exception. Why are these extension methods behaving differently from instance methods? Why aren't they simply helper methods? There's simply no reasoning for keeping this madness...

Erlis Vidal
06/11/2010 04:27 PM by
Erlis Vidal

Hi Luis,

I totally agree, is damn confussing. It should be a Helper.

mike
06/11/2010 08:25 PM by
mike

Wow, that is some tricky code

Dhananjay Goyani
06/14/2010 07:51 AM by
Dhananjay Goyani

+1 for Phil. Any() doesn't iterate next onto passed enumerator. Also Oren's solution doesn't seem to be good one.

Lucas O.
06/14/2010 03:07 PM by
Lucas O.

Enumerables can be iterated just once.

It is not the same thing use the same enumerator instance that use 2 different instances. Remember that we cannot modify the iterated collection when we use enumerator, in fact if we do that, the next call to MoveNext() method throws an exception.

We can get so many as we want to enumerators but between

its creations, lot of things can happen. In the Oren's example, he use two differents enumerators but between them can be a line deleting all the files. That is the problem.

hazzik
06/17/2010 04:53 PM by
hazzik

In following sample booth implementation with self.Any()==false and self.Count() = 0 loses data. Any lose only first element, and Count lose all of it.

    static void Main(string[] args)

    {

        var enumerable = Enumerable();


        if (enumerable.IsNullOrEmpty())

        {

            Console.WriteLine("enumerable is empty");

        }

        else

        {

            foreach (var i in enumerable)

                Console.WriteLine(i);

        }

        Console.ReadKey();

    }


    private static int item;


    private static IEnumerable

<int Enumerable()

    {

        while (item < 10)

            yield return item++;

    }
Patrik
06/18/2010 09:37 AM by
Patrik

@hazzik That's just one more way in which the c# implementation of IEnumerable and IEnumerator with the yield-keyword is broken.

Eduard Tom&#224;s
06/20/2010 12:02 PM by
Eduard Tomàs

@Patrik

I disagree... the @hazzik problem is about using a static variable, not yield...

Yes you can build non-valid IEnumerables with yield, but that does not mean that yield was broken :)

Greetings

SinsI
06/26/2010 08:52 AM by
SinsI

1) You don't need to check for null - if program fails with that, it means there is an error that requires your immediate attention - so it SHOULD stop execution . Otherwise, you can always catch the resulting exception.

2) Why are there no .IsEmpty member in IEnumerable? Why must we try to re-invent the wheel?

Patrik
06/26/2010 09:47 PM by
Patrik

@Eduard

You're right, I didn't read the code in detail. I didn't catch the static variable. This is clearly a case where the implementation of the method that returns the IEnumerable is faulty, not the implementation of yield. It's still broken though in other ways.

Comments have been closed on this topic.