Beware of big Task Parallel Library Operations

time to read 5 min | 900 words

Take a look at the following code:

class Program
{
    static void Main()
    {
        var list = Enumerable.Range(0, 10 * 1000).ToList();

        var task = ProcessList(list, 0);


        Console.WriteLine(task.Result);

    }

    private static Task<int> ProcessList(List<int> list, int pos, int acc = 0)
    {
        if (pos >= list.Count)
        {
            var tcs = new TaskCompletionSource<int>();
            tcs.TrySetResult(acc);
            return tcs.Task;
        }

        return Task.Factory.StartNew(() => list[pos] + acc)
            .ContinueWith(task => ProcessList(list, pos + 1, task.Result))
            .Unwrap();
    }
}

This is a fairly standard piece of code, which does a “complex” async process and then move on. It is important in this case to do the operation in the order they were given, and the real code is actually doing something that need to be async (go and fetch some data from a remote server).

It is probably easier to figure out what is going on when you look at the C# 5.0 code:

class Program
{
    static void Main()
    {
        var list = Enumerable.Range(0, 10 * 1000).ToList();

        var task = ProcessList(list, 0);

        Console.WriteLine(task.Result);

    }

    private async static Task<int> ProcessList(List<int> list, int pos, int acc = 0)
    {
        if (pos >= list.Count)
        {
            return acc;
        }

        var result = await Task.Factory.StartNew(() => list[pos] + acc);

        return await ProcessList(list, pos + 1, result);
    }
}

I played with user mode scheduling in .NET a few times in the past, and one of the things that I was never able to resolve properly was the issue of the stack depth. I hoped that the TPL would resolve it, but it appears that it didn’t. Both code samples here will throw StackOverFlowException when run.

It sucks, quite frankly. I understand why this is done this way, but I am quite annoyed by this. I expected this to be solved somehow. Using C# 5.0, I know how to solve this:

class Program
{
    static void Main()
    {
        var list = Enumerable.Range(0, 10 * 1000).ToList();

        var task = ProcessList(list);

        Console.WriteLine(task.Result);

    }

    private async static Task<int> ProcessList(List<int> list)
    {
        var acc = 0;
        foreach (var i in list)
        {
            var currentAcc = acc;
            acc += await Task.Factory.StartNew(() => i + currentAcc);
        }
        return acc;
    }
}

The major problem is that I am not sure how to translate this code to C# 4.0. Any ideas?