Mixing Sync & Async calls

time to read 5 min | 928 words

Take a look at the following code:

int count = 0;
var sp = Stopwatch.StartNew();
var tasks = new List<Task>();   
for (int i = 0; i < 500; i++)
{
    var t = Task.Run(() =>
    {
        var webRequest = WebRequest.Create(new Uri(“http://google.com”));
webRequest.GetResponse().Close();
Interlocked.Increment(ref count); }); tasks.Add(t); } var whenAll = Task.WhenAll(tasks.ToArray()); while (whenAll.IsCompleted == false && whenAll.IsFaulted == false) { Thread.Sleep(1000); Console.WriteLine("{0} - {1}, {2}", sp.Elapsed, count, tasks.Count(x=> x.IsCompleted == false)); } Console.WriteLine(sp.Elapsed);

As you can see, it is a pretty silly example of making 500 queries to Google. There is some stuff here about reporting & managing, but the key points is that we start 500 tasks to do some I/O. How long do you think that this is going to run?

On my machine, this code completes in roughly 22 seconds. Now, WebRequest is old school, and we want to use HttpClient, because it has a much better interface, so we wrote:

int count = 0;
var sp = Stopwatch.StartNew();
var tasks = new List<Task>();
for (int i = 0; i < 500; i++)
{
    var t = Task.Run(() =>
    {
        var client = new HttpClient();
        client.SendAsync(new HttpRequestMessage(HttpMethod.Get, new Uri("http://google.com"))).Wait();
 
        Interlocked.Increment(ref count);
    });
    tasks.Add(t);
}
var whenAll = Task.WhenAll(tasks.ToArray());
while (whenAll.IsCompleted == false && whenAll.IsFaulted == false)
{
    Thread.Sleep(1000);
    Console.WriteLine("{0} - {1}, {2}", sp.Elapsed, count, tasks.Count(x => x.IsCompleted == false));
}
Console.WriteLine(sp.Elapsed); 

I’ve been running this code on my machine for the past 7 minutes, and this code hasn’t yet issue a single request.

It took a while to figure it out, but the problem is in how this is structured. We are creating a lot of tasks, more than the available threads in the thread pool. Then we make an async call, and block on that. That means that we block the thread pool thread. Which means that to process the result of this call, we’ll need to use another thread pool thread. However, we have scheduled more tasks than we have threads for. So there is no thread available to handle the reply, so all the threads are stuck waiting for reply that there is no thread to handle to unstick them.

The thread pool notices that, and will decide to allocate more threads, but they are also taken up by the already existing tasks that will immediately block.

Now, surprisingly, eventually the thread pool will allocate enough threads (although it will take it several hours to do so, probably) to start handling the requests, and the issue is resolved. Expect that this ends up basically crippling the application while this is happening.

Obviously, the solution is to not wait on async calls inside a task like that, indeed, we can use the following code quite easily:

var t = Task.Run(async () =>
{
    var client = new HttpClient();
    await client.SendAsync(new HttpRequestMessage(HttpMethod.Get, new Uri("http://google.com")));

    Interlocked.Increment(ref count);
});

And this shows performance comparable to the WebRequest method.

However, that isn’t very helpful when we need to expose a synchronous API. In our case, in RavenDB we have the IDoctumentSession, which is a simple synchronous API. We want to use the a common codebase for sync and async operations. But we need to expose the same interface, even if we changed things. To make things worse, we have no control on how the http client is scheduling the async operations, and it has no sync operations.

That means that we are left with the choice of writing everything twice, once for synchronous operations and once for async ops or just living with this issue.

A work around we managed to find is to run the root tasks (which we do control) on our own task scheduler, so they won’t compete with the I/O operations. But that is hardly something that I am thrilled about.