Thread pool starvation? Just add another thread

time to read 2 min | 388 words

One of the nastier edge cases with TaskCompletionSource is that you can attach a continuation to that which will run synchronously. You can avoid that to a certain extent by using RunContinuationsAsynchronously, and that works, but under load, it can still be problematic.

In particular, consider the case where we have a task with:

  1. Do computation
  2. Enqueue a task to be completed by a different thread (getting a Task back)
  3. Continue computation until done
  4. Wait for previous operation to complete
  5. Go to 1

Even with avoiding running the continuation in sync mode, that still result in an issue. In particular, when we are running asynchronous continuation, that isn’t magic, that still need a thread to run on, and that will typically be a thread pool thread.

But if all the thread pool threads are busy doing the work above, it may force us to wait until we are done with the computation that the code is running, to pull some more work from the thread pool queue until the queue gets to the notification that we are ready to work. In other words, we may suffer from jitter, where the running task is waiting for an already complete async operation, but it doesn’t know it (and hence give up the thread) because there wasn’t any available thread to run it.

We resolve it by adding a dedicated thread, which simply wait for those notifications, and run only them. Because those are typically very short, and there isn’t that many of them, it means that we can process them very quickly. In order to prevent us from having stalls on that thread, we use what I think is a pretty nifty trick.

We are registering the event twice, once on our dedicated thread, and once on the normal thread pool. If somehow the dedicated thread is too busy, the thread pool (and its auto growth) will handle it, but most of the time, the dedicated thread can catch it and run it.

And adding a basically noop task to the thread pool isn’t going to generate any pressure on the thread pool if there is no load, and if there is load, it will encourage it to grow faster, which is what we want.

If you care to look at the code, it is here.