The edge case is in the timing

Dec 13 2016

The edge case is in the timing

time to read 1 min | 141 words

In a previous post, I showed how to get 10x performance boost by batching remote calls. The code included the following method:

This code has a subtle bug in it. Take a look and see if you can find it.

Yes, I’ll wait, honestly.

Go read the code and think about interesting ways to break it.

Okay, found it? Awesome, now let me explain anyway Smile .

This code suffer from the slow arrival syndrome. If we have a new request coming in every 100 ms, then the first request here will languish in the queue for over 25 seconds!

Luckily, the fix is simple:

We just need to make sure that we aren’t waiting for too long. In this case, we will wait a maximum of less than half a second for the messages to go over the wire.

Tweet Share Share 9 comments

Tags:

Comments

13 Dec 2016
10:22 AM

hangy

Damn, and I thought the bug was that, document is never declared and val is never assigned. :) But yes, waiting for 25s before doing any processing would probably be bad.

13 Dec 2016
10:59 AM

I think you just created a new bug :) sp.ElapsedMilliseconds < 250

13 Dec 2016
11:34 AM

Shaun

Shouldn't it be sp.ElapsedMilliseconds > 250 since that is your exit condition?

Another bug is that if your cancellation token is triggered, then the documents that you removed from the queue are lost.

13 Dec 2016
11:50 AM

Oren Eini

Hangy, mk & Shaun, Yes, you are correct.

13 Dec 2016
11:51 AM

Oren Eini

Shaun, If the cancellation token is triggered, the entire operation is cancelled,not just a particular batch

13 Dec 2016
13:06 PM

Crowley

I know this is just a simple example but just in case someone copy and use this code: Maybe new List<RecordMailArgs>(256) before both loops and clearing the list instead of a new one every batch could give you better performance. Not just for the cost of New(); also because Clear does not reset the capacity and prevents autogrouth ( in case you don't use new (Int32) ).

13 Dec 2016
13:52 PM

Guillaume Pouillet

With this implementation and the correct conditions, it's still possible to wait up to 400ms.

Why not compute the remaining time substracting sp.TotalMilliseconds and the initial timeout value ?
I manage to have a stable wait time close to my initial timeout value using different conditions.

Something like

public static IEnumerable<T> TakeDuring<T>(this BlockingCollection<T> collection, int maximumItems, TimeSpan time, CancellationToken cancellationToken)
{
    var sp = Stopwatch.StartNew();
    var count = 0;
    T item;

    while (!cancellationToken.IsCancellationRequested && count < maximumItems)
    {
        var remaining = time - sp.Elapsed;
        if (remaining > TimeSpan.Zero)
        {
            if (collection.TryTake(out item, (int)remaining.TotalMilliseconds, cancellationToken))
            {
                count++;
                yield return item;
            }
        }
        else
            break;
    }

    cancellationToken.ThrowIfCancellationRequested();
}

13 Dec 2016
16:39 PM

Scooletz

Another approach for so-called smart batching is no-wait on empty queue. If there's nothing left in the queue, proceed with what you get batched and return to polling the queue. This approach is used by EventStore in its storage service. The same can be seen in the LMAX disruptor pattern and Aeron (messaging protocol). The last two actually obtains the last occupied position (depending on the library it's called differently) and proceed with all the items till this point. With this approach you ensure that small batches will be finished in the amortized time of handling a single batch. In your case, you'll always wait for this and additional timeout.

13 Dec 2016
20:20 PM

njy

@Oren so basically the good old difference between throttle and debounce?

http://benalman.com/projects/jquery-throttle-debounce-plugin/

(see the diagrams)

Comment preview

Comments have been closed on this topic.

Oren Eini

Oren Eini

CEO of RavenDB