The HIGH cost of ConcurrentBag in .NET 4.0
I got some strange results when using concurrent collections, so I decided to try to track it down, and wrote the following code:
var count = ?; var list = new List<object>(count); var sp = Stopwatch.StartNew(); for (int i = 0; i < count; i++) { list.Add(new ConcurrentBag<int>()); } sp.Stop(); Console.WriteLine("{0} {2} items in {1:#,#;;0}ms = {3:#,#;;0}ms per item", sp.Elapsed, sp.ElapsedMilliseconds, count, sp.ElapsedMilliseconds / count);
And then I started to play with the numbers, and it is not good.
- 10 items in 2ms = 0ms per item
This is incredibly high number, you have to understand. Just to compare, List<int> takes 8 ms to create 100,000 items.
Let us see how it works when we use more of this.
- 100 items in 5ms = 0ms per item
- 1,000 items in 37ms = 0ms per item
- 10,000 items in 2,319ms = 0ms per item
Note the numbers, will you?
1,000 items in 37 ms, but 10,000 items? 2.3 seconds!
- 20,000 items in 21,331ms = 1ms per item
And doubling the amount took ten times as long?
- 25,000 items in 32,588ms = 1ms per item
And at this point, I stopped trying, because I didn’t have the patience.
Note that the other concurrent collection, ConcurrentStack, ConcurrentQueue and ConcurrentDictionary do not suffer from the same problem.
I contacted Microsoft about this, and this is already resolved in .NET 4.5. The underlying issue was that ThreadLocal, which ConcurrentBag uses, didn’t expect to have a lot of instances. That has been fixed, and now can run fairly fast.
Comments
This is why tests should fail if performance is too poor. It's not hard to write such tests for this kind of code, which has no external dependencies.
Bug in your 'per item' timings there...
I'm curious to see what they changed in ThreadLocal. The 4.0 implementation was extremely unusual and interesting.
You just stole 2min of my time, why?
Ayende, you can do better than that :)
The memory perspective is very interesting too, in some concurrent scenarios using a ReaderWriterLockSlim is very costly in memory terms
http://social.msdn.microsoft.com/Forums/da-DK/netfxbcl/thread/fe2ce8aa-dd78-42e6-b5f4-26df96a16bc2
Also the basic Concurrent classes like the ConcurrentDictionary if you create a ton of them (For example we used ConcurrentDictionary for some caching in each entity and profiling a bit we found the memory problem on the ConcurrentDictionary)
Cheers
How is this testing concurrent bag? All I can see is you testing the constructor for concurrent bag by adding it to a list and not actually adding items to the bag or dealing with the bag. Is that the only issue? When you have 10000 bags created do they perform reasonably?
You do realize you're testing how many ConcurrentBag objects you can synchronously instantiate, not how many items you can put into a ConcurrentBag, right? What's an example use case for creating n ConcurrentBag objects in a loop? I can imagine adding n items to a ConcurrentBag, but not creating n ConcurrentBags.
Chris, For example, we have state associated with queries in RavenDB, we initially put that state (for each unique query) inside a ConcurrentBag. But it turns out that just the act of creating the bag can be the most expensive thing there.
ConcurrentBag is another word for memory leak. See http://stackoverflow.com/questions/10850443/concurrentbag-does-cause-memory-leak It is very difficult to use it right. Besides this it does consume waay more memory for simple data types like an integer. I have in general a problem with a type that keeps a disposable object (ThreadLocal) inside it but it does not implement the IDisposable interface. The only way to get rid of the threadlocals is to terminate the thread which did create the ConcurrentBag. This is still not fixed with .NET 4.5.
I thought @kkozmic found out about this a while back and said to stay away from the concurrent classes no?
Comment preview