Public Service Announcement: ConcurrentDictionary.Count is locking

Aug 11 2017

Public Service AnnouncementConcurrentDictionary.Count is locking

time to read 2 min | 233 words

During a performance problem investigation, we discovered that the following innocent looking code was killing our performance.

This is part of a code that allow users to subscribe to changes in the database using a WebSocket. This is pretty rare, so we check that there aren’t any and skip all the work.

We had a bunch of code that run on many threads and ended up calling this method. Since there are not subscribers, this should be very cheap, but it wasn’t. The problem was that the call to Count was locking, and that created a convoy that killed our performance.

We did some trawling through our code base and through the framework code and came up with the following conclusions.

ConcurrentQueue:

Clear locks
CopyTo, GetEnumerator, ToArray, Count creates a snapshot (consume more memory)
TryPeek, IsEmpty are cheap

Here is a piece of problematic code, we are trying to keep the last ~25 items that we looked at:

The problem is that this kind of code ensures that there will be a lot of snapshots and increases memory utilization.

ConcurrentDictionary:

Count, Clear, IsEmpty, ToArray - lock the entire thing
TryAdd, TryUpdate, TryRemove – lock the bucket for this entry
GetEnumerator does not lock
Keys, Values both lock the table and forces a temporary collection

If you need to iterate over the keys of a concurrent dictionary, there are two options:

Iterating over the entire dictionary is much better then iterating over just the keys.

Tweet Share Share 10 comments

Tags:

More posts in "Public Service Announcement" series:

(11 Aug 2017) ConcurrentDictionary.Count is locking
(28 Dec 2010) Git master repositories for the Rhino Tools projects

Comments

11 Aug 2017
09:11 AM

Damien

My first instinct would be - why is this not using Any? For much the same reason I'd tell off someone using COUNT(*) in SQL if all they care about is whether any row exists.

11 Aug 2017
09:15 AM

Scooletz

The enumerator is super friendly, using only volatile reads. No locking performance overhead. I've used the iterator over Keys/Count several times already. Not to mention, writing a custom structure if lookup/cache is needed.

11 Aug 2017
09:57 AM

TomCollins

@Damien: I think the comparison with sql's COUNT(*) is wrong. The database has to count the rows but the dictionary just looks it up.

11 Aug 2017
10:27 AM

Ian Davies

@Oren, whats was the actual solutions here?

BTW found this blog recently, https://blogs.msdn.microsoft.com/dotnet/2017/06/07/performance-improvements-in-net-core/ Would be interesting to know if you're playing with .Net Core 2 yet? Looks like there's a load of performance improvements around collections

11 Aug 2017
11:56 AM

Damien

@TomCollins - except as already observed here, with the concurrent version it can't just "look it up". It takes locks, it has to iterate over its internal tables, etc. Any is semantically what's being asked for and should be the default unless or until it's proven to be a performance issue itself.

11 Aug 2017
14:38 PM

Federico Lois

@Ian we are still not using CoreCLR 2.0, the work is under way and if everything goes well we should be able to have it running there within the coming weeks. There are improvements but the general locking mechanisms for concurrent data structures haven't changed. @Damien the problem with LINQ operations (in general) on hot spots (where this was used) is that the predicate requires a call. LINQ is essentially banned from the core of RavenDB (and afaik it is from other high throughput software like Roslyn) even if semantically is what you need.

11 Aug 2017
14:38 PM

peter

core keeps the same behavior, it seems. https://github.com/dotnet/corefx/issues/3357

11 Aug 2017
17:18 PM

peter borisute

not sure this is the same issue, I didn't look into c# source if it escalates locks, but here is a java discussion: http://stackoverflow.com/a/22996395/1268003

15 Aug 2017
06:18 AM

Ryan Heath

And how is it fixed? Implemented your own 'Any' that uses the non-blocking GetEnumerator?

// Ryan

20 Aug 2017
14:39 PM

Oren Eini

Ryan, We changed things so we are very careful about access Count or Keys, Values, etc. In some cases, we started tracking this value externally.

Comment preview

Comments have been closed on this topic.

Oren Eini

Oren Eini

CEO of RavenDB