Ayende @ Rahien

Refunds available at head office

Dictionary Puzzler

A while ago in the NHibernate mailing list we got a report that NHibernate is making use of a dictionary with an enum as the key, and that is causing a big performance issue.

The response was almost unanimous, I think, “what?! how can that be?!!?”. Several people went in to and tried to figure out what is going on there. The answer is totally non oblivious, Dictionary<K,V> force boxing for any value type that is used as the key.

That sound completely contradictory to what you would expect, after all, one of the major points in generics was the elimination of boxing, so what happened?

Well, the issue is that Dictionary<K,V> has to compare the keys, and for that, it must make some assumptions about the actual key. It is abstracted into EqualityComparer, and that is where the actual problem starts. EqualityComparer has some special cases for the common types (anything that is IEquatable<T>, which most of the usual suspects implements), to speed this up.

The problem is that the fall back is to an ObjectComparer, and that, of course, will box any value type.

And enum does not implements IEquatable<T>…

Omer has a good coverage on the subject, with really impressive results. Take a look at his results.

image

I am not going to steal his thunder, but I suggest going over and reading the code, it is very elegant.

Comments

Dmitry
02/21/2009 05:00 AM by
Dmitry

Is this really an issue with a generic dictionary? How many values can an enum have in real life, 10-15?

Ayende Rahien
02/21/2009 06:03 AM by
Ayende Rahien

The problem isn't with the number of elements, the problem is with the number of comparisions!

Gil Fink
02/21/2009 10:47 AM by
Gil Fink

Thanks for sharing!

Simone
02/21/2009 06:55 PM by
Simone

It can be accomplished in 10 lines with C# 3.0, I posted a comment on the article, I think it's been updated.

Omer Mor
02/21/2009 08:41 PM by
Omer Mor

Simone is right, and I'll update my article soon.

The general solution is still valid: overcoming generics' shortcoming using dynamic code generation.

Simone's idea is to use Expression Trees to generate the code instead of Lightweight Code Generation (DynamicMethod), which makes the code clearer (no need for IL).

LCG is still useful when you can't use .Net 3.5 & C#3.

Omer Mor

firefly
02/22/2009 01:44 AM by
firefly

I am more interested on how to identify the bottleneck in the first place.

Josh
02/22/2009 03:51 AM by
Josh

Well 1) you could forego the default comparison mechanism by supplying your own IComparer <t in the dictionary's constructor

2) It always bothered me that the CLR can't provide default implementations for IEquatable <t and GetHashCode() for value types based on its binary structure.

Comments have been closed on this topic.