Ayende @ Rahien

It's a girl

Performance, threading and double checked locks

A very common pattern for lazy initialization of expensive things is this:

GetDescriptionsHelper.Delegate getDescription;
if (GetDescriptionsHelper.Cache.TryGetValue(entityType, out getDescription) == false)
{
	MethodInfo getDescriptionInternalGeneric = getDescriptionInternalMethodInfo.MakeGenericMethod(entityType);
	getDescription = (GetDescriptionsHelper.Delegate)Delegate.CreateDelegate(
		typeof(GetDescriptionsHelper.Delegate),
		getDescriptionInternalGeneric);

	GetDescriptionsHelper.Cache.Add(entityType, getDescription);
}

This tends to break down when we are talking about code that can run in multiply threads. So we start writing this:

lock(GetDescriptionsHelper.Cache)
{
	GetDescriptionsHelper.Delegate getDescription;
	if (GetDescriptionsHelper.Cache.TryGetValue(entityType, out getDescription) == false)
	{
		MethodInfo getDescriptionInternalGeneric = getDescriptionInternalMethodInfo.MakeGenericMethod(entityType);
		getDescription = (GetDescriptionsHelper.Delegate)Delegate.CreateDelegate(
			typeof(GetDescriptionsHelper.Delegate),
			getDescriptionInternalGeneric);
	
		GetDescriptionsHelper.Cache.Add(entityType, getDescription);
	}
}

Except, this is really bad for performance, because we always lock when we try to get the value out. So we start playing with it and get this:

GetDescriptionsHelper.Delegate getDescription;
if (GetDescriptionsHelper.Cache.TryGetValue(entityType, out getDescription) == false)
{
	lock(GetDescriptionsHelper.Cache)
	{	
		MethodInfo getDescriptionInternalGeneric = getDescriptionInternalMethodInfo.MakeGenericMethod(entityType);
		getDescription = (GetDescriptionsHelper.Delegate)Delegate.CreateDelegate(
			typeof(GetDescriptionsHelper.Delegate),
			getDescriptionInternalGeneric);
	
		GetDescriptionsHelper.Cache.Add(entityType, getDescription);
	}
}

This code has a serious error in it, in the right conditions, two threads will evaluate the if at the same time, and then try to enter the lock at the same time. The end result is an exception as the second of them will try to add the entity type again.

So we come out with the doubly checked lock, like this:

GetDescriptionsHelper.Delegate getDescription;
if (GetDescriptionsHelper.Cache.TryGetValue(entityType, out getDescription) == false)
{
	lock(GetDescriptionsHelper.Cache)
	{	
		if (GetDescriptionsHelper.Cache.TryGetValue(entityType, out getDescription) )
			return;

		MethodInfo getDescriptionInternalGeneric = getDescriptionInternalMethodInfo.MakeGenericMethod(entityType);
		getDescription = (GetDescriptionsHelper.Delegate)Delegate.CreateDelegate(
			typeof(GetDescriptionsHelper.Delegate),
			getDescriptionInternalGeneric);
	
		GetDescriptionsHelper.Cache.Add(entityType, getDescription);
	}
}

Now we are handling this condition as well. But my preference of late has been to use this code in multi threaded environments.

Update: I should be wrong more often, I got some very good replies to this post. The code below happened to work by chance and luck alone, apparently. The solution above is the more correct one.

Actually, it is more complex than that. It is still possible to have readers attempt to read the Cache variable (which is of type Dictionary<Type, Delegate>) while it is being written to inside the lock. There is a potential for seriousĀ  bit of mayhem in that case.

The safe thing to do in this case would be to lock it always before access, but for things that are going to be used frequently (hot spots) this can be a problem. Reader/Writer lock is much worse in terms or performance than the usual lock statement, and ReaderWriterLockSlim is for .NET 3.5 only.

An interesting dilemma, and I apologize for misleading you earlier.

GetDescriptionsHelper.Delegate getDescription;
if (GetDescriptionsHelper.Cache.TryGetValue(entityType, out getDescription) == false)
{
	MethodInfo getDescriptionInternalGeneric = getDescriptionInternalMethodInfo.MakeGenericMethod(entityType);
	getDescription = (GetDescriptionsHelper.Delegate)Delegate.CreateDelegate(
		typeof(GetDescriptionsHelper.Delegate),
		getDescriptionInternalGeneric);

	GetDescriptionsHelper.Cache[entityType] = getDescription;
}

can you spot the main difference? We are not using locks, and we are using the indexer instead of the Add() method.

Empirical evidence suggest that using the indexer in a multi threaded environment is safe, in that it doesn't corrupt the dictionary and one of the values will remain there.

This means that assuming that the cost of creating the value isn't very high, it is just fine to have it created twice on those rare cases, the end result is after the initial flurry, we have the value cached, even if it was calculated more than once. In the long run, it doesn't matter.

Comments

Stefan Wenig
01/21/2008 11:58 AM by
Stefan Wenig

Is the cache class interlocked? Is the Cache member of the GetDescriptionsHelper object guaranteed to be thread safe? If not, what makes you assume that accessing the cache in different threads cannot corrupt its internal memory structures? To me, this code looks like an accident to happen.

Even in you third snippet you have this problem: Reading the cache will probably break if another thread modifies the internal cache structures (hashtable or whatever) during the read. To really avoid locking for the majority of cases, you'd need a cache that locks itself for write-operations and is guaranteed to be safe for reading even if the write locks are ignored by the reading thread. Combine this with .NETs memory model (and the fact that you cannot mark an entire object as volatile) and you're in for a real challenge. Worse, you'd have to formally prove the correctness of this code, because testing is virtually impossible: While the CLRs memory model makes certain guarantees and sets limits for what can actually happen, most processor architectures never go for those limits. So, on a x86/x64 computer, testing might succeed in all kinds of race conditions, while on certain Itanium systems, it won't (depending on the chipset, to make it really impossible to deal with this).

My recommendation: Lock everything. Double-checked locking is OK if you really need it and observe the memory model requirements (use volatile or explicit memory barriers). If you want to optimize anything, make sure the lock does not last for the actual creation, but the caching structure itself (hashtable) must be locked!

Eamon Nerbonne
01/21/2008 12:31 PM by
Eamon Nerbonne

"Empirical evidence suggest" doesn't fill me with trust as to this multithreading solution.

In almost all caching cases the TryGetValue will return a value anyhow. A lock inside that if statement is thus virtually free. So what's the advantage of your solution? Why would this solution continue working on a different architecture, potentially with a different .NET implementation? That's not entirely hypothetical, since mono runs on non-x86 platforms too, as does the .NET compact framework.

Further, even the double-checked locking might introduce invalid values, in the event that the cache is in an invalid state in the middle of a write and is read in another thread.

Locking isn't necessarily that bad for performance, in many cases.

Another interesting trick is that the .NET framework guarantees that a classes static constructor will have run before the class is otherwise used.

That means you can create a generic class - templated on your entityType, say - with a static constructor which does the "heavy lifting", and sets a static class variable to the result of the calculation.

In any case, before playing these dangerous locking tricks you want to benchmark and make sure you need to. Locking might not be a problem at all, for you.

Joshua McKinney
01/21/2008 12:37 PM by
Joshua McKinney

Reflector says nope to Dictionary's indexer being threadsafe. Specifically there is a race condition where:

dict[key1] = val1;

may be overwritten by a simultaneous

dict[key2] = val2;

(at least that's how my reading of the code has indicated)

Jon Skeet
01/21/2008 01:00 PM by
Jon Skeet

Has locking been shown to actually being a performance issue in this case? I pretty much always stick to "lock every time" when I can't use the normal static initializer trick.

It's much simpler to reason about, and I've never been near code where it's really been a bottleneck.

As Stefan says, the memory model is a pain to reason about. It also brings up the question of whether you want to work on non-.NET CLIs. The .NET 2.0 memory model is significantly "safer" than the ECMA model, for instance. I've no idea whether Mono has modelled itself on .NET 2.0 or ECMA...

Jon

Ayende Rahien
01/21/2008 01:08 PM by
Ayende Rahien

Stefan,

I did mention that I had empirical evidence only, didn't I?

The Cache in this case is a Dictionary<Type, Delegate>.

Francois Tanguay
01/21/2008 01:19 PM by
Francois Tanguay

That doesn't make it thread safe. The indexer calls InsertItem and there is plenty of room for threading disaster in there.

I would assume it is quite easy to simulate a threading problem by inserting a first item, simulating a lengthy operation in something like GetHashCode or Equals and provoking a second insert on another thread.

Jon Skeet
01/21/2008 01:31 PM by
Jon Skeet

Leaving the current implementation aside for the moment, the documentation specifically states:

"To allow the collection to be accessed by multiple threads for reading and writing, you must implement your own synchronization."

So it may currently happen to be thread-safe (although I doubt it is) there's room for that to change in the future.

Jon

Max
01/21/2008 01:34 PM by
Max

I have found that a nice way to organize this logic is to create a SynchronizedKeyedCollection subclass that you can construct with a Converter<TKey, TValue> delegate that is used to create values in the dictionary when a non-existent key is accessed. This way all the locking logic can be hidden from consumers of the class.

If only .NET provided some thread safe hashtable by default :-( (Java has quite a nice one, with better lock granularity than simple "lock the whole table").

I have to agree with the posters above that locks are necessary in this case and your last code example is evil ;-)

Stefan Wenig
01/21/2008 01:45 PM by
Stefan Wenig

Ayende,

sure you did, but I wanted to make clear e that empirical evidence in this case might be very misleading, and may even differ depending on your hardware. Also, I think that many people using the double-checked-locking pattern are not aware that this works only for volatile reference fields, but never for unlocked hash-tables (even if they are locked for writing). I really don't want to know how much code is out there that gets this wrong, and those bugs are quite a pain to find out, let alone reproduce or even fix.

Joshua McKinney
01/21/2008 01:58 PM by
Joshua McKinney

The Dictionary class is categorically not threadsafe. I wanted to learn how to debug multithreaded apps and play with the VS 2008 .Net source availability at the same time.

My test code:

using System;

using System.Collections.Generic;

using System.Threading;

namespace TestDictionaryNotThreadSafe

{

class Program

{

    static Dictionary<string, string> items = new Dictionary<string, string>();

    static void Main(string[] args)

    {

        Thread t1 = new Thread(Start1);

        t1.Name = "t1";

        Thread t2 = new Thread(Start2);

        t2.Name = "t2";

        t1.Start();

        t2.Start();

        Console.WriteLine(items["item2"]);

    }


    private static void Start1()

    {

        items["item1"] = "1";

    }


    private static void Start2()

    {

        items["item2"] = "2";

    }

}

}

Method:

Break into Dictionary.InsertItem (called by indexer), thawing and freezing threads as necessary to exhibit result.

The result:

Joshua McKinney
01/21/2008 02:04 PM by
Joshua McKinney

Or to not have to worry about the image, I managed to break it such that items[0] = {["item2", 1]}

Daniel Grunwald
01/21/2008 03:49 PM by
Daniel Grunwald

Actually, even the double-checked version is wrong. Here is the comment on thread-safety from MSDN:

A Dictionary<TKey, TValue> can support multiple readers concurrently, as long as the collection is not modified. Even so, enumerating through a collection is intrinsically not a thread-safe procedure. In the rare case where an enumeration contends with write accesses, the collection must be locked during the entire enumeration. To allow the collection to be accessed by multiple threads for reading and writing, you must implement your own synchronization.

The only guarantee is that concurrent TryGetValue calls work.

There is no guarantee that TryGetValue behaves as expected when called concurrently to a Add call. There are not only the possibilities that TryGetValue finds the item or does not find it, it is completely undefined behavior, e.g. it could cause an Exception or return an incorrect item.

I'm not 100% sure that this is the case in the implementation, but it might be possible when the Add() calls causes the Dictionary to resize.

Stefan Wenig
01/21/2008 03:57 PM by
Stefan Wenig

I just noticed your update and wondered why you removed only the last part, but it seems I confused snippet numbers in my first comment. I mentioned a bug in snippet #3, but you had marked this as faulty anyway. The comment should have read, "Even in your FOURTH snippet you have this problem: Reading the cache will probably break if another thread modifies the internal cache structures (hashtable or whatever) during the read."

So I'm sorry, but you'll have to go with #2 - lock each time, for the whole time. #4 won't do, it could produce funny behavior if the TryGetValue outside the lock gets executed while the Add inside shifts hashtable buckets around.

Just noticed my mistake, sorry.

Stefan Wenig
01/21/2008 04:28 PM by
Stefan Wenig

See? If Daniel had aquired a lock on that blog entry, I wouldn't have had to write my last comment. So who says locks are expensive? ;-)

Alois Kraus
01/21/2008 10:50 PM by
Alois Kraus

A more thorough introduction to double checked locking, the CLR memory model is shown here:

http://geekswithblogs.net/akraus1/articles/90803.aspx

Your dictionary will sporadically throw KeyNotFoundExceptions when it is altered at random times by different threads.

Lock free immutable data structures (the functional programming way) are discussed e.g. here:

http://blogs.msdn.com/ericlippert/archive/2007/12/10/immutability-in-c-part-four-an-immutable-queue.aspx

Yours,

Alois Kraus

Comments have been closed on this topic.