Multi threaded design guidelines for librariesPart I
The major difference between libraries and frameworks is that a framework is something that runs your code, and is in general in control of its own environment and a library is something that you use in your own code, where you control the environment.
Examples for frameworks: ASP.Net, NServiceBus, WPF, etc.
Examples for libraries: NHibernate, RavenDB Client API, JSON.Net, SharpPDF, etc.
Why am I talking about the distinction between frameworks and libraries in a post about multi threaded design?
Simple, there are vastly different rules for multi threaded design with frameworks and libraries. In general, frameworks manage their own threads, and will let your code use one of their threads. On the other hands, libraries will use your own threads.
The simple rule for multi threaded design for libraries? Just don’t do it.
Multi threading is hard, and you are going to cause issues for people if you don’t know exactly what you are doing. Therefor, just write for a single threaded application and make sure to hold no shared state.
For example, JSON.Net pretty much does this. The sole place where it does do multi threading is where it is handling caching, and it must be doing this really well because I never paid it any mind and we got no error reports about it.
But the easiest thing to do is to just not support multi threading for your objects. If the user want to use the code from multiple threads, he is welcome to instantiate multiple instances and use one per thread.
In my next post, I’ll talk about what happens when you actually do need to hold some shared state.
Comments
There's a difference between not spawning multiple threads in your libraries vs ensuring your library is thread-safe. I agree you generally want to avoid creating threads in your libraries but if you provide static utils that maintains static state you need to make it thread-safe.
i.e. ServiceStack's JSON, JSV + CSV Serializers employs aggressive static delegate caching, it maintains the caches statically so the cost of creating delegates is only incurred once. I personally wouldn't want to use a serializer that doesn't do this, paying for start-up costs more than once is not a good perf strategy.
I'm sure every many other fast .NET libs that touches reflection does the same thing and there are different strategies to achieve thread-safety: i.e. In ServiceStack.Text we use lock-free code + immutable collections.
IMO this post could've been improved if it listed the different strategies to make your code thread-safe...
Personally I like to take advantage of static constructors and try to isolate multi-threaded code in as small surface area as possible, e.g. Rather than ensuring a single RedisClient and its socket connection is thread-safe in every API, I pass around a thread-safe Redis Client Factory instead that's used to retrieve non-thread-safe RedisClient connection instances. This confines the multi-threaded code to one spot and allows us to not have to worry about thread-safety in your instance clients.
Demis, Wait for tomorrow's post, it discuss exactly that.
Ayende check out ThreadSafeStore:
https://github.com/JamesNK/Newtonsoft.Json/blob/master/Src/Newtonsoft.Json/Utilities/ThreadSafeStore.cs
Creating a new dictionary for every change isn't elegant but their is a finite amount of type data to cache so that isn't a problem. Its getto but it is compatible with every version of the .NET Framework and it has yet to break.
James, Absolutely, and it is actually quite elegant. I use similar methods here: http://ayende.com/blog/17409/caching-the-funny-way
I discuss those approaches in my next post.
I'd argue that doing any work to make sure there's no shared state in a library is multi-threaded (i.e. thread-safe) design.
Hey Ayende, how about writing concurrent code using the actor model, such as with F#'s agents (http://www.developerfusion.com/article/140677/writing-concurrent-applications-using-f-agents/) instead?
What are you thoughts on them?
By eradicating the use of shared state altogether and given that actors are thread-safe by default since everything inside an actor is single-threaded there is no need for synchronization. If you need higher throughput then just employ multiple agents in a simple fan-out setup.
This model of concurrency is also employed in Erlang and the basis of many highly scalable NoSQL solutions such as CouchBase and Riak.
Nowadays, the PostSharp toolkit also has support to help you with actor-based programming in C# too (http://www.sharpcrafters.com/blog/post/Actor-Based-Programming-with-C-50-and-PostSharp-Threading-Toolkit.aspx).
Comment preview