Cloned Dictionary vs. Immutable Dictionary vs. Frozen Dictionary in high traffic systems
In my previous post, I explained what we are trying to do. Create a way to carry a dictionary between transactions in RavenDB, allowing one write transaction to modify it while all other read transactions only observe the state of the dictionary as it was at the publication time.
I want to show a couple of ways I tried solving this problem using the built-in tools in the Base Class Library. Here is roughly what I’m trying to do:
IEnumerable<object> SingleDictionary()
{
var dic = new Dictionary<long, object>();
var random = new Random(932);
var v = new object();
// number of transactions
for (var txCount = 0; txCount < 1000; txCount++)
{
// operations in transaction
for (int opCount = 0; opCount < 10_000; opCount++)
{
dic[random.NextInt64(0, 1024 * 1024 * 1024)] = v;
}
yield return dic;// publish the dictionary
}
}
As you can see, we are running a thousand transactions, each of which performs 10,000 operations. We “publish” the state of the transaction after each time.
This is just to set up a baseline for what I’m trying to do. I’m focusing solely on this one aspect of the table that is published. Note that I cannot actually use this particular code. The issue is that the dictionary is both mutable and shared (across threads), I cannot do that.
The easiest way to go about this is to just clone the dictionary. Here is what this would look like:
IEnumerable<object> ClonedDictionary()
{
var dic = new Dictionary<long, object>();
var random = new Random(932);
var v = new object();
// number of transactions
for (var txCount = 0; txCount < 1000; txCount++)
{
// operations in transaction
for (int opCount = 0; opCount < 10_000; opCount++)
{
dic[random.NextInt64(0, 1024 * 1024 * 1024)] = v;
}
// publish the dictionary
yield return new Dictionary<long, object>(dic);
}
}
This is basically the same code, but when I publish the dictionary, I’m going to create a new instance (which will be read-only). This is exactly what I want: to have a cloned, read-only copy that the read transactions can use while I get to keep on modifying the write copy.
The downside of this approach is twofold. First, there are a lot of allocations because of this, and the more items in the table, the more expensive it is to copy.
I can try using the ImmutableDictionary in the Base Class Library, however. Here is what this would look like:
IEnumerable<object> ClonedImmutableDictionary()
{
var dic = ImmutableDictionary.Create<long, object>();
var random = new Random(932);
var v = new object();
// number of transactions
for (var txCount = 0; txCount < 1000; txCount++)
{
// operations in transaction
for (int opCount = 0; opCount < 10_000; opCount++)
{
dic = dic.Add(random.NextInt64(0, 1024 * 1024 * 1024), v);
}
// publish the dictionary
yield return dic;
}
}
The benefit here is that the act of publishing is effectively a no-op. Just send the immutable value out to the world. The downside of using immutable dictionaries is that each operation involves an allocation, and the actual underlying implementation is far less efficient as a hash table than the regular dictionary.
I can try to optimize this a bit by using the builder pattern, as shown here:
IEnumerable<object> BuilderImmutableDictionary()
{
var builder = ImmutableDictionary.CreateBuilder<long, object>();
var random = new Random(932);
var v = new object(); ;
// number of transactions
for (var txCount = 0; txCount < 1000; txCount++)
{
// operations in transaction
for (int opCount = 0; opCount < 10_000; opCount++)
{
builder[random.NextInt64(0, 1024 * 1024 * 1024)] = v;
}
// publish the dictionary
yield return builder.ToImmutable();
}
}
Now we only pay the immutable cost one per transaction, right? However, the underlying implementation is still an AVL tree, not a proper hash table. This means that not only is it more expensive for publishing the state, but we are now slower for reads as well. That is not something that we want.
The BCL recently introduced a FrozenDictionary, which is meant to be super efficient for a really common case of dictionaries that are accessed a lot but rarely written to. I delved into its implementation and was impressed by the amount of work invested into ensuring that this will be really fast.
Let’s see how that would look like for our scenario, shall we?
IEnumerable<object> FrozenDictionary()
{
var dic = new Dictionary<long, object>();
var random = new Random(932);
var v = new object();
// number of transactions
for (var txCount = 0; txCount < 1000; txCount++)
{
// operations in transaction
for (int opCount = 0; opCount < 10_000; opCount++)
{
dic[random.NextInt64(0, 1024 * 1024 * 1024)] = v;
}
// publish the dictionary
yield return dic.ToFrozenDictionary();
}
}
The good thing is that we are using a standard dictionary on the write side and publishing it once per transaction. The downside is that we need to pay a cost to create the frozen dictionary that is proportional to the number of items in the dictionary. That can get expensive fast.
After seeing all of those options, let’s check the numbers. The full code is in this gist.
I executed all of those using Benchmark.NET, let’s see the results.
Method | Mean | Ratio |
SingleDictionaryBench | 7.768 ms | 1.00 |
BuilderImmutableDictionaryBench | 122.508 ms | 15.82 |
ClonedImmutableDictionaryBench | 176.041 ms | 21.95 |
ClonedDictionaryBench | 1,489.614 ms | 195.04 |
FrozenDictionaryBench | 6,279.542 ms | 807.36 |
ImmutableDictionaryFromDicBench | 46,906.047 ms | 6,029.69 |
Note that the difference in speed is absolutely staggering. The SingleDictionaryBench is a bad example. It is just filling a dictionary directly, with no additional cost. The cost for the BuilderImmutableDictionaryBench is more reasonable, given what it has to do.
Just looking at the benchmark result isn’t sufficient. I implemented every one of those options in RavenDB and ran them under a profiler. The results are quite interesting.
Here is the version I started with, using a frozen dictionary. That is the right data structure for what I want. I have one thread that is mutating data, then publish the frozen results for others to use.
However, take a look at the profiler results! Don’t focus on the duration values, look at the percentage of time spent creating the frozen dictionary. That is 60%(!) of the total transaction time. That is… an absolutely insane number.
Note that it is clear that the frozen dictionary isn’t suitable for our needs here. The ratio between reading and writing isn’t sufficient to justify the cost. One of the benefits of FrozenDictionary is that it is more expensive to create than normal since it is trying hard to optimize for reading performance.
What about the ImmutableDictionary? Well, that is a complete non-starter. It is taking close to 90%(!!) of the total transaction runtime. I know that I called the frozen numbers insane, I should have chosen something else, because now I have no words to describe this.
Remember that one problem here is that we cannot just use the regular dictionary or a concurrent dictionary. We need to have a fixed state of the dictionary when we publish it. What if we use a normal dictionary, cloned?
This is far better, at about 40%, instead of 60% or 90%.
You have to understand, better doesn’t mean good. Spending those numbers on just publishing the state of the transaction is beyond ridiculous.
We need to find another way to do this. Remember where we started? The PageTable in RavenDB that currently handles this is really complex.
I looked into my records and found this blog post from over a decade ago, discussing this exact problem. It certainly looks like this complexity is at least semi-justified.
I still want to be able to fix this… but it won’t be as easy as reaching out to a built-in type in the BCL, it seems.
Comments
My first thought after reading this is to use copy-on-write. That would allow all read transactions to get a read-only view of the dictionary cheaply since the allocation cost would be pushed onto the write transaction, and you'd only have to pay it when a write occurred. As long as no reader captures the dictionary, a write transaction can continue writing to it without paying any additional costs.
First thing I would try is return a custom dict that contains a version number. Each lookup is then made a linked list and then for every write we replace the head of the linked list. Overhead should be 16 bytes per entry and almost no runtime cost for unchanged items, but potentielle a linear lookup cost relative to age of transaction and number of transactions for high frequency items.
First off I woulԁ like to sɑy excellent blog! Ι had a quuick question which I'd like too asк if you do nott mind. I was curious tօ know how you center yourseⅼf aand clear yߋur head befoe writing. Ι've had trouble clearing my mind іn getting mʏ ideas оut theгe. Ӏ do tɑke pleasure in writing but іt just sems like thee firsst 10 to 15 mіnutes arе wasted simply јust trуing tо figure oᥙt how to begin. Any recommendations oг tips? Thаnks!
Comment preview