Immutable Collections performance

time to read 25 min | 4939 words

After finding out that a lot of our costs in Voron is actually related to immutable collections, I decided to run some isolated perf tests.

   1: private static void Dictionary()
   2: {
   3:     var dic = new Dictionary<object, object>();
   4:     var sp = Stopwatch.StartNew();
   5:  
   6:     for (int i = 0; i < 10*1000*1000; i++)
   7:     {
   8:         var obj = new object();
   9:         dic[obj] = obj;
  10:     }
  11:  
  12:     Console.WriteLine(sp.Elapsed);
  13: }                                                          
   1: private static void ImmutableDictionary()
   2: {
   3:     var dic = ImmutableDictionary<object,object>.Empty;
   4:     var sp = Stopwatch.StartNew();
   5:  
   6:     for (int i = 0; i < 10 * 1000 * 1000; i++)
   7:     {
   8:         var obj = new object();
   9:         dic = dic.SetItem(obj, obj);
  10:     }
  11:  
  12:     Console.WriteLine(sp.Elapsed);
  13: }

3.639 seconds

1 minute and 58 seconds

Yes, that is correct, the immutable version is over thirty times slower. And given that we are only using 10 million items, that is a ridiculous high rate.

I decided to do some tests to see if it is just the number of calls that we are making here:

   1: private static void ImmutableDictionary()
   2: {
   3:     var dic = ImmutableDictionary<object,object>.Empty;
   4:     var sp = Stopwatch.StartNew();
   5:  
   6:     dic = dic.SetItems(Enumerable.Range(0, 10*1000*1000).Select(i =>
   7:     {
   8:         var obj = new object();
   9:         return new KeyValuePair<object, object>(obj, obj);
  10:     }));
  11:     
  12:     Console.WriteLine(sp.Elapsed);
  13: }

1 minute

So that it appears that it isn’t the number of calls, but something intrinsic to the way it works. And how about lists?

   1: private static void List()
   2: {
   3:     var list = new List<object>();
   4:     var sp = Stopwatch.StartNew();
   5:  
   6:     for (int i = 0; i < 10 * 1000 * 1000; i++)
   7:     {
   8:         var obj = new object();
   9:         list.Add(obj);
  10:     }
  11:  
  12:     Console.WriteLine(sp.Elapsed);
  13: }                                                           
   1: private static void ImmutableList()
   2: {
   3:     var list = ImmutableList<object>.Empty;
   4:     var sp = Stopwatch.StartNew();
   5:  
   6:     for (int i = 0; i < 10 * 1000 * 1000; i++)
   7:     {
   8:         var obj = new object();
   9:         list = list.Add(obj);
  10:     }
  11:  
  12:     
  13:     Console.WriteLine(sp.Elapsed);
  14: }

0.9 seconds

16 seconds

Ouch.

I think we are going to need to do something about this, even though I love the idea of immutable collections, 16 – 32 times slower is just going to be utterly unacceptable.

But this is for writes, how about reads?

I had written the following for both of them:

   1: var sp = Stopwatch.StartNew();
   2: for (int i = 0; i < 10*1000*1000; i++)
   3: {
   4:     object value;
   5:     dic1.TryGetValue(i, out value);
   6: }
   7:  
   8: Console.WriteLine(sp.Elapsed);
   1: var sp1 = Stopwatch.StartNew();
   2:  
   3: for (int i = 0; i < 10 * 1000 * 1000; i++)
   4: {
   5:     object value;
   6:     dic2.TryGetValue(i, out value);
   7: }
   8: Console.WriteLine(sp1.Elapsed);

0.45 seconds

3.11 seconds

So the summary is, the performance for our workloads is probably going to be simply unacceptable. We’ll have to find another way to handle this. The way to handle this nicely is to basically copy the values. So I wanted to know how long it would take to fully clone a 10 millions items dictionary. The answer: 0.6 seconds. Doing the same with immutable dictionary? 16 seconds.

So we are going to need to do something else Sad smile.