Ayende @ Rahien

Refunds available at head office

Immutable Collections performance

After finding out that a lot of our costs in Voron is actually related to immutable collections, I decided to run some isolated perf tests.

   1: private static void Dictionary()
   2: {
   3:     var dic = new Dictionary<object, object>();
   4:     var sp = Stopwatch.StartNew();
   5:  
   6:     for (int i = 0; i < 10*1000*1000; i++)
   7:     {
   8:         var obj = new object();
   9:         dic[obj] = obj;
  10:     }
  11:  
  12:     Console.WriteLine(sp.Elapsed);
  13: }                                                          
   1: private static void ImmutableDictionary()
   2: {
   3:     var dic = ImmutableDictionary<object,object>.Empty;
   4:     var sp = Stopwatch.StartNew();
   5:  
   6:     for (int i = 0; i < 10 * 1000 * 1000; i++)
   7:     {
   8:         var obj = new object();
   9:         dic = dic.SetItem(obj, obj);
  10:     }
  11:  
  12:     Console.WriteLine(sp.Elapsed);
  13: }

3.639 seconds

1 minute and 58 seconds

Yes, that is correct, the immutable version is over thirty times slower. And given that we are only using 10 million items, that is a ridiculous high rate.

I decided to do some tests to see if it is just the number of calls that we are making here:

   1: private static void ImmutableDictionary()
   2: {
   3:     var dic = ImmutableDictionary<object,object>.Empty;
   4:     var sp = Stopwatch.StartNew();
   5:  
   6:     dic = dic.SetItems(Enumerable.Range(0, 10*1000*1000).Select(i =>
   7:     {
   8:         var obj = new object();
   9:         return new KeyValuePair<object, object>(obj, obj);
  10:     }));
  11:     
  12:     Console.WriteLine(sp.Elapsed);
  13: }

1 minute

So that it appears that it isn’t the number of calls, but something intrinsic to the way it works. And how about lists?

   1: private static void List()
   2: {
   3:     var list = new List<object>();
   4:     var sp = Stopwatch.StartNew();
   5:  
   6:     for (int i = 0; i < 10 * 1000 * 1000; i++)
   7:     {
   8:         var obj = new object();
   9:         list.Add(obj);
  10:     }
  11:  
  12:     Console.WriteLine(sp.Elapsed);
  13: }                                                           
   1: private static void ImmutableList()
   2: {
   3:     var list = ImmutableList<object>.Empty;
   4:     var sp = Stopwatch.StartNew();
   5:  
   6:     for (int i = 0; i < 10 * 1000 * 1000; i++)
   7:     {
   8:         var obj = new object();
   9:         list = list.Add(obj);
  10:     }
  11:  
  12:     
  13:     Console.WriteLine(sp.Elapsed);
  14: }

0.9 seconds

16 seconds

Ouch.

I think we are going to need to do something about this, even though I love the idea of immutable collections, 16 – 32 times slower is just going to be utterly unacceptable.

But this is for writes, how about reads?

I had written the following for both of them:

   1: var sp = Stopwatch.StartNew();
   2: for (int i = 0; i < 10*1000*1000; i++)
   3: {
   4:     object value;
   5:     dic1.TryGetValue(i, out value);
   6: }
   7:  
   8: Console.WriteLine(sp.Elapsed);
   1: var sp1 = Stopwatch.StartNew();
   2:  
   3: for (int i = 0; i < 10 * 1000 * 1000; i++)
   4: {
   5:     object value;
   6:     dic2.TryGetValue(i, out value);
   7: }
   8: Console.WriteLine(sp1.Elapsed);

0.45 seconds

3.11 seconds

So the summary is, the performance for our workloads is probably going to be simply unacceptable. We’ll have to find another way to handle this. The way to handle this nicely is to basically copy the values. So I wanted to know how long it would take to fully clone a 10 millions items dictionary. The answer: 0.6 seconds. Doing the same with immutable dictionary? 16 seconds.

So we are going to need to do something else Sad smile.

Comments

Jahmai Lay
11/29/2013 02:52 AM by
Jahmai Lay

In zero contention scenarios, Dictionary is much much faster than ConcurrentDictionary, but in high contention it is the reverse.

It is not much surprise that a single threaded for-loop is faster for Dictionary than ImmutableDictionary.

What is Vorons dictionary usage pattern?

I would also compare the time it takes to construct the object as adding items.

Ayende Rahien
11/29/2013 02:54 AM by
Ayende Rahien

Jahmai, We have a no contention issue (already protected), but using immutable collections means that we don't have to fear mutation while reading. Now we have to figure out on replacing the immutable collections with something else.

erewhon
11/29/2013 03:36 AM by
erewhon

Yeah seems like instead of an immutable dictionary (which copies itself every time you alter its state) you want something that guarantees the immutability of the items it contains?

Chris Bernholt
11/29/2013 03:40 AM by
Chris Bernholt

Have you tried the static methods on ImmutableList and/or the extension methods? For example changing your ImmutableList test to the following drops the time down significantly:

        var list = new List<object>();
        var sp = Stopwatch.StartNew();

        for (int i = 0; i < 10 * 1000 * 1000; i++)
        {
            var obj = new object();
            list.Add(obj);
        }

        var iList = list.ToImmutableList<object>();

        Console.WriteLine(sp.Elapsed);
Ayende Rahien
11/29/2013 03:50 AM by
Ayende Rahien

Erewhon, Nope, I actually store long -> long there. So no issue with mutability of items. I need to have both readers and a single writer.

Ayende Rahien
11/29/2013 03:51 AM by
Ayende Rahien

Chris, Not really significant. Because I need to actually do a buildup of this over time, not just once.

Mikkel Christensen
11/29/2013 05:52 AM by
Mikkel Christensen

Have you looked into the LMAX Disruptor ring buffer implementation for handling this?

Ayende Rahien
11/29/2013 05:53 AM by
Ayende Rahien

Mikkel, I don't need anything like that.

jgauffin
11/29/2013 07:23 AM by
jgauffin

Why don't you just expose it as a IReadOnlyDictionary? Protected, but you can still modify it.

Hermit Dave
11/29/2013 09:24 AM by
Hermit Dave

The performance isn't bad. Blogged about it a few mins back http://invokeit.wordpress.com/2013/11/29/immutable-collections-performance-dotnet/

Stephan
11/29/2013 09:31 AM by
Stephan

The Immutable Collections API suggests using its builder classes for manipulating the collections:

var builder = ImmutableDictionary.CreateBuilder<object, object>(); var sp = Stopwatch.StartNew();

for (int i = 0; i < 10 * 1000 * 1000; i++) { var obj = new object(); builder.Add(obj, obj); }

Console.WriteLine(sp.Elapsed);

When I run this test, it reduces the run time from 1:46.42 (using ImmutableDictionary) to 0:55.79. Now, compared to 3.67 seconds for the plain Dictionary version...

Frank Quednau
11/29/2013 11:47 AM by
Frank Quednau

Hermit, as far as I can see, while ayende mutates the immutable list n times (meaning, creating a new one from the old one), you only generate it once per run - that is a very different test and not comparable. You are essentially copying the list into a new structure snapshot-style. You don't need Immutable* for that.

Ayende Rahien
11/29/2013 04:09 PM by
Ayende Rahien

jgauffin, IReadOnlyDictionary isn't useful for me. I want to have this read by many threads, and written by one thread. The dictionary is NOT safe to be read while it is being mutated.

Ayende Rahien
11/29/2013 04:10 PM by
Ayende Rahien

Stephan, That is the time for writing, but the read times are just as bad, and they impact us a LOT more.

Ayende Rahien
11/29/2013 04:13 PM by
Ayende Rahien

Hermit, What you are doing is to create a snapshot of the data. That is nice, but it is utterly useless. Try to imagine that you now need to do this in another operation (for example, you need to track something). That means that you cannot build the entire data set in one go.

Snapshot != immutable.

For fun, try iterating over both of lists, for AND foreach are horrible

Rafal
12/05/2013 10:28 AM by
Rafal

What implementation of ImmutableDictionary are you using?

Ayende Rahien
12/05/2013 10:29 AM by
Ayende Rahien

Rafal, The BCL Immutable collections: http://www.nuget.org/packages/Microsoft.Bcl.Immutable/

dupdob
12/05/2013 10:33 AM by
dupdob

Hello I am not sure you can do much faster using any existing form of dictionary. The immutable collection works by creating a new version each time you alter it, this way each existing instance is read only and can be safely read by as many thread as you want.

The benefit you gets from those is that the new version is not a full copy, this way it saves memory and copy time. To achieve that they have to use complex structure which are less iteration friendly that their basic counterpart.

Look at it this way: 2 minutes to create 10 M different dictionaries with a size from 1 to 10 M is not that bad!

I understand this is not what you want to hear.

If that is possible in your implementation, you should favor batch modifications: SetItems or AddRange

Rafal
12/05/2013 10:33 AM by
Rafal

ah, .Net 4.5 only. ok.

Ayende Rahien
12/05/2013 10:37 AM by
Ayende Rahien

dupdob, I have tried with SetItems, it isn't really helpful for our scenario. In fact, performance is really bad around that.

And yes, I fully understand the implementation decision associated with this.

Paul Turner
12/05/2013 10:56 AM by
Paul Turner

For those unclear about these collections, the implementations for the immutable collections are substantially more complicated than their mutable counterparts. They are optimised to keep to a low memory footprint so that each change causes a new collection to be created with a reference to the previous state and a delta of the changes. It avoids repeatedly copying all the elements and the large memory allocation that would cost.

This immutability is perfect for representing a "snapshot" of state which you can pass around to other operations knowing that it will not mutate.

Compare this to a concurrent collection which would provide support for concurrent reads and writes, but makes no promise that the state wont be shifting between reads.

I'm sure the BCL team have gone to great lengths to make sure these classes perform as fast as possible for their goals of low memory utilization, so it might be that the pattern isn't the most performant for this use case.

@Ayende What is the ratio between reads and writes? How "stale" can the data be for the readers? Is it important that the reads or writes aren't blocked?

Patrick McDonald
12/05/2013 11:23 AM by
Patrick McDonald

Tried something similar with F# immutable map

Writes: let elements = { 1 .. 10 * 1000 * 1000 } |> Seq.map (fun i -> i, i) |> Map.ofSeq takes about 20 seconds

Reads: elements |> Array.ofSeq |> ignore takes 0.9 seconds

(times on my machine, times for your C# dictionary samples are similar to yours)

I guess this means there's room for improvement in the C# ImmutableDictionary implementation

Ayende Rahien
12/05/2013 11:38 AM by
Ayende Rahien

Paul, The immutability contract is perfect, we need to have an immutable snapshot of the data while we are modifying it in another thread. There is one writer, and multiple readers, but the data cannot change from the moment the reader got the collection.

Ayende Rahien
12/05/2013 11:41 AM by
Ayende Rahien

Patrick, The problem is that the F# map doesn't have a way to batch changes. Our actual scenario is more complex, and we want to be able to batch those changes to avoid a lot of intermediary steps.

Paul Turner
12/05/2013 11:53 AM by
Paul Turner

If you are just iterating, it might be worth mentioning that the ConcurrentDictionary does provide a snapshot.

OmariO
12/05/2013 11:57 AM by
OmariO

Why not wrap ConcurrentDictionary in a IReadOnlyDictionary? ConcurrentDictionary is fairly good at reading.

Ayende Rahien
12/05/2013 12:05 PM by
Ayende Rahien

Paul, I am not iterating, I am calling TryGetValue a lot.

Ayende Rahien
12/05/2013 12:06 PM by
Ayende Rahien

OmariO, Because I don't get a snapshot of the state it was in.

Catalin
12/05/2013 01:44 PM by
Catalin

In the first test, you have created 1010001000 snapshots of the Immutable dictionary. I'm pretty sure you don't need that many.

If you try this, and have only (1000 snapshots):

        var dic = ImmutableDictionary<object, object>.Empty;
        var sp = Stopwatch.StartNew();

        for (int j = 0; j < 1000; j++)
        {
            var builder = dic.ToBuilder();
            for (int i = j * 10 * 1000; i < 10 * 1000 * j; i++)
            {
                var obj = new object();
                builder.Add(i,obj);
            }
            dic = builder.ToImmutable();
        }

        Console.WriteLine(sp.Elapsed);

It runs in under a second on my machine.

Catalin
12/05/2013 01:52 PM by
Catalin

Ups, Ignore the previous post, it has a bug (it does not add items)

Catalin
12/05/2013 02:18 PM by
Catalin

Proper version:

    private static void ImmutableDictionaryBatched()
    {
        var dic = ImmutableDictionary<object, object>.Empty;
        var sp = Stopwatch.StartNew();

        var builder = dic.ToBuilder();
        for (int i = 0; i < 10 * 1000 * 1000; i++)
        {
            if (i % 1000 == 0)
            {
                dic = builder.ToImmutable();
                builder = dic.ToBuilder();
            }
            var obj = new object();
            builder.Add(i, obj);
        }
        dic = builder.ToImmutable();

        Console.WriteLine(sp.Elapsed);
    }

Timings: 10000000 Snapshots 00:03:21.4288838 1000 Snapshots 00:00:25.7278598

Read timings: 00:00:05.2422042 00:00:04.6549159

macrohard
12/05/2013 02:29 PM by
macrohard

Try instantiating the dictionary with a default capacity size. This works wonders for the List<> especially when you know how many items it needs to have. The underlying array that backs a List<> can get pre-sized and so you don't have to do any memory copies to resize an array. Linear performance that way.

TTucker
12/05/2013 04:53 PM by
TTucker

I watched a video by Bjarne Stroustrup on LinkedLists vs Vectors in speed performance in reads and writes. Might be worth a look to explain why the immutable collection is so slow http://www.youtube.com/watch?v=YQs6IC-vgmo#t=6

alex
12/05/2013 07:30 PM by
alex

Well this difference in performance should not be that surprising really. You are comparing an O(1) implementation versus an O(log N) implementation. With 10 million inserts that is going to be noticeable. If I need a persistent data structure as a generic dictionary, I generally us an implementation of a hashed array mapped trie. On these tests it is almost twice as fast as the BCL immutable dictionary, but still O(log N).

Seems like your needs are different and you do not want a persistent data structure, but something specific that delivers a "minimal change"immutable snapshot after each mutating operation on an underlying mutable data structure and that is preferably array based v.s. link / pointer based.

For these types of specific use cases it often makes sense to just bite the bullet and build your own customized data structure that is optimized for your use case.

Tyler
12/05/2013 08:54 PM by
Tyler

I tried a similar test with ConcurrentDictionary and performance lags big time when the key is an object. When it is an int, I get 2.0 seconds on Dictionary and 2.5 seconds on ConcurrentDictionary. In the past, I've tried to get better performance from Dictionary using my own locks and always given in to the superior performance of ConcurrentDictionary under heavy read/write contention (50 million entries). Straight sequential performance is probably not the best indicator of how it will hold up under load.

Ayende Rahien
12/06/2013 12:41 AM by
Ayende Rahien

Alex, Yes, that is the direction we are going with. We are currently trying out different purpose built implementations.

Ayende Rahien
12/06/2013 12:41 AM by
Ayende Rahien

Tyler, Concurrent Dictionary isn't good for us, it exposes the mutated state, and we need a frozen view.

Tyler
12/06/2013 01:02 AM by
Tyler

Okay, it exposes the mutated state. But how does a concurrency strategy around Dictionary prevent exposure of the mutated state? How do you "freeze" the view on the Dictionary?

flukus
12/06/2013 01:06 AM by
flukus

Could you wrap the dictionary access instead of passing the dictionary itself around? The wrapper would hold an instance of the current dictionary and a couple of functions to control access. Something like:

CollectionWrapper.WithCurrent(x=> /* do stuff, x is the current dictionary as an IReadOnlyDictionary /*);

CollectionWrapper.ModifyCurrent(x=> /* x is a clone of the current dictionary, this will return a new dictionary */ return x;);

Some locking around ModifCurrent and you could have multi threaded writes as well.

Ayende Rahien
12/06/2013 01:08 AM by
Ayende Rahien

Tyler, That is what immutable collections gives you.

Ayende Rahien
12/06/2013 01:09 AM by
Ayende Rahien

flukus, I cannot do that, I am going to keep the instance I get for a while, and need to make multiple separate calls to it.

flukus
12/06/2013 01:17 AM by
flukus

You could have wrapper.GetCurrent(). As long as you can trust the calling code to not make multiple calls ie: if (wrapper.GetCurrent().HasKey("key")) var x = wrapper["key"].

That's not exactly ideal though. I assume the data is to big to clone every time?

Tyler
12/06/2013 01:18 AM by
Tyler

Okay, I understand that. But the cost (making a new copy of the data on every write) seems prohibitively high, as you point out. I wonder if you could achieve something similar but faster by having two dictionaries (A & B) for data and a transaction/swap dictionary (C). Write to A, read from B. When you write to A, you "log" the change to C. When you swap, A for B, C is used to update B, and reads occur on A.

Thanks for your patience as I'm poking around in the dark outside of the box, not knowing the internals of the problem you are solving.

Ayende Rahien
12/06/2013 01:22 AM by
Ayende Rahien

Flukus, Sure, you got the current, but how do you protect it from other changes?

Ayende Rahien
12/06/2013 01:23 AM by
Ayende Rahien

Tyler, The problem here is that we cannot assume that we have just reader/writer. In practice we have multiple readers and one writer.

flukus
12/06/2013 01:34 AM by
flukus

The readers only get access to an IReadOnlyDIctionary and the writer gets a cloned copy to modify. After modification wrapper.current will be replaced with the cloned dictionary. Every call to GetCurrent() will potentially get a new dictionary.

Do readers have to have the latest version or is it okay to have a stale version while the writes are taking place?

Ayende Rahien
12/06/2013 01:36 AM by
Ayende Rahien

flukus, That still means that you have to clone the dictionary on every write.

The readers gets the version once (has to be latest), and then use it for a while.

Tyler
12/06/2013 01:50 AM by
Tyler

I think I understand that. If you have multiple readers, and they are all reading from the "read-only" dictionary (perhaps even a ConcurrentDictionary that is not taking writes at any time readers are reading from it). And you have a single writer that is writing to the "writable" dictionary. And at some point when writes are at a lull or some other strategy determined point, readers are switched to read from the previously "writeable" collection. At that point, the queued changes (in "C") are written to the previously "read only" collection and writes continue there, queuing changes until the next "swap".

If I'm missing the boat, can you explain how you expected immutables to behave differently except that every write would create a new copy that would be available to the next reader? Do you need that level of concurrency, that every write invalidates the current "read only" collection? If so, could you get away with segmenting or sharding the immutables into smaller collections, with reads and writes orchestrated across the "shards" to limit the amount of data being "copied" on each write/read cycle?

Ayende Rahien
12/06/2013 01:58 AM by
Ayende Rahien

Tyler, The writer make its own changes, and they are not visible until it is done. Once it is done, the changes are public and available for any reader. Readers lifetimes are independent, and you can have readers that last multiple write cycles, but they need the same data in their dictionary.

flukus
12/06/2013 02:05 AM by
flukus

Cloning for writes is the only lock free way of doing it I know of. Otherwise you end up re implementing ImmutableDictionary.

wrapper.GetCurrent would be the current snapshot so if we had the situation:

  1. reader thread a calls wrapper.getCurrent()
  2. writer thread adds "abc" to the dictionary
  3. writer thread goes about doing more writes or whatever it does
  4. reader thread checks if "abc" is in the dictionary

So at step 4 "abc" won't be in the dictionary because it is a snapshot, which is semantically the same as ImmutableDictionary but with hopefully better performance characteristics (depends entirely on how many clones your doing).

Either that, or as tyler said, additions/modifications go into a separate dictionary and at some point you combine the two, this requires either locking or cloning as well.

The big question is what are the write characteristics? Frequent small additions or less frequent large additions?

Tyler
12/06/2013 02:19 AM by
Tyler

Ayende, it sounds like you don't want mutability but versioning. Can a reader be given a version and all reads of the mutable concurrent dictionary read the value for that version or prior to that version but never a newer version of the value? And all writes simply adding a new value with a current "writer version" rather than updating an existing value? And then a trailing process to remove old versions once there are no more readers with that version. In this case, the "value" in the concurrent dictionary would have to support multiple "versions" with a version key. And deletes would be more complex. But it just might give you reader immutability (or the illusion of same) and writer freedom while allowing readers to have a lifetime across multiple reads with a guarantee that they read only their "version" of the data. Hmmm.....

Ayende Rahien
12/06/2013 05:40 AM by
Ayende Rahien

flukus, We frequently do some additions. It can be changing 10 values, it could be changing 100 - 250, but those are the usual ranges.

Ayende Rahien
12/06/2013 05:41 AM by
Ayende Rahien

Tyler, That is actually a very good suggestion, and it just might actually work great! That would turn our process to a O(2), with very little cost. I am going to try that soon

Tyler
12/06/2013 06:09 AM by
Tyler

Cool. I look forward to hearing how it goes.

alex
12/06/2013 07:05 AM by
alex

I am not sure if your use case is similar to one of mine, where basically I am only appending to the end of a list (e.g. pages for a transaction, essentially leading to a transaction ordered list), removing from the front of the list (e.g. transaction pages that are no longer needed) and searching for the first entry matching some criterion and iterating sequentially from that first entry (e.g. those that still need to be data synced).

I created a custom immutable list for this purpose that performs O(1) amortized for appends, removals, indexing and enumeration, and thus can be searched in O(log N) for a given item, if the items are inserted in sorted order (which is the case in my use case).

For inserts it is takes about twice the time of a BCL generic list, for random/sequential indexing and enumeration it performs about the same as the BCL list. For removals it is much faster. Performance in practice depends obviously on the actual mix of operations, but so far I am pretty happy with it.

I added the essentials to a gist if you are interested: https://gist.github.com/anonymous/7819554

alex
12/06/2013 07:27 AM by
alex

One note of caution: this implementation is only truly immutable if there is only a single thread at a time that appends new items.

Ayende Rahien
12/06/2013 07:42 AM by
Ayende Rahien

Alex, that is a very interesting implementation. I'll try it out, but it looks like it should be quite appropriate for our needs. Can I assume that you don't have an issue with me using this?

And yes, we only have a single writer thread.

alex
12/06/2013 08:32 AM by
alex

I have no issue with you using it. In fact that has already resulted in finding and fixing 2 bugs ;)

An update with 2 fixes: https://gist.github.com/anonymous/7820394

Patrick Huizinga
12/06/2013 10:23 AM by
Patrick Huizinga

Some other idea that might be useful (though Tyler's is pretty good too):

assume: class TransUndo { KVP old; TransUndo next } static TransUndo last; static ConcurrentDictionary dict

Write: last = new TransUndo { old = valuesReplacedByChanges, Next = last } dict.Write

Read: snapshot = last ... // skipping snapshot is intentional for (current = snapshot.Next; current = current.Next; current != null { if (current.old.Key == readKey) return current.old.Value } return dict.Read

The advantage of this solution over Tyler's is that reading is cheaper when no write happened during the read transaction.

Also I'm not sure how Tyler would deal with very old values. What about a read transaction version 200000 when you need value version 1? dict[key][200000] is not going to work. And iterating would be expensive.

The advantage over alex's solution is that cleaning up the Xs happens by the gc, but it comes at the cost of slower reads when a lot of writes happened during the read transaction. Also alex's solution allows you to reuse the transaction object, which could relieve a bit of gc pressure.

Patrick Huizinga
12/06/2013 10:24 AM by
Patrick Huizinga

ugh bitten by a newline is not a new line again :(

tx
12/06/2013 10:55 AM by
tx

I'm coming from Java context, so I have not checked it with C#. But if the list is the same as in other functional languages, you are likely doing benchmarking in a wrong way.

For immutable structures, adding to the end is usually O(N) operation, adding to beginning is usually O(1). So the correct benchmark would be adding elements to the beginning (using Insert(0, x)) and then reversing the list at the end. Otherwise you will get quadratic cost on adding elements.

Howard Chu
12/06/2013 03:34 PM by
Howard Chu

You realize of course, that you're looking for an MVCC dictionary to interface to Voron, which is an MVCC dictionary. ???

Ayende Rahien
12/07/2013 06:51 PM by
Ayende Rahien

Alex, You have another bug here: https://gist.github.com/anonymous/7820394#file-immutableappendonlylist-cs-L205

The conditions should be reversed there.

Ayende Rahien
12/07/2013 06:55 PM by
Ayende Rahien

tx, We aren't using a list here, we are testing a dictionary, which doesn't really have the notion of adding in beginning / end.

Ayende Rahien
12/07/2013 06:57 PM by
Ayende Rahien

Howard, Yes, I am. What I need to store is the page table for Voron. And that would NOT do well as a B+Tree in same implementation semantics. To start with, the cost of 4KB min per change is very prohibitive for the kind of things we do. And we don't need infinite scale, we know we can do it more efficiently if we concentrate on memory resident collection.

Ayende Rahien
12/07/2013 06:57 PM by
Ayende Rahien

Patrick, I am afraid that I am not following your solution.

Ayende Rahien
12/07/2013 07:02 PM by
Ayende Rahien

Alex, Another issue here : https://gist.github.com/anonymous/7820394#file-immutableappendonlylist-cs-L194 You are decrementing count by 1, but nToRemove can be more than that.

Ayende Rahien
12/07/2013 08:06 PM by
Ayende Rahien

Alex, another issue here: https://gist.github.com/anonymous/7820394#file-immutableappendonlylist-cs-L82

You need to do i < _head + _count.

Ayende Rahien
12/07/2013 08:13 PM by
Ayende Rahien

Alex, Here is the fully fixed code: https://gist.github.com/ayende/7848030

alex
12/07/2013 11:49 PM by
alex

well that should teach me not to do copy - paste - strip - edit gists anymore.

The implementation I am using does a few additional things to keep track of both data sync & reader limits. I figured creating a gist of a modified version stripped down to the essentials would be a good idea. I should have checked the result a bit better I guess ...

Thanks for the corrections.

alex
12/08/2013 02:23 AM by
alex

There appears to be another copy-paste bug in there at https://gist.github.com/ayende/7848030#file-immutableappendonlylist-cs-L187

_count - 1 should be _count - nToRemove.

Patrick Huizinga
12/09/2013 09:41 AM by
Patrick Huizinga

Ayende, Basically, the implementation is: * a concurrent dictionary with the current state * a single linked list / queue-without-head with change undoes.

When you start a read transaction, the latest node is part of its state. When you want to read a value, you first walk the nodes. (skip the node you remembered, because those changes happened before you started the transaction)

When you write, you append a node with the old values of the updates. Set this new node as the latest node. Let the GC clean up nodes that no transaction can reach anymore.

So when you want to write "newValue" to "key": 1. append a node with ("key" : "oldValue) 2. set current node to this new node 3. set "key" to "newValue" in the live dict.

This assumes the following is rare: T0 start read transaction + read a value T1 finish write transaction T2 read another value in the read transaction

Because only in such a situation walking the nodes is going to incur overhead. Which will become worse linearly to the amount of items written / write transactions finished.

Ayende Rahien
12/09/2013 09:52 AM by
Ayende Rahien

Patrick, Yes, we ended up with something very smiliar.

Comments have been closed on this topic.