Rhino.DHT – Persistent & Distributed Storage

time to read 4 min | 796 words

This is an interesting project, even if I say so myself. For several reasons, I need a fast and easy key/value store. After looking a bit around to see what there is out there, I decided that I might as well write my own. I already wrote a distributed cache in the past, and it is a very simple project to do. The twist for now was that I had to do this in a persistent manner. A system restart should not rewipe the cache.

More than that, I wanted to duplicate some of the more interesting aspects of Amazon’s Dynamo of multi valued keys. I usually like to claim that the API that I strive for is simple. In this case, I don’t think that I can say so. It is simple, if you understand that there is a small twist. It is not just a simple distributed hashtable, we have the notion of versioned concurrency here as well.

Here is the class diagram:

image

The first thing to pay attention to is that every piece of the API contains the notion of a version. More than that, it usually contain the notion of versions.

I think that the best way of describing this would be with picture, so let us imagine the DHT initially.

image

It is empty, has no values, cleared of any versions. And the Actor A had to interfere and put  value in my DHT…

image

Actor A puts a value keyed ‘a’ with no previous versions. The DHT tag the key/value pair with the version 1 and move on with its life. Sometimes later Actor B comes along and decides to also put a value keyed by ‘a’ with no previous version…

image

The DHT noticed that it already have a key named ‘a’ that isn’t a direct ancestor of the new key, so it create a new version and store it. But what about the first version? Don’t we have concurrency issue here?

Indeed we do. Let us see how the world looks like to Actor C, who just wanted to check on the value of ‘a’.

image

When Actor C asks for the value of ‘a’, it gets a surprise, it gets two versions of ‘a’. #1 and #2. It is now Actor C’s responsibility to merge them somehow and send the updated version to the DHT.

image

Now, Actor C is saving key ‘a’ with a set of previous versions (#1 and #2). When the DHT notice that we write a value that is covering all the existing values, it will remove the previous versions and keep the latest one.

Now, at this point Actor B wakes up and decides that it also wants to save something for ‘a’, and it tries to save base on the #2 version.

image 

At that point, we go to the same behavior, we now have no direct ancestry between the existing value and the new value. As such, we maintain both of them until a client is kind enough to merge them.

Hopefully, by now I was able to explain what I am doing and why the notion of versions is such an important concept.

Other interesting aspects of the project is the distribution part, here we follow the pretty common memcached mode, of isolated nodes that are brought together by the client library. That isn’t really interesting, but it is worth mentioning.

Finally, persistence in this case is using Esent. Esent is the embedded DB that is embedded (fun intended) into Windows. That was my first real foray into implementing production level code with Esent. The code is very low level, but it is both fun to write and give you the sense of incredible power.