Ayende @ Rahien

Refunds available at head office

Raven.Munin

Raven.Munin is the actual implementation of a low level managed storage for RavenDB. I split it out of the RavenDB project because I intend to make use of it in additional projects.

At its core, Munin provides high performance transactional, non relational, data store written completely in managed code. The main point in writing it was to support the managed storage in RavenDB, but it is going to be used for Raven MQ as well, and probably a bunch of other stuff as well. I’ll post about Raven MQ in the future, so don’t bother asking about it.

Let us look at a very simple API example. First, we need to define a database:

public class QueuesStorage : Database
{
    public QueuesStorage(IPersistentSource persistentSource) : base(persistentSource)
    {
        Messages = Add(new Table(key => key["MsgId"], "Messages")
        {
            {"ByQueueName", key => key.Value<string>("QueueName")},
            {"ByMsgId", key => new ComparableByteArray(key.Value<byte[]>("MsgId"))}
        });

        Details = Add(new Table("Details"));
    }

    public Table Details { get; set; }

    public Table Messages { get; set; }
}

This is a database with two tables, Messages and Details. The Messages table has a primary key of MsgId, and two secondary indexes by queue name and by message id. The Details table is sorted by the key itself.

It is important to understand one very important concept about Munin. Data stored in it is composed of two parts. The key and the data. It is easier to explain when you look at the API:

bool Remove(JToken key);
Put(JToken key, byte[] value);
ReadResult Read(JToken key);

public class ReadResult
{
    public int Size { get; set; }
    public long Position { get; set; }
    public JToken Key { get; set; }
    public Func<byte[]> Data { get; set; }
}

Munin doesn’t really care about the data, it just saves it. But the key is important. In the Details table case, the table would be sorted by the full key. In the Messages table case, things are different. We use the lambda to extract the primary key from the key for each item, and we use additional lambdas to extract secondary indexes. Munin can build secondary indexes only from the key, not from the value. It is only the secondary indexes that allow range queries, the PK allow only direct access, which is why we define both the primary key and a secondary index on MsgId.

Let us see how we can save a new message:

public void Enqueue(string queue, byte[]data)
{
    messages.Put(new JObject
    {
        {"MsgId", uuidGenerator.CreateSequentialUuid().ToByteArray()},
        {"QueueName", queue},
    }, data);
}

And now we want to read it:

public Message Dequeue(Guid after)
{
    var key = new JObject { { "MsgId", after.ToByteArray()} };
    var result = messages["ByMsgId"].SkipAfter(key).FirstOrDefault();
    if (result == null)
        return null;

    var readResult = messages.Read(result);

    return new Message
    {
        Id = new Guid(readResult.Key.Value<byte[]>("MsgId")),
        Queue = readResult.Key.Value<string>("Queue"),
        Data = readResult.Data(),
    };
}

We do a bunch of stuff here, we scan the secondary index for the appropriate value, then get it from the actual index, load it into a DTO and return it.

Transactions

Munin is fully transactional, and it follows an append only, MVCC, multi reader single writer mode. The previous methods are run in the context of:

using (queuesStroage.BeginTransaction())
{
    // use the storage
    queuesStroage.Commit();
}

Storage on disk

Munin can work with either a file or in memory (which makes unit testing it a breeze). It uses an append only model, so on the disk it looks like this:image

Each of the red rectangle represent a separate transaction.

In memory data

You might have noted that we don’t keep any complex data structures on the disk. This is because all the actual indexing for the data is done in memory. The data on the disk is used solely for building the index in memory. Do note that the actual values are not held, only the keys. That means that search Munin indexes is lightning fast, since we never touch the disk for the search itself.

The data is actually held in an immutable binary tree, which gives us the ability to do MVCC reads, without any locking.

Compaction

Because Munin is using the append only model, it require periodic compaction. It does so automatically in RavenDB, waiting for periods of inactivity to do so.

Summary

Munin is a low level api, not something that you are likely to use directly. And it was explicitly modeled to give me an interface similar in capability to what Esent gives me, but in purely managed code.

Please note that is is released under the same license as RavenDB, AGPL.

Comments

Rafal
11/08/2010 02:58 PM by
Rafal

I like the name you selected

Patrik Svensson
11/08/2010 03:05 PM by
Patrik Svensson

So, when will we see "Raven.Hugin" :-)

Gian Maria
11/08/2010 03:06 PM by
Gian Maria

Great news :)

Alk.

Samuel Jack
11/08/2010 03:11 PM by
Samuel Jack

So Ravens are the new Rhinos then?

Danny
11/08/2010 03:12 PM by
Danny

You play Eve? :)

Dave
11/08/2010 03:25 PM by
Dave

Do you also plan to replace Esent in current projecs like Rhino Queues and Rhino ESB with Munin? Saving sagas in Munin (and outside your domain database) might be a good use..

Louis Hau&#223;knecht
11/08/2010 03:29 PM by
Louis Haußknecht

You mentioned, that for large scale applications, ESENT would be better suited, because of the fact that Munin stores keys in memory.

What scenario are you targeting with Munin (data volume wise)?

Ayende Rahien
11/08/2010 04:53 PM by
Ayende Rahien

Louis,

We currently use ~7 MB per 10,000 documents.

In other words, if you want to store 1,000,000 it would cost you 0.7 GB.

We can probably optimize it further, and the nature of the data structure that we means that we can handle paging of the in memory index pretty well.

Esent does much better once you go over tens of millions of documents, but until then, we are good.

Ayende Rahien
11/08/2010 04:54 PM by
Ayende Rahien

Dave,

I don't plan to do it, but someone else might

cowgaR
11/08/2010 05:49 PM by
cowgaR

__Munin name is already used in well-known (popular) server/cloud monitoring application.

http://munin-monitoring.org/

-google- results a bit problematic in the future?

Tonio
11/08/2010 05:54 PM by
Tonio

Esent is very slow with bulk insert; PersistentDictionary based on esent is very very slow compared to Raven.ManagedStorage.Degenerate

Corey
11/08/2010 06:21 PM by
Corey

Dave,

The license would be problematic for existing queues and esb users. It would essentially require them to purchase a raven db license and I don't think they would appreciate that. I would love to do the work if that were not the case.

Ayende Rahien
11/08/2010 09:02 PM by
Ayende Rahien

cowgaR,

Munin isn't intended to be a separate product.

Ayende Rahien
11/08/2010 09:03 PM by
Ayende Rahien

Tonio,

Actually, that depends on a lot of factors.

Esent is actually faster than Munin for bulk insert, because of the way we use it.

Boris Letocha
11/08/2010 09:36 PM by
Boris Letocha

It is maybe too soon to write about it on such high profile blog, but let's try... I am trying also write managed key value db with ACID and MVCC, but it has much lower level interface compared to Munin. I would definitely liked to know if somebody finds it interesting.

https://github.com/Bobris/BTDB

Frank Quednau
11/16/2010 07:47 AM by
Frank Quednau

Based on my yet limited understanding, this piece of software would work fairly well for persisting an Event Store...

I guess this thing ticks inside Raven MQ as well, in case you persist messages?

Royston Shufflebotham
11/18/2010 04:26 PM by
Royston Shufflebotham

Unfortunately, AGPL means that for the vast majority of situations in which I might consider deploying any of this, I can't.

(Same thing goes for RavenDB itself, really: the MongoDB approach to licensing (cross AGPL/Apache) [ http://blog.mongodb.org/post/103832439/the-agpl] is much more palatable and conducive to wider adoption.)

Shame, as it looks pretty cool...

Ayende Rahien
11/18/2010 04:34 PM by
Ayende Rahien

Royston,

I am aware of that, but I am more interesting in getting enough adoption to get people to pay for it than having total high adoption

Jimbo
12/02/2010 05:12 PM by
Jimbo

@Royston,

The real question is whether Ayende's interpretation of the AGPL license being used is right?

Several other AGPL licensed apps clearly indicate that only mods to the core are what is being protected. For example both MongoDB and CiviCRM clearly interpret the AGPL as covering only the core app/db and not any application you develop that connects through the API.

Despite the explanation in an earlier blogpost on the issue it is hard to conceive how the license actually is any different for RavenDB, Munin than these.

Comments have been closed on this topic.