Ayende @ Rahien

Hi!
My name is Ayende Rahien
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:

ayende@ayende.com

+972 52-548-6969

@

Posts: 5,947 | Comments: 44,541

filter by tags archive

Raven.Munin


Raven.Munin is the actual implementation of a low level managed storage for RavenDB. I split it out of the RavenDB project because I intend to make use of it in additional projects.

At its core, Munin provides high performance transactional, non relational, data store written completely in managed code. The main point in writing it was to support the managed storage in RavenDB, but it is going to be used for Raven MQ as well, and probably a bunch of other stuff as well. I’ll post about Raven MQ in the future, so don’t bother asking about it.

Let us look at a very simple API example. First, we need to define a database:

public class QueuesStorage : Database
{
    public QueuesStorage(IPersistentSource persistentSource) : base(persistentSource)
    {
        Messages = Add(new Table(key => key["MsgId"], "Messages")
        {
            {"ByQueueName", key => key.Value<string>("QueueName")},
            {"ByMsgId", key => new ComparableByteArray(key.Value<byte[]>("MsgId"))}
        });

        Details = Add(new Table("Details"));
    }

    public Table Details { get; set; }

    public Table Messages { get; set; }
}

This is a database with two tables, Messages and Details. The Messages table has a primary key of MsgId, and two secondary indexes by queue name and by message id. The Details table is sorted by the key itself.

It is important to understand one very important concept about Munin. Data stored in it is composed of two parts. The key and the data. It is easier to explain when you look at the API:

bool Remove(JToken key);
Put(JToken key, byte[] value);
ReadResult Read(JToken key);

public class ReadResult
{
    public int Size { get; set; }
    public long Position { get; set; }
    public JToken Key { get; set; }
    public Func<byte[]> Data { get; set; }
}

Munin doesn’t really care about the data, it just saves it. But the key is important. In the Details table case, the table would be sorted by the full key. In the Messages table case, things are different. We use the lambda to extract the primary key from the key for each item, and we use additional lambdas to extract secondary indexes. Munin can build secondary indexes only from the key, not from the value. It is only the secondary indexes that allow range queries, the PK allow only direct access, which is why we define both the primary key and a secondary index on MsgId.

Let us see how we can save a new message:

public void Enqueue(string queue, byte[]data)
{
    messages.Put(new JObject
    {
        {"MsgId", uuidGenerator.CreateSequentialUuid().ToByteArray()},
        {"QueueName", queue},
    }, data);
}

And now we want to read it:

public Message Dequeue(Guid after)
{
    var key = new JObject { { "MsgId", after.ToByteArray()} };
    var result = messages["ByMsgId"].SkipAfter(key).FirstOrDefault();
    if (result == null)
        return null;

    var readResult = messages.Read(result);

    return new Message
    {
        Id = new Guid(readResult.Key.Value<byte[]>("MsgId")),
        Queue = readResult.Key.Value<string>("Queue"),
        Data = readResult.Data(),
    };
}

We do a bunch of stuff here, we scan the secondary index for the appropriate value, then get it from the actual index, load it into a DTO and return it.

Transactions

Munin is fully transactional, and it follows an append only, MVCC, multi reader single writer mode. The previous methods are run in the context of:

using (queuesStroage.BeginTransaction())
{
    // use the storage
    queuesStroage.Commit();
}

Storage on disk

Munin can work with either a file or in memory (which makes unit testing it a breeze). It uses an append only model, so on the disk it looks like this:image

Each of the red rectangle represent a separate transaction.

In memory data

You might have noted that we don’t keep any complex data structures on the disk. This is because all the actual indexing for the data is done in memory. The data on the disk is used solely for building the index in memory. Do note that the actual values are not held, only the keys. That means that search Munin indexes is lightning fast, since we never touch the disk for the search itself.

The data is actually held in an immutable binary tree, which gives us the ability to do MVCC reads, without any locking.

Compaction

Because Munin is using the append only model, it require periodic compaction. It does so automatically in RavenDB, waiting for periods of inactivity to do so.

Summary

Munin is a low level api, not something that you are likely to use directly. And it was explicitly modeled to give me an interface similar in capability to what Esent gives me, but in purely managed code.

Please note that is is released under the same license as RavenDB, AGPL.


Comments

Rafal

I like the name you selected

Patrik Svensson

So, when will we see "Raven.Hugin" :-)

Gian Maria

Great news :)

Alk.

Samuel Jack

So Ravens are the new Rhinos then?

Danny

You play Eve? :)

Dave

Do you also plan to replace Esent in current projecs like Rhino Queues and Rhino ESB with Munin? Saving sagas in Munin (and outside your domain database) might be a good use..

Louis Hau&#223;knecht

You mentioned, that for large scale applications, ESENT would be better suited, because of the fact that Munin stores keys in memory.

What scenario are you targeting with Munin (data volume wise)?

Ayende Rahien

Louis,

We currently use ~7 MB per 10,000 documents.

In other words, if you want to store 1,000,000 it would cost you 0.7 GB.

We can probably optimize it further, and the nature of the data structure that we means that we can handle paging of the in memory index pretty well.

Esent does much better once you go over tens of millions of documents, but until then, we are good.

Ayende Rahien

Dave,

I don't plan to do it, but someone else might

cowgaR

__Munin name is already used in well-known (popular) server/cloud monitoring application.

http://munin-monitoring.org/

-google- results a bit problematic in the future?

Tonio

Esent is very slow with bulk insert; PersistentDictionary based on esent is very very slow compared to Raven.ManagedStorage.Degenerate

Corey

Dave,

The license would be problematic for existing queues and esb users. It would essentially require them to purchase a raven db license and I don't think they would appreciate that. I would love to do the work if that were not the case.

Ayende Rahien

cowgaR,

Munin isn't intended to be a separate product.

Ayende Rahien

Tonio,

Actually, that depends on a lot of factors.

Esent is actually faster than Munin for bulk insert, because of the way we use it.

Boris Letocha

It is maybe too soon to write about it on such high profile blog, but let's try... I am trying also write managed key value db with ACID and MVCC, but it has much lower level interface compared to Munin. I would definitely liked to know if somebody finds it interesting.

https://github.com/Bobris/BTDB

Frank Quednau

Based on my yet limited understanding, this piece of software would work fairly well for persisting an Event Store...

I guess this thing ticks inside Raven MQ as well, in case you persist messages?

Royston Shufflebotham

Unfortunately, AGPL means that for the vast majority of situations in which I might consider deploying any of this, I can't.

(Same thing goes for RavenDB itself, really: the MongoDB approach to licensing (cross AGPL/Apache) [ http://blog.mongodb.org/post/103832439/the-agpl] is much more palatable and conducive to wider adoption.)

Shame, as it looks pretty cool...

Ayende Rahien

Royston,

I am aware of that, but I am more interesting in getting enough adoption to get people to pay for it than having total high adoption

Jimbo

@Royston,

The real question is whether Ayende's interpretation of the AGPL license being used is right?

Several other AGPL licensed apps clearly indicate that only mods to the core are what is being protected. For example both MongoDB and CiviCRM clearly interpret the AGPL as covering only the core app/db and not any application you develop that connects through the API.

Despite the explanation in an earlier blogpost on the issue it is hard to conceive how the license actually is any different for RavenDB, Munin than these.

Comment preview

Comments have been closed on this topic.

FUTURE POSTS

No future posts left, oh my!

RECENT SERIES

  1. RavenDB Sharding (3):
    22 May 2015 - Adding a new shard to an existing cluster, splitting the shard
  2. The RavenDB Comic Strip (2):
    20 May 2015 - Part II – a team in trouble!
  3. Challenge (45):
    28 Apr 2015 - What is the meaning of this change?
  4. Interview question (2):
    30 Mar 2015 - fix the index
  5. Excerpts from the RavenDB Performance team report (20):
    20 Feb 2015 - Optimizing Compare – The circle of life (a post-mortem)
View all series

RECENT COMMENTS

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats