The design of managed memory map database

Aug 12 2013

The design of managed memory map database

time to read 3 min | 552 words

Since MMMD appears to be an airport in Mexico, I think that I’ll call this project ~~Nevar~~ Voron. It seems… appropriate, somehow.

I don’t learn by just reading code, I really need to do some writing to actually understand the concepts. In order to do that, I decided to try to create a managed implementation based on the ideas from LMDB. Note that I explicitly don’t try to do a port, I am trying to do something different, but based on the concepts that I learned from reading the code.

Goals:

Single process only – I have no need to do something that can be used by multiple processes.
No predefined file size – The reason LMDB does this is that it maps the entire file as a single continuous range. But there isn’t really any reason why you can grow the file and map the extra space in a different location.
Multiple trees (“tables”) – So we don’t have just a single B-Tree, we can have more than one. I won’t have a main DB exposed outside.
ACID, Transactional, MVCC with single writer
No File I/O – In LMDB, the code allows either standard file I/O or direct memory writes. The later is dangerous if you are using C, because you can overwrite and corrupt data, but in .NET, that isn’t an issue, since we don’t do direct memory access.
No duplicates – It really complicates the LMDB codebase, and I don’t think it is really needed.

Overall architecture:

One of the reasons I explicitly not trying to do a port is that I still don’t like the LMDB codebase. I think that it require a major refactoring, and the way it works is not really suitable for a managed implementation. The first major component is going to be the B-Tree. This is responsible for reading & writing data from a memory mapped file region. But it is focused only on that. Things like transactions, page allocations, free pages, concurrency and anything else is the responsibility of others. This is important because B-Tree is going to be useful for a lot of stuff. In LMDB, the B+Tree is used to maintain the free list, the main db, sub databases, etc. For that matter, just having a persistent B+Tree is going to be very useful in general.

After going away and playing with the code for a while, I can tell you that you need the following just to get started:

Page – represent a page in memory, of course. A page is a sorted list of items, where item can be a value or a reference to another page.
Tree – the actual tree implementation, responsible for such things as maintaining the tree, reads & searches. The only outside dependency it has is on Transaction, and that is only when it needs to allocate a pages.
Transaction – manages transactions, either read only or write. Handles page allocations. This one isn’t even started, because I want to see how far I can take the tree code before I need to actually complete it.

The basic idea is, I really like the ideas that LMDB uses, I just don’t like the codebase. So I want to see if I can do the same thing with a codebase that would be easier for me to work with.

And, of course, this is a cool project and quite interesting one.

0 comments

Oren Eini

Oren Eini

CEO of RavenDB