Ayende @ Rahien

My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:


+972 52-548-6969

, @ Q c

Posts: 6,125 | Comments: 45,486

filter by tags archive

Voron’s implementation: Managed memory mapped database–getting to C memory model

time to read 11 min | 2160 words

In my previous post I started to talk about the architecture of Naver. But this is actually still too early to tell, since before anything else, I want to create the low level interface for working with pages. And even before that, I had to decide how to actually represent a page in C#. In LMDB, this is easy because you can just access the memory using a pointer, and work with memory in this fashion is really natural in C. But in C#, that is much harder. It took me some trial and error, but I realize that I was trying to write C code in C#, and that isn’t going to work. Instead, I have to write native C#, which is similar, but different. In C, I can just pass around a pointer, and start doing evil things to it at will. In C#, there are a lot of road blocks along the way that prevent you from doing that.

So I ended up with this:

   1: [StructLayout(LayoutKind.Explicit, Pack = 1)]
   2: public struct PageHeader
   3: {
   4:     [FieldOffset(0)]
   5:     public int PageNumber;
   6:     [FieldOffset(4)]
   7:     public PageFlags Flags;
   9:     [FieldOffset(5)]
  10:     public ushort Lower;
  11:     [FieldOffset(7)]
  12:     public ushort Upper;
  14:     [FieldOffset(5)]
  15:     public int NumberOfPages;
  16: }

This is the memory layout that I want. But, there is not way in C# to say, allocate this struct at this location. Luckily, I found a workaround.

   1: public unsafe class Page
   2: {
   3:     private readonly byte* _base;
   4:     private readonly PageHeader* _header;
   6:     public Page(byte* b)
   7:     {
   8:         _base = b;
   9:         _header = (PageHeader*)b;
  10:     }
  12:     public int PageNumber { get { return _header->PageNumber; } set { _header->PageNumber = value; } }

What I can do, I can have a Page class that manage this for me. This means that I just give the class a pointer, and it can treat this either as a PageHeader* or as a byte array. This means that I also get to do cool tricks like having an array backed by memory directly in the code:

   1: public ushort* KeysOffsets
   2: {
   3:     get { return (ushort*)(_base + Constants.PageHeaderSize); }
   4: }

The Page class have a lot of methods relating to managing the page itself, and it should abstract away all the pesky details of actually working with memory directly.

So, this isn’t a really informative post, I fear. But it did take me a few hours to come up with the approach that I wanted. I kept trying to find ways to embed addresses inside the PageHeader, until I realized that I can just hold that externally. Note that all the important state about the Page is actually stored in memory outside the Page class, so that is shared globally, but you can have two Page instances that point to the same place. And that is where I decided to put local state. Things like where we are currently positioned in the page, for example.



"ushort Lower" at offset 5 will cause unaligned access. Not sure if that is a performance problem. At least the CLR and C compilers never do this on their own.


Guess I better start reading up on unsafe contexts and pointers to keep up with this blog post series! I'm looking forward to more post like this!


Doing more research on this (http://lemire.me/blog/archives/2012/05/31/data-alignment-for-speed-myth-or-reality/) it seems there is no performance difference at all. Why are compilers avoiding this like the plague then?

Ayende Rahien

Tobi, Look at the comments, this is fairly recent development.

Drew Noakes

Depending upon how you use the Page class, you can avoid the overhead of having a reference type altogether by just casting the byte* to a PageHeader* and then moving your methods/properties from the class to your struct directly. In that way, they will be manipulating the underlying byte[] directly, and there's nothing for the GC to clean up at the end of it.

For example:

var page = (PageHeader*)bytePointer; page->PageNumber = 2;

No copying of memory. No GC.

Drew Noakes

BTW thanks for the great talk in London last night. I really enjoyed your review of the different databases you researched, and the detail on the improvements you've made with Voron.

Ayende Rahien

Drew, There is a non persistent state that I need to keep. For example, the last search pos in the page. I keep that in the Page class for that purpose.

Comment preview

Comments have been closed on this topic.


  1. The design of RavenDB 4.0: Physically segregating collections - one day from now
  2. RavenDB 3.5 Whirlwind tour: I need to be free to explore my data - about one day from now
  3. RavenDB 3.5 whirl wind tour: I'll have the 3+1 goodies to go, please - 5 days from now
  4. The design of RavenDB 4.0: Voron has a one track mind - 6 days from now
  5. RavenDB 3.5 whirl wind tour: Digging deep into the internals - 7 days from now

And 12 more posts are pending...

There are posts all the way to May 30, 2016


  1. The design of RavenDB 4.0 (14):
    03 May 2016 - Making Lucene reliable
  2. RavenDB 3.5 whirl wind tour (14):
    04 May 2016 - I’ll find who is taking my I/O bandwidth and they SHALL pay
  3. Tasks for the new comer (2):
    15 Apr 2016 - Quartz.NET with RavenDB
  4. Code through the looking glass (5):
    18 Mar 2016 - And a linear search to rule them
  5. Find the bug (8):
    29 Feb 2016 - When you can't rely on your own identity
View all series


Main feed Feed Stats
Comments feed   Comments Feed Stats