Ayende @ Rahien

My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:


+972 52-548-6969

, @ Q c

Posts: 6,128 | Comments: 45,551

filter by tags archive

Reviewing Lightning memory-mapped database libraryA thoughtful hiatus

time to read 2 min | 386 words

I thought that I would stop a bit from focusing on what the LMDB code is doing in favor to some observations about the code itself. Going into this codebase it like getting hit in the face with a shovel. Now, this might be my personal experience, as someone who has done a lot of managed code work in the past. But I used to be a pretty horrible C/C++ guy (the fact that I say C/C++ should tell you exactly what my level was).

But I don’t think that it was just that. Even beyond the fact that the code is C, and not C++ (which I am much more used to), there is a problem that only become clear to me well after I read the code for the millionth time. It grew. Looking at the way the code is structured, it looks like it was about as nice a C codebase as you can get (don’t get excited, that isn’t saying much). But overtime, features were added, but the general structure of the codebase wasn’t adjusted to account for that.

I am talking about things like this:




There are actually 22 (!) ‘if(IS_LEAF(mp))’ references in the codebase.

Or what about this?



It looks like certain features (duplicate keys support, for example) was added that had a lot of implication on the code, but it wasn’t refactored accordingly. It make it very hard to go through.

More posts in "Reviewing Lightning memory-mapped database library" series:

  1. (08 Aug 2013) MVCC
  2. (07 Aug 2013) What about free pages?
  3. (06 Aug 2013) Transactions & commits
  4. (05 Aug 2013) A thoughtful hiatus
  5. (02 Aug 2013) On page splits and other painful things
  6. (30 Jul 2013) On disk data
  7. (25 Jul 2013) Stepping through make everything easier
  8. (24 Jul 2013) going deeper
  9. (15 Jul 2013) Because, damn it!
  10. (12 Jul 2013) tries++
  11. (09 Jul 2013) Partial



Can this code repetition be a case of unwrapping what otherwise would have been function calls, making those things inline for performance reasons?

Ayende Rahien

Anton, No, I don't believe it. There are other ways to do that, see: http://gcc.gnu.org/onlinedocs/gcc/Inline.html

Howard Chu

Eh. Yes, cursornext is mostly a mirror image of cursorprev. Likewise cursorfirst / cursorlast. But I made a conscious choice to keep them separate. I could easily have unified them but there would be additional branching and special casing going on.

As for the LEAF2 cases - it's cheaper to have the two lines of repeated code than to go thru the overhead of a function call.

Ayende Rahien

Howard, You can do inline function, which has the same cost of repeating the code and not have duplicate code. Now, if you need to change something, you have to search for all the places where this is happening.

Comment preview

Comments have been closed on this topic.


  1. The worker pattern - 10 hours from now

There are posts all the way to May 30, 2016


  1. The design of RavenDB 4.0 (14):
    26 May 2016 - The client side
  2. RavenDB 3.5 whirl wind tour (14):
    25 May 2016 - Got anything to declare, ya smuggler?
  3. Tasks for the new comer (2):
    15 Apr 2016 - Quartz.NET with RavenDB
  4. Code through the looking glass (5):
    18 Mar 2016 - And a linear search to rule them
  5. Find the bug (8):
    29 Feb 2016 - When you can't rely on your own identity
View all series


Main feed Feed Stats
Comments feed   Comments Feed Stats