Reviewing Lightning memory-mapped database library: Stepping through make everything easier

architecture (618) rss
bugs (451) rss
challanges (123) rss
community (381) rss
databases (481) rss
design (896) rss
development (647) rss
hibernating-practices (72) rss
miscellaneous (592) rss
performance (397) rss
programming (1093) rss
raven (1459) rss
ravendb.net (545) rss
reviews (184) rss

2025
- August (6)
- July (7)
- June (7)
- May (10)
- April (10)
- March (10)
- February (7)
- January (12)
2024
- December (3)
- November (2)
- October (1)
- September (3)
- August (5)
- July (10)
- June (4)
- May (6)
- April (2)
- March (8)
- February (2)
- January (14)
2023
- December (4)
- October (4)
- September (6)
- August (12)
- July (5)
- June (15)
- May (3)
- April (11)
- March (5)
- February (5)
- January (8)
2022
- December (5)
- November (7)
- October (7)
- September (9)
- August (10)
- July (15)
- June (12)
- May (9)
- April (14)
- March (15)
- February (13)
- January (16)
2021
- December (23)
- November (20)
- October (16)
- September (6)
- August (16)
- July (11)
- June (16)
- May (4)
- April (10)
- March (11)
- February (15)
- January (14)
2020
- December (10)
- November (13)
- October (15)
- September (6)
- August (9)
- July (9)
- June (17)
- May (15)
- April (14)
- March (21)
- February (16)
- January (13)
2019
- December (17)
- November (14)
- October (16)
- September (10)
- August (8)
- July (16)
- June (11)
- May (13)
- April (18)
- March (12)
- February (19)
- January (23)
2018
- December (15)
- November (14)
- October (19)
- September (18)
- August (23)
- July (20)
- June (20)
- May (23)
- April (15)
- March (23)
- February (19)
- January (23)
2017
- December (21)
- November (24)
- October (22)
- September (21)
- August (23)
- July (21)
- June (24)
- May (21)
- April (21)
- March (23)
- February (20)
- January (23)
2016
- December (17)
- November (18)
- October (22)
- September (18)
- August (23)
- July (22)
- June (17)
- May (24)
- April (16)
- March (16)
- February (21)
- January (21)
2015
- December (5)
- November (10)
- October (9)
- September (17)
- August (20)
- July (17)
- June (4)
- May (12)
- April (9)
- March (8)
- February (25)
- January (17)
2014
- December (22)
- November (19)
- October (21)
- September (37)
- August (24)
- July (23)
- June (13)
- May (19)
- April (24)
- March (23)
- February (21)
- January (24)
2013
- December (23)
- November (29)
- October (27)
- September (26)
- August (24)
- July (24)
- June (23)
- May (25)
- April (26)
- March (24)
- February (24)
- January (21)
2012
- December (19)
- November (22)
- October (27)
- September (24)
- August (30)
- July (23)
- June (25)
- May (23)
- April (25)
- March (25)
- February (28)
- January (24)
2011
- December (17)
- November (14)
- October (24)
- September (28)
- August (27)
- July (30)
- June (19)
- May (16)
- April (30)
- March (23)
- February (11)
- January (26)
2010
- December (29)
- November (28)
- October (35)
- September (33)
- August (44)
- July (17)
- June (20)
- May (53)
- April (29)
- March (35)
- February (33)
- January (36)
2009
- December (37)
- November (35)
- October (53)
- September (60)
- August (66)
- July (29)
- June (24)
- May (52)
- April (63)
- March (35)
- February (53)
- January (50)
2008
- December (58)
- November (65)
- October (46)
- September (48)
- August (96)
- July (87)
- June (45)
- May (51)
- April (52)
- March (70)
- February (43)
- January (49)
2007
- December (100)
- November (52)
- October (109)
- September (68)
- August (80)
- July (56)
- June (150)
- May (115)
- April (73)
- March (124)
- February (102)
- January (68)
2006
- December (95)
- November (53)
- October (120)
- September (57)
- August (88)
- July (54)
- June (103)
- May (89)
- April (84)
- March (143)
- February (78)
- January (64)
2005
- December (70)
- November (97)
- October (91)
- September (61)
- August (74)
- July (92)
- June (100)
- May (53)
- April (42)
- March (41)
- February (84)
- January (31)
2004
- December (49)
- November (26)
- October (26)
- September (6)
- April (10)

Jul 25 2013

Reviewing Lightning memory-mapped database libraryStepping through make everything easier

time to read 4 min | 624 words

Okay, I know that I have been critical about the LMDB codebase so far. But one thing that I really want to point out for it is that it was pretty easy to actually get things working on Windows. It wasn’t smooth, in the sense that I had to muck around with the source a bit (hack endianess, remove a bunch of unix specific header files, etc). But that took less than an hour, and it was pretty much it. Since I am by no means an experienced C developer, I consider this a major win. Compare that to leveldb, which flat out won’t run on Windows no matter how much time I spent trying, and it is a pleasure.

Also, stepping through the code I am starting to get a sense of how it works that is much different than the one I had when I just read the code. It is like one of those 3D images, you suddenly see something.

The first thing that became obvious is that I totally missed the significance of the lock file. LMDB actually create two files:

lock.mdb
data.mdb

Lock.mdb is used to synchronized data between different readers. It seems to mostly be there if you want to have multiple writers using different processes. That is a very interesting model for an embedded database, I’ve to admit. Not something that I think other embedded databases are offering. In order to do that, it create two named mutexes (one for read and one for write).

A side note on Windows support:

LMDB supports Windows, but it is very much a 2nd class citizen. You can see it in things like path not found error turning into a no such process error (because it try to use GetLastError() codes as C codes), or when it doesn’t create a directory even though not creating it would fail.

I am currently debugging through the code and fixing such issues as I go along (but no, I am doing heavy handed magic fixes, just to get past this stage to the next one, otherwise I would have sent a pull request).

Here is one such example. Here is the original code:

But ReadFile in Win32 will return false if the file is empty, so you actually need to write something like this to make the code work:

Past that hurdle, I think that I get a lot more about what is going on with the way LMDB works than before.

Let us start with the way data.mdb works. It is important to note that for pretty much everything in LMDB we use the system page size. By default, that is 4KB.

The data file starts with 2 pages allocated. Those page contain the following information:

Looking back at how CouchDB did things, I am pretty sure that those two pages are going to be pretty important. I am guess that they would always contain the root of the data in the file. There is also the last transaction on them, which is what I imagine determine how something gets committed. I don’t know yet, as I said, guessing based on how CouchDB works.

I’ll continue this review in another time. Next time, transactions…

Tweet Share Share 17 comments

Tags:

Comments

12 Jul 2013
16:42 PM

Howard Chu

No hacking of the files is needed if you compile on Windows using MinGW.. And actually, I've also compiled this using MSVC without any trouble, so I don't know why you're going in hacking it apart.

And Windows is very much a 2nd class OS. 3rd class even. But despite that, there are companies like VMWare using LMDB in their Windows builds...

12 Jul 2013
16:54 PM

Ayende Rahien

Howard, See the error I have pointed out? There are quite a few those that I managed to work around / ignore. That is what I meant by that. I am pretty sure that this is because most of the development is done elsewhere, so things might be broken on the Windows build without you noticing. None of that is something that would be very hard to fix, from what I saw.

25 Jul 2013
12:34 PM

Rodrigo Zechin

"That is a very interesting model for an embedded database, I’ve to admit. Not something that I think other embedded databases are offering"

SQLite also does that exceptionally well (at least for an embedded database).

25 Jul 2013
12:48 PM

Ayende Rahien

Rodrigo, SQLite does this by having global locks.

25 Jul 2013
12:58 PM

Rodrigo Zechin

To be honest, I never studied SQLite source-code in deep details. But I always believed that its "write-ahead-log" mode worked without global locks. http://www.sqlite.org/wal.html

25 Jul 2013
13:10 PM

Howard Chu

re: the above error, yes, you're right, the guy who patched that is not a Windows programmer. The code was correct when I wrote it originally, and I have fixed it again now that you've pointed it out.

SQLite is some pretty horrible code to read and modify. It took me over a day to retrofit LMDB into it. Most adaptations only take me a few hours.

An embedded database engine has more uses than just single-process applications. LMDB is being used in many large NoSQL servers as well as OpenLDAP slapd, systems where multi-process access is a big win. You can have multiple frontend processes serving requests from the same database, you can do message-passing thru the database, etc. etc...

25 Jul 2013
13:21 PM

Howard Chu

SQLite3 is an example of code written by a non-programmer; there is no concept of abstraction layer anywhere at all. High level code assumes it knows the exact byte layout of the data files, when it has no business mucking about in such low level details. There is no abstraction or isolation of internal data structures, they just go grubbing around wherever they like. To replace its storage engine requires gutting dozens of source files instead of just tweaking one or two interface definitions.

25 Jul 2013
14:14 PM

Jiggaboo

@Howard: Not very kind of you to comment about Windows being 3rd class citizen when it has more then billion installations and your code doesn't work on it. Also it's kindy funny after all posts from Ayende about your code to comment about SQLite's code being bad.

25 Jul 2013
14:45 PM

Howard Chu

The code builds perfectly fine on Windows using MinGW. And Windows is a 3rd class OS. http://www.gizmodo.com.au/2013/05/a-windows-developers-brutal-explanation-as-to-why-microsoft-is-falling-behind/

Just giving the facts. Whether you consider facts to be kind or unkind is irrelevant.

25 Jul 2013
16:19 PM

comment

Howard Chu: ok, but still your code looks like crap. optimisation(really?) > readability? nope. chuck testa.

25 Jul 2013
16:57 PM

Howard Chu

Your opinion. "Pretty" code that runs slowly is ugly, to me. Code that yields the correct answer, late, is as broken as code that yields the wrong answer, IMO. But hey, not everyone can be world's fastest. From writing the world's fastest software multiplier in college, to ethernet drivers and file servers, the world has improved tangibly from my work. It doesn't matter whether or not you think it's readable because a person like you will never have the responsibility to read it.

25 Jul 2013
17:12 PM

Jiggaboo

@Howard: I don't understan how that link answers why Windows is 3rd class OS. Code building != code working. So great developer should know that. And that's bull... that "pretty" code is slow. Your code could be as fast as it is (if it is that is) and look good not. Hell it's less readable then assembler.

25 Jul 2013
17:13 PM

Jiggaboo

I wanted to say: Your code could be as fast as it is (if it is that is) and look good.

25 Jul 2013
18:28 PM

Rafal

Guys, seriously, what's so 'ugly' about mdb code? It looks concise, clearly structured, documented just as much as necessary, well formatted - nothing to complain about imho.

25 Jul 2013
20:30 PM

JRB

If you want an easy way to use LevelDB on Windows, check out. https://github.com/jbandela/leveldb_cross_compiler

In releases, there is a .dll that works with Visual C++ 2013 and mingw gcc (same dll works on both). You just have to include the header files and copy the .dll to the app directory.

25 Jul 2013
20:30 PM

JRB

If you want an easy way to use LevelDB on Windows, check out. https://github.com/jbandela/leveldb_cross_compiler

In releases, there is a .dll that works with Visual C++ 2013 and mingw gcc (same dll works on both). You just have to include the header files and copy the .dll to the app directory.

25 Jul 2013
20:32 PM

JRB

Sorry double post. And URL got screwed up. The correct URL is above

Comment preview

Comments have been closed on this topic.

Markdown turns plain text formatting into fancy HTML formatting.

Phrase Emphasis

*italic*   **bold**
_italic_   __bold__

Links

Inline:

An [example](http://url.com/ "Title")

Reference-style labels (titles are optional):

An [example][id]. Then, anywhere
else in the doc, define the link:
  [id]: http://example.com/  "Title"

Images

Inline (titles are optional):

![alt text](/path/img.jpg "Title")

Reference-style:

![alt text][id]
[id]: /url/to/img.jpg "Title"

Headers

Setext-style:

Header 1
========
Header 2
--------

atx-style (closing #'s are optional):

# Header 1 #
## Header 2 ##
###### Header 6

Lists

Ordered, without paragraphs:

1.  Foo
2.  Bar

Unordered, with paragraphs:

*   A list item.
    With multiple paragraphs.
*   Bar

You can nest them:

*   Abacus
    * answer
*   Bubbles
    1.  bunk
    2.  bupkis
        * BELITTLER
    3. burper
*   Cunning

Blockquotes

> Email-style angle brackets
> are used for blockquotes.
> > And, they can be nested.
> #### Headers in blockquotes
> 
> * You can quote a list.
> * Etc.

Horizontal Rules

Three or more dashes or asterisks:

---
* * *
- - - -

Manual Line Breaks

End a line with two or more spaces:

Roses are red,   
Violets are blue.

Fenced Code Blocks

Code blocks delimited by 3 or more backticks or tildas:

```
This is a preformatted
code block
```

Header IDs

Set the id of headings with {#<id>} at end of heading line:

## My Heading {#myheading}

Tables

Fruit    |Color
---------|----------
Apples   |Red
Pears	 |Green
Bananas  |Yellow

Definition Lists

Term 1
: Definition 1
Term 2
: Definition 2

Footnotes

Body text with a footnote [^1]
[^1]: Footnote text here

Abbreviations

MDD <- will have title
*[MDD]: MarkdownDeep

Oren Eini

Oren Eini

CEO of RavenDB

Reviewing Lightning memory-mapped database libraryStepping through make everything easier

More posts in "Reviewing Lightning memory-mapped database library" series:

Comments

Comment preview

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication

Main feed
Comments feed

Oren Eini

CEO of RavenDB

Related posts that you may find interesting:

More posts in "Reviewing Lightning memory-mapped database library" series:

Comments

Comment preview

Markdown formatting

Phrase Emphasis

Links

Images

Headers

Lists

Blockquotes

Horizontal Rules

Manual Line Breaks

Fenced Code Blocks

Header IDs

Tables

Definition Lists

Footnotes

Abbreviations

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication