Reviewing Lightning memory-mapped database libraryStepping through make everything easier
Okay, I know that I have been critical about the LMDB codebase so far. But one thing that I really want to point out for it is that it was pretty easy to actually get things working on Windows. It wasn’t smooth, in the sense that I had to muck around with the source a bit (hack endianess, remove a bunch of unix specific header files, etc). But that took less than an hour, and it was pretty much it. Since I am by no means an experienced C developer, I consider this a major win. Compare that to leveldb, which flat out won’t run on Windows no matter how much time I spent trying, and it is a pleasure.
Also, stepping through the code I am starting to get a sense of how it works that is much different than the one I had when I just read the code. It is like one of those 3D images, you suddenly see something.
The first thing that became obvious is that I totally missed the significance of the lock file. LMDB actually create two files:
- lock.mdb
- data.mdb
Lock.mdb is used to synchronized data between different readers. It seems to mostly be there if you want to have multiple writers using different processes. That is a very interesting model for an embedded database, I’ve to admit. Not something that I think other embedded databases are offering. In order to do that, it create two named mutexes (one for read and one for write).
A side note on Windows support:
LMDB supports Windows, but it is very much a 2nd class citizen. You can see it in things like path not found error turning into a no such process error (because it try to use GetLastError() codes as C codes), or when it doesn’t create a directory even though not creating it would fail.
I am currently debugging through the code and fixing such issues as I go along (but no, I am doing heavy handed magic fixes, just to get past this stage to the next one, otherwise I would have sent a pull request).
Here is one such example. Here is the original code:
But ReadFile in Win32 will return false if the file is empty, so you actually need to write something like this to make the code work:
Past that hurdle, I think that I get a lot more about what is going on with the way LMDB works than before.
Let us start with the way data.mdb works. It is important to note that for pretty much everything in LMDB we use the system page size. By default, that is 4KB.
The data file starts with 2 pages allocated. Those page contain the following information:
Looking back at how CouchDB did things, I am pretty sure that those two pages are going to be pretty important. I am guess that they would always contain the root of the data in the file. There is also the last transaction on them, which is what I imagine determine how something gets committed. I don’t know yet, as I said, guessing based on how CouchDB works.
I’ll continue this review in another time. Next time, transactions…
More posts in "Reviewing Lightning memory-mapped database library" series:
- (08 Aug 2013) MVCC
- (07 Aug 2013) What about free pages?
- (06 Aug 2013) Transactions & commits
- (05 Aug 2013) A thoughtful hiatus
- (02 Aug 2013) On page splits and other painful things
- (30 Jul 2013) On disk data
- (25 Jul 2013) Stepping through make everything easier
- (24 Jul 2013) going deeper
- (15 Jul 2013) Because, damn it!
- (12 Jul 2013) tries++
- (09 Jul 2013) Partial
Comments
No hacking of the files is needed if you compile on Windows using MinGW.. And actually, I've also compiled this using MSVC without any trouble, so I don't know why you're going in hacking it apart.
And Windows is very much a 2nd class OS. 3rd class even. But despite that, there are companies like VMWare using LMDB in their Windows builds...
Howard, See the error I have pointed out? There are quite a few those that I managed to work around / ignore. That is what I meant by that. I am pretty sure that this is because most of the development is done elsewhere, so things might be broken on the Windows build without you noticing. None of that is something that would be very hard to fix, from what I saw.
"That is a very interesting model for an embedded database, I’ve to admit. Not something that I think other embedded databases are offering"
SQLite also does that exceptionally well (at least for an embedded database).
Rodrigo, SQLite does this by having global locks.
To be honest, I never studied SQLite source-code in deep details. But I always believed that its "write-ahead-log" mode worked without global locks. http://www.sqlite.org/wal.html
re: the above error, yes, you're right, the guy who patched that is not a Windows programmer. The code was correct when I wrote it originally, and I have fixed it again now that you've pointed it out.
SQLite is some pretty horrible code to read and modify. It took me over a day to retrofit LMDB into it. Most adaptations only take me a few hours.
An embedded database engine has more uses than just single-process applications. LMDB is being used in many large NoSQL servers as well as OpenLDAP slapd, systems where multi-process access is a big win. You can have multiple frontend processes serving requests from the same database, you can do message-passing thru the database, etc. etc...
SQLite3 is an example of code written by a non-programmer; there is no concept of abstraction layer anywhere at all. High level code assumes it knows the exact byte layout of the data files, when it has no business mucking about in such low level details. There is no abstraction or isolation of internal data structures, they just go grubbing around wherever they like. To replace its storage engine requires gutting dozens of source files instead of just tweaking one or two interface definitions.
@Howard: Not very kind of you to comment about Windows being 3rd class citizen when it has more then billion installations and your code doesn't work on it. Also it's kindy funny after all posts from Ayende about your code to comment about SQLite's code being bad.
The code builds perfectly fine on Windows using MinGW. And Windows is a 3rd class OS. http://www.gizmodo.com.au/2013/05/a-windows-developers-brutal-explanation-as-to-why-microsoft-is-falling-behind/
Just giving the facts. Whether you consider facts to be kind or unkind is irrelevant.
Howard Chu: ok, but still your code looks like crap. optimisation(really?) > readability? nope. chuck testa.
Your opinion. "Pretty" code that runs slowly is ugly, to me. Code that yields the correct answer, late, is as broken as code that yields the wrong answer, IMO. But hey, not everyone can be world's fastest. From writing the world's fastest software multiplier in college, to ethernet drivers and file servers, the world has improved tangibly from my work. It doesn't matter whether or not you think it's readable because a person like you will never have the responsibility to read it.
@Howard: I don't understan how that link answers why Windows is 3rd class OS. Code building != code working. So great developer should know that. And that's bull... that "pretty" code is slow. Your code could be as fast as it is (if it is that is) and look good not. Hell it's less readable then assembler.
I wanted to say: Your code could be as fast as it is (if it is that is) and look good.
Guys, seriously, what's so 'ugly' about mdb code? It looks concise, clearly structured, documented just as much as necessary, well formatted - nothing to complain about imho.
If you want an easy way to use LevelDB on Windows, check out. https://github.com/jbandela/leveldb_cross_compiler
In releases, there is a .dll that works with Visual C++ 2013 and mingw gcc (same dll works on both). You just have to include the header files and copy the .dll to the app directory.
If you want an easy way to use LevelDB on Windows, check out. https://github.com/jbandela/leveldb_cross_compiler
In releases, there is a .dll that works with Visual C++ 2013 and mingw gcc (same dll works on both). You just have to include the header files and copy the .dll to the app directory.
Sorry double post. And URL got screwed up. The correct URL is above
Comment preview