Ayende @ Rahien

Refunds available at head office

LevelDB & Windows: It ain’t a love story

I have been investigating the LevelDB project for the purpose of adding another storage engine to RavenDB. The good news is that there is a very strong likelihood that we can actually use that as a basis for what we want.

The bad news is that it is insanely easy to get LevelDB to compile and work on Linux, and appears to be an insurmountable barrier to do the same on Windows.

Yes, I know that I can get it working by just using a precompiled binary, but that won’t work. I actually want to make some changes there (mostly in the C API, right now).

This instructions appears to be no longer current. And this thread was promising, but didn’t lead anywhere.

I am going to go over the codebase with a fine tooth comb, but I am no longer a C++ programmer, and the intricacies of the build system is putting a very high roadblock of frustration.

Comments

Rafal
03/01/2013 08:36 AM by
Rafal

Is transaction support in leveldb sophisticated enough for RavenDB needs?

j23
03/01/2013 09:25 AM by
j23

@Ayende There are so many virtualization solutions (and cloud) nowadays. I don't think we should care if something works on Windows anymore.

Matt warren
03/01/2013 10:04 AM by
Matt warren

@rafal

LevelDB diesn't support transactions you have to implement your own as a layer on top. See indexedDB in Chrome for an example impl

LevekDB gives you atomic batched updates, but that's it

Daniel Marbach
03/01/2013 10:29 AM by
Daniel Marbach

Hy oren, What is the promise of having another DB option? Do you plan to superseed esent?

Daniel

Matt Warren
03/01/2013 10:35 AM by
Matt Warren

There's a couple of LevelDB ports that should compile in VS, see https://code.google.com/p/leveldbwin/ and https://code.google.com/r/kkowalczyk-leveldb/

However I think that's the problem with a Windows version of LevelDB, you're relying on someone porting the low-level parts (such as threading, mutexes and I/O) from the official Google Linux version.

Compared to Esent that MS officially supports and ships in every version of Windows, that's a big difference for something that you want to be robust and fully tested.

Matt Warren
03/01/2013 10:37 AM by
Matt Warren

You might also want to take a look at this https://groups.google.com/forum/#!topic/leveldb/g_fWOcIwNDM, it should save you some time

Ayende Rahien
03/01/2013 10:52 AM by
Ayende Rahien

Rafal, No, but we already have written the code to compensate for that.

Ayende Rahien
03/01/2013 10:53 AM by
Ayende Rahien

j23, It matters, a lot. To start with, we are developing mostly on Windows. Having to develop on a separate platform is a huge barrier.

Ayende Rahien
03/01/2013 10:53 AM by
Ayende Rahien

Daniel, I don't trust munin, and I would like to have something better.

Ayende Rahien
03/01/2013 10:55 AM by
Ayende Rahien

Matt, Those are both last updated on 2011, that is a bit too old for me. And I agree on the problem there.

Ayende Rahien
03/01/2013 10:55 AM by
Ayende Rahien

Matt, I know of leveldb-sharp, but it has major implementation issues making it unsuitable for what we want to do.

tobi
03/01/2013 01:37 PM by
tobi

Ayende, you don't trust Munin you say (which I understand). I'm curious: did you have higher expectations before starting the project? Did it turn out to be to costly to get Munin to a "trustworthy" state?

Ayende Rahien
03/01/2013 01:40 PM by
Ayende Rahien

Tobi, Yes, it was always going to be a toy thing, but it got pretty expensive when we run into race conditions there.

Wal
03/01/2013 02:49 PM by
Wal

Ayende, I notice you already have an alternative to esent (not recommended for production I see) - why are you considering other engines such as leveldb instead of pursuing your own? (for interests sake)

Ayende Rahien
03/01/2013 03:38 PM by
Ayende Rahien

Wal, We want something that can run on Linux. And building a proper storage engine is HARD, I want to skip doing all the hard work and take something that is already known to be working.

Justin
03/01/2013 04:29 PM by
Justin

Having looked at Raven's ESENT table structure and the source code that interacts with it, I would think SQLite would be an obvious choice as an alternative fully cross platform extremely reliable and fast data engine.

Have you guys looked at SQLite for this purpose?

Rob Ashton
03/01/2013 04:51 PM by
Rob Ashton

You don't really want to use a storage engine which has already made decisions about concurrency control and transactions - Sqlite wouldn't be a good fit for the first come first served model preferred by Raven.

Rob Ashton
03/01/2013 05:10 PM by
Rob Ashton

Hey Oren, switch to Linux then you just have to type "make" ;-)

Rob Ashton
03/01/2013 05:18 PM by
Rob Ashton
  • and spend 10 hours fiddling with your graphics and network drivers - totally worth it
Justin
03/01/2013 05:41 PM by
Justin

Rob,

I would be interested to to hear some more details on what is meant by "already made decisions" on SQLite vs ESENT.

Is it that SQLite only does read_uncommitted and serializable vs ESENT only doing snapshot isolation?

Control over transactions seem comparable between them.

Data types on table columns seems comparable except for no built in multi-value tagged types.

SQLite can run in memory, journal in memory, fsync off, etc. to get varying level of performance vs safety.

You can spread data across multiple files to increase concurrency among other tricks.

Hard to beat how well tested and how stable the file format is, seems like a good fit for an embedded cross platform db engine with good .Net bindings.

Of course Postgresql would give you all the concurrency, transactions and datatypes you could ask for including a full JSON type or other multi-value types, but you would need to spin up a separate process for it since it won't run in-process.

Matt Warren
03/01/2013 06:17 PM by
Matt Warren

@Justin

I thought that Sqlite had issues with multi-threaded access, see http://ayende.com/blog/3400/in-search-of-an-embedded-db

Rafal
03/01/2013 06:45 PM by
Rafal

Isn't SQLite using BerkeleyDB underneath?

Rob
03/01/2013 06:51 PM by
Rob

No, ESENT

Matt
03/01/2013 07:02 PM by
Matt

SQLite has either a database file level or in the case of shared cache mode table level locks. These are shared reade r- single writer locks but you can allow reads while writing with read uncommitted if needed.

See: http://www.sqlite.org/sharedcache.html

That mode would be ideal for Raven IMO.

And as I mentioned you could spread each of the 18 raven tables to separate files if needed and you can still do join across files, which I am not sure if you guy do. But the table level locking should give the same result.

This is obviously not as nice as snapshot isolation with MVCC that is given to you for free by more complex DB's like PostgreSQL or ESENT, but it looks like Mongo is doing just fine with a per database global lock:

http://docs.mongodb.org/manual/faq/concurrency/

Mongo's was process wide up until 2.2!

Rob Ashton
03/01/2013 07:15 PM by
Rob Ashton

Hey, I totally mis-read the question above as I was on my iPhone :-)

WRT to Mongo doing DB-wide locks, that's not the same thing, it's easy to do DB-wide locks if you have a single writer thread and you don't care for fsync (as an example)

Justin
03/01/2013 07:21 PM by
Justin

Rob,

Not sure I follow, you can turn off fsync with SQLite and use a single writer thread as well, thats up to Raven?

Of course SQLite will let you use multiple writer threads but they will wait on the same table.

Rob Ashton
03/02/2013 12:01 AM by
Rob Ashton

That wasn't what I was saying, I was saying that Mongo gets away with its full-db locks because it works that way, and Raven doesn't work that way so it's a bad comparison.

Justin
03/02/2013 12:13 AM by
Justin

Rob,

My point is that is an internal implementation detail to Raven. From the outside Mongo and Raven are very similar and Mongo is able to provide adequate performance, arguably as good or better than Raven even though it has a global lock.

If Mongo can do this with a global lock so should Raven, and if so then you now have the option of using a very well tested and performant database engine that's cross platform.

If this is not possible because Ravens design is so tied to a storage engine having snapshot isolation then I guess SQLite is not viable choice and my original question is answered. It will be interesting to see what solution is chosen to accomplish this goal in the end.

JDice
03/02/2013 03:39 AM by
JDice

Ayende,

Is LevelDB going to provide better performance than Esent? If so, how much would you estimate?

Matt Johnson
03/02/2013 04:25 AM by
Matt Johnson

Did anything ever come of looking into BangDB?

Ayende Rahien
03/02/2013 06:32 AM by
Ayende Rahien

Justin, I agree, but SQLite is pretty bad at multi threaded access, at that is something that we rely heavily on.

Ayende Rahien
03/02/2013 06:35 AM by
Ayende Rahien

Matt, That would NOT be ideal for RavenDB, actually. We really do need to have multiple concurrent writers at the same time. Just to give you some examples, map/reduce indexes, stats, replication, user writes. A lot of those generate concurrent writes. We actually need to make a lot of writes to the same "table" a the same time, and any storage solution we use has to reflect that.

And Mongo's decision to do that is... well, let us say that it caused a lot of problems for Mongo's users (search the forums), and Mongo doesn't have nearly as much background stuff as we do.

Ayende Rahien
03/02/2013 06:36 AM by
Ayende Rahien

JDice, As I can't run it yet, I have really no way to tell. Performance is something that we would like to improve, but I have no idea how / whatever that will be the case

Ayende Rahien
03/02/2013 06:37 AM by
Ayende Rahien

Matt, We tried looking at BangDB, I couldn't find the code. It is supposed to be OSS project, and I couldn't find the code (and I looked). That is the point when I gave up.

Chris Wright
03/03/2013 08:41 AM by
Chris Wright

And you would like to have one storage engine at least that can work on all platforms that you want RavenDB to work on. Otherwise you could just use ESENT on Windows and LevelDB on Linux.

Ayende Rahien
03/03/2013 09:17 AM by
Ayende Rahien

Chris, That is a VERY important issue, yes.

Yitzchok
03/03/2013 10:39 AM by
Yitzchok

Did you check out https://github.com/hsn10/leveldb-mingw it seems like it is active.

j23
03/03/2013 08:30 PM by
j23

@ayende Question is why should Google care to support Leveldb on Windows ? I don't think they have any serious server software on non *nix

Ayende Rahien
03/04/2013 12:18 AM by
Ayende Rahien

j23, Wherever did I said that Google is obligated to do so?

Giorgi
03/04/2013 09:48 AM by
Giorgi

There is a Berkeley DB back-end to SQLite which supports multiple writers: http://stackoverflow.com/questions/2824135/how-fast-is-berkeley-db-sql-compared-to-sqlite

Ayende Rahien
03/04/2013 11:50 AM by
Ayende Rahien

Giorgi, I don't trust BDB at all. See my previous experiments with it.

Giorgi
03/05/2013 10:19 AM by
Giorgi

Ayende,

The bug that you encountered is fix and there is a .Net binding for 5.3 so why not give it a try again?

Ayende Rahien
03/05/2013 11:07 AM by
Ayende Rahien

Giorgi, Which bug are you talking about? And I want to be able to compile & step through the code myself.

Giorgi
03/05/2013 11:45 AM by
Giorgi

The bug which you linked to at http://ayende.com/blog/3411/observations-on-embedded-databases

Ayende Rahien
03/05/2013 12:26 PM by
Ayende Rahien

Giorgi, I lost trust in that, and that is important. I prefer concentrating my effort on things that didn't fall off & die the first time I touched them.

Matthias Götzke
04/21/2013 06:30 AM by
Matthias Götzke

@ayende ... have you looked at https://github.com/bitcoin/bitcoin/tree/master/src/leveldb

btw we use sqlite right now and are fine with it, but are also looking into leveldb (but need cross platform)

there is also an article on that:

http://www.codeproject.com/Articles/569146/LevelDB-DLL-for-Windows-A-New-Approach-to-Exportin

Ayende Rahien
04/21/2013 08:44 AM by
Ayende Rahien

Matthias, I am sorry, but I didn't know about that. At this point, however, it isn't that relevant.

Comments have been closed on this topic.