Ayende @ Rahien

My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:


+972 52-548-6969

, @ Q c

Posts: 6,125 | Comments: 45,488

filter by tags archive

Rob’s SprintThe cost of getting data from LevelDB

time to read 3 min | 422 words

We are currently investigating the usage of LevelDB as a storage engine in RavenDB. Some of the things that we feel very strongly about is transactions (LevelDB doesn’t have it) and performance (for a different definition of the one usually bandied about).

LevelDB does have atomicity, and the rest of CID can be built atop of that without too much complexity (already done, in fact). But we run into an issue when looking at the performance of reading. I am not sure if that is unique or not, but in our scenario, we typically deal with relatively large values. Documents of several MB are quite common. That means that we are pretty sensitive to memory allocations. It doesn’t help that we have very little control on the Large Object Heap, so it was with great interest that we looked at how LevelDB did things.

Reading the actual code make a lot of sense (more on that later, I will probably go through a big review of that). But there was one story that really didn’t make any sense to us, reading a value by key.

We started out using LevelDB Sharp:


This in turn result in the following getting called:


A few things to note here. All from the point of view of someone who deals with very large values.

  • valuePtr is not released, even though it was allocated by us.
  • We copy the value from valuePtr into a string, resulting in two copies of the data and twice the memory usage.
  • There is no way to get just partial data.
  • There is no way to get binary data (for example, encrypted)
  • This is going to be putting a lot of pressure on the Large Object Heap.

But wait, it actually gets better. Let us look at the LevelDB method that get called:


So we are actually copying the data multiple times now. For fun, the db->rep->Get() call also copy the data. And that is pretty much where we stopped looking.

We are actually going to need to write a new C API and export that to be able to make use of that in our C# code. Fun, or not.

More posts in "Rob’s Sprint" series:

  1. (08 Mar 2013) The cost of getting data from LevelDB
  2. (07 Mar 2013) Result Transformers
  3. (06 Mar 2013) Query optimizer jumped a grade
  4. (05 Mar 2013) Faster index creation
  5. (04 Mar 2013) Indexes and the death of temporary indexes
  6. (28 Feb 2013) Idly indexing


Rei Roldan

Reading the limitations list, do you really think it would be a good match for RavenDB?


Only a single process (possibly multi-threaded) can access a particular database at a time.

James Nugent

@Rei - AFAIK Esent (as currently used by RavenDB) is also accessible from one process at a time.


@James: yep, it does seem so "... The database file cannot be shared between multiple processes simultaneously..."

source: http://blogs.msdn.com/b/windowssdk/archive/2008/10/23/esent-extensible-storage-engine-api-in-the-windows-sdk.aspx

Rob Ashton

Leveldb handles concurrency for most operations across threads - which is what is required.

Matt Johnson

Why write a new API? Why not contribute changes to the existing projects instead?

Rob Ashton

Matt - it's a rabbit hole - when you only need 10% of the functionality and you'd have to do 100% of the work to get that 10% into the other projects.

Especially the LevelDB C api - do you really think they're going to take pull requests for this?


Here's something from the LevelDB benchmarks:

LevelDB doesn't perform as well with large values of 100,000 bytes each. This is because LevelDB writes keys and values at least twice: first time to the transaction log, and second time (during a compaction) to a sorted file. With larger values, LevelDB's per-operation efficiency is swamped by the cost of extra copies of large values.

Ayende Rahien

JDice, Sure, but that is something that you are going to have in any ACID db with large values.

Ido Samuelson

Instead of going the platform invoke way. Work with Managed C++ and the IJW. It will not only be easier to integrate but also perform better.

Ayende Rahien

Ido, The problem is that managed C++ won't work with Mono.

Ido Samuelson


Take a look at this solution, looks interesting... http://tirania.org/blog/archive/2011/Dec-19.html

Comment preview

Comments have been closed on this topic.


  1. The design of RavenDB 4.0: Physically segregating collections - 27 minutes from now
  2. RavenDB 3.5 Whirlwind tour: I need to be free to explore my data - about one day from now
  3. RavenDB 3.5 whirl wind tour: I'll have the 3+1 goodies to go, please - 4 days from now
  4. The design of RavenDB 4.0: Voron has a one track mind - 5 days from now
  5. RavenDB 3.5 whirl wind tour: Digging deep into the internals - 6 days from now

And 12 more posts are pending...

There are posts all the way to May 30, 2016


  1. The design of RavenDB 4.0 (14):
    03 May 2016 - Making Lucene reliable
  2. RavenDB 3.5 whirl wind tour (14):
    04 May 2016 - I’ll find who is taking my I/O bandwidth and they SHALL pay
  3. Tasks for the new comer (2):
    15 Apr 2016 - Quartz.NET with RavenDB
  4. Code through the looking glass (5):
    18 Mar 2016 - And a linear search to rule them
  5. Find the bug (8):
    29 Feb 2016 - When you can't rely on your own identity
View all series


Main feed Feed Stats
Comments feed   Comments Feed Stats