Ayende @ Rahien

My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:


+972 52-548-6969

, @ Q c

Posts: 18 | Comments: 87

filter by tags archive

A persistence problem, irony, your name is…

time to read 5 min | 902 words

The major goal that I had in mind for the profiler was online development usage. That is, do something, check the profiler, clear it, do something else, etc. One of the things that I am finding out is that people use it a lot more as a dumping ground. They push a lot of information into it, and then want to sift through that and look at how the application behave, not just a single scenario.

Surprisingly, it works quite well, especially with the recently implemented performance profiling sessions that we just run through. One scenario, however, remains stubbornly outside what the profiler can currently do. When people talk to me about it, they call it load tests profiling, or integration tests profiling. This is when you pour literally gigabytes of information into the profiler. And it works, provided you have enough memory, that is.

If you don’t have enough memory, however, you get to say hello to OutOfMemoryException.

When I dove into this problem I was sure that I would simply find that there is something stupid that I am doing wrong, and that as soon as I’ll figure it out, it will be all right. I actually did find a few places where I could optimize memory usage (reducing lambda usage in favor of cached delegates to named methods, for example), but that only shaved a few percentage points. Trying out string interning actually resulted in a huge saving in memory, but I feel that this is just a stop gag measure. I have to persist the data to disk, rather than keep it in memory.

That lead me to a very interesting problem. What I need is basically a key value store. Interestingly enough, I already wrote one. The problem is that while this would work great right now, I have future plans which means depending on Esent is an… unwise choice. Basically, I would like to be able to run on Mono and/or Silverlight and that rules out using a Windows only / full trust native dll. As they say, a bummer. That requirement also rules out using the various embedded databases as well.

I considered ignoring this requirement and handling it when the times come, but I decided that since this is going to majorly effect how I am going to use it, I can’t really afford to delay that decision. With that in mind, I set out to figure out what I needed:

  • A fast way to store / retrieve a session information along with its associated data (stack trace, alerts, statistics, etc).
  • Ability to store, at a minimum, tens of thousands of items of variable size.
  • A single file (or at least, very few files) – cannot afford to have one item per file (it usually kills the FS).
  • Support updates without re-writing the entire file.
  • Usable from Mono & Silverlight, or easily portable to them.

With that in mind, I decided to take a look at what is already out there.

  • C#-Sqlite looked like it might be the ticker. It is a C# port of the Sqlite database. Unfortunately, I took a look at the code base and it is a port to C#, the code gave me the willies. I don’t feel that I can trust it, and at any rate, it would require me to write a lot of data access code, that is a thing that I am trying to avoid :-). (And no, you can’t use NHibernate with that version, you would have to port the ADO.Net driver as well, and then you wouldn’t be able to use it in Silverlight anyway.)
  • Caching Application Block – because it looked like it had a persistent solution already. That persistent solution is based on several files per item, which is not acceptable. I already tried that route in the past, it is a good way to kill your file system.
  • SilverDB – this is an interesting code base, and a good solution for the problem it is meant to (saving relatively small amount of information to disk). However, I need to save large amounts of information, and I need to handle a lot of updates. SilverDB re-write the entire file whenever it is saving. That has too high a perf cost for my needs.
  • TheCache – I took only a brief look here, but it looks that it is too heavily focused on being a cache to be useful for my purposes.

In fact, given my requirements, it might be interesting to see what I don’t need.

  • Not reliable.
  • Not thread safe.
  • Saving is just a way to free memory.

Given that, I decided to go with the following method:

  • Custom serialization format, allowing me to save space & time using file & memory based string interning.
  • No persistent file index, that can be kept directly in memory.
  • Persisted string interning file.

As you can see, this is a very tailored solution, not something that would be generally useful, but I have great hopes for this.



Have you looked at FirebirdSQL yet? It's designed as an embedded database. FirebirdSql also provides a service/daemon for client/server access.

The DotnetFirebird project (a subproject from FirebirdSql, much like mysql connector project) comes with both a Windows and a Mono 1.1.x. It also supports the compact framework so you should be able to use it with Silverlight or on a Windows Mobile phone.


Ayende Rahien


I tried it a while ago, yes. It had numerous problems that made me give it up.


Thinking outside the box now, but why not build the ability to persist all information to a database with a session tag?

Such a database could be handled by NHibernate and it would give some interesting features.

Storing the information to a specified database would give the whole team analyzing capabilities. Another 'feature' would be collecting the profiling data during (automated) unit- and regression tests and dynamically assign a session tag (version + date comes into mind).

Later a team member could 'open' that session and compare it with a earlier run.

Were talking about an ORM profiling tool, so specifying a database server (mssql, oracle, mysql, etc) shouldn't be a problem for most users. Even when profiling an application with an embedded database it's still posible to store the profiling data into a remote database. Visual Studio comes by default with a mssql express installation.

I'm sure if I give it some more though I can come up with some more usages.

Scott White

what if you redesign it to be an NT service so that you can disconnect the UI from the base functionality. Then you could put a website, silverlight or whatever in front of the database that the service is actually using.

Ayende Rahien


That would make me lose something extremely valuable, the xcopy experience, and it would SERIOUSLY complicate my life.



Patrik Hägne

I was about to ask the same question as Jenser, have you looked at Db4o? It would be interesting to see if it would cut it. I've never tried it myself but I'd like to get around to it some day.


Isn't the regular SQLite fast enought ? BTW, it would be nice to have a pure managed SQLite version...

Judah Himango

I'm with Jenser and Patrik, you ought to give DB4O a shot.

Alex Yakunin

It's probable you'll come to the solution we work on (full-featured integrated database):

  • Key-value store = no range queries. But in your case there is at least one obvious range query kind: time range queries.

  • Sessions can be very long in case with other ORMs, so it's possible that it won't be enough to store session data as value part.

Just FYI: our original problem was looking very similar. I thought it would be nice to have a simple key-value pair storage for our local databases.

Alex Yakunin

Btw, if simple custom storage is really enough, it's definitely better to use this option - at least, because of flexibility.

E.g. if range queries are actually needed, but they can be emulated with sequential key processing (e.g. if key is minute number or something like this), key-value store seems a good option.

Ayende Rahien


I have no range queries, so K/V store is perfect. There is no size limitation for the value

Comment preview

Comments have been closed on this topic.


  1. Buffer allocation strategies: A possible solution - 2 days from now
  2. Buffer allocation strategies: Explaining the solution - 3 days from now
  3. Buffer allocation strategies: Bad usage patterns - 4 days from now
  4. The useless text book algorithms - 5 days from now
  5. Find the bug: The concurrent memory buster - 6 days from now

There are posts all the way to Sep 11, 2015


  1. Find the bug (5):
    20 Apr 2011 - Why do I get a Null Reference Exception?
  2. Production postmortem (10):
    03 Sep 2015 - The industry at large
  3. What is new in RavenDB 3.5 (7):
    12 Aug 2015 - Monitoring support
  4. Career planning (6):
    24 Jul 2015 - The immortal choices aren't
View all series


Main feed Feed Stats
Comments feed   Comments Feed Stats