Ayende @ Rahien

Unnatural acts on source code

Observations on Embedded databases

I spent significant parts of the last two weeks dealing with embedded databases. I have used, SQL CE, SQLite, FireBird, db4o and Berkeley DB.

My requirements were really simple, or so I thought. I just wanted safe for multi threading and support for transactions.

Let me go over them in order:

SQL CE

This is a really nice DB, syntax is comparable to SQL Server, so it makes a lot of things simpler. It has transaction support and is supposed to be multi threaded safe.

It is not.

It is very easy to get SQL CE into situations where it hang. Usually when it is attempting to open the database. Oh, and there is no lock timeout for this issue, so it literally hang.

Result: overruled.

SQLite

I just love this DB. It is simple, straightforward, and aside from crazy date manipulation support, just works. It has both transactions and multi threading support.

However, multi threading support is gained by locking the entire database. This is an acceptable behavior, in most scenarios, but for my needs, it meant that threads stepped on each other all too often, and that wasn't acceptable.

FireBird

I had a really hard time getting this to work correctly. I got some recommendations for it, so I decided to go for it.

It doesn't handle transaction isolation, so I never actually got to the point of testing multi threading behavior.

db4o

This was my first foray into using db4o, so take anything that I say with a grain of salt.

It looked like it would be a good solution. However, I couldn't figure out how to get two threads share the same file. Each connection lock the file, so they cannot be used concurrently. Using the server version might solve this, but I am looking for embedded solution, not server solution.

Berkeley DB

I spent the most amount of time here. And it is almost there.

I had one critical issue that I managed to overcome, but then BDB totally killed itself when it hang in my tests. After tearing up a lot of hair, I ended up finding out that there is a known bug in version 4.5 (which is the latest version that has a .NET binding) that can cause this.

Conclusion

I start using Dictionary as my data store. Just don't restart the server.

Comments

Simon Green
07/10/2008 11:04 PM by
Simon Green

Have you looked at VistaDB? This is a managed (C#) database that aims to be SQL Server compatible.

http://www.vistadb.net/

Ayende Rahien
07/10/2008 11:12 PM by
Ayende Rahien

Yes, I did.

Haven't tried that because there isn't a free license that allow commercial use.

Bob Archer
07/10/2008 11:18 PM by
Bob Archer

Did you look at Bamboo.Prevelence... it is basically an in-memory database but it backs itself up to the file system so shutdowns and restarts are safe. It is supposed to be extremely fast.

BOb

Ayende Rahien
07/10/2008 11:25 PM by
Ayende Rahien

I did. Last activity was in 2005

Andy Miller
07/11/2008 01:25 AM by
Andy Miller

Do you have some sample code that causes SQL CE to hang? I am using SQL CE in a multi-threaded app, but have never seen that problem. I'm thinking that my ORM is hiding that bad behavior from me, but if I saw the conditions that cause it I could try to force the problem to occur. I am using the latest CE (3.5)...if that makes a difference.

Ayende Rahien
07/11/2008 01:40 AM by
Ayende Rahien

http://www.ayende.com/Blog/archive/2008/04/08/SQL-CE-Issues-Part-2.aspx

James Arendt
07/11/2008 02:28 AM by
James Arendt

RE: db4o

You want to use the client/server functionality in db4o. There's two modes for the client/server -- embedded and remote. Embedded does not use any networking and provides the functionality you're looking for.

Here's the ref documentation on the site:

http://developer.db4o.com/Resources/view.aspx/Reference/Client-Server/Embedded

It's also brought up briefly in the tutorial:

http://www.db4o.com/about/productinformation/resources/db4o-6.3-tutorial-net.pdf

Dennis
07/11/2008 02:49 AM by
Dennis

mysql also comes in an embeddable version. Not sure a C# version is available though

Yann Trevin
07/11/2008 05:28 AM by
Yann Trevin

Well... You just found a new project to work on (Rhino.Store ?)

Ayende Rahien
07/11/2008 05:34 AM by
Ayende Rahien

My total lack of interest in building that is remarkable.

Oran
07/11/2008 05:39 AM by
Oran

So do the requirements "safe for multi-threading and supports transactions" still apply to your in-memory solution?

Ayende Rahien
07/11/2008 05:47 AM by
Ayende Rahien

Yes,

hell, it even supports distributed transactions :-)

Omer van Kloeten
07/11/2008 09:11 AM by
Omer van Kloeten

Great post. Loved the conclusion.

I used to work with SQL CE, although never in a multithreaded environment, I can still relate to the pain.

Sidar ok
07/11/2008 10:27 AM by
Sidar ok

Any chance that you can share the code that you used to test dbs with those requirements?

Joshka
07/11/2008 10:46 AM by
Joshka

but hold on, Dictionary isn't thread safe... (http://www.ayende.com/Blog/archive/2008/01/21/Performance-threading-and-double-checked-locks.aspx) ;)

Ryan Roberts
07/11/2008 11:36 AM by
Ryan Roberts

You can get multiple threads accessing db4o internally using embedded client-server, here's a copypaste from an autofac module setup:

private void RegisterDatabases(ContainerBuilder builder)

    {

        builder.Register(c => Db4oFactory.OpenServer(

                                  c.Resolve<IConfiguration>("DocumentsConfiguration"), DocumentStore,0)).Named("DocumentServer").SingletonScoped();

        builder.Register(c => Db4oFactory.OpenServer(

                                  c.Resolve<IConfiguration>("ResourcesConfiguration"), ResourceStore,0)).Named("ResourceServer").SingletonScoped();



        builder.Register(c => c.Resolve<IObjectServer>("DocumentServer").OpenClient()).Named(

            "DocumentClient").FactoryScoped();


        builder.Register(c => c.Resolve<IObjectServer>("ResourceServer").OpenClient()).Named(

            "ResourceServer").FactoryScoped();   

    }
vbandrade
07/11/2008 11:46 AM by
vbandrade

Dictionary was a long shot. It's better use EntLib cache manager with a file storage. I had a similar situation and the cache manager is up and running...

Bob Archer's solution sounds nice.

Ayende Rahien
07/11/2008 12:21 PM by
Ayende Rahien

Sure, here they are. Try to run the throughput sample

https://rhino-tools.svn.sourceforge.net/svnroot/rhino-tools/branches/rhino-queues-bdb

https://rhino-tools.svn.sourceforge.net/svnroot/rhino-tools/branches/rhino-queues-firebird

https://rhino-tools.svn.sourceforge.net/svnroot/rhino-tools/branches/rhino-queues-bdb

https://rhino-tools.svn.sourceforge.net/svnroot/rhino-tools/branches/rhino-queues-sqlite

Ayende Rahien
07/11/2008 12:22 PM by
Ayende Rahien

Joshka,

It is thread safe if you wrap it with locking

Ayende Rahien
07/11/2008 12:23 PM by
Ayende Rahien

vbandrade,

Try to several hundred thousand items in it from multi threads, and it breaks down.

Casey Duplantis
07/11/2008 02:03 PM by
Casey Duplantis

I agree - this is a missing part of the open.net approach. Have you tried hsqldb .net? (http://hsqldb.cvs.sourceforge.net/hsqldb/Dot.NET/Org.Hsqldb/)

I've used the java version for Java/Groovy/Grails work - especially handy for unit tests. The downside? It looks like the .NET version runs on IKVM (might not be the best option for you) and has no recent activity.

pete w
07/11/2008 02:04 PM by
pete w

"Conclusion

I start using Dictionary as my data store. Just don't restart the server."

Reading that one made me laugh out loud :)

Stiiifff
07/11/2008 02:32 PM by
Stiiifff

Check out ServiceModelEx lib from Juval Lowy ... it contains System.Transaction-aware collections. Or maybe that's you've implemented already ;o)

Joshka
07/11/2008 07:25 PM by
Joshka

but what makes any of the above db's not thread safe if you wrap them with locking...

Ayende Rahien
07/12/2008 03:10 AM by
Ayende Rahien

If I need to lock a DB, that is a problem with the DB.

A DB should be thread safe.

Sean Kearon
07/12/2008 09:32 AM by
Sean Kearon

Ayende, VistaDB does have an express version that allows what you need. You can use as long as you are not using it commercially, display the VistaDB logo, are not writing integration with VS, etc. Pretty straight forward and definitely allows open source projects to use it freely.

There are a couple of feature restrictions too - no VS GUI support, no isolated storage, no in-memory support. But hey, can't have everything ;)

They have a blog about it here:

http://www.vistadb.net/blog/vistadb-news/vistadb-express-release/

Ayende Rahien
07/12/2008 09:34 AM by
Ayende Rahien

I know.

The inability to use in a commercial version is a reason to reject it.

I do intend to make use of this in a commercial applications.

Tuna Toksoz
07/12/2008 04:16 PM by
Tuna Toksoz

I'm with James Arendt, It suports embedded server mode.

lg
07/13/2008 07:03 PM by
lg

derby http://db.apache.org/derby/papers/DerbyTut/embedded_intro.html

http://www.hsqldb.org/ hsqldb

Gregory Burd
07/14/2008 01:49 AM by
Gregory Burd

Hello,

Thanks for checking into Berkeley DB and using our forums to try to address your issues.  I think you missed the part where we posted you a fix as a patch to 4.5, did you try that?  Also, 4.7.x is the latest release, and as always it is a bit better than its predecessors.  Finally, you're looking for a .NET/C# API.  Guess what?  You're not alone.  You ended up using someone's .NET API, but not one from Oracle.  We're working on one that we will support.  We're also working to make our C++ API more STL friendly.  The .NET/C# API will evolve over time, we hope to be able to provide Direct Persistence Layer (DPL) style features (automatic object->database mappings) soon.  In general, I think we need to better support Windows developers like you.  Please keep going to our forums, or email me directly.

regards,

-greg

Gregory Burd | Senior Product Manager | Berkeley DB, Oracle Corp

Ayende Rahien
07/14/2008 02:37 AM by
Ayende Rahien

I didn't miss the patch at all. And I am grateful for the quick responses in the forums.

The problem that I have is that the cost of using BDB has risen to the point where I need to invest too much for a side project, which is what Iam using it for now.

I am very happy to hear that you are working on a .NET API, just making a 1:1 mapping between the languages would be incredibly valuable. Is there any beta program that would allow me to use this API right now?

Justin Rudd
07/14/2008 06:29 AM by
Justin Rudd

The inability to use in a commercial version is a reason to reject it.

I do intend to make use of this in a commercial applications.

Be careful with Oracle BDB (i.e. Sleepycat) with regard to their open source license - http://www.oracle.com/technology/software/products/berkeley-db/htdocs/licensing.html - if you plan on distributing what you are working on under a commercial license, you'll need the commercial BDB license unless you are going to give away all the code. But then it would be a dual license...hm...I dunno. I just know that I had plans to use BDB in a little app that I wanted to be able to take commercial later on. The language in the OSI license was vague enough to make me rethink it. Now if you've got a lawyer that says otherwise...

Ayende Rahien
07/14/2008 06:36 AM by
Ayende Rahien

The issue is different, here you have redistribution vs. commercial

blater
07/14/2008 10:38 AM by
blater

you have my sympathies - we spent a lot of time investigating embedded databases, and settled on derby/hsqldb... before getting burned by it's lack of proper support for some of the more "difficult" parts of sql 92.

h2 (h2database.com) did the trick though. Recommended. if you need an embedded java database that won't turn round and bite you.

ecards
07/14/2008 03:02 PM by
ecards

Just wondering about your reluctance to use a commercial db (vistadb) in a commercial product.

Is it based on philosophy or just affordability of the license?

Of course I understand wanting to use open source for non-commercial projects, and also for commercial projects where there is a clearly proven and ready solution.

But no doubt your time to research this was worth more than the cost of a license, so I'm curious about the motivation.

best regards,

lee

Ayende Rahien
07/14/2008 05:50 PM by
Ayende Rahien

It is based on me not liking limitations on what I can do with a piece of software I write.

If I am basing myself on a certain platform, I want to know what I can do with it. Having limitations on that is counter to what I would like to have.

Paying for something is not an issue, but in this case, I am working on a hobby project. As such, I would like to be able to use it in any circumstance.

Just to give you an idea, Rhino Mocks was a hobby project.

Marc
07/16/2008 08:47 AM by
Marc

Perhaps I'im mission something, but did you try to enable Sql CE for ASP.Net environment ?

AppDomain.CurrentDomain.SetData("SQLServerCompactEditionUnderWebHosting", true)

See http://blogs.msdn.com/stevelasker/archive/2006/11/27/sql-server-compact-edition-under-asp-net-and-iis.aspx for full explanation.

Ayende Rahien
07/16/2008 09:12 AM by
Ayende Rahien

I did, and we had this issue.

But note that I am not using this in an ASP.Net system

Pierre Meyer
07/16/2008 10:09 AM by
Pierre Meyer

intersting blog !!

for my AVS scm project, I tried at first OneDollarDB, which performances were quite great, but I had issues with indexes, which caused database crashes, so I dropped it out.

now I am using DerbyDB, which also provides an embedded database, very stable product. Did you spend some time on this one ?

Pierre

Gunnar Liljas
07/16/2008 11:45 AM by
Gunnar Liljas

db4o doesn't exactly have a "free license that allow commercial use" does it?

Eric Hauser
07/21/2008 04:34 PM by
Eric Hauser

I was browsing through the SQLite documentation. According to it in versions greater than 3.4, the whole database does not lock:

http://www.sqlite.org/34to35.html (Section 4.0)

Based on your experiences, Is this inaccurate?

Ayende Rahien
07/21/2008 04:51 PM by
Ayende Rahien

I am using the System.Data.Sqlite version, which has embedded version of the native DLL.

No idea what that version is, though. I think that I might have an earlier version.

I'll have to try that.

Jan Persson
08/04/2008 08:24 AM by
Jan Persson

What do you mean when you say that Firebird does not handle transaction isolation? Actually it handles multiple transactions very well and we have been using it embedded in a multi-threaded application for almost two years now.

http://www.firebirdsql.org/dotnetfirebird/blog/2005/02/transaction-isolation-levels-in.html

Ayende Rahien
08/04/2008 07:33 PM by
Ayende Rahien

Run the throughput sample here, you'll see the problem:

https://rhino-tools.svn.sourceforge.net/svnroot/rhino-tools/branches/rhino-queues-firebird

Jan Persson
08/05/2008 09:42 AM by
Jan Persson

I'm sorry to say that I neither have the time nor the competence to understand the problems you experienced by looking through your code. Could you instead please (in a few words) explain what kind of problems you had with Firebird?

Since Firebird is using MVCC it doesn't behave in the same way as other databases when it comes to locking and isolation, but that is not the same thing as to say that it is not handling transaction isolation.

Darrell
08/05/2008 01:50 PM by
Darrell

May want to check out Postgresql:

http://www.postgresql.org/

You will like their license!

Ayende Rahien
08/05/2008 11:35 PM by
Ayende Rahien

Darrell,

What part of embedded database does PostgresSQL Has?

Ayende Rahien
08/06/2008 12:06 AM by
Ayende Rahien

http://ayende.com/Blog/archive/2008/07/07/Why-I-dont-like-FireBird-Part-II.aspx

Comments have been closed on this topic.