Observations on Embedded databases
I spent significant parts of the last two weeks dealing with embedded databases. I have used, SQL CE, SQLite, FireBird, db4o and Berkeley DB.
My requirements were really simple, or so I thought. I just wanted safe for multi threading and support for transactions.
Let me go over them in order:
SQL CE
This is a really nice DB, syntax is comparable to SQL Server, so it makes a lot of things simpler. It has transaction support and is supposed to be multi threaded safe.
It is not.
It is very easy to get SQL CE into situations where it hang. Usually when it is attempting to open the database. Oh, and there is no lock timeout for this issue, so it literally hang.
Result: overruled.
SQLite
I just love this DB. It is simple, straightforward, and aside from crazy date manipulation support, just works. It has both transactions and multi threading support.
However, multi threading support is gained by locking the entire database. This is an acceptable behavior, in most scenarios, but for my needs, it meant that threads stepped on each other all too often, and that wasn't acceptable.
FireBird
I had a really hard time getting this to work correctly. I got some recommendations for it, so I decided to go for it.
It doesn't handle transaction isolation, so I never actually got to the point of testing multi threading behavior.
db4o
This was my first foray into using db4o, so take anything that I say with a grain of salt.
It looked like it would be a good solution. However, I couldn't figure out how to get two threads share the same file. Each connection lock the file, so they cannot be used concurrently. Using the server version might solve this, but I am looking for embedded solution, not server solution.
Berkeley DB
I spent the most amount of time here. And it is almost there.
I had one critical issue that I managed to overcome, but then BDB totally killed itself when it hang in my tests. After tearing up a lot of hair, I ended up finding out that there is a known bug in version 4.5 (which is the latest version that has a .NET binding) that can cause this.
Conclusion
I start using Dictionary as my data store. Just don't restart the server.
Comments
Have you looked at VistaDB? This is a managed (C#) database that aims to be SQL Server compatible.
http://www.vistadb.net/
Yes, I did.
Haven't tried that because there isn't a free license that allow commercial use.
Did you look at Bamboo.Prevelence... it is basically an in-memory database but it backs itself up to the file system so shutdowns and restarts are safe. It is supposed to be extremely fast.
BOb
I did. Last activity was in 2005
Do you have some sample code that causes SQL CE to hang? I am using SQL CE in a multi-threaded app, but have never seen that problem. I'm thinking that my ORM is hiding that bad behavior from me, but if I saw the conditions that cause it I could try to force the problem to occur. I am using the latest CE (3.5)...if that makes a difference.
http://www.ayende.com/Blog/archive/2008/04/08/SQL-CE-Issues-Part-2.aspx
RE: db4o
You want to use the client/server functionality in db4o. There's two modes for the client/server -- embedded and remote. Embedded does not use any networking and provides the functionality you're looking for.
Here's the ref documentation on the site:
http://developer.db4o.com/Resources/view.aspx/Reference/Client-Server/Embedded
It's also brought up briefly in the tutorial:
http://www.db4o.com/about/productinformation/resources/db4o-6.3-tutorial-net.pdf
mysql also comes in an embeddable version. Not sure a C# version is available though
Well... You just found a new project to work on (Rhino.Store ?)
My total lack of interest in building that is remarkable.
So do the requirements "safe for multi-threading and supports transactions" still apply to your in-memory solution?
Yes,
hell, it even supports distributed transactions :-)
Great post. Loved the conclusion.
I used to work with SQL CE, although never in a multithreaded environment, I can still relate to the pain.
Any chance that you can share the code that you used to test dbs with those requirements?
but hold on, Dictionary isn't thread safe... (http://www.ayende.com/Blog/archive/2008/01/21/Performance-threading-and-double-checked-locks.aspx) ;)
You can get multiple threads accessing db4o internally using embedded client-server, here's a copypaste from an autofac module setup:
private void RegisterDatabases(ContainerBuilder builder)
Dictionary was a long shot. It's better use EntLib cache manager with a file storage. I had a similar situation and the cache manager is up and running...
Bob Archer's solution sounds nice.
Sure, here they are. Try to run the throughput sample
https://rhino-tools.svn.sourceforge.net/svnroot/rhino-tools/branches/rhino-queues-bdb
https://rhino-tools.svn.sourceforge.net/svnroot/rhino-tools/branches/rhino-queues-firebird
https://rhino-tools.svn.sourceforge.net/svnroot/rhino-tools/branches/rhino-queues-bdb
https://rhino-tools.svn.sourceforge.net/svnroot/rhino-tools/branches/rhino-queues-sqlite
Joshka,
It is thread safe if you wrap it with locking
vbandrade,
Try to several hundred thousand items in it from multi threads, and it breaks down.
I agree - this is a missing part of the open.net approach. Have you tried hsqldb .net? (http://hsqldb.cvs.sourceforge.net/hsqldb/Dot.NET/Org.Hsqldb/)
I've used the java version for Java/Groovy/Grails work - especially handy for unit tests. The downside? It looks like the .NET version runs on IKVM (might not be the best option for you) and has no recent activity.
"Conclusion
I start using Dictionary as my data store. Just don't restart the server."
Reading that one made me laugh out loud :)
Check out ServiceModelEx lib from Juval Lowy ... it contains System.Transaction-aware collections. Or maybe that's you've implemented already ;o)
but what makes any of the above db's not thread safe if you wrap them with locking...
If I need to lock a DB, that is a problem with the DB.
A DB should be thread safe.
Ayende, VistaDB does have an express version that allows what you need. You can use as long as you are not using it commercially, display the VistaDB logo, are not writing integration with VS, etc. Pretty straight forward and definitely allows open source projects to use it freely.
There are a couple of feature restrictions too - no VS GUI support, no isolated storage, no in-memory support. But hey, can't have everything ;)
They have a blog about it here:
http://www.vistadb.net/blog/vistadb-news/vistadb-express-release/
I know.
The inability to use in a commercial version is a reason to reject it.
I do intend to make use of this in a commercial applications.
I'm with James Arendt, It suports embedded server mode.
derby http://db.apache.org/derby/papers/DerbyTut/embedded_intro.html
http://www.hsqldb.org/ hsqldb
Hello,
regards,
-greg
Gregory Burd | Senior Product Manager | Berkeley DB, Oracle Corp
I didn't miss the patch at all. And I am grateful for the quick responses in the forums.
The problem that I have is that the cost of using BDB has risen to the point where I need to invest too much for a side project, which is what Iam using it for now.
I am very happy to hear that you are working on a .NET API, just making a 1:1 mapping between the languages would be incredibly valuable. Is there any beta program that would allow me to use this API right now?
<quote>The inability to use in a commercial version is a reason to reject it.
I do intend to make use of this in a commercial applications. </quote>
Be careful with Oracle BDB (i.e. Sleepycat) with regard to their open source license - http://www.oracle.com/technology/software/products/berkeley-db/htdocs/licensing.html - if you plan on distributing what you are working on under a commercial license, you'll need the commercial BDB license unless you are going to give away all the code. But then it would be a dual license...hm...I dunno. I just know that I had plans to use BDB in a little app that I wanted to be able to take commercial later on. The language in the OSI license was vague enough to make me rethink it. Now if you've got a lawyer that says otherwise...
The issue is different, here you have redistribution vs. commercial
you have my sympathies - we spent a lot of time investigating embedded databases, and settled on derby/hsqldb... before getting burned by it's lack of proper support for some of the more "difficult" parts of sql 92.
h2 (h2database.com) did the trick though. Recommended. if you need an embedded java database that won't turn round and bite you.
Just wondering about your reluctance to use a commercial db (vistadb) in a commercial product.
Is it based on philosophy or just affordability of the license?
Of course I understand wanting to use open source for non-commercial projects, and also for commercial projects where there is a clearly proven and ready solution.
But no doubt your time to research this was worth more than the cost of a license, so I'm curious about the motivation.
best regards,
lee
It is based on me not liking limitations on what I can do with a piece of software I write.
If I am basing myself on a certain platform, I want to know what I can do with it. Having limitations on that is counter to what I would like to have.
Paying for something is not an issue, but in this case, I am working on a hobby project. As such, I would like to be able to use it in any circumstance.
Just to give you an idea, Rhino Mocks was a hobby project.
Perhaps I'im mission something, but did you try to enable Sql CE for ASP.Net environment ?
AppDomain.CurrentDomain.SetData("SQLServerCompactEditionUnderWebHosting", true)
See http://blogs.msdn.com/stevelasker/archive/2006/11/27/sql-server-compact-edition-under-asp-net-and-iis.aspx for full explanation.
I did, and we had this issue.
But note that I am not using this in an ASP.Net system
intersting blog !!
for my AVS scm project, I tried at first OneDollarDB, which performances were quite great, but I had issues with indexes, which caused database crashes, so I dropped it out.
now I am using DerbyDB, which also provides an embedded database, very stable product. Did you spend some time on this one ?
Pierre
db4o doesn't exactly have a "free license that allow commercial use" does it?
I was browsing through the SQLite documentation. According to it in versions greater than 3.4, the whole database does not lock:
http://www.sqlite.org/34to35.html (Section 4.0)
Based on your experiences, Is this inaccurate?
I am using the System.Data.Sqlite version, which has embedded version of the native DLL.
No idea what that version is, though. I think that I might have an earlier version.
I'll have to try that.
What do you mean when you say that Firebird does not handle transaction isolation? Actually it handles multiple transactions very well and we have been using it embedded in a multi-threaded application for almost two years now.
http://www.firebirdsql.org/dotnetfirebird/blog/2005/02/transaction-isolation-levels-in.html
Run the throughput sample here, you'll see the problem:
https://rhino-tools.svn.sourceforge.net/svnroot/rhino-tools/branches/rhino-queues-firebird
I'm sorry to say that I neither have the time nor the competence to understand the problems you experienced by looking through your code. Could you instead please (in a few words) explain what kind of problems you had with Firebird?
Since Firebird is using MVCC it doesn't behave in the same way as other databases when it comes to locking and isolation, but that is not the same thing as to say that it is not handling transaction isolation.
May want to check out Postgresql:
http://www.postgresql.org/
You will like their license!
Darrell,
What part of embedded database does PostgresSQL Has?
http://ayende.com/Blog/archive/2008/07/07/Why-I-dont-like-FireBird-Part-II.aspx
Comment preview