Ayende @ Rahien

Unnatural acts on source code

Rhino Divan DB – Design

One of the things that I wanted to do with RDB is to create an explicit actor model inside the codebase. I have been using a similar structure inside NH Prof, and it has been quite successful. The design goals for RDB is:

Assumptions for the database cosntruction

Get / Put / Delete semantics for Json documents.

All those operations can access batches of documents to work on. Those operations fully implement ACID. Which means that if you got a successful response for a document Put, you can rely on the document always being there.

Those operations should be considered cheap.

Reboot / crash resistant

The DB can crash / restart, but no lose of functionality may occur, but as soon as it restarts, everything goes on as usual. There can be no in memory data structures / work that cannot be recovered from persistent structure.

Views for searching

The DB use views, defined using linq expressions, for supporting search capabilities. Those views are background indexed (so no holding up request processing for views). When you get a result from a queue you always know if the result is stale or not.

Adding a view to an existing database is a cheap operation, regardless of the database size. During view construction, the view can be queried (but its results will be considered stale). Reboot during view construction will not impact the construction process.

Indexing a document twice is a stable operation, which means that a view can always choose to re-index things if it so choose.

Overall design

image

RDB stores two major pieces of information in transactional storage.

Documents, obviously, which are stored in a format that allows to send the document content to the user quickly, and tasks.

Tasks are how RDB maintains state over crashes / reboots, and they also form the base of async work of the database. Any work that is going to take some time for the database to perform is written to transactional storage as a task. Those tasks are things like: “View ‘peopleByName’ should index documents 1 – 42'”.

There are background threads working of off this tasks queue, performing the work and removing the task when they are completed.

The results of each view is written to a Lucene index (one per view).

So far i have the entire structure done, I need to some polishing, and I have a different OSS strategy to go with, but thinks are looking good.

Posted By: Ayende Rahien

Published at

Originally posted at

Comments

Rafal
03/01/2010 10:21 AM by
Rafal

Can you justify using Esent as a transactional store? Many applications will use a relational database along with the 'nosql' document db, so it would be much simpler to use the same database server for storing both relational data and DivanDB documents. It would simplify many tasks like administration, backup/restore, debugging etc.

Ayende Rahien
03/01/2010 11:12 AM by
Ayende Rahien

Rafal,

I want it to be a single thing, not something that relies on a lot of external components, requires complex installation, etc.

Administration - there should be none

Backup/Restore - esentutl.exe is already part of windows

Debugging - you don't do that, because the server has no logic

Index checking - there is Luke

j23tom
03/01/2010 11:48 AM by
j23tom

hmm ... does it mean that this project will never run under linux/mono ?

Ayende Rahien
03/01/2010 01:12 PM by
Ayende Rahien

j23tom,

It would, when someone would port the storage to BDB

Ben
03/01/2010 01:18 PM by
Ben

Ayende, is the code available to start poking around?

Steve
03/01/2010 01:31 PM by
Steve

Good stuff

Let's say I use this for my 'command storage' in a command-query pattern, one piece that would be essential to me would be to figure out how to update my query database (let's say MS SQL)

Ayende Rahien
03/01/2010 01:39 PM by
Ayende Rahien

Ben,

Not at the moment, we are working on a different release plan

Ayende Rahien
03/01/2010 01:40 PM by
Ayende Rahien

You have a message dispatched to a consumer whenever a command is executed?

Jan Limpens
03/01/2010 02:53 PM by
Jan Limpens

Ayende, under which license do you plan to release all this? Or is this a commercial project?

Ben
03/01/2010 04:06 PM by
Ben

Thanks Ayende. Hurry up ;) I was all set to start with CouchDB or MongoDB but this sounds like it'd be a better fit for what I need. Like Jan, I'm curious on your release plan... if you're going dual license, commercial, purely oss, what.

Set
03/01/2010 04:12 PM by
Set

Does it support multiple documents commit?

Vadim Kantorov
03/01/2010 04:38 PM by
Vadim Kantorov

I've heard some ramblings that there's a database size limit in Esent. Is there really?

Ben
03/01/2010 04:59 PM by
Ben

Vadim,

  • Individual columns can be up to 2GB in size. A database can be up to 16TB in size.

Copied from: blogs.msdn.com/.../...-api-in-the-windows-sdk.aspx

So, yes, there is a limit but, from a practical standpoint, probably a livable one for most apps.

Ayende Rahien
03/01/2010 06:41 PM by
Ayende Rahien

Jan,

This will be OSS, but I am not sure under what license.

Ayende Rahien
03/01/2010 06:45 PM by
Ayende Rahien

Ben,

I am willing to give source access in exchange for work on the project.

Set,

Not at the moment, it will soon.

Ben
03/01/2010 08:06 PM by
Ben

Ayende,

I can certainly help with anything that doesn't require a big brain ;)

Kerja Jawatan Kosong
03/02/2010 04:03 AM by
Kerja Jawatan Kosong

Ayende.. im currenty using xoops for my own project..Do u think this storage working with it??

Ayende Rahien
03/02/2010 07:54 AM by
Ayende Rahien

It would be possible, yes.

We have a fully functional JSON / HTTP API

Comments have been closed on this topic.