Ayende @ Rahien

Unnatural acts on source code

Rhino Divan DB – Consideration

It is interesting, I have been thinking about Divan DB for a long time now, and tonight I decided to give it some love and finally try to see exactly how hard it is going to be to write it.

It turns out that it isn’t really hard at all. You can see my progress here (http://github.com/ayende/rhino-divandb/tree/master). It now supports adding and retrieving documents, defining views, view caching, and (very simple) view application.

So you can now push a bunch of views and documents to the database and get the result back. There is a lot more to do (for example, handling edits with the views), but it seems to be fairly straight forward so far.

Comments

Fabio
09/03/2009 04:56 AM by
Fabio

Which is the target of the prj ?

Rafal
09/03/2009 06:33 AM by
Rafal

Cool, is it a first document database for .Net?

Anyway, I have been checking what options are out there and found many implementation for Java, especially the Persevere project caught my attention. All these JSON databases benefit from the fact that there are javascript engines embedded in them so advanced server-side document processing is possible. And there's no javascript engine for .Net, Microsoft knows why... Do they think the best language for handling JSON in Python or C#?

Frank Quednau
09/03/2009 07:51 AM by
Frank Quednau

Well, that's interesting. In some ways sharepoint could be understood as document database but this one is quite a different thing :)

Here a couple of questions if you don't mind:

Is this meant to be used e.g. as infrastructure to your services which handle view reads / creations / etc. or had you in mind adding some kind of client/server support that would allow clients to connect to some divan instance?

Is Esent the only DataStorage one can consider or can you think of other stores?

Ayende Rahien
09/03/2009 08:15 AM by
Ayende Rahien

Fabio,

Right now?

It has been bouncing in my heads for months, I got tired of having it there

Ayende Rahien
09/03/2009 08:17 AM by
Ayende Rahien

Rafal,

If you take a look at the source code, you'll see that I made the decision not to rely on javascript for the actual processing.

The view language is actually C#, and we are using Linq for handling that which brings a lot of advantages to the game.

Ayende Rahien
09/03/2009 08:22 AM by
Ayende Rahien

Frank,

This is the library itself, there will probably be a server component if I get interested enough to do it.

As for Esent, there could be other storages, it is just the simplest one to use

Marco Parenzan
09/03/2009 12:22 PM by
Marco Parenzan

Ah ah ah....nice name for a competitor of CouchDB. But just a question: in italian we translate couch in "divano"...did you called it "divan" from italian?

Rinat Abdullin
09/03/2009 05:08 PM by
Rinat Abdullin

Marco, that's because "Divan" is "Couch" in Russian))

Rafal
09/03/2009 05:27 PM by
Rafal

And in Polish it's a carpet. Don't know why.

I'd rather search project name's origins in Arabic or Hebrew language.

BTW can anyone explain why 'Couch' is a good name for a database?

Ralf Westphal
09/03/2009 06:04 PM by
Ralf Westphal

@Rafal: There are other "document databases" for .NET. I gave the topic a try with The Lounge Repository. You can find the source code at CodePlex: http://loungerepo.codeplex.com/

But it´s not a document database but an entity database. You throw at it a graph of entity objects (with value objects mixed in) and it persists it to the file system - a separate file for each entity. This is different from document databases which take an object graph as a whole and persist it. With The Lounge Repository you can later on retrieve any entity directly.

A pragmatic limitation so far: It supports queries only for cached entities. (All entities are cached by default.) So you could call it an in-mem database with persistence. (But not snapshot persistence.)

There is another "document database" called StupidDB. See this German blog artikel for a short explanation: blog.aztec-project.org/.../stupiddb-object-pers...
Or the .NET source code at: stupiddb.svn.sourceforge.net/.../trunk/

-Ralf

Ayende Rahien
09/03/2009 06:42 PM by
Ayende Rahien

Rafal,

It is a "restful" database

Rafal
09/04/2009 06:18 AM by
Rafal

@Ralf

Heh, StupidDB, this really would make my sales rock :)

By document database I meant something like Couch DB, that is a system able to store JSON documents with search/index capabilities, REST interface and versioning support. So I weren't checking various object persistence frameworks like Lounge Repository, it's not exactly what I need.

@Ayende

'Restful' explains everything. But is Divan restful?

Ralf Westphal
09/04/2009 07:30 AM by
Ralf Westphal

@Rafal: Why is the notion of "document databases" so attractive? What´s so cool about JSON? Sure, if you transmit "objects" JSON encoded that´s pretty cross-platform. But otherwise...

At first glance I liked CouchDB quite a bit - and still like it. However thinking in terms of "documents" I find limiting in many scenarios. A document is a closed container for data. That´s great for a text. But a customer, an invoice, a user, a product, a bid, an order, a payment... that´s all data which benefits from being interconnected with other data. It should not be penned up in a container.

That´s why I developed The Lounge Repository - and view it as a generalization of "document databases": if you like, define a document to be an object graph with just one entity as the root. You can stuff any number value object in it. They´ll be all persisted within the entity´s "document".

Once this doesn´t fit your bill anymore, though, you can link your "document" to others and be sure each will be persisted separately. But then I wouldn´t call them documents anymore, but enties.

Where does JSON enter the picture? Currently The Lounge Repository is just an embedded "database". There is no need for JSON. But I could easily replace the binary serializer with a JSON serializer.

Once The Lounge Repository is made a server "database", though, JSON earns more consideration. Less for storage, but more as a transport encoding.

-Ralf

Rafal
09/04/2009 07:43 AM by
Rafal

@Ralf

I like the idea of CouchDB, but not its implementation - its setup is difficult on Windows. I find Persevere much easier to work with, but haven't put it to any production-level work yet.

Now, why JSON and not an object persistence framework.

  1. My software relies on dynamic data structures, some of them can be modified at runtime so strongly typed entities don't fit. JSON is elastic and can hold any data structures without hardcoding anything

  2. Having a single text document with all the data allows me to do versioning easily

  3. JSON is very portable and easy to work with in all application layers

  4. RESTful interface has many advantages over .Net remoting or web services or whatever communication protocol the database has

  5. Does your Lounge Repository allow for searching or indexing the data?

R

Ralf Westphal
09/04/2009 05:29 PM by
Ralf Westphal

@Rafal: True, CouchDB is not easy to install on Windows.

As for your points:

  1. The Lounge Repo offers a strongly typed API - but right under the surface it´s working with tuple hierarchies. The first thing it does when storing a strongly typed entity is, it "normalizes" it into a tuple.

JSON on the othe hand, I´d say, is not needed, if you want elasticity.

  1. The Lounge Repo stores entities in single files. Versioning is as easy as with other "databases" doing the same. Whether these files are text files or binary is just a matter of the internal serializer for normalized objects.

But keep in mind: During normalization entity hierarchies are "shreddered", so that each entity can be saved in its own file.

  1. JSON is portable. True. But portability is some kind of optimization for many apps. I don´t need it, and non of my clients needs it - so far.

But in case you want it, JSON can easily built in: just replace the serializer and serialize to JSON or XML or whatever.

  1. REST is a consideration for distributed scenarios. So far The Lounge Repo does not support client server architectures. But once it does it sure will sport a RESTful API - besides other means.

  2. The Lounge Repo currently is an in-mem database. All entities can be queried using Linq. I´m wating for someone to complain this is too slow. Until then I´m staying away from such kind of optimizations. They don´t add to the general programming model which is what I want to explore for business apps.

-Ralf

alex p
09/15/2009 07:23 PM by
alex p

the big thing isn't the storage of individual documents, it's storage and computation of "views". the lock-free map/reduce approach of couchdb has been the main selling point for me - aside from a few up-front design headaches, i can easily build out a repository that'll partition with no consideration of the data structure. that's a big deal.

Ayende Rahien
09/15/2009 07:38 PM by
Ayende Rahien

Alex,

That is actually easy to do. Look at the source, it is already there.

Comments have been closed on this topic.