Rhino Divan DB – Consideration
It is interesting, I have been thinking about Divan DB for a long time now, and tonight I decided to give it some love and finally try to see exactly how hard it is going to be to write it.
It turns out that it isn’t really hard at all. You can see my progress here (http://github.com/ayende/rhino-divandb/tree/master). It now supports adding and retrieving documents, defining views, view caching, and (very simple) view application.
So you can now push a bunch of views and documents to the database and get the result back. There is a lot more to do (for example, handling edits with the views), but it seems to be fairly straight forward so far.
Comments
Which is the target of the prj ?
Cool, is it a first document database for .Net?
Anyway, I have been checking what options are out there and found many implementation for Java, especially the Persevere project caught my attention. All these JSON databases benefit from the fact that there are javascript engines embedded in them so advanced server-side document processing is possible. And there's no javascript engine for .Net, Microsoft knows why... Do they think the best language for handling JSON in Python or C#?
Well, that's interesting. In some ways sharepoint could be understood as document database but this one is quite a different thing :)
Here a couple of questions if you don't mind:
Is this meant to be used e.g. as infrastructure to your services which handle view reads / creations / etc. or had you in mind adding some kind of client/server support that would allow clients to connect to some divan instance?
Is Esent the only DataStorage one can consider or can you think of other stores?
Fabio,
Right now?
It has been bouncing in my heads for months, I got tired of having it there
Rafal,
If you take a look at the source code, you'll see that I made the decision not to rely on javascript for the actual processing.
The view language is actually C#, and we are using Linq for handling that which brings a lot of advantages to the game.
Frank,
This is the library itself, there will probably be a server component if I get interested enough to do it.
As for Esent, there could be other storages, it is just the simplest one to use
Ah ah ah....nice name for a competitor of CouchDB. But just a question: in italian we translate couch in "divano"...did you called it "divan" from italian?
Marco, that's because "Divan" is "Couch" in Russian))
And in Polish it's a carpet. Don't know why.
I'd rather search project name's origins in Arabic or Hebrew language.
BTW can anyone explain why 'Couch' is a good name for a database?
@Rafal: There are other "document databases" for .NET. I gave the topic a try with The Lounge Repository. You can find the source code at CodePlex: http://loungerepo.codeplex.com/
But it´s not a document database but an entity database. You throw at it a graph of entity objects (with value objects mixed in) and it persists it to the file system - a separate file for each entity. This is different from document databases which take an object graph as a whole and persist it. With The Lounge Repository you can later on retrieve any entity directly.
A pragmatic limitation so far: It supports queries only for cached entities. (All entities are cached by default.) So you could call it an in-mem database with persistence. (But not snapshot persistence.)
There is another "document database" called StupidDB. See this German blog artikel for a short explanation: blog.aztec-project.org/.../stupiddb-object-pers...
Or the .NET source code at: stupiddb.svn.sourceforge.net/.../trunk/
-Ralf
Rafal,
It is a "restful" database
@Ralf
Heh, StupidDB, this really would make my sales rock :)
By document database I meant something like Couch DB, that is a system able to store JSON documents with search/index capabilities, REST interface and versioning support. So I weren't checking various object persistence frameworks like Lounge Repository, it's not exactly what I need.
@Ayende
'Restful' explains everything. But is Divan restful?
@Rafal: Why is the notion of "document databases" so attractive? What´s so cool about JSON? Sure, if you transmit "objects" JSON encoded that´s pretty cross-platform. But otherwise...
At first glance I liked CouchDB quite a bit - and still like it. However thinking in terms of "documents" I find limiting in many scenarios. A document is a closed container for data. That´s great for a text. But a customer, an invoice, a user, a product, a bid, an order, a payment... that´s all data which benefits from being interconnected with other data. It should not be penned up in a container.
That´s why I developed The Lounge Repository - and view it as a generalization of "document databases": if you like, define a document to be an object graph with just one entity as the root. You can stuff any number value object in it. They´ll be all persisted within the entity´s "document".
Once this doesn´t fit your bill anymore, though, you can link your "document" to others and be sure each will be persisted separately. But then I wouldn´t call them documents anymore, but enties.
Where does JSON enter the picture? Currently The Lounge Repository is just an embedded "database". There is no need for JSON. But I could easily replace the binary serializer with a JSON serializer.
Once The Lounge Repository is made a server "database", though, JSON earns more consideration. Less for storage, but more as a transport encoding.
-Ralf
@Ralf
I like the idea of CouchDB, but not its implementation - its setup is difficult on Windows. I find Persevere much easier to work with, but haven't put it to any production-level work yet.
Now, why JSON and not an object persistence framework.
My software relies on dynamic data structures, some of them can be modified at runtime so strongly typed entities don't fit. JSON is elastic and can hold any data structures without hardcoding anything
Having a single text document with all the data allows me to do versioning easily
JSON is very portable and easy to work with in all application layers
RESTful interface has many advantages over .Net remoting or web services or whatever communication protocol the database has
Does your Lounge Repository allow for searching or indexing the data?
R
@Rafal: True, CouchDB is not easy to install on Windows.
As for your points:
JSON on the othe hand, I´d say, is not needed, if you want elasticity.
But keep in mind: During normalization entity hierarchies are "shreddered", so that each entity can be saved in its own file.
But in case you want it, JSON can easily built in: just replace the serializer and serialize to JSON or XML or whatever.
REST is a consideration for distributed scenarios. So far The Lounge Repo does not support client server architectures. But once it does it sure will sport a RESTful API - besides other means.
The Lounge Repo currently is an in-mem database. All entities can be queried using Linq. I´m wating for someone to complain this is too slow. Until then I´m staying away from such kind of optimizations. They don´t add to the general programming model which is what I want to explore for business apps.
-Ralf
the big thing isn't the storage of individual documents, it's storage and computation of "views". the lock-free map/reduce approach of couchdb has been the main selling point for me - aside from a few up-front design headaches, i can easily build out a repository that'll partition with no consideration of the data structure. that's a big deal.
Alex,
That is actually easy to do. Look at the source, it is already there.
Comment preview