Designing a document database: What next?

architecture (618) rss
bugs (451) rss
challanges (123) rss
community (381) rss
databases (481) rss
design (896) rss
development (647) rss
hibernating-practices (72) rss
miscellaneous (592) rss
performance (397) rss
programming (1093) rss
raven (1459) rss
ravendb.net (545) rss
reviews (184) rss

2025
- August (6)
- July (7)
- June (7)
- May (10)
- April (10)
- March (10)
- February (7)
- January (12)
2024
- December (3)
- November (2)
- October (1)
- September (3)
- August (5)
- July (10)
- June (4)
- May (6)
- April (2)
- March (8)
- February (2)
- January (14)
2023
- December (4)
- October (4)
- September (6)
- August (12)
- July (5)
- June (15)
- May (3)
- April (11)
- March (5)
- February (5)
- January (8)
2022
- December (5)
- November (7)
- October (7)
- September (9)
- August (10)
- July (15)
- June (12)
- May (9)
- April (14)
- March (15)
- February (13)
- January (16)
2021
- December (23)
- November (20)
- October (16)
- September (6)
- August (16)
- July (11)
- June (16)
- May (4)
- April (10)
- March (11)
- February (15)
- January (14)
2020
- December (10)
- November (13)
- October (15)
- September (6)
- August (9)
- July (9)
- June (17)
- May (15)
- April (14)
- March (21)
- February (16)
- January (13)
2019
- December (17)
- November (14)
- October (16)
- September (10)
- August (8)
- July (16)
- June (11)
- May (13)
- April (18)
- March (12)
- February (19)
- January (23)
2018
- December (15)
- November (14)
- October (19)
- September (18)
- August (23)
- July (20)
- June (20)
- May (23)
- April (15)
- March (23)
- February (19)
- January (23)
2017
- December (21)
- November (24)
- October (22)
- September (21)
- August (23)
- July (21)
- June (24)
- May (21)
- April (21)
- March (23)
- February (20)
- January (23)
2016
- December (17)
- November (18)
- October (22)
- September (18)
- August (23)
- July (22)
- June (17)
- May (24)
- April (16)
- March (16)
- February (21)
- January (21)
2015
- December (5)
- November (10)
- October (9)
- September (17)
- August (20)
- July (17)
- June (4)
- May (12)
- April (9)
- March (8)
- February (25)
- January (17)
2014
- December (22)
- November (19)
- October (21)
- September (37)
- August (24)
- July (23)
- June (13)
- May (19)
- April (24)
- March (23)
- February (21)
- January (24)
2013
- December (23)
- November (29)
- October (27)
- September (26)
- August (24)
- July (24)
- June (23)
- May (25)
- April (26)
- March (24)
- February (24)
- January (21)
2012
- December (19)
- November (22)
- October (27)
- September (24)
- August (30)
- July (23)
- June (25)
- May (23)
- April (25)
- March (25)
- February (28)
- January (24)
2011
- December (17)
- November (14)
- October (24)
- September (28)
- August (27)
- July (30)
- June (19)
- May (16)
- April (30)
- March (23)
- February (11)
- January (26)
2010
- December (29)
- November (28)
- October (35)
- September (33)
- August (44)
- July (17)
- June (20)
- May (53)
- April (29)
- March (35)
- February (33)
- January (36)
2009
- December (37)
- November (35)
- October (53)
- September (60)
- August (66)
- July (29)
- June (24)
- May (52)
- April (63)
- March (35)
- February (53)
- January (50)
2008
- December (58)
- November (65)
- October (46)
- September (48)
- August (96)
- July (87)
- June (45)
- May (51)
- April (52)
- March (70)
- February (43)
- January (49)
2007
- December (100)
- November (52)
- October (109)
- September (68)
- August (80)
- July (56)
- June (150)
- May (115)
- April (73)
- March (124)
- February (102)
- January (68)
2006
- December (95)
- November (53)
- October (120)
- September (57)
- August (88)
- July (54)
- June (103)
- May (89)
- April (84)
- March (143)
- February (78)
- January (64)
2005
- December (70)
- November (97)
- October (91)
- September (61)
- August (74)
- July (92)
- June (100)
- May (53)
- April (42)
- March (41)
- February (84)
- January (31)
2004
- December (49)
- November (26)
- October (26)
- September (6)
- April (10)

Mar 17 2009

Designing a document databaseWhat next?

time to read 3 min | 596 words

So far I posted quite a few of posts about building the document database. To be frank, the reason that I did this is because the idea has been bouncing in my head a lot recently, and sitting down and actually thinking about it has been great, especially since now I have the design dancing in my head, shiny & beautiful. Here is the full list, in case you missed anything:

A few days ago I asked on twitter what do people think, do I have this written up yet or not. Opinions seems to be divided on this score. Let me try to set the record straight. I have a lot of scattered code around this, yes. But it is not a project, is is a lot of tiny experiments to prove that one approach or the other would work. This series of posts has required a lot of research. But I don’t have anything that is even remotely close to a working system.

I am estimating that it would take a month or two to take this from the drawing board to something that I would be willing to use in production*. This if full time work, by the way. It is likely that I can get something usable faster than that, depending on your definition of usable :-). Most of the challenge is going to be in implementing the views, as I see it now. Everything else seems to be pretty straightforward.

That is somewhat of a problem. I don’t really want to spend several months (and the associated support costs afterward) to build an open source project. The main issue is that while it is fun, there is simply no money in it, and I heard that eating is mandatory. On the other hand, I don’t really see something like that selling as a commercial package. This is infrastructure, and infrastructure has been commoditized. The ideal solution from my point of view is what we tried to do with Linq to NHibernate. Getting a company, or several companies, to sponsor its development as an OSS project.

The motivation would be the same as usual, this is something that the aforementioned companies need, and are willing to pay for. It didn’t end up the way I expected it with Linq for NHibernate, but it ended up very well after all, so I am happy about that.

Oh, and as an aside, if you want more posts in this series, do suggest a few topics that you want to hear about.

* Just to give you an idea about the complexity involved, I estimated Linq to NHibernate to be about 3 months.

Tweet Share Share 20 comments

Tags:

Databases

Comments

17 Mar 2009
07:05 AM

Simon Segal

Ayende

What about extending this into Queuing? What would be involved in turning the idea on it's head a little and using a document to describe a message? What about the API that would be required, would you simply make the messaging work within the confines of the REST API that you proposed? How about making it work in a Sandboxed technology such as Silverlight. Silverlight currently makes the Store and Forward pattern very difficult. I am curious.

17 Mar 2009
07:19 AM

Rafal

Ayende, there is one question that wasn't asked before: why did you want to build Couch DB at all, since the original is already built and is free? This elliminates the need to pay for a document database. In your posts I didn't see any feature comparison of couch db and your project, it would probably help to state something like 'my database will have better implementation of X and Y and will provide extra features like Z and ...'

How to make money on such project? Incorporate it into some business application, like document management system.

BTW, I spent half of last night updating application on customer's servers. Needed to change table structure, unfortunately the table has millions of records and the update took a hour and half. My service window was exactly 60 minutes, so I did not make it on time and had to call for extra minutes making customer very anxious. And it was just adding several fields..What would happen if the operation failed and had to be repeated - probably a catastrophe.. and next updates will be only worse. I regret the application isn't built on a schema-less database.

17 Mar 2009
09:26 AM

Erik

I have to agree with Rafal, instead of building "CouchDB.Net" maybe we could build a new "Hibernate" or Linq layer for existing schema-less databases?

Of course it depends on what we intend to do with our document DB...

17 Mar 2009
10:00 AM

Ayende Rahien

Simon,

I am not sure that I understand what you mean.

I don't see how queuing is related to Doc DB.

17 Mar 2009
10:04 AM

Ayende Rahien

Rafal,

a) Couch DB is not supported on Windows. There is some lengthy process you can go through to get it working, but it is not supported there, and there are several things that it does that make it not work nicely.

b) Couch DB is running on Erlang. In most environments, it is... hard to get a new platform in. .Net is already acceptable for most, and that make is much easier to adopt it.

17 Mar 2009
10:06 AM

Ayende Rahien

Erik,

ORM == Object Relational Mapping

If there are no relations, there is no ORM.

There is absolutely no challenge in building a layer on top of the doc db.

Querying a doc db is not really feasible, that is why we have views.

17 Mar 2009
10:45 AM

Rafal

Couch DB is not supported on Windows

This reason is good enough for me. Please don't abandon your project :)

17 Mar 2009
11:30 AM

Simon Segal

Ayende

It's not related directly to DocDB. I know of a project where an API was written to use SQL Servers Queues much in the same vein as MSMQ, but only to 'facilitate store and forward' at the point of publishing or sending a message. Transport in this scenario was handled by WCF and the dequeing of messages and their transport over the wire was wrapped in a transaction. Do you see any value in extending that idea to a docDB? Do documents map nicely enough to messages and with the transactional support, is this lite weight enough for it to be xcopy portable and a real alternative where MSMQ (or the like) is not going to be acceptable. So if you have a durable storage with support for transactions (like the doc DB under discussion), could it fill the gap and help make something like Rhino.Queues durable?

17 Mar 2009
11:30 AM

Fernando Felman

Hmm, I might be really off track here, but isn't it all a re-implementation of Couch DB? I mean, yeah, I do get that couch DB is not a feasible option for the Windows environment, fair enough, but why re-designing it?

Would love to hear what are the functional differences between this and the Couch DB, especially the reasoning behind it. I think this can provide some insights into what else Couch DB is lacking, or what kind of scenarios this new project is more suitable for.

Cheers,

17 Mar 2009
12:07 PM

Nuz

Hi,

What are your thoughts on using sharepoint to be the document repository and using its api for some of the tasks?

Regards

17 Mar 2009
12:34 PM

eledu

Companies possibly interested in this project:

Microsoft

I don't see any other who may consider it. Worst if it is a windows only solution.

17 Mar 2009
13:33 PM

Uriel Katz

isn`t it easier to just build a installer for CouchDB on windows to make it easier to install?

i understand the challenge and satisfaction of building something like that(building a database myself,but for really different needs) but there is already a really good product like CouchDB written in a really good language for the task(except for the IO part,if i am not wrong) so why reinvent the wheel.

after saying that,for educational purposes of teaching how to build something like that to people who aren`t familiar with functional programming,that maybe a good reason.

17 Mar 2009
16:35 PM

Ayende Rahien

Simon,

The main reason that Rhino Queues was such a pain to write is the persistence format.

Right now, I have a very easy solution for persistent data, Esent, so I don't think it would be much of a challenge.

About using the doc db for this, you _could_, I just see no reason that you would want to do that.

17 Mar 2009
19:01 PM

Travis

Did a company or someone step up for Linq to NHibernate? What was the outcome?

18 Mar 2009
08:47 AM

Paul Batum

Travis,

iMeta have provided a full time developer for 3 months. See here:

groups.google.com.ar/.../5111835e99d9a8e8?hl=en

18 Mar 2009
11:11 AM

Vadi

I see one problem though is using Esent. It's the unawareness of existence and reliability. How could i just use my own preferred database server, eg., I could be interested in using SQL Server or Oracle, which solves lot of other problems like clustering, replication etc.,

Do you think to start this as a OSS project so that we can contribute one or two?

20 Mar 2009
10:58 AM

Ayende Rahien

Uriel,

Maybe, but I lack the skills to do so. In addition to that, putting Erlang in the enterprise is not something that goes quickly or easily.

I am running into problems just putting MSMQ into place, because it is an unfamiliar tech to the sys admins. Putting totally new platform has high resistance.

20 Mar 2009
11:15 AM

Ayende Rahien

Nuz,

Not if you put me over hot coals and made me watch all the Drag & Drop demos in MSDN.

20 Mar 2009
11:18 AM

Ayende Rahien

Fernando,

From concept idea, they are very similar.

From design perspective, there are a significant changes all around because of different design constraints regarding the implementation.

20 Mar 2009
11:22 AM

Ayende Rahien

Vadi,

What problem do you have with Esent?

What do you mean, unawareness of exsitence and reliability.

Comment preview

Comments have been closed on this topic.

Markdown turns plain text formatting into fancy HTML formatting.

Phrase Emphasis

*italic*   **bold**
_italic_   __bold__

Links

Inline:

An [example](http://url.com/ "Title")

Reference-style labels (titles are optional):

An [example][id]. Then, anywhere
else in the doc, define the link:
  [id]: http://example.com/  "Title"

Images

Inline (titles are optional):

![alt text](/path/img.jpg "Title")

Reference-style:

![alt text][id]
[id]: /url/to/img.jpg "Title"

Headers

Setext-style:

Header 1
========
Header 2
--------

atx-style (closing #'s are optional):

# Header 1 #
## Header 2 ##
###### Header 6

Lists

Ordered, without paragraphs:

1.  Foo
2.  Bar

Unordered, with paragraphs:

*   A list item.
    With multiple paragraphs.
*   Bar

You can nest them:

*   Abacus
    * answer
*   Bubbles
    1.  bunk
    2.  bupkis
        * BELITTLER
    3. burper
*   Cunning

Blockquotes

> Email-style angle brackets
> are used for blockquotes.
> > And, they can be nested.
> #### Headers in blockquotes
> 
> * You can quote a list.
> * Etc.

Horizontal Rules

Three or more dashes or asterisks:

---
* * *
- - - -

Manual Line Breaks

End a line with two or more spaces:

Roses are red,   
Violets are blue.

Fenced Code Blocks

Code blocks delimited by 3 or more backticks or tildas:

```
This is a preformatted
code block
```

Header IDs

Set the id of headings with {#<id>} at end of heading line:

## My Heading {#myheading}

Tables

Fruit    |Color
---------|----------
Apples   |Red
Pears	 |Green
Bananas  |Yellow

Definition Lists

Term 1
: Definition 1
Term 2
: Definition 2

Footnotes

Body text with a footnote [^1]
[^1]: Footnote text here

Abbreviations

MDD <- will have title
*[MDD]: MarkdownDeep

Oren Eini

Oren Eini

CEO of RavenDB

Designing a document databaseWhat next?

More posts in "Designing a document database" series:

Comments

Comment preview

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication

Main feed
Comments feed

Oren Eini

CEO of RavenDB

Related posts that you may find interesting:

More posts in "Designing a document database" series:

Comments

Comment preview

Markdown formatting

Phrase Emphasis

Links

Images

Headers

Lists

Blockquotes

Horizontal Rules

Manual Line Breaks

Fenced Code Blocks

Header IDs

Tables

Definition Lists

Footnotes

Abbreviations

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication