That No SQL Thing: Document Databases – usages

architecture (618) rss
bugs (451) rss
challanges (123) rss
community (381) rss
databases (481) rss
design (896) rss
development (647) rss
hibernating-practices (72) rss
miscellaneous (592) rss
performance (397) rss
programming (1093) rss
raven (1459) rss
ravendb.net (545) rss
reviews (184) rss

2025
- August (6)
- July (7)
- June (7)
- May (10)
- April (10)
- March (10)
- February (7)
- January (12)
2024
- December (3)
- November (2)
- October (1)
- September (3)
- August (5)
- July (10)
- June (4)
- May (6)
- April (2)
- March (8)
- February (2)
- January (14)
2023
- December (4)
- October (4)
- September (6)
- August (12)
- July (5)
- June (15)
- May (3)
- April (11)
- March (5)
- February (5)
- January (8)
2022
- December (5)
- November (7)
- October (7)
- September (9)
- August (10)
- July (15)
- June (12)
- May (9)
- April (14)
- March (15)
- February (13)
- January (16)
2021
- December (23)
- November (20)
- October (16)
- September (6)
- August (16)
- July (11)
- June (16)
- May (4)
- April (10)
- March (11)
- February (15)
- January (14)
2020
- December (10)
- November (13)
- October (15)
- September (6)
- August (9)
- July (9)
- June (17)
- May (15)
- April (14)
- March (21)
- February (16)
- January (13)
2019
- December (17)
- November (14)
- October (16)
- September (10)
- August (8)
- July (16)
- June (11)
- May (13)
- April (18)
- March (12)
- February (19)
- January (23)
2018
- December (15)
- November (14)
- October (19)
- September (18)
- August (23)
- July (20)
- June (20)
- May (23)
- April (15)
- March (23)
- February (19)
- January (23)
2017
- December (21)
- November (24)
- October (22)
- September (21)
- August (23)
- July (21)
- June (24)
- May (21)
- April (21)
- March (23)
- February (20)
- January (23)
2016
- December (17)
- November (18)
- October (22)
- September (18)
- August (23)
- July (22)
- June (17)
- May (24)
- April (16)
- March (16)
- February (21)
- January (21)
2015
- December (5)
- November (10)
- October (9)
- September (17)
- August (20)
- July (17)
- June (4)
- May (12)
- April (9)
- March (8)
- February (25)
- January (17)
2014
- December (22)
- November (19)
- October (21)
- September (37)
- August (24)
- July (23)
- June (13)
- May (19)
- April (24)
- March (23)
- February (21)
- January (24)
2013
- December (23)
- November (29)
- October (27)
- September (26)
- August (24)
- July (24)
- June (23)
- May (25)
- April (26)
- March (24)
- February (24)
- January (21)
2012
- December (19)
- November (22)
- October (27)
- September (24)
- August (30)
- July (23)
- June (25)
- May (23)
- April (25)
- March (25)
- February (28)
- January (24)
2011
- December (17)
- November (14)
- October (24)
- September (28)
- August (27)
- July (30)
- June (19)
- May (16)
- April (30)
- March (23)
- February (11)
- January (26)
2010
- December (29)
- November (28)
- October (35)
- September (33)
- August (44)
- July (17)
- June (20)
- May (53)
- April (29)
- March (35)
- February (33)
- January (36)
2009
- December (37)
- November (35)
- October (53)
- September (60)
- August (66)
- July (29)
- June (24)
- May (52)
- April (63)
- March (35)
- February (53)
- January (50)
2008
- December (58)
- November (65)
- October (46)
- September (48)
- August (96)
- July (87)
- June (45)
- May (51)
- April (52)
- March (70)
- February (43)
- January (49)
2007
- December (100)
- November (52)
- October (109)
- September (68)
- August (80)
- July (56)
- June (150)
- May (115)
- April (73)
- March (124)
- February (102)
- January (68)
2006
- December (95)
- November (53)
- October (120)
- September (57)
- August (88)
- July (54)
- June (103)
- May (89)
- April (84)
- March (143)
- February (78)
- January (64)
2005
- December (70)
- November (97)
- October (91)
- September (61)
- August (74)
- July (92)
- June (100)
- May (53)
- April (42)
- March (41)
- February (84)
- January (31)
2004
- December (49)
- November (26)
- October (26)
- September (6)
- April (10)

Apr 19 2010

That No SQL ThingDocument Databases – usages

time to read 2 min | 381 words

I described what a document database is, but I haven’t touched about why you would want to use it.

The major benefit, of course, is that you are dealing with documents. There is little or no impedance mismatch between DTOs and documents. That means that storing data in the document database is usually significantly easier than when using an RDBMS for most non trivial scenarios.

It is usually quite painful to design a good physical data model for an RDBMS, because the way the data is laid out in the database and the way that we think about it in our application are drastically different. Moreover, RDBMS has this little thing called Schemas. And modifying a schema can be a painful thing indeed.

Sidebar: One of the most common problems that I find when reviewing a project is that the first step (or one of them) was to build the Entity Relations Diagram, thereby sinking a large time/effort commitment into it before the project really starts and real world usage tells us what we actually need.

The schemaless nature of a document database means that we don’t have to worry about the shape of the data we are using, we can just serialize things into and out of the database. It helps that the commonly used format (JSON) is both human readable and easily managed by tools.

A document database doesn’t support relations, which means that each document is independent. That makes it much easier to shard the database than it would be in a relational database, because we don’t need to either store all relations on the same shard or support distributed joins.

Finally, I like to think about document databases as a natural candidate for DDD and DDDish (DDD-like?) applications. When using a relational database, we are instructed to think in terms of Aggregates and always go through an aggregate. The problem with that is that it tends to produce very bad performance in many instances, as we need to traverse the aggregate associations, or specialized knowledge in each context. With a document database, aggregates are quite natural, and highly performant, they are just the same document, after all.

I’ll post more about this issue tomorrow.

Tweet Share Share 19 comments

Tags:

Comments

19 Apr 2010
21:53 PM

Ayende Rahien

Ralf,

We actually have a pretty strong client side API.

I would love to get your comments on it.

19 Apr 2010
22:30 PM

Mohammad Azam

I think one of the other advantage of using NOSQL database is that the data can be consumed by any framework since it is kept in the form of document.

This means you can put data using .NET Framework and get the same data out using Ruby or Python framework.

19 Apr 2010
23:06 PM

Michael J. Ryan

@Mohammad, so long as the framework you are using doesn't go too abstract. I find a lot of times trying to interact with a single system from different frameworks isn't always so easy. Example: using memcached from .Net, Java, and PHP. Depending on how the key hashes are generated can yeild very different results.

20 Apr 2010
01:24 AM

So how do you handle the case where you want to change the layout of a particular type of document? e.g. to support a new feature. Do you just have a process that upgrades all the documents in one go? Or do you end up with lots of conditionals in your deserialization code?

20 Apr 2010
01:53 AM

Ayende Rahien

MF,

You'll have your answer in 3 days.

In short, you never have conditional in deserialization code if you do it right.

20 Apr 2010
04:55 AM

Steve

Good point Mohammad - since it's stored as JSON, it shouldn't much matter what the client is.

Ayende, would your client api include calls from , ie. a javascript call ?

I look forward to hearing more on the topic

20 Apr 2010
05:16 AM

Hendry Luk

Timing can't be better!

I've been exploring document-db lately, less about the technology itself, instead mostly on usage pattern, the best way to take its benefit, and how to apply things that we have taken for granted in rdbms-orm duet (e.g. transactions, n-level cache, lazy-load, join-load, stuff like that).

Looking forward to your next posts

20 Apr 2010
05:32 AM

Hendry Luk

Ah btw... anonymous class in C# 3 and the new dynamic feature in C# 4 are the things that have made document-db to be a natural fit with .net applications... I mean, amazingly natural!

It is inevitable that document-db will now start gaining traction with the current state of .net language capability.

20 Apr 2010
06:11 AM

Rafal

Aggregates are quite natural for document databases? How? They are not supported by the db engine - if you want an aggregate you have to do everything yourself. In this way of thinking also statistics and reporting are 'natural' for document databases - provided that you bring in the missing data processing functionality.

20 Apr 2010
06:24 AM

Hendry Luk

Mongo does have support for basic aggregate functions...

Complex aggregate operations can cause performance problem with huge data, they're normally solved using map/reduce across distributed processing power.

20 Apr 2010
07:38 AM

Frank Quednau

As to working with aggregates, shouldn't an object database work equally well in this respect?

20 Apr 2010
08:42 AM

Ayende Rahien

Steve,

Well, using jQuery's API, here is how you insert a document:

$.ajax({

method: 'PUT',

dataType: 'json',

url: ' http://localhost:8080/docs/users/',

data: { name: 'ayende' }

});

The Web UI for Raven is composed solely of calls like this one.

And yes, there is even a wrapper around that to give you things like EditDocument, GetDocumentsPage, etc.

20 Apr 2010
08:43 AM

Ayende Rahien

Rafal,

a) I was talking about DDD Aggregates, not Aggregation in general.

b) Most Document Database has some support for aggregation. They call it map reduce, but it is the same thing.

20 Apr 2010
08:45 AM

Ayende Rahien

Frank,

Probably, I am not sure how you set things up in a object database to control the scope of storage to reduce the number of remote calls

And as I am not an expert on OODB, I don't really know.

20 Apr 2010
13:30 PM

Ayende Rahien

Wayne,

You might want to read the previous posts in the series, I am laying out a lot of information about how and why you want to use this.

20 Apr 2010
14:33 PM

Wayne

I have just spent the day telling the CIO why a NoSql/Document database would be a bad place to store production reporting data 30000+ records per hour.

He has read some blog posts saying that there is the solution to all storage needs and has reduced/no maintenance costs associated with NOSQL. So everything has to be converted

20 Apr 2010
15:00 PM

Ayende Rahien

Wayne,

At a rate of 30,000 new records an hour, after a year you'll have 262,800,000 records.

I am not sure what you intend to do with them, but assuming that each row is 128 bytes in size (guid, couple of dates, an ip, maybe a url), you will have about 30 GB of data per year. I wouldn't worry about that.

As for whatever a NoSQL solution would be good or not, that is impossible to say without more data :-)

26 Apr 2010
14:58 PM

sebastien

Can the xml type of sql server, with help of xml indexes and Xquery be called a document oriented storage ?

26 Apr 2010
15:08 PM

Ayende Rahien

Sebastien,

It might, but that wouldn't probably do what you want.

Comment preview

Comments have been closed on this topic.

Markdown turns plain text formatting into fancy HTML formatting.

Phrase Emphasis

*italic*   **bold**
_italic_   __bold__

Links

Inline:

An [example](http://url.com/ "Title")

Reference-style labels (titles are optional):

An [example][id]. Then, anywhere
else in the doc, define the link:
  [id]: http://example.com/  "Title"

Images

Inline (titles are optional):

![alt text](/path/img.jpg "Title")

Reference-style:

![alt text][id]
[id]: /url/to/img.jpg "Title"

Headers

Setext-style:

Header 1
========
Header 2
--------

atx-style (closing #'s are optional):

# Header 1 #
## Header 2 ##
###### Header 6

Lists

Ordered, without paragraphs:

1.  Foo
2.  Bar

Unordered, with paragraphs:

*   A list item.
    With multiple paragraphs.
*   Bar

You can nest them:

*   Abacus
    * answer
*   Bubbles
    1.  bunk
    2.  bupkis
        * BELITTLER
    3. burper
*   Cunning

Blockquotes

> Email-style angle brackets
> are used for blockquotes.
> > And, they can be nested.
> #### Headers in blockquotes
> 
> * You can quote a list.
> * Etc.

Horizontal Rules

Three or more dashes or asterisks:

---
* * *
- - - -

Manual Line Breaks

End a line with two or more spaces:

Roses are red,   
Violets are blue.

Fenced Code Blocks

Code blocks delimited by 3 or more backticks or tildas:

```
This is a preformatted
code block
```

Header IDs

Set the id of headings with {#<id>} at end of heading line:

## My Heading {#myheading}

Tables

Fruit    |Color
---------|----------
Apples   |Red
Pears	 |Green
Bananas  |Yellow

Definition Lists

Term 1
: Definition 1
Term 2
: Definition 2

Footnotes

Body text with a footnote [^1]
[^1]: Footnote text here

Abbreviations

MDD <- will have title
*[MDD]: MarkdownDeep

Oren Eini

Oren Eini

CEO of RavenDB

That No SQL ThingDocument Databases – usages

More posts in "That No SQL Thing" series:

Comments

Comment preview

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication

Main feed
Comments feed

Oren Eini

CEO of RavenDB

Related posts that you may find interesting:

More posts in "That No SQL Thing" series:

Comments

Comment preview

Markdown formatting

Phrase Emphasis

Links

Images

Headers

Lists

Blockquotes

Horizontal Rules

Manual Line Breaks

Fenced Code Blocks

Header IDs

Tables

Definition Lists

Footnotes

Abbreviations

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication