I described what a document database is, but I haven’t touched about why you would want to use it.
The major benefit, of course, is that you are dealing with documents. There is little or no impedance mismatch between DTOs and documents. That means that storing data in the document database is usually significantly easier than when using an RDBMS for most non trivial scenarios.
It is usually quite painful to design a good physical data model for an RDBMS, because the way the data is laid out in the database and the way that we think about it in our application are drastically different. Moreover, RDBMS has this little thing called Schemas. And modifying a schema can be a painful thing indeed.
Sidebar: One of the most common problems that I find when reviewing a project is that the first step (or one of them) was to build the Entity Relations Diagram, thereby sinking a large time/effort commitment into it before the project really starts and real world usage tells us what we actually need.
The schemaless nature of a document database means that we don’t have to worry about the shape of the data we are using, we can just serialize things into and out of the database. It helps that the commonly used format (JSON) is both human readable and easily managed by tools.
A document database doesn’t support relations, which means that each document is independent. That makes it much easier to shard the database than it would be in a relational database, because we don’t need to either store all relations on the same shard or support distributed joins.
Finally, I like to think about document databases as a natural candidate for DDD and DDDish (DDD-like?) applications. When using a relational database, we are instructed to think in terms of Aggregates and always go through an aggregate. The problem with that is that it tends to produce very bad performance in many instances, as we need to traverse the aggregate associations, or specialized knowledge in each context. With a document database, aggregates are quite natural, and highly performant, they are just the same document, after all.
I’ll post more about this issue tomorrow.
More posts in "That No SQL Thing" series:
- (03 Jun 2010) Video
- (14 May 2010) Column (Family) Databases
- (09 May 2010) Why do I need that again?
- (07 May 2010) Scaling Graph Databases
- (06 May 2010) Graph databases
- (22 Apr 2010) Document Database Migrations
- (21 Apr 2010) Modeling Documents in a Document Database
- (20 Apr 2010) The relational modeling anti pattern in document databases
- (19 Apr 2010) Document Databases – usages