RavenDB in comparison to CouchDB

time to read 5 min | 839 words

I run into this question, and I thought that this is an important enough topic to put it on the blog as well.

A cursory dig suggests CouchDB is more mature, with a larger community to support it. That aside, what do you consider to be the significant differences?

RavenDB was heavily inspired by CouchDB. But when I sat down to build it, I tried to find all the places where you would have friction in using CouchDB and eliminating them, as well as try to build a product that would be a natural fit to the .NET ecosystem. That isn't just being able to run easily on Windows, btw. It is about being a product that fits the thought processes, requirements and environment in which it is used.

Here are some of the things that distinguish RavenDB:

  • Transactions - support for single document, document batch, multi request, multi node transactions. Include support for DTC. To my knowledge, CouchDB supports transaction only on a single document.
  • Patching - you can perform a PATCH op against a document, instead of having to send the entire document to the server.
  • Set based operations - basically, a way to do things like: "update active = false where last_login < '2010-10-01'"
  • Deployment options - can run embedded, separate executable, windows service, iis, windows azure.
  • Client API - comes with a client API for .NET that is very mature. Supports things like unit of work, change tracking, etc.
  • Safe by default - both the server and the client have builtin limits (overrdable) that prevent you from doing things that will kill your app.
  • Queries - Support the following querying options:
    • Indexes - similar to couch's views. Define by specifying a linq query.
    • Just do a search - doesn't have to have an index. RavenDB will analyze the query and create a temporary index for you. However, unlike couch temp views. This is meant for production use. And those temp indexes will automatically become permanent ones based on usage. Note that you don't have to define anything, just issue the actual query: Name:ayende will give you back the correct result. Tags,Name:raven will also do the same, including when you have to deal with extracting information directly from the composite docs.
    • Run a linq query - this is similar to the way temp view works, it is an O(n) operation, but it allows you to do whatever you want with the full power of linq. (For the non .NET guys, it allows you to run a SQL query against the data store) Mostly meant for testing.
  • Index backing store - Raven puts the index information in Lucene, which means we get full text searching OOTB. We can also do spatial queries OOTB.
  • Searching - It is very easy to say "index users by first name and last name", then search for them by either one. (As I understand it, I would have to define two separate views in couch for this).
  • Scaling - Raven comes with replication builtin, including master/master. Sharding is natively supported by the client API and requires you to simply define your sharding strategy.
  • Authorization - Raven has an auth system that allows defining queries based on user / role on document, set of documents (based on the doc data) and globally. You can define something like: "Only Senior Engineers can Support Strategic Clients"
  • Triggers - Raven gives you the option to register triggers that will run on document PUT/READ/DELETE/INDEX
  • Extensibility - Raven is intended to be customized by the user for typical deployment. That mean that you would typically have some sort of customization, such as triggers, additional operations that the DB can do.
  • Includes¬† & Live projections -¬† Let us say that we have the following set of documents: { "name": "ayende", "partner": "docs/123" }, { "name": "arava" }
    • Includes means that you can load the "ayende" document, while asking RavenDB to load the document referred to by the partner property. That means that you have only a single request to make, vs. 2 of them without this feature.
    • Live projections means that we can ask for the document name and the name of the partner's name. Effectively joining the two together.
    • Those two features will only work on local data, obviously.

And I am probably forgetting some stuff, to be honest. Oh, and I am naturally the most unbiased of observers :-)