Ayende @ Rahien

My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:


+972 52-548-6969

, @ Q c

Posts: 6,131 | Comments: 45,563

filter by tags archive

Modeling hierarchical structures in RavenDB

time to read 5 min | 829 words

The question pops up frequently enough and is interesting enough for a post. How do you store a data structure like this in Raven?

The problem here is that we don’t have enough information about the problem to actually give an answer. That is because when we think of how we should model the data, we also need to consider how it is going to be accessed. In more precise terms, we need to define what is the aggregate root of the data in question.

Let us take the following two examples:

image image

As you can imagine, a Person is an aggregate root. It can stand on its own. I would typically store a Person in Raven using one of two approaches:

Bare references Denormalized References
  "Name": "Ayende",
  "Email": "Ayende@ayende.com",
  "Parent": "people/18",
  "Children": [
  "Name": "Ayende",
  "Email": "Ayende@ayende.com",
  "Parent": { "Name": "Oren", "Id": "people/18"},
  "Children": [
        { "Name": "Raven", "Id": "people/59"},
        { "Name": "Rhino", "Id": "people/29"}

The first option is bare references, just holding the id of the associated document. This is useful if I only need to reference the data very rarely. If, however, (as is common), I need to also show some data from the associated documents, it is generally better to use denormalized references, which keep the data that we need to deal with from the associated document embedded inside the aggregate.

But the same approach wouldn’t work for Questions. In the Question model, we have utilized the same data structure to hold both the question and the answer. This sort of double utilization is pretty common, unfortunately. For example, you can see it being used in StackOverflow, where both Questions & Answers are stored as posts.

The problem from a design perspective is that in this case a Question is not a root aggregate in the same sense that a Person is. A Question is a root aggregate if it is an actual question, not if it is a Question instance that holds the answer to another question. I would model this using:

   "Content": "How to model relations in RavenDB?",
   "User": "users/1738",
   "Answers" : [
      {"Content": "You can use.. ", "User": "users/92" },
      {"Content": "Or you might...", "User": "users/94" },

In this case, we are embedding the children directly inside the root document.

So I am afraid that the answer to that question is: it depends.


Nathan Stott

In CouchDB, you would not want to embed the answers to a question directly in the document because if two people answered the question at about the same time, or if you were using replication and they answered it between replication cycles, then you would get a 409 (conflict). If you add the answers as documents of their own, two people adding at the same time will not cause conflicts.

Would this scenario not be a problem with RavenDB? What about RavenDB makes the proper choice of strategy different?

Ayende Rahien


That is a good point. WRT replication, Raven would be in the same situation as CouchDB, but Raven also support the notion of partial updates, things like: "Add this answer to the Answers array"

Which means that two concurrent updates can both succeed.

Nathan Stott

How do the partial updates work? Does the app have to specify that it is doing a partial update or does Raven do this behind the scenes? Got a link handy?

Brian Vallelunga

I have a similar question to Nathan's. Given the StackOverflow model you presented, if two people answer the question at about the same time, won't you get conflicts storing the data to the db.

I can imagine the following scenario:

1) Person A answers question.

2) Get question document for Person A

3) Append answer A

4) Person B answers question.

5) Get question document for Person B

6) Save Person A's answer to DB.

7) Append answer B

8) Save Person B's answer to DB.

If we let the last-in win, Person A's answer is completely gone. I've actually avoided working a part of my application that requires this sort of modeling because I haven't figured out what to do yet.

Obviously storing the answers as entities themselves would help, but we'd almost always want to access the data as one document in this situation. Can you expand on a strategy here?


Ayende Rahien


As I told Nathan, the answer for that is to use Raven's partial document update support, which would resolve the issue

Jason Young


So... for a limitlessly recursive heirarchy (e.g. parent-child relationship), you want each element in its own document, but for depth-limited relationships (e.g. question-answer), you can put all the "children" in a collection in the "parent" document, and "children" need not have documents of their own, correct? If so, that makes sense to me.

Brian Vallelunga

Ahh, thanks, I see now. I read that as only being available with replication. Reading the mailing list, it seems there is client support at the store level for this. I haven't seen any examples of it though. I'll go ahead and ask on the list.

Ayende Rahien



Although I would put it differently


seems like client api doesnot support the command "patch" ,right?

c# model

maybe i'm missing something here but the Person denormalized example saves only id and name. When you query the model how does children and parent convert back to a whole c# person (with own parent and children) ?

Matt Warren

@c# model

You can use the id string and load the document based on that, i.e.

var person = session.Load <person("people/59");

Matt Warren

Just to add: Load is a generic method that need to have the type specified as "Person", but it got stripped out in my answer.

Daniel Cohen

@Matt warren , I get this if you go with the bare reference approach and then in you POCO class you have a string ParentId { get;set;}

but in the denormalized way what kind of class you get in return ?? it's not an id field nor a full Person class

btw "c# model" was intended to be the title not the name, a funny mistake :)


What would be the cost of updating name in partialy denormalized reference ?

Ayende Rahien


It shouldn't be very expensive.


Is there a way to only load a small part of the Answers for paging (etc if there will be hundreds or thousands of them) soo the database wont have to send all of them ?


... and what happens if a Username is stored for every Answer as it always needs to be displayed, but the user is allowed to change his Username ?

Will i have to loop through all documents in the database where the Username is stored (almost everywhere there is a user action), and update the Username? will it be a problem ?

Thanks for a great blog.

Ayende Rahien


Yes, you can.

You create an index that project those out, and then query on that

Ayende Rahien


Changing username is a rare occasion, you can handle that as a background process

Comment preview

Comments have been closed on this topic.


  1. RavenDB Conference 2016–Slides - 17 hours from now
  2. Proposed solution to the low level interview question - about one day from now

There are posts all the way to Jun 02, 2016


  1. The design of RavenDB 4.0 (14):
    26 May 2016 - The client side
  2. RavenDB 3.5 whirl wind tour (14):
    25 May 2016 - Got anything to declare, ya smuggler?
  3. Tasks for the new comer (2):
    15 Apr 2016 - Quartz.NET with RavenDB
  4. Code through the looking glass (5):
    18 Mar 2016 - And a linear search to rule them
  5. Find the bug (8):
    29 Feb 2016 - When you can't rely on your own identity
View all series



Main feed Feed Stats
Comments feed   Comments Feed Stats