RavenDB: Splitting entities across several documents

architecture (618) rss
bugs (451) rss
challanges (123) rss
community (381) rss
databases (481) rss
design (896) rss
development (647) rss
hibernating-practices (72) rss
miscellaneous (592) rss
performance (397) rss
programming (1093) rss
raven (1459) rss
ravendb.net (545) rss
reviews (184) rss

2025
- August (6)
- July (7)
- June (7)
- May (10)
- April (10)
- March (10)
- February (7)
- January (12)
2024
- December (3)
- November (2)
- October (1)
- September (3)
- August (5)
- July (10)
- June (4)
- May (6)
- April (2)
- March (8)
- February (2)
- January (14)
2023
- December (4)
- October (4)
- September (6)
- August (12)
- July (5)
- June (15)
- May (3)
- April (11)
- March (5)
- February (5)
- January (8)
2022
- December (5)
- November (7)
- October (7)
- September (9)
- August (10)
- July (15)
- June (12)
- May (9)
- April (14)
- March (15)
- February (13)
- January (16)
2021
- December (23)
- November (20)
- October (16)
- September (6)
- August (16)
- July (11)
- June (16)
- May (4)
- April (10)
- March (11)
- February (15)
- January (14)
2020
- December (10)
- November (13)
- October (15)
- September (6)
- August (9)
- July (9)
- June (17)
- May (15)
- April (14)
- March (21)
- February (16)
- January (13)
2019
- December (17)
- November (14)
- October (16)
- September (10)
- August (8)
- July (16)
- June (11)
- May (13)
- April (18)
- March (12)
- February (19)
- January (23)
2018
- December (15)
- November (14)
- October (19)
- September (18)
- August (23)
- July (20)
- June (20)
- May (23)
- April (15)
- March (23)
- February (19)
- January (23)
2017
- December (21)
- November (24)
- October (22)
- September (21)
- August (23)
- July (21)
- June (24)
- May (21)
- April (21)
- March (23)
- February (20)
- January (23)
2016
- December (17)
- November (18)
- October (22)
- September (18)
- August (23)
- July (22)
- June (17)
- May (24)
- April (16)
- March (16)
- February (21)
- January (21)
2015
- December (5)
- November (10)
- October (9)
- September (17)
- August (20)
- July (17)
- June (4)
- May (12)
- April (9)
- March (8)
- February (25)
- January (17)
2014
- December (22)
- November (19)
- October (21)
- September (37)
- August (24)
- July (23)
- June (13)
- May (19)
- April (24)
- March (23)
- February (21)
- January (24)
2013
- December (23)
- November (29)
- October (27)
- September (26)
- August (24)
- July (24)
- June (23)
- May (25)
- April (26)
- March (24)
- February (24)
- January (21)
2012
- December (19)
- November (22)
- October (27)
- September (24)
- August (30)
- July (23)
- June (25)
- May (23)
- April (25)
- March (25)
- February (28)
- January (24)
2011
- December (17)
- November (14)
- October (24)
- September (28)
- August (27)
- July (30)
- June (19)
- May (16)
- April (30)
- March (23)
- February (11)
- January (26)
2010
- December (29)
- November (28)
- October (35)
- September (33)
- August (44)
- July (17)
- June (20)
- May (53)
- April (29)
- March (35)
- February (33)
- January (36)
2009
- December (37)
- November (35)
- October (53)
- September (60)
- August (66)
- July (29)
- June (24)
- May (52)
- April (63)
- March (35)
- February (53)
- January (50)
2008
- December (58)
- November (65)
- October (46)
- September (48)
- August (96)
- July (87)
- June (45)
- May (51)
- April (52)
- March (70)
- February (43)
- January (49)
2007
- December (100)
- November (52)
- October (109)
- September (68)
- August (80)
- July (56)
- June (150)
- May (115)
- April (73)
- March (124)
- February (102)
- January (68)
2006
- December (95)
- November (53)
- October (120)
- September (57)
- August (88)
- July (54)
- June (103)
- May (89)
- April (84)
- March (143)
- February (78)
- January (64)
2005
- December (70)
- November (97)
- October (91)
- September (61)
- August (74)
- July (92)
- June (100)
- May (53)
- April (42)
- March (41)
- February (84)
- January (31)
2004
- December (49)
- November (26)
- October (26)
- September (6)
- April (10)

Sep 29 2010

RavenDBSplitting entities across several documents

time to read 4 min | 684 words

There are occasions where it isn’t feasible or desirable to store our entity as a single document in RavenDB. A question that just came up was how to design votes for an entity using RavenDB.

The scenario is simple, we have our entity, Question (think stack overflow), which can have Up/Down votes. It would be very easy to design the system using a single document for the entity, like so:

{ //document id: questions/123
   Title: "How to handle Up/Down votes with Raven?",
   Content: "...",
   Votes: [
         { Up: true, User: "users/ayende" },
         { Up: false, User: "users/oren" },
  ]
}

As usual, the problem begins when you start to consider what happens when you want to deal with questions that may have large number of votes, or the common scenario where you just want to display the vote totals, and not pull the entire document to get that.

One option is to split things up. I guess you figured that out from the title of this blog post. The idea is to change the document structure to be:

{ //document id: questions/123
   Title: "How to handle Up/Down votes with Raven?",
   Content: "...",
}

{ //document id: questions/123/votes
   Votes: [
         { Up: true, User: "users/ayende" },
         { Up: false, User: "users/oren" },
  ]
}

Note that we have two separate documents here. Now we can load just the questions, or the questions and the votes. We still have a problem with getting the totals without loading potentially thousands of votes. It is pretty easy to solve this, however, using the following index:

from voteDoc in docs.VoteDocs
from vote in voteDoc.Votes
group vote by vote.Up into g
select new { Up = g.Key, Count = g.Count() }

Now we can query the index directly, to get the aggregated results:

session.LuceneQuery<VoteTotals>("Questions/VoteTotals")
            .SelectFields("__document_id", "Up", "Count")
            .ToList();

And if we want to get the votes themselves, they are easily available as well.

Tweet Share Share 18 comments

Tags:

Raven

Comments

29 Sep 2010
10:42 AM

gandjustas

It looks like relational tables and join in view\function\storedproc, isn't it? ;)

29 Sep 2010
10:43 AM

Benny Thomas

Is it just me, or does your index use the unsplited document in this sample?

29 Sep 2010
11:33 AM

configurator

A feature I'd like to see is the indexes updating entities, i.e.

{ //document id: questions/123

Title: "How to handle Up/Down votes with Raven?",

Content: "...",

}

{ //document id: questions/123/votes

Votes: [

     { Up: true, User: "users/ayende" },

     { Up: false, User: "users/oren" },

]

}

Would be transformed automatically to

{ //document id: questions/123

Title: "How to handle Up/Down votes with Raven?",

Content: "...",

UpVotes: 1,

DownVotes: 1

}

{ //document id: questions/123/votes

Votes: [

     { Up: true, User: "users/ayende" },

     { Up: false, User: "users/oren" },

]

}

And the UpVotes/DownVotes would be updated whenever the index is. Do you have such a feature?

29 Sep 2010
12:28 PM

Torkel

Yes, I am also a little puzzled.

The index definition seems to use the original document where the Votes were part of the Question document. Was that the intent? Then why show the split?

/confused

29 Sep 2010
12:41 PM

Dennis

How will you deal with the "did I vote" on this without fetching the whole thing in?

29 Sep 2010
16:29 PM

fschwiet

I am surprised you put all the votes in one document still. In the same situation, I had stored each vote as a document then used a map/reduce index to count the totals.

How would you check if someone already voted? I suppose you can create an index to split out the individual votes, so they can be read one at a time. As people vote, you're going to have concurrency issues that you wouldn't have if the votes are individual documents.

29 Sep 2010
18:01 PM

Guy

How is that better / worse than storing the votes in the Question document and having an index which projects a question with only the total votes?

29 Sep 2010
23:56 PM

Demis Bellot

This actually looks quite inefficient. If I needed to implement this in Redis I would store the user ids in 2 server-side sets, one for 'up' and the other for 'down' votes.

Recording a vote can easily be done in a single 'SADD' set operation without needing to serialize/deserialize the entire document. By comparison this looks like it would be magnitudes of times slower.

30 Sep 2010
06:34 AM

gandjustas

SADD is unsuitable here. You need to store Who voted.

30 Sep 2010
07:28 AM

Demis Bellot

@gandjustas

SADD is unsuitable here. You need to store Who voted.

My recommendation was storing 'user ids', i.e. who's voted.

Using a set also ensures you only count each users vote once.

01 Oct 2010
11:23 AM

Ayende Rahien

Benny,

No, it uses the votes document, not the unsplitted.

01 Oct 2010
11:24 AM

Ayende Rahien

Configurator,

You can absolutely do that using an index update trigger!

You just have to watch out not to modify the same document that the index is based on (other wise you create a loop).

01 Oct 2010
11:25 AM

Ayende Rahien

Torkel,

No, it uses a separate document (//document id: questions/123/votes)

Both documents have a Votes array

01 Oct 2010
11:25 AM

Ayende Rahien

Dennis,

I would have an additional index, that would output who voted, and I could query that index.

01 Oct 2010
13:38 PM

Dennis

Ayende, Wouldnt that easily be a N+1 query?

"Show a list of posts, and then for every post I need to add a javascript tag to see if I already votes on it, so I can get instant feedback in the gui."

01 Oct 2010
13:55 PM

configurator

@Ayenda: Would it be a loop even if the index result doesn't change? Suppose I define the index B = 2 * A and put a document

{ A = 1 }

It would be changed at some point to be

{ A = 1, B = 2 }

And then when the index is rerun it would be changed to

{ A = 1, B = 2 }

as this is no change, the index doesn't have to be run again. Or does it?

03 Oct 2010
12:35 PM

Ayende Rahien

Dennis,

Actually, no.

You would simple make two queries.

a) get recent posts

b) get votes where post id is in (...) from the votes index

03 Oct 2010
12:37 PM

Ayende Rahien

Configurator,

Yes, it would be a look as long as the index update trigger would touch the same document.

We aren't comparing the old/new data when updating a document, and the etag is always updated.

Comment preview

Comments have been closed on this topic.

Markdown turns plain text formatting into fancy HTML formatting.

Phrase Emphasis

*italic*   **bold**
_italic_   __bold__

Links

Inline:

An [example](http://url.com/ "Title")

Reference-style labels (titles are optional):

An [example][id]. Then, anywhere
else in the doc, define the link:
  [id]: http://example.com/  "Title"

Images

Inline (titles are optional):

![alt text](/path/img.jpg "Title")

Reference-style:

![alt text][id]
[id]: /url/to/img.jpg "Title"

Headers

Setext-style:

Header 1
========
Header 2
--------

atx-style (closing #'s are optional):

# Header 1 #
## Header 2 ##
###### Header 6

Lists

Ordered, without paragraphs:

1.  Foo
2.  Bar

Unordered, with paragraphs:

*   A list item.
    With multiple paragraphs.
*   Bar

You can nest them:

*   Abacus
    * answer
*   Bubbles
    1.  bunk
    2.  bupkis
        * BELITTLER
    3. burper
*   Cunning

Blockquotes

> Email-style angle brackets
> are used for blockquotes.
> > And, they can be nested.
> #### Headers in blockquotes
> 
> * You can quote a list.
> * Etc.

Horizontal Rules

Three or more dashes or asterisks:

---
* * *
- - - -

Manual Line Breaks

End a line with two or more spaces:

Roses are red,   
Violets are blue.

Fenced Code Blocks

Code blocks delimited by 3 or more backticks or tildas:

```
This is a preformatted
code block
```

Header IDs

Set the id of headings with {#<id>} at end of heading line:

## My Heading {#myheading}

Tables

Fruit    |Color
---------|----------
Apples   |Red
Pears	 |Green
Bananas  |Yellow

Definition Lists

Term 1
: Definition 1
Term 2
: Definition 2

Footnotes

Body text with a footnote [^1]
[^1]: Footnote text here

Abbreviations

MDD <- will have title
*[MDD]: MarkdownDeep

Oren Eini

Oren Eini

CEO of RavenDB

RavenDBSplitting entities across several documents

More posts in "RavenDB" series:

Comments

Comment preview

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication

Main feed
Comments feed

Oren Eini

CEO of RavenDB

Related posts that you may find interesting:

More posts in "RavenDB" series:

Comments

Comment preview

Markdown formatting

Phrase Emphasis

Links

Images

Headers

Lists

Blockquotes

Horizontal Rules

Manual Line Breaks

Fenced Code Blocks

Header IDs

Tables

Definition Lists

Footnotes

Abbreviations

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication