Ayende @ Rahien

My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:


+972 52-548-6969

, @ Q c

Posts: 6,026 | Comments: 44,844

filter by tags archive

Designing a document databaseLooking at views

time to read 2 min | 334 words

I was asked how we can handle more complex views scenarios. So let us take a look at how we can deal with them.


In a many to one scenario, (post & comments), how can we get both on them in one call to the database? I am afraid that I am not doing anything new here, since the technique is actually described here. The answer to that is quite simple, you don’t. What you can do, however, is generate a view that will allow you to get both of them at the same time. For example:


The key here is that views are always calculated in sort order, so what is actually happening here is that we sort by post id, then by IsPost. Since false is higher then true, the actual post is always the first item, with the comments directly following that. This means that we can query for all of them in one DB call.

Returning more than a single result per row

To be fair, I haven’t considered this, but it seems like a pretty obvious that this is needed. Here is the original request:

Question: can a view contain more rows than the underlying document database? For example: assume an invoice database (each document is an invoice with buyer's and seller's Tax ID). I want to create index: Tax ID -> #of invoices, where tax id can belong either to buyer or seller. In worst case scenario, unique tax IDs in every invoice, we'll have index with 2N entries. How view syntax would look like?

If I understand the problem correctly, this can be resolve using the following view definition:



More posts in "Designing a document database" series:

  1. (17 Mar 2009) What next?
  2. (16 Mar 2009) Remote API & Public API
  3. (16 Mar 2009) Looking at views
  4. (15 Mar 2009) View syntax
  5. (14 Mar 2009) Aggregation Recalculating
  6. (13 Mar 2009) Aggregation
  7. (12 Mar 2009) Views
  8. (11 Mar 2009) Replication
  9. (11 Mar 2009) Attachments
  10. (10 Mar 2009) Authorization
  11. (10 Mar 2009) Concurrency
  12. (10 Mar 2009) Scale
  13. (10 Mar 2009) Storage



Regarding the problem with joins, maybe there should be an option to left-join with an already existing view by its key? Not necessarily in v 1.0.

Ayende Rahien


Can you give me an example please?


Let's stick to the invoice db example. Suppose we have invoice documents with buyerTaxID and corporate customer database, customers identified also by Tax ID. And we have a view Tax ID -> Customer (vCustomerByTaxID) . Then, when doing mapping on invoices, we could tell the system to fetch customer data from vCustomerByTaxID, where key is in invoice's buyerTaxId field. Something like

select d as doc, vCustomerByTaxId.Value as customer

from docs

join vCustomerByTaxId on docs.buyerTaxId


Second thought is that such join will probably have the same cost as fetching data 'by hand' in map function...

Nathaniel Neitzke


This is the problem that map-reduce-merge was intended to solve, relational algebra (joins especially) in map reduce. Sorry the link before was an ACM one, but if you google for map-reduce-merge there is a decent amount of info out there. This is what I am working on implementing at the moment.

Ayende Rahien


I read some about it. What seems to be missing is the concept of updatable data source.

That is the problem that I am trying to solve at this stage. And the main goal is to reduce the amount of work that I have to do whenever I have to update a document to a minimum.

Comment preview

Comments have been closed on this topic.


No future posts left, oh my!


  1. Technical observations from my wife (3):
    13 Nov 2015 - Production issues
  2. Production postmortem (13):
    13 Nov 2015 - The case of the “it is slow on that machine (only)”
  3. Speaking (5):
    09 Nov 2015 - Community talk in Kiev, Ukraine–What does it take to be a good developer
  4. Find the bug (5):
    11 Sep 2015 - The concurrent memory buster
  5. Buffer allocation strategies (3):
    09 Sep 2015 - Bad usage patterns
View all series


Main feed Feed Stats
Comments feed   Comments Feed Stats