Ayende @ Rahien

My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:


+972 52-548-6969

, @ Q c

Posts: 6,131 | Comments: 45,568

filter by tags archive

Designing a document databaseView syntax

time to read 5 min | 973 words

The choice of using Linq queries as the default syntax was not an accident. If you look at how Couch DB is doing things, you can see that the choice of Javascript as the query language can cause some really irritating imperative style coding. For example, look at this piece of code:

function(doc) {
  if (doc.type == "comment") {
    map(doc.author, {post: doc.post, content: doc.content});

This works, and it allows for some really complicated solutions, but it comes with its own set of problems. Unlike Couch DB, I actually want to enforce a schema for the views, and I need to be able to tell that schema at view creation time. This is partly because of the storage engine choice, and partly because the imperative style means that it is very easy to violate some of the map reduce required behaviors, such as repeatability of the results (by querying a separate data source, for example).

Linq queries are not imperative, they are a good way of expressing set based logic in a really nice way, while still allowing for an almost embarrassingly complex set of problems to be expressed with them. More than that, Linq queries are strongly typed, provide me with a whole bunch of information and allow me to do some really interesting things along the way, some of which we will talk about later. There is also the issue of how easy it would be to utilize such things as PLinq, or that the extensibility story for the DB becomes much easier with this scenario, or that at least in a theoretical perspective, the performance that we are talking about here should be much better than a Javascript based solution. 

Another property of Linq that I considered, much as I am loath to admit it in such a public forum is the marketing aspect of it. A linq-driven database is sure to get a lot of attention, you only have to look at the number of comment on the previous posts in this topic, compare those with linq queries to those without the linq queries. The difference is quite astounding.

All in all, it sounds like an impressive amount of reason to go with Linq.

The problem, of course, is that Linq implies C#, and I don’t really think that C# is the best language for doing language oriented programming. This time, however, we have the major advantage that the domain concepts that we want are already built into the language, so we don’t really need a lot of tweaking here to get things exciting.

I posted about the syntax before, but I don’t think that a lot of people actually got what I meant. Here is the entire view definition:


It is not a snippet, and it is not a part of something larger that I am not showing. This is the view. And yes, it is not compliable on its own. Nor do I imagine that we will see people writing this code in Visual Studio. Or, at least, I imagine that it will be written there, but it will not stay there.

Much like in Couch DB today, you are going to have to create the view on the server, and you do that by creating a specially named document, which will contain this syntax as its content.

Internally, we are going to do some interesting things to it, but I think that I can stop now by just showing your the first stage, what happens to the view code after preprocessing it:


Readers of my book should recognize the pattern, I am using the notion of Implicit Base Class here to get us an executable class, which we can now compile and execute at will. Note that the query itself was modified, to make it compliable. We can now proceed to do additional analysis of the actual query, generate the fixed schema out of it, and start doing the really interesting things that we want to do.

But I have better leave those for another post…

More posts in "Designing a document database" series:

  1. (17 Mar 2009) What next?
  2. (16 Mar 2009) Remote API & Public API
  3. (16 Mar 2009) Looking at views
  4. (15 Mar 2009) View syntax
  5. (14 Mar 2009) Aggregation Recalculating
  6. (13 Mar 2009) Aggregation
  7. (12 Mar 2009) Views
  8. (11 Mar 2009) Replication
  9. (11 Mar 2009) Attachments
  10. (10 Mar 2009) Authorization
  11. (10 Mar 2009) Concurrency
  12. (10 Mar 2009) Scale
  13. (10 Mar 2009) Storage



Now I finally see what you mean. Here, you're actually using a DSL that looks like C# and compiles to C#.

Two things come to mind:

  1. I'm not sure the definition should be "var pagesByTitleAndVersion", but "view pagesByTitleAndVersion". That's because you're not generating a variable but a class, and 'var' is a bit confusing in this syntax.

  2. Whenever we want any complex check (that is, any check other than == or !=), we'd need to cast the return type. i.e. we'd need an example view to be:

var oldPages = from doc in docs

where doc.Type == "page"

where (DateTime)doc.CreationTime > DateTime.Now.AddDays(-3);

or something like that. Any way to easily remove that cast? It would be rather cumbersome on the otherwise elegant syntax you're using.

Ayende Rahien


  1. the code totally ignores the variable type, feel free to put whatever you want there.

  2. Yes, changing the syntax would be pretty easy, yes.


Ayende, I've got a question about transaction management in your document db. Are you going to support transactions at all? Can they cover updates to multiple documents? What about distributed transactions?

Frank Quednau

Thanks for clarifying this. I suppose the "document" can carry arbitrary properties definable by some user of your DB. Could this not be expressed as an interface at runtime? ...if one would provide some kind of editor to define the view, i could imagine it possible that, once you have said interface for your document, the first expression could be compilable. Additionally, you know how to fill an empty interface with life (Dynamicproxy2, etc.)?

Anyway, let's see what follows on !

Ayende Rahien


I intend to support transactions, including over multiple docs in the same batch.

No DTC planned.

Ayende Rahien


Documents don't have to share schema, though.

Trying to express this as interface would lead to a whole bunch of trouble.


I'm still not sure of the original syntax. With Intellisense and Resharper etc isn't the original syntax going give you all sorts of visual issues in the editor? Why not just go with the dictionary syntax and save the hassle? Plus it stops the confusion of appearing to be strongly typed...


Question: can a view contain more rows than the underlying document database? For example: assume an invoice database (each document is an invoice with buyer's and seller's Tax ID). I want to create index: Tax ID -> #of invoices, where tax id can belong either to buyer or seller. In worst case scenario, unique tax IDs in every invoice, we'll have index with 2N entries. How view syntax would look like?

Ayende Rahien


You missed the part about there being no editor?

Ayende Rahien


look at the next post

Comment preview

Comments have been closed on this topic.


  1. RavenDB Conference 2016–Slides - 7 hours from now
  2. Proposed solution to the low level interview question - about one day from now

There are posts all the way to Jun 02, 2016


  1. The design of RavenDB 4.0 (14):
    26 May 2016 - The client side
  2. RavenDB 3.5 whirl wind tour (14):
    25 May 2016 - Got anything to declare, ya smuggler?
  3. Tasks for the new comer (2):
    15 Apr 2016 - Quartz.NET with RavenDB
  4. Code through the looking glass (5):
    18 Mar 2016 - And a linear search to rule them
  5. Find the bug (8):
    29 Feb 2016 - When you can't rely on your own identity
View all series



Main feed Feed Stats
Comments feed   Comments Feed Stats