Set based operations with RavenDB

time to read 4 min | 762 words

One of the most common problems that people run into with NoSQL databases is the inability to perform set based operations. A typical example of that would be:

DELETE FROM Users WHERE LastLogin < '2009-01-01'
UPDATE Users SET IsActive = 0 WHERE LastLogin < '2010-01-01'

The problem is that most NoSQL solutions really have no notion about set based operations, but they are something that:

  • Users really like
  • Can significantly simplify users’ code
  • Drastically reduce the number of remote calls

With all of that, I decided that I really want RavenDB to support SET based operations. And now it does :-)

The secret for set based operations is that you need two things:

  • What should you execute the operation on? (defining the set that you are working on). This is the WHERE clause.
  • What is the operation?

We will start with DELETE, because that is that operation is very simple to understand. So we only need to figure out what the set we operate on is. Well, Raven already have a way to express WHERE clauses, using its Indexing system. There is no reason why we can’t use the same mechanism for that.

Here is how we would execute the first statement using Raven. First, we need to define the appropriate index:

// Users/LastLogin
from user in docs.Users
select new { user.LastLogin }

And now we can issue an HTTP DELETE call:

DELETE /bulk_docs/Users/LastLogin?query=LastLogin:[00010101000000000 TO 20090101000000000]&allowStale=True

This asks Raven do delete all the documents where the LastLogin is between 01-01-0001 and 01-01-2009.

There is one important thing to note here, Raven’s indexes allow stale reads. But SET based operations will not work on stale indexes by default.

In this case, I am pretty sure that the index is capable of handling requests that are that far in the past, so I can safely add the allowStale=True. If we remove that, or specify allowStale=False, the operation will only succeed if the index is up to date. We can also specify a cutoff date for the stale check. This is similar to how this works everywhere with Raven.

So far, this has been pretty easy, what about the next step? Updating a set of documents? That presented some challenges, how do you define the operation?

Luckily, Raven has support for the PATCH command, which means that you can specify an update command very easily. Using the same index, we can specify:

PATCH /bulk_docs/Users/LastLogin?query=LastLogin:[00010101000000000 TO 20100101000000000]&allowStale=True

[
  { 
     "Type": "Set",
     "Name": "IsActive",
     "Value": false           
   }
]

Which will set the IsActive property to false on all the documents whose LastLogin is before 2010.

And yes, you can execute those operations using the Client API as well.