Enhancing the RavenDB consistency model
From the get go, we have built RavenDB to be as easy to use as possible. Quite explicitly, wherever possible, we aimed to make it as simple as possible for someone familiar with .NET. The fun part is sometimes looking at people’s expectations and answering them is a pleasure, such as this one:
The reasons for this exchange being amusing is that RavenDB doesn’t have the concept of expensive queries. For the most parts, queries are merely scanning through an already computed index. RavenDB does no computation and very little work to answer your queries, so the notion of an expensive query is meaningless, there aren’t any.
That said, this behavior does comes at a cost, it means that we have to do the work of computing the indexes at some stage, and in our case, we have chosen to do so on the background. What that means, in turn, is the whole notion of queries potentially returning stale results. As it turned out, there is a actually a surprisingly few cases where you can’t tolerate some staleness in your application, and that makes RavenDB highly suited for a large number of scenarios.
There is, however, one common scenario that is annoying, the Create –> List flow. As a user, I want to complete the Create screen:
And then I want to see the List screen:
And naturally I expect to see the just created item in the list. That is the one scenario where people usually run into RavenDB’s consistency model, and it brought a few complaints every now and then. If the timing is just wrong, you may be able to issue the next request before the addition to the database had the chance to index, resulting in a missing update.
Usually, the advice was to add a WaitForNonStaleResultsAsOfNow(), which resolved this issue, so we didn’t really consider this any further. It was just one of those things that you had to understand about the way RavenDB worked.
Recently, however, we got a bug report for this exact scenario, and the user couldn’t just use WaitForNonStaleResultsAsOfNow(), or to be rather more accurate, he was already using it, but it wasn’t working. We eventually figured out that the problem was clock synchronization between the server and client computers. That forced us to re-consider how to approach this. After looking at several alternatives, we ended up creating a new consistency model for RavenDB.
In this consistency model, we are basically ensuring a very simple metric, “you can query anything that you have written”, instead of relying on the time being the same everywhere, we have decided to use the etags that are being reported back to the server. Using this approach, you can basically ignore the difference between RavenDB and the traditional Relational Database, since it will behave (externally), in much the same fashion.
You can enable this mode on a per query basis:
Or you can enable this for all queries:
I am pretty proud of this feature, since it simplify a lot for most users, and it provides a very effective (and simple) way to approach consistency. This is especially true if we are talking about multiple clients working on the same database (which is the case in the issue that was raised).
Whenever each client is writing, they will have to wait for their changes to be indexed, but they won’t have to wait for the changes from any other client. That matches the user’s expectation, and also allow RavenDB to answer most queries without doing any waiting whatsoever.
Comments
As the user referred to here: "the user couldn’t just use WaitForNonStaleResultsAsOfNow(), or to be rather more accurate, he was already using it, but it wasn’t working" I obviously think this is a buh-rilliant feature. The implementation seems really clean and should be really easy to implement. Now, I'm sure you feel this coming...wait for it...wait for it...so when's the next stable release :-).
How does this work in web scenarios where the next request would be from a new session instance?
Eric, The information is held at document store, not the session level.
Sounds like a pretty awesome feature. Maybe even something that should be on by default? Guess I should look to see how it works because I don't understand how it would know what writes are mine in a stateless world like the web.
Eric, "Client" as in DocumentStore == machine. Everrett's case was literally different apps contacting the same server, so each would only need to consider its own changes.
I ran into this issue, but my solution at the time was adopting the Rails methodology of saving a new item. First you save the document, then redirect to a show/details page. Since you can query off the id directly you don't have to wait for an index to update. The user is just looking for confirmation that their work has been saved, so a details page is just as much confirmation as a list. Also the user will most likely want to enter another item, so you can transition them back to an insert/edit page while the indexes update (super fast).
Does this solution mean that RavenDB users should stop using the WaitForNonStaleResultsAsOfNow() method and use this (as long as they upgrade)?
Khalid, This is probably better, yes.
You are a genius! You have just solved the only pain point I have experienced with RavenDB. I concur with having this on by default, but that is just me :)
Thanks!
Randall, I thought about making this the default, but I am not sure yet, we'll see after we have some time with this feature in the field.
This is a feature that I've often referred to as "local consistency" to distinguish it from eventual consistency. Most CQRS implementations don't have this feature. It does quite a lot to put a users mind at ease, I i insist upon it. Glad to see you've added it to Raven.
hug
awesome awesome awesome.
I'm going to start to set this as a default and see how things pan out. so kewl!
Has NuGet been updated, yet?
Justin, NuGet will be updated on Monday
Awesome. This solves the problem we have with Patching denormalized data, where we need to make sure our index includes everything that has been written.
Martin, Yes, we are working on the docs, we intend to release a new website soon, and that is probably when we will have all of the new info there. You can see the pending docs tasks here: https://github.com/ravendb/ravendb/issues
Have you remembered to document this feature ?
...and also the session.Advanced.DocumentStore.AggressivelyCacheFor() feature (and all other features you mention here on the blog).
The documentation on the on the RavenDB website seems very limited compared to how many features I have read about here on your blog.
Something must be wrong with your blog and the way it handles dates, as your answer comes before my question (I asssume you dont have magic powers) :)
The date for my question is 06/29/2011 02:03 and the date for your answer is 06/28/201110:42 PM.
Can you point me to an article talking about RavenDB and how to handle changes in the the document.
That is my biggest concern with any NoSql solution, because i am not experienced and i acknowlegde that, so i know that what i start out with will probably not be what i want to end up with. It seems impossible to figure out all the data i will need in a document from the start.
Most likely as the project grows I will need to change the structure of the document and maybe rename something and merge data from one document to another and stuff like that, but how do i handle that easily with RavenDB?
Does RavenDB consume a lot of memory compared to etc. Ms Sql Server because of all the indexes and that the data is not as normalized as in a RDBMS ?
I still can't understand how you distinguish one client from another. Do they identify during the initial connection? In other words, how (if at all) can I make the user wait for the item they just created, without waiting for all other users, while being served by multiple web servers without a sticky session?
Avish, A client in this sense is each document store. The scenario at hand is that each client is running in a different application / process.
Martin, We had some issue with the server datetime, which caused it to drift. Basically, it handles changes just fine. If you add a property, it will be added to the document on the first save on the new value. If you remove a property, it will be remvoed on the next save. The only thing that requires work is renaming a property, and we have explicit support for doing that.
As for memory usage, that really depend on the actual usage. The short answer is that it shouldn't take too much memory, and is usually much more effiecnt (joins takes a LOT of memory.
Comment preview