Ayende @ Rahien

Unnatural acts on source code

Designing a document database: Concurrency

In my previous post, I asked about designing a document DB, and brought up the issue of concurrency, along with a set of questions that effect the design of the system:

  • What concurrency alternatives do we choose?

We have several options. Optimistic and pessimistic concurrency are the most obvious ones. Merge concurrency, such as the one implemented by Rhino DHT, is another. Note that we also have to handle the case where we have a conflict as a result of replication.

I think that it would make a lot of sense to support optimistic concurrency only. Pessimistic concurrency is a scalability killer in most system. As for conflicts as a result of concurrency, Couch DB handles this using merge concurrency, which may be a good idea after all. We can probably support both of them pretty easily.

It does cause problems with the API, however. A better approach might be to fail reads of documents with multiple versions, and force the user to resolve them using a different API. I am not sure if this is a good idea or a time bomb. Maybe returning the latest as well as a flag that indicate that there is a conflict? That would allow you to ignore the issue.

  • What about versioning?

In addition to the Document ID, each document will have an associated version. The Document Id is a UUID, which means that it can be generated at the client side. Each document is also versioned by the server accepting it. The version syntax follow the following format: [server guid]/[increasing numeric id]/[time].

That will ensure global uniqueness, as well as giving us all the information that we need for the document version.

Comments

Tobin Harris
03/10/2009 12:03 PM by
Tobin Harris

When you say versioned, do you mean that we can rewind/fast forward in time to a particular version?

Ayende Rahien
03/10/2009 12:12 PM by
Ayende Rahien

No, I mean that you know what version a document is, useful for things like optimistic concurrency.

josh
03/10/2009 03:35 PM by
josh

So do you version at the record/document level or at the field level? Assuming you want to merge changes without overwriting someone else's update, you need a way to determine what changed at the field level. Maybe you don't care that much, or just make a user reload the latest version before allowing them to commit changes (which can be a bad user experience).

Ayende Rahien
03/10/2009 03:44 PM by
Ayende Rahien

No, I do not. I track this at field level.

This something like the way SVN track those changes.

josh
03/10/2009 03:50 PM by
josh

so you track what changes on the client/app side, and send only those changes with the relevant version?

Ayende Rahien
03/10/2009 03:56 PM by
Ayende Rahien

I am not sure that I understand what you mean.

What I intend to do is actually create a very simple system. If you update a document with a version that is not the latest, I am going to reject the update.

josh
03/10/2009 04:02 PM by
josh

ok. guess I was throwing partial updates in there too. If the user experience of requiring the document to be reloaded if its not the latest before accepting changes is acceptable, then good. it is, by far, much easier to implement.

huey
03/10/2009 05:21 PM by
huey

The server rejecting an update does not translate into the user experience. It is up to the client code to know how to handle this rejection in a user pleasing and business logic acceptable way.

Comments have been closed on this topic.