Designing a document database: Concurrency

Mar 10 2009

Designing a document databaseConcurrency

time to read 2 min | 299 words

In my previous post, I asked about designing a document DB, and brought up the issue of concurrency, along with a set of questions that effect the design of the system:

What concurrency alternatives do we choose?

We have several options. Optimistic and pessimistic concurrency are the most obvious ones. Merge concurrency, such as the one implemented by Rhino DHT, is another. Note that we also have to handle the case where we have a conflict as a result of replication.

I think that it would make a lot of sense to support optimistic concurrency only. Pessimistic concurrency is a scalability killer in most system. As for conflicts as a result of concurrency, Couch DB handles this using merge concurrency, which may be a good idea after all. We can probably support both of them pretty easily.

It does cause problems with the API, however. A better approach might be to fail reads of documents with multiple versions, and force the user to resolve them using a different API. I am not sure if this is a good idea or a time bomb. Maybe returning the latest as well as a flag that indicate that there is a conflict? That would allow you to ignore the issue.

What about versioning?

In addition to the Document ID, each document will have an associated version. The Document Id is a UUID, which means that it can be generated at the client side. Each document is also versioned by the server accepting it. The version syntax follow the following format: [server guid]/[increasing numeric id]/[time].

That will ensure global uniqueness, as well as giving us all the information that we need for the document version.

Tweet Share Share 8 comments

Tags:

Databases

More posts in "Designing a document database" series:

(17 Mar 2009) What next?
(16 Mar 2009) Remote API & Public API
(16 Mar 2009) Looking at views
(15 Mar 2009) View syntax
(14 Mar 2009) Aggregation Recalculating
(13 Mar 2009) Aggregation
(12 Mar 2009) Views
(11 Mar 2009) Replication
(11 Mar 2009) Attachments
(10 Mar 2009) Authorization
(10 Mar 2009) Concurrency
(10 Mar 2009) Scale
(10 Mar 2009) Storage

Comments

10 Mar 2009
12:03 PM

Tobin Harris

When you say versioned, do you mean that we can rewind/fast forward in time to a particular version?

10 Mar 2009
12:12 PM

Ayende Rahien

No, I mean that you know what version a document is, useful for things like optimistic concurrency.

10 Mar 2009
15:35 PM

josh

So do you version at the record/document level or at the field level? Assuming you want to merge changes without overwriting someone else's update, you need a way to determine what changed at the field level. Maybe you don't care that much, or just make a user reload the latest version before allowing them to commit changes (which can be a bad user experience).

10 Mar 2009
15:44 PM

Ayende Rahien

No, I do not. I track this at field level.

This something like the way SVN track those changes.

10 Mar 2009
15:50 PM

josh

so you track what changes on the client/app side, and send only those changes with the relevant version?

10 Mar 2009
15:56 PM

Ayende Rahien

I am not sure that I understand what you mean.

What I intend to do is actually create a very simple system. If you update a document with a version that is not the latest, I am going to reject the update.

10 Mar 2009
16:02 PM

josh

ok. guess I was throwing partial updates in there too. If the user experience of requiring the document to be reloaded if its not the latest before accepting changes is acceptable, then good. it is, by far, much easier to implement.

10 Mar 2009
17:21 PM

huey

The server rejecting an update does not translate into the user experience. It is up to the client code to know how to handle this rejection in a user pleasing and business logic acceptable way.

Comment preview

Comments have been closed on this topic.

Oren Eini

Oren Eini

CEO of RavenDB