A tale of eventually consistent ACID model

time to read 5 min | 830 words

imageI recently had a conversation about ACID, I don’t think it would surprise anyone that I’m a big proponent of ACID. After all, RavenDB was an ACID database from the first release.

When working with distributed systems, on the other hand, it is far harder to get ACID guarantees at a reasonable cost. Pretty much all the 1st generation NoSQL databases left ACID on the sidelines, because it is a hard problem. That was one of the primary reasons why RavenDB even exists. I couldn’t imagine living without transactions. This is a post from 2011, talking about just that topic.

Consistency in a distributed system is a hard problem, mostly because it has an impact on the design and performance of the system. It is also common to think about ACID as a binary property, which is sort of true (A for Atomic Smile). However, it turns out that the real world is a lot more nuanced than that.

I want to discuss the consistency model for RavenDB as it applies to running in a distributed cluster. It is ACID with eventual consistency, which doesn’t sound like it makes sense, right?

I found a good example to explain the importance of ACID operations from your database even in the presence of eventual consistency.

Consider the following scenario, we have a married couple with a shared bank account. Both husband and wife have a checkbook for the account and primarily use checks to pay for things in their day to day life.

Checks are anachronistic for some people, who are used to instant payments and wire transfers. The great thing about checks is that they are (by definition) a way to work in a distributed system. You hand someone a check and at some future point in time they will deposit that and get the money from your account.

One of the most important aspects of using checks was managing that delay. The amount of money you had in the account didn’t necessarily represent how much money you had available. If your rent check wasn’t deposited yet, you still had to consider the rest money “gone”, even if you could still see it in the bank statement.

Because of checks’ eventual consistency, a really important part of using checks was to keep track of all the outstanding checks that weren’t deposited yet. You did that by filling in the stub of the check in the checkbook whenever you wrote a check. In other words, you never gave a check before you properly filled the stub for that.

That brings us back to ACID. The act of filling the stub and writing the check is a transaction, composed of two separate actions. That action isn’t a global transaction. The husband and wife in our example do not have to coordinate with one another whenever they write a check. But they do need to ensure that no check would be handed off without a proper stub (and vice versa, if we want to be exact). If the act of writing a check and filling the stub isn’t atomic, you may have a check unexpectedly hit your account, which is… exciting (in the Chinese proverb  manner).

On the other side, the entity that you handed the check to also needs a transaction. They need to fill out an invoice for the check (even though it hasn’t been deposited yet). Having a check with no invoice or an invoice with no check is… bad (as in, IRS agents having shots and high fives during an audit).

The idea is that at the local level, you have to use transactions, otherwise, you cannot be sure about the consistency of your own data. If you don’t have transactions at the persistence layer, you’ll have to build it on top of that, which is… not ideal, really hard and usually not going to work in all cases.

With local transactions, you can then start pushing consistent data out and resolve all the distributed states you have.

Going back to our husband and wife example, for the most part, they can act completely independently of one another, and they’ll reconcile their account status with each other at a later date (weekly budget meeting). At the same time, there are certain transactions (pun intended) where they won’t act independently. A great example is buying a car, that sort of amount requires that both will be consulted on the purchase. That is a high value operation, so it is worth the additional cost of distributed consistency.

With RavenDB, we have the notion of local node transactions, which are then sent out to the rest of the nodes in the cluster in the background (async replication) as well as support for cluster wide transactions, which requires the consent of a majority of the nodes in the cluster. You can choose for each scenario exactly what level of transactions and consistency you want to have, local or global.