Ayende @ Rahien

Refunds available at head office

World’s smallest No SQL Database: ACID & Transactions

Even for a first attempt, World’s Smallest No  SQL Database is actually pretty good. We got 3 of the 4.

The DB is:

  • Atomic, since change will either go in or be rejected.
  • Consistent, you can’t see changes half way.
  • Isolated, one change can’t modify/view another.

It isn’t durable, because everything is in memory.

It even has transactions, in the sense that a change goes in or not in an atomic fashion.

But, I think you’ll agree, this is very much a poor man’s solution. And those only apply just as long as you need to work with just a single item. In practice, you would usually want to make a lot more than that.

The basic properties you want is the ability to do multi item transactions, so you can modify two items at the same time, and they either go in or not at all. And that is where it gets really  complex, because at that point we need to build our own way to handle atomic updates to multiple values, how to keep them isolated and how to handle this consistently.

There aren’t really simple way to handle that, especially if we want to handle concurrent transactions. You would have to make a lot of decisions all of a sudden. For example, what happens if you have two concurrent transactions trying to update the same value. Is one of them rejected? And if so, at what point? Initial write? Commit time?

And that is just the start.

  • Do you allow a transaction to live across multiple requests?
  • How do you handle transaction expiration?
  • Do you need to have a transaction span more than one node? How do you handle that?

And we haven’t even talked about making transactions durable. There is no point, since we don’t do any persistence. But we will touch on persistence in a future post, so let talk about durability right now.

In general, there are two major ways to handle transactions. Either an append only model or a transaction log. In both cases, concurrent transactions make it… interesting to figure out how to properly write to the log. And the moment that you have either option, you have to build in some form of a recovery tool for system or db crashes. You have better also love fsync, since it is a very important tool in making sure that you can actually get durable transactions.

Oh, and of course, I forgot, if we allow long running transactions…

  • Who can read the yet uncommitted value? Everyone? Just the current transaction? Can you explicitly ask to see uncommitted value?
  • How do you allow it to be read?
  • Is it available for queries?

This is usually the most touchy part in the system, since is so critical. It is also something that you have to take into account in pretty much the entire codebase.

Tags:

Posted By: Ayende Rahien

Published at

Originally posted at

Comments

tobi
07/18/2013 09:30 AM by
tobi

Transactions get a lot easier if you insist that all writes be submitted at once. That allows you to achieve deadlock freedom by sorting the keys.

Regarding writing to the log: I like the SQL Server model where they are able to write every record whenever they want as long as it is before the end of the transaction. They cannot write too early. They do this by ending the transaction with a final "commit" log record. If that is not found during recovery they rollback the transaction.

Harry McIntyre
07/18/2013 10:01 AM by
Harry McIntyre

This makes me feel very uneasy about some of the projects in my github account...

Ayende Rahien
07/18/2013 03:30 PM by
Ayende Rahien

Harray, What is the problem?

meisinger
07/19/2013 06:58 AM by
meisinger

so how long do you think it will be before you fall into the "two phase commit" paradigm?

Ayende Rahien
07/19/2013 07:12 AM by
Ayende Rahien

meisinger, This is actually a pretty big step from standard transactions to a 2PC transaction. 2PC is a LOT more complex.

Harry McIntyre
07/22/2013 09:33 AM by
Harry McIntyre

Heh, the problem is I have started to write my own database - your posts make me realise I'm walking off a cliff!

Comments have been closed on this topic.