Ayende @ Rahien

It's a girl

Time series feature design: The Consensus has dRafted a decision

So, after reaching the conclusion that replication is going to be hard, I went back to the office and discussed those challenges and was in general pretty annoyed by it. Then Michael made a really interesting suggestion. Why not put it on RAFT?

And once he explained what he meant, I really couldn’t hold my excitement. We now have a major feature for 4.0. But before I get excited about that (we’ll only be able to actually start working on that in a few months, anyway), let us talk about what the actual suggestion was.

Raft is a consensus algorithm. It allows a distributed set of computers to arrive into a mutually agreed upon set of sequential log records. Hm… I wonder where else we can find sequential log records, and yes, I am looking at you Voron.Journal.

The basic idea is that we can take the concept of log shipping, but instead of having a single master/slave relationship, we change things so we can put Raft in the middle. When committing a transaction, we’ll hold off committing the transaction until we have a Raft consensus that it should be committed. The advantage here is that we won’t be constrained any longer by the master/slave issue. If there is a server down, we can still process requests (maybe need to elect a new cluster leader, but that is about it).

That means that from an architectural standpoint, we’ll have the ability to process write requests for any quorum (N/2+1). That is a pretty standard requirement for distributed databases, so that is perfectly fine.

That is a pretty awesome thing to have, to be honest, and more importantly, this is happening at the low level storage layer. That means that we can apply this behavior not just to a single database solution, but to many database solutions.

I’m pretty excited about this.


Kijana Woodard
02/28/2014 12:31 PM by
Kijana Woodard

Lol. I've been pretty excited at the prospect of applying raft to voron/raven since this series started. Good to know I wasn't completely insane. Wish I wasn't a month behind you guys. 😁

03/03/2014 01:26 PM by

This kind of replication using concensus, is it something you would apply only for increasing fault tolerance between local nodes or also for replicating between nodes distributed over the world?

Ayende Rahien
03/03/2014 03:09 PM by
Ayende Rahien

Frank, That would be something that I'll apply in general whenever I need HA.

Comments have been closed on this topic.