Design exerciseDistributing (consistent) data at scale, answer

time to read 3 min | 536 words

imageIn my previous post I talked about an interesting challenge, distributing data among many different nodes, each of which can act independently on this data. Sadly, the best example for this scenario is distributing ads, but I hope you’ll excuse me on that.

The first thing to realize about this task that this is basically a synchronization problem over anything else. We define a certain location as the primary one, the one that will accept all the modifications to the data. Each of the nodes will then simply need to connect to that location every now and then and get all the updates.

“Basically a synchronization problem” is like saying that getting a P1 emergency bug at Friday night just before the movie starts is a “tad annoying”. The problem is that to do synchronization properly, you have to model your data properly, make sure that you keep track of changes and be able to send partial changes down the wire efficiently. That is not a simple task at all.

In this post, I want to offer another option to handle this. Using Raft. This is a strange use case for a consensus algorithm, I’ll admit. I guess that technically you could run a Raft consensus over 10,000 nodes. Just don’t expect it to be making any sort of decisions. So why am I offering Raft for a scenario when we have that many nodes? Because Raft, at its core, is a way to achieve consensus on a distributed log, that is all. And no one says that you must get that distributed log only via Raft directly.

The idea is basically to have a single source of truth. This can be a single server or it can be a Raft cluster with 3 – 7 nodes in it. All writes in the system are going to go there. The actual process is very well understood and there are multiple ways to do that. The simplest one to consume is likely rqlite. The log, in the case of rqlite, is going to be the SQL statements that are going to be applied to a sqlite database.

But how does this solve the problem of distributing the data? The answer is simple, we already have a way to disseminate distributed state, the log itself. What is going to happen is that you’ll have each of the nodes in the edge connect to the cluster and ask for a copy of the log as of the last committed entry that they have. When they get that, they can apply these statements to their own local copy of sqlite and know that they are now up to date with the state of the system at that time frame.

This approach skips over the need to architect your data for sync (which is hard) and push all of that complexity down the stack to your infrastructure. If the number of nodes you have is large enough, you might need to introduce mirrors to reduce the load. But that fits very nicely into the architecture without really needing to change something.

More posts in "Design exercise" series:

  1. (19 Dec 2018) Distributing (consistent) data at scale, answer
  2. (18 Dec 2018) Distributing (consistent) data at scale
  3. (26 Nov 2018) A generic network protocol