Large scale distributed consensus approachesLarge data sets

time to read 3 min | 464 words

In my previous post, I talked about how we can design a large cluster for compute bound operations. The nice thing about this is that is that the actual amount of shared data that you need is pretty small, and you can just distribute that information among your nodes, then let them do stateless computation on that, and you are done.

A much more common scenario is when can’t just do stateless operations, but need to keep track of what is actually going on. The typical example is a set of users changing data. For example, let us say that we want to keep track of the pages each user visit on our site. (Yes, that is a pretty classic Big Table scenario, I’ll ignore the prior art issue for now). How would we design such a system?

Well, we still have the same considerations. We don’t want a single point of failures, and we want to have very large number of machines and make the most of their resources.

In this case, we are merely going to change the way we look at the data. We still have the following topology:

image

There is the consensus cluster, which is responsible for cluster wide immediately consistent operations. And there are all the other nodes, which actually handle processing requests and keeping the data.

What kind of decisions do we get to make in the consensus cluster? Those would be:

  • Adding & removing nodes from the entire cluster.
  • Changing the distribution of the data in the cluster.

In other words, the state that the consensus cluster is responsible for is the entire cluster topology. When a request comes in, the cluster topology is used to decide into which set of nodes to direct it to.

Typically in such systems, we want to keep the data on three separate nodes, so we get a request, then route it to one of those three nodes that match this. This is done by sharding the data according the the actual user id whose page views we are trying to track.

Distributing the sharding configuration is done as described in the compute cluster example, and the actual handling of requests, or sending the data between the sharded instances is handled by the cluster nodes directly.

Note that in this scenario, you cannot ensure any kind of safety. Two requests for the same user might hit different nodes, and do separate operations without being able to consider the concurrent operation. Usually, that is a good thing, but that isn’t always the case. But that is an issue of the next post.

More posts in "Large scale distributed consensus approaches" series:

  1. (24 Nov 2014) Concurrent consistent decisions
  2. (20 Nov 2014) Large data sets
  3. (19 Nov 2014) Computing with a hundred node cluster
  4. (17 Nov 2014) Calculating a way out