RavenDB 3.5 whirl wind tour: A large cluster goes into a bar and order N^2 drinks

Apr 29 2016

RavenDB 3.5 whirl wind tourA large cluster goes into a bar and order N^2 drinks

time to read 3 min | 443 words

Imagine that you have a two nodes cluster, setup as master-master replication, and then you write a document to one of them. The node you wrote the document to now contacts the 2nd node to let it knows about the new document. The data is replicated, and everything is happy in this world.

But now let us imagine it with three nodes. We write to node 1, which will then replicate to nodes 2 and 3. But node 2 is also configured to replicate to node 3, and given that we have a new document in, it will do just that. Node 3 will detect that it already have this document and turn that into a no-op. But at the same time that node 3 is getting the document from node 2, it is also sending the document it got from node 1 to node 2.

This work, and it evens out eventually, because replicating a document that was already replicated is safe to do. And on high load systems, replication is batched, so you typically don't see a problem until you get to bigger cluster sizes.

Let us take the following six way cluster. In this graph, we are going to have 15 round trips across the network on each replication batch.*

* Nitpicker corner, yes, I know that the number of connections is ( N * (N-1) ) / 2, but N^2 looks better in the title.

mes-topology

The typical answer we have for this is to change the topology, instead of having a fully connected graph, with each node talking to all other nodes, we use something like this:

tree-topology

Real topologies typically have more than a single path, and it isn't a hierarchy, but this is to illustrate a point.

This work, but it requires the operations team to plan ahead when they deploy, and if you didn't allow for breakage, a single node going down can disconnect large portion of your cluster. That is not ideal.

So in RavenDB 3.5 we have taken steps to avoid it, nodes are now much smarter about the way they go about talking about their changes. Instead of getting all fired up and starting to send replication message all over the place, potentially putting some serious pressure on the network, the nodes will be smarter about it, and wait a bit to see if their siblings already got the documents from the same source. In which case, we now only need to ping them periodically to ensure that they are still connected, and we saved a whole bunch of bandwidth.

Tweet Share Share 3 comments

Tags:

raven

More posts in "RavenDB 3.5 whirl wind tour" series:

(25 May 2016) Got anything to declare, ya smuggler?
(23 May 2016) I'm no longer conflicted about this
(19 May 2016) What did you subscribe to again?
(17 May 2016) See here, I got a contract, I say!
(13 May 2016) Deeper insights to indexing
(11 May 2016) Digging deep into the internals
(09 May 2016) I'll have the 3+1 goodies to go, please
(04 May 2016) I’ll find who is taking my I/O bandwidth and they SHALL pay
(02 May 2016) You want all the data, you can’t handle all the data
(29 Apr 2016) A large cluster goes into a bar and order N^2 drinks
(27 Apr 2016) I’m the admin, and I got the POWER
(25 Apr 2016) Can you spare me a server?
(21 Apr 2016) Configuring once is best done after testing twice
(19 Apr 2016) Is this a cluster in your pocket AND you are happy to see me?

Comments

29 Apr 2016
11:33 AM

mark

Isn't this what gossip protocols are supposed to address? What's your experience with those?

29 Apr 2016
12:56 PM

Oren Eini

Mark, gossip protocols are typically used for clusters that are very large (hundreds of nodes or higher). We looked at a couple of those, and we'll probably end up implementing them in the future, but right now we'll make do with the poor cousin

29 Apr 2016
14:51 PM

Tal Weiss

@mark gossip protocols are perfect for this but they come with the price of performance , they use a lot of communication that we are trying to minimize... For the sizes of clusters that we deal with it doesn't worth the overhead.

Comment preview

Comments have been closed on this topic.

Oren Eini

Oren Eini

CEO of RavenDB