Ayende @ Rahien

My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:


+972 52-548-6969

, @ Q c

Posts: 6,026 | Comments: 44,842

filter by tags archive

Data roles don’t scale up/down dynamically

time to read 2 min | 262 words

Originally posted at 12/5/2010

One question that comes up frequently with regards to running RavenDB on the cloud is how to take advantage of the cloud’s ability to scale up & down dynamically. The question to that is actually quite interesting.

You don’t.

That seems to give most people a pause, because it is totally unexpected. On the cloud, people expects to be able to dynamically adjust the number of servers they have based on their current load. After all, this is what you do with web and worker roles, no?

The problem with that logic chain is that is assumes equality between a Database and a web/worker role. For the most part, web/worker roles are pretty much stateless, they may have caches, but that is about it. That makes it very easy to add new servers when there is heavy load and remove them when the load goes down.

But for data roles, you can’t really do that. What is going to happen to the data in that node when you take it down when there is less work to be done?

There are actually solutions for that, to tell you the truth, at least for RavenDB, because we can manipulate offline databases very easily, so we can shuffle them off & on machines by just copying the document. But for the most part, it is actually too much work. Even for very large loads, a small number of sharded servers can more than keep up with your application, and while it is theoretically nice to have the ability to do so, you usually don’t care.



The ability to linearly scale your data storage up or down is one the core tenants of Amazon's Dynamo paper along with the various implementations such as Cassandra and Riak. So, while it is being done, it does have the potential to introduce some interesting temporal and consistency problems into your application code which then has to reconcile divergent copies of the underlying data. This probably why Facebook opted to go with a fully consistent HBase for their new messaging platform rather than continue with Cassandra or another Dynamo clone.



speaking about Cassandra, you should scale up it part by part, remembering, that there is a growth factor you should not exceed in one scaling iteration, but finally it is linearly scalable :)


Membase does live cluster reconfiguration, but they are merely persistent memcached, no queries / indexes yet, but they will get there one day...

Comment preview

Comments have been closed on this topic.


No future posts left, oh my!


  1. Technical observations from my wife (3):
    13 Nov 2015 - Production issues
  2. Production postmortem (13):
    13 Nov 2015 - The case of the “it is slow on that machine (only)”
  3. Speaking (5):
    09 Nov 2015 - Community talk in Kiev, Ukraine–What does it take to be a good developer
  4. Find the bug (5):
    11 Sep 2015 - The concurrent memory buster
  5. Buffer allocation strategies (3):
    09 Sep 2015 - Bad usage patterns
View all series


Main feed Feed Stats
Comments feed   Comments Feed Stats