Ayende @ Rahien

Hi!
My name is Ayende Rahien
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:

ayende@ayende.com

+972 52-548-6969

@

Posts: 5,947 | Comments: 44,543

filter by tags archive

Data roles don’t scale up/down dynamically


Originally posted at 12/5/2010

One question that comes up frequently with regards to running RavenDB on the cloud is how to take advantage of the cloud’s ability to scale up & down dynamically. The question to that is actually quite interesting.

You don’t.

That seems to give most people a pause, because it is totally unexpected. On the cloud, people expects to be able to dynamically adjust the number of servers they have based on their current load. After all, this is what you do with web and worker roles, no?

The problem with that logic chain is that is assumes equality between a Database and a web/worker role. For the most part, web/worker roles are pretty much stateless, they may have caches, but that is about it. That makes it very easy to add new servers when there is heavy load and remove them when the load goes down.

But for data roles, you can’t really do that. What is going to happen to the data in that node when you take it down when there is less work to be done?

There are actually solutions for that, to tell you the truth, at least for RavenDB, because we can manipulate offline databases very easily, so we can shuffle them off & on machines by just copying the document. But for the most part, it is actually too much work. Even for very large loads, a small number of sharded servers can more than keep up with your application, and while it is theoretically nice to have the ability to do so, you usually don’t care.


Comments

Jonathan

The ability to linearly scale your data storage up or down is one the core tenants of Amazon's Dynamo paper along with the various implementations such as Cassandra and Riak. So, while it is being done, it does have the potential to introduce some interesting temporal and consistency problems into your application code which then has to reconcile divergent copies of the underlying data. This probably why Facebook opted to go with a fully consistent HBase for their new messaging platform rather than continue with Cassandra or another Dynamo clone.

Scooletz

@Jonathan,

speaking about Cassandra, you should scale up it part by part, remembering, that there is a growth factor you should not exceed in one scaling iteration, but finally it is linearly scalable :)

zvolkov

Membase does live cluster reconfiguration, but they are merely persistent memcached, no queries / indexes yet, but they will get there one day...

Comment preview

Comments have been closed on this topic.

FUTURE POSTS

No future posts left, oh my!

RECENT SERIES

  1. RavenDB Sharding (3):
    22 May 2015 - Adding a new shard to an existing cluster, splitting the shard
  2. The RavenDB Comic Strip (2):
    20 May 2015 - Part II – a team in trouble!
  3. Challenge (45):
    28 Apr 2015 - What is the meaning of this change?
  4. Interview question (2):
    30 Mar 2015 - fix the index
  5. Excerpts from the RavenDB Performance team report (20):
    20 Feb 2015 - Optimizing Compare – The circle of life (a post-mortem)
View all series

RECENT COMMENTS

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats