Ayende @ Rahien

Hi!
My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:

ayende@ayende.com

+972 52-548-6969

, @ Q c

Posts: 18 | Comments: 87

filter by tags archive

RavenDB Auto Sharding Bundle Design–Early Thoughts

time to read 4 min | 725 words

Originally posted at 4/19/2011

RavenDB Auto Sharding is an implementation of sharding on the server. As the name implies, it aims to remove all sharding concerns from the user. At its core, the basic idea is simple. You have a RavenDB node with the sharding bundle installed. You just work with it normally.

At some point you realize that the data has grown too large for a single server, so you need to shard the data across multiple servers. You bring up another RavenDB server with the sharding bundle installed. You wait for the data to re-shard (during which time you can still read / write to the servers). You are done.

At least, that is the goal. In practice, there is one step that you would have to do, you would have to tell us how to shard your data. You do that by defining a sharding document, which looks like this:

{ // Raven/Sharding/ByUserName
  "Limits": [3],
  "Replica": 2
  "Definitions": [
    {
      "EntityName": "Users",
      "Paths": ["Username"]
    },
    {
      "EntityName": "Posts",
      "Paths": ["AuthorName"]
    }
  ]
}

There are several things to not here. We define a sharding document that shards on just one key, and the shard key has a length of 3. We also define different ways to retrieve the sharding key from the documents based on the entity name. This is important, since you want to be able to say that posts by the same user would sit on the same shard.

Based on the shard keys, we generate the sharding metadata:

{ "Id": "chunks/1", "Shards": ["http://shard1:8080", "http://shard1-backup:8080"], "Name": "ByUserName", "Range": ["aaa", "ddd"] }
{ "Id": "chunks/2", "Shards": ["http://shard1:8080", "http://shard2-backup:8080"], "Name": "ByUserName", "Range": ["ddd", "ggg"] }
{ "Id": "chunks/3", "Shards": ["http://shard2:8080", "http://shard3-backup:8080"], "Name": "ByUserName", "Range": ["ggg", "lll"] }
{ "Id": "chunks/4", "Shards": ["http://shard2:8080", "http://shard1-backup:8080"], "Name": "ByUserName", "Range": ["lll", "ppp"] }
{ "Id": "chunks/5", "Shards": ["http://shard3:8080", "http://shard2-backup:8080"], "Name": "ByUserName", "Range": ["ppp", "zzz"] }
{ "Id": "chunks/6", "Shards": ["http://shard3:8080", "http://shard3-backup:8080"], "Name": "ByUserName", "Range": ["000", "999"] }

This information gives us a way to make queries which are both directed (against a specific node, assuming we include the shard key in the query) or global (against all shards).

Note that we split the data into chunks, each chunk is going to be sitting in two different servers (because of the Replica setting above). We can determine which shard holds which chunk by using the Range data.

Once  a chunk grows too large (25,000 documents, by default), it will split, potentially moving to another server / servers.

Thoughts?


Comments

Jer0enH

how to interpret the Range in your example?

from (inclusive) - to (exclusive), I presume but what's up with the 2 last ranges (ppp-zzz and 000-999)?

Richard Slater

Brings a tear to my eye, something so simple and so beautiful in comparison to other options in the .NET arena.

Ayende Rahien

Jer0enH,

The example is meant to demonstrate an idea, not be executable code.

Michael L Perry

I'm somewhat concerned by the idea of keeping the configuration itself in a document. I understand that it's a common pattern (for example, import a SQL database into Visio and you might see sysdiagrams), but I've seen cases where meta information interferes with application information. I can't identify a specific problem in this case, yet.

tobi

We would have to worry about skewed data distributions with this scheme. We would probably not shard on a natural key (Customer Name), mostly on a surrogate (ID). So those might be skewed (and identity columns would only grow at the end, making only one shard bigger). This might benefit from hash distribution (loosing range queries thereby).

A solution to skew would also be to lower the chunk size considerably, so we could keep range queries.

Ayende Rahien

Michael,

Just about all the configuration for RavenDB and RavenDB bundles are done using documents. That means that we have drastically simplified a lot of problems for ourselves, because we have a consistent medium.

For example, change notifications are handled once, and easily handled, unlike if you were storing this anywhere else.

Simon Hughes

When sharded, will you be able to perform parallel queries on each shard?

SQL server is able to do this when its partitioned, and greatly improves performance.

Ayende Rahien

Simon,

Sure, that is just an issue at the client side.

We actually already have client side sharding, including the ability to run parallel queries.

Simon Hughes

For auto-sharding, you won't need to specify a key. You distribute the data evenly between the specified number of shards, perhaps in a round robin fashion.

For retrieval, you send the same query to all databases, and collate the data returned from all databases into a list and return back to the user.

By not specifying a key, it would mean you always need to query all shards for the data, but this happends in parallel so the query will still be fast.

Ayende Rahien

Simon,

The problem is that then you have:

a) very hard time replicating data (how do you replicate with the key to decide what to replicate and where)

b) you have to query all servers, that means that load is still heavy on all servers.

c) as the number of servers grow, the cost of querying them gets very high.

Simon Hughes

Ok, for auto-sharding with no key specified by the user, you would have to do it like this:

A)

  • Have master auto-key shard document, that is replicated to all databases. Or alternatively, provide a separate auto-key shard database.

  • Auto-shard data evenly between shards, recording the ID in the master shard document.

B) Yes, you have to query all servers as no key is specified, that is the penalty the user has to pay for auto-sharding without a key. This is similar to a table scan in SQL server.

C) Yes, again that is a penaly the user has to pay for auto-sharding without a key. That is why you would always recommend the designer to specify a key. The auto-key is simple to get running across your whole database instantly, but has a cost implication. However, the auto-sharding with no key will still out-perform a database running on a single database given a large enough dataset.


Random thought) Got me thinking about the separate auto-key shard database. I wonder if you can create a parity database like Raid 5, and provide data redundancy in case a shard went off-line? Then auto-rebuild the shard when it came back up.

Ayende Rahien

Simon,

There is already a unique key involved, the document key. No need to get complicated. But it is actually much more common for you to want to control how you are doing this sort of thing, because you desire locality of reference.

Comment preview

Comments have been closed on this topic.

FUTURE POSTS

  1. Buffer allocation strategies: A possible solution - 2 days from now
  2. Buffer allocation strategies: Explaining the solution - 3 days from now
  3. Buffer allocation strategies: Bad usage patterns - 4 days from now
  4. The useless text book algorithms - 5 days from now
  5. Find the bug: The concurrent memory buster - 6 days from now

There are posts all the way to Sep 11, 2015

RECENT SERIES

  1. Find the bug (5):
    20 Apr 2011 - Why do I get a Null Reference Exception?
  2. Production postmortem (10):
    03 Sep 2015 - The industry at large
  3. What is new in RavenDB 3.5 (7):
    12 Aug 2015 - Monitoring support
  4. Career planning (6):
    24 Jul 2015 - The immortal choices aren't
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats