RavenDB Auto Sharding Bundle Design–Early Thoughts

time to read 4 min | 725 words

Originally posted at 4/19/2011

RavenDB Auto Sharding is an implementation of sharding on the server. As the name implies, it aims to remove all sharding concerns from the user. At its core, the basic idea is simple. You have a RavenDB node with the sharding bundle installed. You just work with it normally.

At some point you realize that the data has grown too large for a single server, so you need to shard the data across multiple servers. You bring up another RavenDB server with the sharding bundle installed. You wait for the data to re-shard (during which time you can still read / write to the servers). You are done.

At least, that is the goal. In practice, there is one step that you would have to do, you would have to tell us how to shard your data. You do that by defining a sharding document, which looks like this:

{ // Raven/Sharding/ByUserName
  "Limits": [3],
  "Replica": 2
  "Definitions": [
    {
      "EntityName": "Users",
      "Paths": ["Username"]
    },
    {
      "EntityName": "Posts",
      "Paths": ["AuthorName"]
    }
  ]
}

There are several things to not here. We define a sharding document that shards on just one key, and the shard key has a length of 3. We also define different ways to retrieve the sharding key from the documents based on the entity name. This is important, since you want to be able to say that posts by the same user would sit on the same shard.

Based on the shard keys, we generate the sharding metadata:

{ "Id": "chunks/1", "Shards": ["http://shard1:8080", "http://shard1-backup:8080"], "Name": "ByUserName", "Range": ["aaa", "ddd"] }
{ "Id": "chunks/2", "Shards": ["http://shard1:8080", "http://shard2-backup:8080"], "Name": "ByUserName", "Range": ["ddd", "ggg"] }
{ "Id": "chunks/3", "Shards": ["http://shard2:8080", "http://shard3-backup:8080"], "Name": "ByUserName", "Range": ["ggg", "lll"] }
{ "Id": "chunks/4", "Shards": ["http://shard2:8080", "http://shard1-backup:8080"], "Name": "ByUserName", "Range": ["lll", "ppp"] }
{ "Id": "chunks/5", "Shards": ["http://shard3:8080", "http://shard2-backup:8080"], "Name": "ByUserName", "Range": ["ppp", "zzz"] }
{ "Id": "chunks/6", "Shards": ["http://shard3:8080", "http://shard3-backup:8080"], "Name": "ByUserName", "Range": ["000", "999"] }

This information gives us a way to make queries which are both directed (against a specific node, assuming we include the shard key in the query) or global (against all shards).

Note that we split the data into chunks, each chunk is going to be sitting in two different servers (because of the Replica setting above). We can determine which shard holds which chunk by using the Range data.

Once a chunk grows too large (25,000 documents, by default), it will split, potentially moving to another server / servers.

Thoughts?

Tweet Share Share 12 comments

Tags:

Raven

Comments

03 May 2011
09:17 AM

Jer0enH

how to interpret the Range in your example?

from (inclusive) - to (exclusive), I presume but what's up with the 2 last ranges (ppp-zzz and 000-999)?

03 May 2011
10:01 AM

Richard Slater

Brings a tear to my eye, something so simple and so beautiful in comparison to other options in the .NET arena.

03 May 2011
10:07 AM

Ayende Rahien

Jer0enH,

The example is meant to demonstrate an idea, not be executable code.

03 May 2011
17:57 PM

Michael L Perry

I'm somewhat concerned by the idea of keeping the configuration itself in a document. I understand that it's a common pattern (for example, import a SQL database into Visio and you might see sysdiagrams), but I've seen cases where meta information interferes with application information. I can't identify a specific problem in this case, yet.

03 May 2011
18:08 PM

tobi

We would have to worry about skewed data distributions with this scheme. We would probably not shard on a natural key (Customer Name), mostly on a surrogate (ID). So those might be skewed (and identity columns would only grow at the end, making only one shard bigger). This might benefit from hash distribution (loosing range queries thereby).

A solution to skew would also be to lower the chunk size considerably, so we could keep range queries.

04 May 2011
08:08 AM

Ayende Rahien

Michael,

Just about all the configuration for RavenDB and RavenDB bundles are done using documents. That means that we have drastically simplified a lot of problems for ourselves, because we have a consistent medium.

For example, change notifications are handled once, and easily handled, unlike if you were storing this anywhere else.

04 May 2011
08:51 AM

Simon Hughes

When sharded, will you be able to perform parallel queries on each shard?

SQL server is able to do this when its partitioned, and greatly improves performance.

04 May 2011
08:56 AM

Ayende Rahien

Simon,

Sure, that is just an issue at the client side.

We actually already have client side sharding, including the ability to run parallel queries.

04 May 2011
08:59 AM

Simon Hughes

For auto-sharding, you won't need to specify a key. You distribute the data evenly between the specified number of shards, perhaps in a round robin fashion.

For retrieval, you send the same query to all databases, and collate the data returned from all databases into a list and return back to the user.

By not specifying a key, it would mean you always need to query all shards for the data, but this happends in parallel so the query will still be fast.

04 May 2011
09:01 AM

Ayende Rahien

Simon,

The problem is that then you have:

a) very hard time replicating data (how do you replicate with the key to decide what to replicate and where)

b) you have to query all servers, that means that load is still heavy on all servers.

c) as the number of servers grow, the cost of querying them gets very high.

04 May 2011
09:17 AM

Simon Hughes

Ok, for auto-sharding with no key specified by the user, you would have to do it like this:

Have master auto-key shard document, that is replicated to all databases. Or alternatively, provide a separate auto-key shard database.
Auto-shard data evenly between shards, recording the ID in the master shard document.

B) Yes, you have to query all servers as no key is specified, that is the penalty the user has to pay for auto-sharding without a key. This is similar to a table scan in SQL server.

C) Yes, again that is a penaly the user has to pay for auto-sharding without a key. That is why you would always recommend the designer to specify a key. The auto-key is simple to get running across your whole database instantly, but has a cost implication. However, the auto-sharding with no key will still out-perform a database running on a single database given a large enough dataset.

Random thought) Got me thinking about the separate auto-key shard database. I wonder if you can create a parity database like Raid 5, and provide data redundancy in case a shard went off-line? Then auto-rebuild the shard when it came back up.

04 May 2011
09:20 AM

Ayende Rahien

Simon,

There is already a unique key involved, the document key. No need to get complicated. But it is actually much more common for you to want to control how you are doing this sort of thing, because you desire locality of reference.

Comment preview

Comments have been closed on this topic.

Markdown turns plain text formatting into fancy HTML formatting.

Phrase Emphasis

*italic*   **bold**
_italic_   __bold__

Links

Inline:

An [example](http://url.com/ "Title")

Reference-style labels (titles are optional):

An [example][id]. Then, anywhere
else in the doc, define the link:
  [id]: http://example.com/  "Title"

Images

Inline (titles are optional):

![alt text](/path/img.jpg "Title")

Reference-style:

![alt text][id]
[id]: /url/to/img.jpg "Title"

Headers

Setext-style:

Header 1
========
Header 2
--------

atx-style (closing #'s are optional):

# Header 1 #
## Header 2 ##
###### Header 6

Lists

Ordered, without paragraphs:

1.  Foo
2.  Bar

Unordered, with paragraphs:

*   A list item.
    With multiple paragraphs.
*   Bar

You can nest them:

*   Abacus
    * answer
*   Bubbles
    1.  bunk
    2.  bupkis
        * BELITTLER
    3. burper
*   Cunning

Blockquotes

> Email-style angle brackets
> are used for blockquotes.
> > And, they can be nested.
> #### Headers in blockquotes
> 
> * You can quote a list.
> * Etc.

Horizontal Rules

Three or more dashes or asterisks:

---
* * *
- - - -

Manual Line Breaks

End a line with two or more spaces:

Roses are red,   
Violets are blue.

Fenced Code Blocks

Code blocks delimited by 3 or more backticks or tildas:

```
This is a preformatted
code block
```

Header IDs

Set the id of headings with {#<id>} at end of heading line:

## My Heading {#myheading}

Tables

Fruit    |Color
---------|----------
Apples   |Red
Pears	 |Green
Bananas  |Yellow

Definition Lists

Term 1
: Definition 1
Term 2
: Definition 2

Footnotes

Body text with a footnote [^1]
[^1]: Footnote text here

Abbreviations

MDD <- will have title
*[MDD]: MarkdownDeep

Oren Eini

Oren Eini

CEO of RavenDB