What is up with RavenDB 2.0? Performance…

time to read 1 min | 115 words

Well, one thing that we put a lot of focus on was performance. In order to test that, I had a dataset of 4.66 million documents (IMDB data set, if you care) as well as two indexes defined.

The results for RavenDB 2.0 (drum roll):

Loading 4.66 millions records in 44 minutes. Average rate of less then half a millisecond per document.

But wait, what about the indexes? Well, RavenDB index stuff as they come, and as we were inserting the documents, they were indexed along the way. That meant that 11 seconds after we were done putting 4.66 millions documents to RavenDB, we were done indexing (across all indexes).

Pretty nice perf, even if I say so myself.

Tweet Share Share 19 comments

Tags:

raven

Comments

22 Nov 2012
16:23 PM

Pop Catalin

"Loading 4.66 millions records in 44 minutes"

This means 1756 documents / second.

Is the I/O channel saturated from this? Disk write speed maxed out?

I don't know about the complexity of those documents, however an ETL process can reach over 50k "rows" per second on my modest machine using bulk load.

Therefore I think it would be interesting to see some benchmarks for small documents (1 property), medium (10-100 properties) large (1000+ properties) and the I/O caracteristics of Raven DB during such operations.

22 Nov 2012
16:32 PM

Ayende Rahien

Pop, This is meant to show indexing performance more than anything else. Bulk load is doing something quite different.

22 Nov 2012
16:42 PM

Jamie

What options do you have for doing an actual bulk load? Say if we wanted to load 250m moderately complex documents - is there some kind of bulk load option which can do batch indexing after?

22 Nov 2012
16:43 PM

Jeús López

May I ask where you downloaded the IMDB dataset from?

22 Nov 2012
19:23 PM

Remco Ros

@Jeús http://www.imdb.com/interfaces

22 Nov 2012
20:51 PM

Nabil

Would be great if you could direct us to your ETL process. I noticed the old ETL project in the raven source is no longer there?

22 Nov 2012
22:13 PM

Will Hughes

Is it possible to get a comparison with Raven 1.x's performance using the same dataset and hardware?

22 Nov 2012
23:41 PM

Ayende Rahien

Jamie, We will have bulk load work done after the release. It is a bit involved, as you might imagine.

22 Nov 2012
23:41 PM

Ayende Rahien

The "ETL code" is just the smuggler.

23 Nov 2012
08:29 AM

Daniel Lang

I really don't understand why people care so much about 'bulk load' performance. I mean really, what's the difference between writing 1.000 or 5.000 documents per second WITHOUT indexing?

The whole point about raven is that is has indexes for you to do calculation or queries. If you don't need that, you have a key/value store for which you don't need raven in the first place.

Perf metrics without indexing are useless.

23 Nov 2012
09:10 AM

AndersM

Daniel: Of course it matters, Jamie clearly stated why. If you need to store large amounts of data quickly, and only need indexes later, bulking makes sense.

23 Nov 2012
09:12 AM

Ayende Rahien

AndersM, Not really, just loading the data and waiting for indexing, and loading the data with indexing would result in about the same time frame

23 Nov 2012
09:45 AM

AndersM

Ok, i did not know how Raven would handle this, but answered based on Daniels numbers :)

23 Nov 2012
09:51 AM

Guillaume

AndersM, Not really, just loading the data and waiting for indexing, and loading the data with indexing would result in about the same time frame

Maybe in the Ravendb world... As Will Hughes suggested, it would be more interesting to see the difference with the previous release, right now it's just some random numbers.

23 Nov 2012
10:43 AM

Daniel Lang

AndersM: My point is - the only metric I care about is the time it takes to do both, writing and indexing. No, I don't mean bulk import of data-sets because this is something you don't do frequently and when you do it, it's generally not time sensitive (like migrate from another database).

23 Nov 2012
11:36 AM

Catalin Pop

@Daniel Bulk loading should include indexing. In my earlier example, indexing during bulk load is enabled.

23 Nov 2012
16:24 PM

Sean Kearon

Very nice! What do the indexes look like?

26 Nov 2012
17:18 PM

Alexei K

So, how did the older version do on this? What's the improvement (if any) does 2.0 bring?

06 Dec 2012
13:07 PM

Alexey

Have you full source code for this perf test?

Comment preview

Comments have been closed on this topic.

Markdown turns plain text formatting into fancy HTML formatting.

Phrase Emphasis

*italic*   **bold**
_italic_   __bold__

Links

Inline:

An [example](http://url.com/ "Title")

Reference-style labels (titles are optional):

An [example][id]. Then, anywhere
else in the doc, define the link:
  [id]: http://example.com/  "Title"

Images

Inline (titles are optional):

![alt text](/path/img.jpg "Title")

Reference-style:

![alt text][id]
[id]: /url/to/img.jpg "Title"

Headers

Setext-style:

Header 1
========
Header 2
--------

atx-style (closing #'s are optional):

# Header 1 #
## Header 2 ##
###### Header 6

Lists

Ordered, without paragraphs:

1.  Foo
2.  Bar

Unordered, with paragraphs:

*   A list item.
    With multiple paragraphs.
*   Bar

You can nest them:

*   Abacus
    * answer
*   Bubbles
    1.  bunk
    2.  bupkis
        * BELITTLER
    3. burper
*   Cunning

Blockquotes

> Email-style angle brackets
> are used for blockquotes.
> > And, they can be nested.
> #### Headers in blockquotes
> 
> * You can quote a list.
> * Etc.

Horizontal Rules

Three or more dashes or asterisks:

---
* * *
- - - -

Manual Line Breaks

End a line with two or more spaces:

Roses are red,   
Violets are blue.

Fenced Code Blocks

Code blocks delimited by 3 or more backticks or tildas:

```
This is a preformatted
code block
```

Header IDs

Set the id of headings with {#<id>} at end of heading line:

## My Heading {#myheading}

Tables

Fruit    |Color
---------|----------
Apples   |Red
Pears	 |Green
Bananas  |Yellow

Definition Lists

Term 1
: Definition 1
Term 2
: Definition 2

Footnotes

Body text with a footnote [^1]
[^1]: Footnote text here

Abbreviations

MDD <- will have title
*[MDD]: MarkdownDeep

Oren Eini

Oren Eini

CEO of RavenDB