Ayende @ Rahien

It's a girl

RavenDB performance optimizations

Just to note, you’ll probably read this post about a month after the change was actually committed.

I spent the day working on a very simple task, reducing the number of writes that RavenDB makes when we perform a PUT operation. I managed to reduce one write operation from the process, but it took a lot of work.

I thought that I might show you what removing a single write operation means, so I built a simple test harness to give me consistent numbers (in the source, look for Raven.Performance).

Please note that the perf numbers are for vanilla RavenDB, with the default configuration, running in debug mode. We can do better than that, but what I am interested in is not absolute numbers, but the change in those numbers.

Here are the results for build 124, before the change:

Wrote 5,163 documents in 5,134ms: 1.01: docs/ms
Finished indexing in 8,032ms after last document write

And here are the numbers for build 126, after the change:

Wrote 5,163 documents in 2,559ms: 2.02: docs/ms
Finished indexing in 2,697ms after last document write

So we get double the speed at write time, but we also get much better indexing speed, this is sort of an accidental by product, because now we index documents based on range, rather than on specific key. But it is a very pleasant accident.

Tags:

Posted By: Ayende Rahien

Published at

Originally posted at

Comments

Demis Bellot
09/08/2010 11:39 AM by
Demis Bellot

Pretty good results ayende, should satisfy a lot of use-cases.

What type of documents are you using in these benchmarks?

josh
09/08/2010 04:53 PM by
josh

Cool. I'm probably using a version before this was added and it was already fast. Faster than SQL and MongoDB in my simplistic tests. ..and I mean a LOT!! faster than both.

Demis Bellot
09/08/2010 10:59 PM by
Demis Bellot

For anyone interested I have modified benchmarks to include timings for Redis as well. I've kept it as close as possible to the RavenDB example including the 128 batch size which Redis doesn't need.

Basically the results shows that Redis stores all 5,163 documents in 981ms making it 2.85x quicker than RavenDB in this scenario.

I have more information available on my blog post here:

http://www.servicestack.net/mythz_blog/?p=474

Although Redis and RavenDB are not exactly the same type of NoSQL data store (RavenDB is a document database while Redis is a data structures server) they still have some overlapping use cases.

Ryan Heath
09/09/2010 10:14 AM by
Ryan Heath

Dennis, you seem to have copyed over the bug Simon Labrecque is talking about.

Does it make any difference when you reset the batch counter when 128 is reached?

// Ryan

Ayende Rahien
09/09/2010 10:52 AM by
Ayende Rahien

Simon,

That is a bug, it should be batchSize % 128 == 0

Ayende Rahien
09/09/2010 10:53 AM by
Ayende Rahien

Demis,

Just to point out, Redis writes to memory, RavenDB writes to disk

Demis Bellot
09/09/2010 11:05 AM by
Demis Bellot

Ok so there seems to be some confusion how Redis works, so I'll just copy a paragraph from my blog explaining it in more detail:

http://www.servicestack.net/mythz_blog/?p=474

Why is Redis so fast?

Based on the comments below there appears to be some confusion as to what Redis is and how it works. Redis is high-performance a data structures server written in C that operates predominantly in-memory and routinely persists to disk and maintains an Append-only transaction log file for integrity – both of which are configurable and can be made to write to disk on every operation.

For redundancy it includes built-in replication where you can turn any redis instance into a slave of another, which can be configured at runtime. It also features its own Virtual Machine implementation so if your dataset exceeds your available memory, un-frequented values are swapped out to disk whilst the hot values remain in memory.

Like other high-performance network servers e.g. Nginx, Node.js, etc it achieves maximum efficiency by having each Redis instance is a single process where all IO is asynchronous and no time is wasted context-switching between threads.

It achieves concurrency is by being really fast and achieves integrity by having all operations atomic. You are not just limited to the available transactions either as you can compose any combination of Redis commands together and process them atomically in a single transaction.

Ayende Rahien
09/09/2010 11:09 AM by
Ayende Rahien

Demis,

Did you configure your Redis server to write to disk on every operation (to match more closely what RavenDB is doing)?

Demis Bellot
09/09/2010 11:13 AM by
Demis Bellot

The benchmarks are both using the standard configuration for both servers, so no.

I will re-run the benchmarks with the bug fix and configure it to write on every operation when I get home tonight.

Demis Bellot
09/10/2010 01:15 AM by
Demis Bellot

Okay new benchmarks are in - details in my blog under the heading: Benchmarks – Take 2

http://www.servicestack.net/mythz_blog/?p=474

As any additional overhead is multiplied when the 'fsync' option is on, I removed some of these overheads imposed on the Redis Client i.e. active entity id tracking and batching (as its not required for Redis) before enabling the appendonly transaction log with ‘fsync always’ option.

Note: I’m using Redis's batch-ful MSET operation behind the scenes, so the fsync penalty is only paid once.

The new benchmarks show Redis is now 11.75x faster than RavenDB with this configuration.

If you disable the append only transaction log Redis becomes 16.9x faster than RavenDB.

Not saying performance is the most important metric just wanted to show that Redis provides a high-performance NoSQL solution for .NET clients. Multiple choices benefit everyone.

  • Demis
Chance
09/10/2010 05:05 PM by
Chance

Nice! I'd love to have accidents like this!

Side question:

Any idea when you guys are going to implement geocoding support at the core of Raven? I thought about hacking it in myself, but at the rate of change right now I figured that would be a bad idea. Alternatively, I could perform the algos outside in our logic but I'd rather they be native. (Map/Reduce seems like our best bet atm).

Thanks,

Chance

Ayende Rahien
09/10/2010 10:17 PM by
Ayende Rahien

Chance,

RavenDB already support spatial queries. I need to document it, though

Chance
09/17/2010 02:10 PM by
Chance

Ah! I can't believe I missed the email alert for your comment Ayende. That's awesome man, thanks!

By the way, its still on your Todo list. If you've finished that, I can only imagine what else you've knocked off of that list. You guys are rocking hard on Raven - keep it up!

Comments have been closed on this topic.