With RavenDB 3.5, we are focusing on performance as one of the key features. I’ve already spoken at length about the kind of changes that we had made to improve performance. A few percentage points here and there end up being quite significant when you add them all together.
But just sanding over the rough edges isn’t quite enough for us. We want to have a major impact, not just an avalanche of small improvements. In order to handle that, we needed to be much more aware of how we are making use of resources in the system.
The result of several months of work is now ready to enter into performance testing, and I’m quite excited about it. But before I show you the results, what is it? Well, RavenDB does quite a lot in the background, to avoid holding up a request thread when you are calling RavenDB. This mean that we have a lot of background work, indexing, map/reduce, etc.
We have been using the default .NET thread pool for a long time to do that, and it has served us very well. But it is a generic construct, without awareness of the unique needs that RavenDB has. Therefor, we worked for quite some time to create our own Thread Pool that match what we do.
The major changes with the RavenThreadPool (RTP from now) are:
- There is a fixed (and dedicated) number of threads that will do the work, sharing (and stealing) work among themselves.
- Indexes tasks are continuous and shared, so a big indexing work will spread across all threads, but with a preference for locality of work if we have a lot of stuff to parallel.
- A slow index doesn’t stop us from working on other indexes, we’ll let it process on its own, and let the other indexes run forward without it.
- Dynamic adjusting of the amount of work that is allowed for the indexes means that under load, we can dynamically and rapidly reduce the amount of work we are doing to allow more resources for processing requests.
There are other stuff, but they are mostly of interest for the people who work on RavenDB, not on those who use it.
And the results, they are pretty good. Here is the before and after sample.
Note that we have a mix here of various types of indexes. The X axis is time, and the Y axis is the number of documents indexed.
As you can see, in the before (without RTP), we are processing all indexes roughly on the same course, with a pretty flat growth over time. However, with RTP, the situation is different. You can see that very quickly the fast indexes are starting to outpace both their version without RTP and the slower indexes.
That, in turn, means that they complete much faster. In the case of the Simple Map index, it complete indexing roughly 50% faster than without RTP. But even more interesting is what happens globally, because we are able to complete indexing of the fast indexing fast, it means that we can process the slow index (HeavyMapReduce) with more resources. So even this slowpoke completes about 15% faster with RTP than without it.
We are still running tests, but even so, this is quite exciting.