Ayende @ Rahien

It's a girl

The RavenDB indexing process: Optimization–De-parallelizing work

One of the major dangers in doing perf work is that you have a scenario, and you optimize the hell out of that scenario. It is actually pretty easy to do without even noticing it. The problem is that when you do things like that, you are likely to be optimizing a single scenario to perform really well, but you are hurting the overall system performance.

In this example, we have moved heaven and earth to make sure that we are indexing things as fast as possible, and we tested with 3 indexes, on an 4 cores machine. As it turned out, we actually had improved things, for that particular scenario.

Using the same test case on a single core machine was suddenly far more heavy weight, because we were pushing a lot of work at the same time. More than the machine could process. The end result was that it actually got there, but much more slowly than if we would have run things sequentially.

Of course, I give you the outliers, but those are good indicators for what we found out. Initially, we thought that we could resolve that by using the TPL’s MaxDegreeOfParallelism, but it turned out to be more complex than that. We have IO bound and we have CPU bound tasks that we need to execute, and trying to execute IO heavy tasks with this would actually cause issues in this scenario.

We had to manually throttle things ourselves, both to ensure limited number of parallel work, and because we have a lot more information about the actual tasks than the TPL have. We can schedule them in a way that is far more efficient because we can tell what is actually going on.

The end result is that we are actually using less parallelism, overall, but in a more efficient manner.

In my next post, I’ll discuss the auto batch tuning support, which allows us to do some really amazing things from the point of view of system performance.

Comments

gandjustas
04/20/2012 09:27 AM by
gandjustas

What actual API you used for parallel computations? Parallel.ForEach? It's not suitable for IO-bound concurrency. For IO-bound you should use Tasks.

Daniel O
04/20/2012 12:47 PM by
Daniel O

Enjoying these posts on the ongoing development of RavenDB. Could you imagine if the SQL Server or Oracle devs did posts like this?

Gene Hughson
04/20/2012 02:25 PM by
Gene Hughson

Performance tuning on a line of business app is expensive and labor-intensive to get right: you really need a comprehensive suite of load tests on the same hardware profile as production using an equivalent network profile - easy-peasy. I can only imagine the headache involved with a more general-purpose tool like Raven. Like Daniel above, I'm enjoying the peek into your world.

Matthew Sullivan
04/20/2012 05:20 PM by
Matthew Sullivan

Sorry if this question is too naive, but is the indexing primarily cpu bound, memory bound or i/o bound? Would it be helpful or possible to use a cloud computing to create indexes in a speedy fashion?

I've just heard that slow indexing speed is a major drawback of doc dbs, a prime reason why reporting etc needs to be done on sql... just wondering if you can throw a little cloud money at the problem to get faster turnaround on ad hoc reporting or index fixes.

Ayende Rahien
04/21/2012 08:17 AM by
Ayende Rahien

Matthew, It is using a lot of CPU for full text indexing, it requires a lot of memory and it writes a lot to disk. It wouldn't be workable to do this on the cloud, because the cost of actually sending the data up there and then getting it back would be too high.

Ayende Rahien
04/21/2012 08:18 AM by
Ayende Rahien

Madhav, RavenDB is DivanDB

Rafal
04/21/2012 10:37 AM by
Rafal

A nice thing about indexing in Raven is that usually you have all recently modified documents in memory so you can index them without reading from the storage. You will not have such luxury when the lucene index is external to the application

Frank Quednau
04/21/2012 10:53 AM by
Frank Quednau

Have you implemented your own Task Scheduler for the Task library?

Ayende Rahien
04/21/2012 10:57 AM by
Ayende Rahien

Frank, No, we didn't do that. We handle the control in a much simple concept by partitioning the work before starting the parallel work

Adam Langley
04/22/2012 09:00 PM by
Adam Langley

The wonderful thing about that scenario, is if the region of code you are optimising is modular (which I'm sure it is), the problem space is not variable once the software is installed. Hence, you could provide two indexing modules, one designed for single-core, and one for multi-core parallelism.

Of course you would increase your code maintenance, but thats just another decision...

Comments have been closed on this topic.