Shiny features in the depth: New index optimization

time to read 2 min | 305 words

One of the nice features in RavenDB 3.0 is optimizing the process of creating a new index. In particular, we want to optimize it when you create a new index on a small collection in a large database.

If you have a small database, you don’t care, it is going to complete quickly anyway. If you are creating an index on a collection that compose a significant amount of the documents in the database, you don’t care, you are going to have to do a lot of work anyway. But the common case for a big database is that you usually have one very big collection, and much smaller collections for everything else.

In RavenDB 2.x, you still have to pay the full price for indexing everything, but that isn’t the case in RavenDB 3.0. What we have done is to effectively optimize the process so that in this case, we will preload all of the documents taking part in the relevant collection, and send them directly to be indexed.

We do this by utilizing the Raven/DocumentsByEntityName index. Which has already indexed everything in the database anyway. This is a nice little feature, because it allows us to really take advantage of the work we already did long ago. Using one index to pre-populate another is a neat trick, and one that I am very happy about.

Because this is a new code path, it also means that it is actually executed outside of standard indexing. And that in turn means that adding a new index will not impact other indexes at all.

This is a small feature, but it does address a common pain point our users have with working in RavenDB in production.

Reminder, we have our upcoming RavenDB Conference in April, where we’ll discuss other stuff in the 3.0 release.