Rob’s Sprint: Query optimizer jumped a grade
RavenDB’s query optimizer is pretty smart, it knows how to find the appropriate index for your queries, and even create a new index to match your query if it didn’t exist. But that was the limits of its abilities. A human could still go into the database and say, look at those:
Those all operate on Posts, and you should be able to merge them all into a single index. Reducing the number of indexes is a good thing, as it reduces the amount of IO on the system, which is typically our limiting factor.
Now, there was no real reason why we couldn’t actually tell the query optimizer that it should be smart enough that when it creates a new index, it will use all of the properties that have been previously indexed.
However, doing so would actually make no difference to us. Because until now, we didn’t have a way to stop an index. With the new index idling feature, we can now have the query optimizer create a new merged index, and then the database will just mark the extra index as idle after a while.
Almost, there is still another issue that we have to resolve. What happens when we have a big database, and we introduce a new (and wider) index? By default, all matching queries would actually hit that index, and not the previously existing index. That is great, except… the new index is stale, and might remain stale for a few minutes. During that time, we have a perfectly servicable index that is just sitting there.
The query optimizer can now take into account the staleness level of an index as well when selecting it, meaning that there should be no interruption from the point of view of other queries. The new index will be introduced, go through all the documents, and then take over as the serving index for all queries. The existing index will wither away and die.