Ayende @ Rahien

Refunds available at head office

Complex indexing, simplified

RavenDB indexes are Turing complete, which means that you can do whatever you want with them. This is a very powerful feature, but it also come with a heavy burden. You can get yourself into some serious trouble. Take a look at this index:

image

 

We run into it during a troubleshooting session with a customer. And it was frankly quite hard to figure out what was going on.

Luckily, I could just throw this into RavenDB 3.0, and look at the indexing options:

image

This turned the above index into this:

image

Which was much clearer, but we could improve it a bit by removing the into clauses, so I ended up with:

image

Now, just from the following, can someone tell me what is the likely issue with this kind of index?

Tags:

Posted By: Ayende Rahien

Published at

Originally posted at

Comments

David Zidar
08/14/2014 12:40 PM by
David Zidar

The cardinality could get rather large?

Mark
08/14/2014 12:44 PM by
Mark

Cartesian product.

Dominic Zukiewicz
08/14/2014 01:35 PM by
Dominic Zukiewicz

1 update to any of the collections requires a full run through all permutations of the data...

i.e. cartesian join :)

@Ayende,

Would you say the error lies between keyboard and chair (user error ;-) ) or just a bad data model? Their intent looks okay on the surface with the model they have.

What would be interesting is in this sort of situation - where the data model could be improved, what are the steps you should take to co-ordinate the patching, code changes and deployment to manage such a change.

Its one thing to have a bad data model and have good intentions to fix it, but its another to know how to approach this sort of problem.

Ayende Rahien
08/14/2014 04:02 PM by
Ayende Rahien

Dominic, Yes, that is the issue. And to resolve it you can split it into multiple maps, so the work doesn't have to be multiplied by each level.

Pure Krome
08/22/2014 08:47 AM by
Pure Krome

Could we please have some code to show how it should be remodelled, please?

Ayende Rahien
08/22/2014 08:48 AM by
Ayende Rahien

Pure Krome, I don't understand the question

Pure Krome
08/22/2014 09:07 AM by
Pure Krome

Ah. I mean this..

And to resolve it you can split it into multiple maps, so the work doesn't have to be multiplied by each level.

some code to explain this answer.

(I know some of us noobs still need help with this).

Ayende Rahien
08/22/2014 09:10 AM by
Ayende Rahien

Pure Krome,

The issue is this:

// index #1

from doc in docs.Phones
from docModelItem in doc.Models
from docModelItemExtensions in docModelItem.Extentions
from docExtensions in doc.Extensions

Assume you have 10 phones, with 10 models each with ten extension per model and 10 extensions per phone. The output from this index is:

10 (phones) x 10 (models) x 10 (extensions per model) x 10 (extensions per phone)

Gives us a total of: 10,000

// index #2 // map #1 from doc in docs.Phones from docModelItem in doc.Models from docModelItemExtensions in docModelItem.Extentions

// map #2 from doc in docs.Phones from docExtensions in doc.Extensions

Then you have:

10 (phones) x 10 (models) x 10 (extensions per model) + 10 (phones) x 10 (extensions per phone)

Gives us a total of 1,100 results.