Ayende @ Rahien

Hi!
My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:

ayende@ayende.com

+972 52-548-6969

, @ Q c

Posts: 10 | Comments: 37

filter by tags archive

Complex indexing, simplified

time to read 2 min | 243 words

RavenDB indexes are Turing complete, which means that you can do whatever you want with them. This is a very powerful feature, but it also come with a heavy burden. You can get yourself into some serious trouble. Take a look at this index:

image

 

We run into it during a troubleshooting session with a customer. And it was frankly quite hard to figure out what was going on.

Luckily, I could just throw this into RavenDB 3.0, and look at the indexing options:

image

This turned the above index into this:

image

Which was much clearer, but we could improve it a bit by removing the into clauses, so I ended up with:

image

Now, just from the following, can someone tell me what is the likely issue with this kind of index?


Comments

David Zidar

The cardinality could get rather large?

Mark

Cartesian product.

Dominic Zukiewicz

1 update to any of the collections requires a full run through all permutations of the data...

i.e. cartesian join :)

@Ayende,

Would you say the error lies between keyboard and chair (user error ;-) ) or just a bad data model? Their intent looks okay on the surface with the model they have.

What would be interesting is in this sort of situation - where the data model could be improved, what are the steps you should take to co-ordinate the patching, code changes and deployment to manage such a change.

Its one thing to have a bad data model and have good intentions to fix it, but its another to know how to approach this sort of problem.

Ayende Rahien

Dominic, Yes, that is the issue. And to resolve it you can split it into multiple maps, so the work doesn't have to be multiplied by each level.

Pure Krome

Could we please have some code to show how it should be remodelled, please?

Ayende Rahien

Pure Krome, I don't understand the question

Pure Krome

Ah. I mean this..

And to resolve it you can split it into multiple maps, so the work doesn't have to be multiplied by each level.

some code to explain this answer.

(I know some of us noobs still need help with this).

Ayende Rahien

Pure Krome,

The issue is this:

// index #1

from doc in docs.Phones
from docModelItem in doc.Models
from docModelItemExtensions in docModelItem.Extentions
from docExtensions in doc.Extensions

Assume you have 10 phones, with 10 models each with ten extension per model and 10 extensions per phone. The output from this index is:

10 (phones) x 10 (models) x 10 (extensions per model) x 10 (extensions per phone)

Gives us a total of: 10,000

// index #2 // map #1 from doc in docs.Phones from docModelItem in doc.Models from docModelItemExtensions in docModelItem.Extentions

// map #2 from doc in docs.Phones from docExtensions in doc.Extensions

Then you have:

10 (phones) x 10 (models) x 10 (extensions per model) + 10 (phones) x 10 (extensions per phone)

Gives us a total of 1,100 results.

Comment preview

Comments have been closed on this topic.

FUTURE POSTS

  1. Production postmortem: The case of the memory eater and high load - one day from now
  2. Production postmortem: The case of the lying configuration file - about one day from now
  3. Production postmortem: The industry at large - 3 days from now
  4. The insidious cost of allocations - 4 days from now
  5. Find the bug: The concurrent memory buster - 5 days from now

And 4 more posts are pending...

There are posts all the way to Sep 10, 2015

RECENT SERIES

  1. Find the bug (5):
    20 Apr 2011 - Why do I get a Null Reference Exception?
  2. Production postmortem (10):
    14 Aug 2015 - The case of the man in the middle
  3. What is new in RavenDB 3.5 (7):
    12 Aug 2015 - Monitoring support
  4. Career planning (6):
    24 Jul 2015 - The immortal choices aren't
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats