Ayende @ Rahien

Refunds available at head office

Raven’s Scripted Index Results

Scripted Index Results (I wish it would have a better name) is a really interesting new feature in RavenDB 2.5. As the name implies, it allows you to attach scripts to indexes. Those scripts can operate on the results of the indexing.

Sounds boring, right? But the options that is opens are nothing but. Using Scripted Index Results you can get recursive map/reduce indexes, for example. But we won’t be doing that today. Instead, I’ll show how you can enhance entities with additional information from other sources.

Our sample database is Northwind, and we have defined the following index to get some statistics about our customers:

image

And we can query it like this:

image

However, what we want to do is to be able to embed those values inside the company document, so we won’t have to query for it separately. Here is how we can use the new Scripted Index Results bundle to do this:

image

Once we have defined that, whenever the index is done, it will run these scripts, and that, in turns, means that this is what our dear ALFKI looks like:

image

I’ll leave recursive map/reduce as tidbit for my dear readers Smile.

Tags:

Posted By: Ayende Rahien

Published at

Originally posted at

Comments

Khalid Abuhakmeh
05/30/2013 12:17 PM by
Khalid Abuhakmeh

I have a few questions about this feature:

  1. What makes this better than using a transformer?
  2. What kind of performance hit does this put on the indexing process?

I'm sure I will have more, but those are the first two that come to mind.

Ayende Rahien
05/30/2013 02:22 PM by
Ayende Rahien

Khalid, 1) This happens during indexing, and they can update document(s), so you can index those items. 2) It would slow down a bit, but I don't expect it to be too much.

Khalid Abuhakmeh
05/30/2013 02:30 PM by
Khalid Abuhakmeh

Your answer to question one seems very interesting. So how do you handle this scenario?

  1. An Order is added.
  2. Orders/ByCompany is updated, which updates the Companies collection.

Does step 2 have to run all indexes again, and can you cause a weird cyclical issue with indexes, where they will constantly be running?

Index A is dependent on Collection A and updates Collection B, and Index B is dependent on collection B which updates Collection A.

How do you prevent something like that from happening, or at least warn the developer that they are doing something stupid?

Ayende Rahien
05/30/2013 02:40 PM by
Ayende Rahien

Companies doc gets updated, the relevant indexes gets run. You cannot modify a document that will be indexed by the same index that trigger this operation. The reason for that is to avoid infinite recursion.

If you have it in two indexes, yes, you have a problem.

Khalid Abuhakmeh
05/30/2013 02:45 PM by
Khalid Abuhakmeh

Do you and the team have any ideas on how to prevent someone from shooting themselves in the foot, or is that just accepted as collateral damage?

It seems like an issue that could be common across a development team. Two developers could be working on separate but related features on separate branches, where the issue would manifest after the two features were merged into the same branch.

Ayende Rahien
05/30/2013 02:51 PM by
Ayende Rahien

Khalid, I don't know how to handle that scenario, in order to do that, you would have to keep track of every action by every index for all time. We keep track of that for a single index, though.

Khalid Abuhakmeh
05/30/2013 03:02 PM by
Khalid Abuhakmeh

I am just thinking out loud, but what if during the indexing process you also tracked the source of what caused that indexing to happen?

  • External Document Put
  • Internal Document Put (script) with Source

You could do pattern matching based on Index, Source, Document Id that caused the index, and frequency within a certain time. If you hit a threshold, you can log a warning in the Management Studio. In addition, you could stop the indexes to save the system.

"Woah we sure touched this one document a lot within a certain time, and it seems what is affecting it is internal to RavenDB, we think there might be something wrong with these indexes: {Index}, {Source}."

You don't have to save this data for a long period of time, you just need a buffered window according to your frequency window (1 minute)? If an item falls out of the window, just throw away that info. Funny enough I think this is a good use case for your previous idea of an event stream.

Not sure if this is possible, just thinking out loud.

Ayende Rahien
05/30/2013 03:05 PM by
Ayende Rahien

Khalid, And now you need to track a WHOLE lot of information in the system. Not only that, but you need to track the source of each write, and it is pretty expensive and hard to do. For something that is purely theoretical right now.

Also, there might be valid reasons why you would want to do that (if you know that you do recursion only to a certain level, for example).

Simone
06/05/2013 03:20 PM by
Simone

Hey,

but the Orders class with count and total in the main document should be already be there and empty in the .NET class?

If not how those information will be deserialized?

Ayende Rahien
06/06/2013 12:08 PM by
Ayende Rahien

Simone, If the properties aren't there, they will be ignored and removed on the next save.

Comments have been closed on this topic.