Raven’s Scripted Index Results
Scripted Index Results (I wish it would have a better name) is a really interesting new feature in RavenDB 2.5. As the name implies, it allows you to attach scripts to indexes. Those scripts can operate on the results of the indexing.
Sounds boring, right? But the options that is opens are nothing but. Using Scripted Index Results you can get recursive map/reduce indexes, for example. But we won’t be doing that today. Instead, I’ll show how you can enhance entities with additional information from other sources.
Our sample database is Northwind, and we have defined the following index to get some statistics about our customers:
And we can query it like this:
However, what we want to do is to be able to embed those values inside the company document, so we won’t have to query for it separately. Here is how we can use the new Scripted Index Results bundle to do this:
Once we have defined that, whenever the index is done, it will run these scripts, and that, in turns, means that this is what our dear ALFKI looks like:
I’ll leave recursive map/reduce as tidbit for my dear readers .
Comments
I have a few questions about this feature:
I'm sure I will have more, but those are the first two that come to mind.
Khalid, 1) This happens during indexing, and they can update document(s), so you can index those items. 2) It would slow down a bit, but I don't expect it to be too much.
Your answer to question one seems very interesting. So how do you handle this scenario?
Does step 2 have to run all indexes again, and can you cause a weird cyclical issue with indexes, where they will constantly be running?
Index A is dependent on Collection A and updates Collection B, and Index B is dependent on collection B which updates Collection A.
How do you prevent something like that from happening, or at least warn the developer that they are doing something stupid?
Companies doc gets updated, the relevant indexes gets run. You cannot modify a document that will be indexed by the same index that trigger this operation. The reason for that is to avoid infinite recursion.
If you have it in two indexes, yes, you have a problem.
Do you and the team have any ideas on how to prevent someone from shooting themselves in the foot, or is that just accepted as collateral damage?
It seems like an issue that could be common across a development team. Two developers could be working on separate but related features on separate branches, where the issue would manifest after the two features were merged into the same branch.
Khalid, I don't know how to handle that scenario, in order to do that, you would have to keep track of every action by every index for all time. We keep track of that for a single index, though.
I am just thinking out loud, but what if during the indexing process you also tracked the source of what caused that indexing to happen?
You could do pattern matching based on Index, Source, Document Id that caused the index, and frequency within a certain time. If you hit a threshold, you can log a warning in the Management Studio. In addition, you could stop the indexes to save the system.
"Woah we sure touched this one document a lot within a certain time, and it seems what is affecting it is internal to RavenDB, we think there might be something wrong with these indexes: {Index}, {Source}."
You don't have to save this data for a long period of time, you just need a buffered window according to your frequency window (1 minute)? If an item falls out of the window, just throw away that info. Funny enough I think this is a good use case for your previous idea of an event stream.
Not sure if this is possible, just thinking out loud.
Khalid, And now you need to track a WHOLE lot of information in the system. Not only that, but you need to track the source of each write, and it is pretty expensive and hard to do. For something that is purely theoretical right now.
Also, there might be valid reasons why you would want to do that (if you know that you do recursion only to a certain level, for example).
Hey,
but the Orders class with count and total in the main document should be already be there and empty in the .NET class?
If not how those information will be deserialized?
Simone, If the properties aren't there, they will be ignored and removed on the next save.
Comment preview