Feature discussion: Spicing up document refreshes in RavenDB
I posted about the @refresh feature in RavenDB, explaining why it is useful and how it can work. Now, I want to discuss a possible extension to this feature. It might be easier to show than to explain, so let’s take a look at the following document:
The idea is that in addition to the data inside the document, we also specify behaviors that will run at specified times. In this case, if the user is three days late in paying the rent, they’ll have a late fee tacked on. If enough time have passed, we’ll mark this payment as past due.
The basic idea is that in addition to just having a @refresh timer, you can also apply actions. And you may want to apply a set of actions, at different times. I think that the lease payment processing is a great example of the kind of use cases we envision for this feature. Note that when a payment is made, the code will need to clear the @refresh array, to avoid it being run on a completed payment.
The idea is that you can apply operations to the documents at a future time, automatically. This is a way to enhance your documents with behaviors and policies with ease. The idea is that you don’t need to setup your own code to execute this, you can simply let RavenDB handle it for you.
Some technical details:
- RavenDB will take the time from the first item in the @refresh array. At the specified time, it will execute the script, passing it the document to be modified. The @refresh item we are executing will be removed from the array. And if there are additional items, the next one will be schedule for execution.
- Only the first element in the @refresh array only. So if the items aren’t sorted by date, the first one will be executed and the persisted again. The next one (which was earlier than the first one) is already ready for execution, so will be run on the next tick.
- Once all the items in the @refresh array has been processed, RavenDB will remove the @refresh metadata property.
- Modifications to the document because of the execution of @refresh scripts are going to be handled as normal writes. It is just that they are executed by RavenDB directly. In other words, features such as optimistic concurrency, revisions and conflicts are all going to apply normally.
- If any of the scripts cause an error to be raised, the following will happen:
- RavenDB will not process any future scripts for this document.
- The full error information will be saved into the document with the @error property on the failing script.
- An alert will be raised for the operations team to investigate.
- The scripts can do anything that a patch script can do. In other words, you can put(), load(), del() documents in here.
- We’ll also provide a debugger experience for this in the Studio, naturally.
- Amusingly enough, the script is able to modify the document, which obviously include the @refresh metadata property. I’m sure you can imagine some interesting possibilities for this.
We also considered another option (look at the Script property):
The idea is that instead of specifying the script to run inline, we can reference a property on a document. The advantage being is that we can apply changes globally much easily. We can fix a bug in the script once. The disadvantage here is that you may be modifying a script for new values, but not accounting for the old documents that may be referencing it. I’m still in two minds about whatever we should allow a script reference like this.
This is still an idea, but I would like to solicit your feedback on it, because I think that this can add quite a bit of power to RavenDB.
Comments
My first reaction to this possible feature was, "Whoa! That is cool!"
Then my next thought was, "Should that kind of logic really be inside the database?"
I don't actually know. But I know that when I was slinging SQL, I grew a disdain for having app logic in stored procedures, rather than in the app code. Might this have the same problem?
Just thinking out loud.
Architecture issues aside, I think this is a super cool feature. I have an idea how to apply it in my own projects.
Judah,
The idea here is that in many cases, you can just have this as part of your processing pipeline.
The ability to define logic that would run at predefined times and modify the document is quite powerful, and it can replace a lot of glue code that you'll need to run.
It does turn RavenDB more into a platform than a database, but I think that can reduce complexity for many scenarios. Especially since you can combine that with subscriptions to react to these changes as needed.
But yes, you need to think about your overall architecture. It may be easier to define it in your own code, but a lot depends on how you are doing things.
If you are working in a micro service environment, this can be just an extension of this as a very natural model.
Would you be able to support "At" as a relative datetime, possibly as a combination of a document property and a timespan?
For example, a grace period for a late payment. A mortgage payment is due on the {n} day of the month, but it is not considered late until day {n+15}.
I agree with Judah, when read your first post, I thought cool. Then read this new post on the expanded refresh feature and thought oh not so much.
I think this gets into the slippery slope of what a DB is for. This level of embedded logic should not be in the db. Raven is great because it does 2 things really well: documents + indexing at its core.
Not feeling this Spicy Refresh feature.
If you really want to introduce something like this. Allow for the storage/scheduling of patches, but outside of the documents.
I also have mixed feeling about this - but not because of the logic-inside-database issue.
Firstly triggers are already really sort of logic-inside-database, to extend them to run in response to other events (apart from CRUD) seems reasonable .
I have always wanted to be able to schedule stored procs to run inside the database on a schedule.
My question is if this capability is already being implemented, why limit it to such a narrow purpose - @refresh? Allow jobs to target whatever tag we want?
Phil S,
Relative time, no. You can absolutely hook up an event listener at save time that would generate the right time from the document data.
The reason I don't want to support it is that it is already crazy complex to figure out time. That is why we use UTC only times when we ask the user to give us a time. That gives us a point in time, vs. the complexity of local time. It gets worse when you are talking about relative to a property, because at this point, you have to also encode what time frame, etc.
Best to avoid the complexity entirely by using a UTC value.
Hassan,
That is the point with getting feedback on proposed features early.
As or storing the patch script, that is the last code sample in the post, did you see that? What do you think about this? This is probably something that we'll do more in the future. For example, we are thinking about allowing "local" subscriptions. So the subscription handling code will run inside RavenDB itself, so you don't need to maintain a separate client process.
Peter,
We currently don't have the ability to run code on any event. That is something that we are thinking about (it is complex, due to the distributed nature of RavenDB). I'm not sure that I'm following on your question. The
@refresh
is just the invocation mechanism, you can execute whatever you want in response. Can you explain?@Oren, I was confused, actually the @refresh is almost exactly the type of trigger I was referring to. I suppose for a recurring scheduled update the script can just add a new @refresh value.
Peter,
Yes, the
@refresh
script is free to modify the@refresh
element as well, which will schedule the next call to the script.Comment preview