RavenDB 4.0Data subscriptions, Part I
I’ll be talking about this feature more once the UI for it is complete, but this feature just landed in our v4.0 branch and it is so awesome that I can’t help talking about it right away.
In RavenDB 4.0 we have taken the idea of subscriptions and pushed it up a few notches. Data subscriptions gives you a reliable push based method to get documents from RavenDB. You setup a subscriptions, and then you open it and RavenDB will stream to you all the documents that are relevant to your subscription. New documents will be sent immediately, and failures are handled and retried automatically. Subscriptions are a great way to build all sort of background jobs.
In RavenDB 3.x their main strength was that they gave you a reliable push base stream of documents, but in RavenDB 4.0, we decided that we want more. Let us take it in stages, here is the most basic subscription usage I can think of:
This is subscribing to all User documents, and RavenDB will first go through all the User documents, sending them to us, and then keep the connection alive and send us the document whenever a User document is updated. Note that we aren’t talking about just a one time thing. If I modify a document once an hour, I’ll be getting a notification on each change. That allows us to do hook this up to jobs, analytics, etc.
The really fun thing here is that this is resilient to failure. If the client maintaining the subscription goes down, it can reconnect and resume from where it left off. Or another client can take over the subscription and continue processing the documents. In RavenDB 4.0, we now also have high availability subscriptions. That means that if a server goes down, the client will simply reconnect to a sibling node and continue operating normally, with no interruption in service.
But you aren’t limited to just blinding getting all the documents in a collection. You can apply a filter, like so:
In this manner, we’ll now only get notified about active users, not all of them. This filtering allows you to handle some really complex scenarios. If you want to apply logic to the stream of changed documents, you can, getting back only the documents that match whatever logic you have in your document.
But the script can do more than just filter, it can also transform. Let us say that we want to get all the active users, but we don’t need the full document (which may be pretty big), we just want a few fields from it.
In this manner, you can select just the right documents, and just the right values you need from the document and process them in your subscription code.
There is actually more, which I’ll post in the next post, but I’m so excited about this feature that I’m not even going to wait for the next publishing schedule and push this immediately. You can read it here.
More posts in "RavenDB 4.0" series:
- (30 Oct 2017) automatic conflict resolution
- (05 Oct 2017) The design of the security error flow
- (03 Oct 2017) The indexing threads
- (02 Oct 2017) Indexing related data
- (29 Sep 2017) Map/reduce
- (22 Sep 2017) Field compression
Comments
If the User has IsActive=true but the document is changed in other properties, will the subcription receive the document?
Is it possible to only receive the document when a certain property has changed?
Could subscriptions be used with dateTime.now? For example I would want a subscription to only receive documents that became 'expired' after their expiration date.
// Ryan
Ryan, Yes, the check is on the entire document, not just the modifications. And you really want to check part II (link at the end of the post) for what you want. :-)
Subscriptions operate on the document data only, not based on time. What you are asking for with regards to time is not a state change (which fits subscriptions), but a query.
Yes, but I would have to setup a background thread that is periodically polling the db with the query, but a subscription would have been a nice fit in this regard ...
// Ryan
Ryan, The problem with stuff that is based on time is that it changes, so every tick you'll need to re-evaluate it. What you can do, based on the second part of this post is setup expiration and versioned subscription, and then check for the
(prev, null)
scenario, which will be called when the expiration will remove this document. That will give you what you want WRT expiration and being notified, but it isn't a general solution to all time based stuff.I have used Oracle CQN for some times (seems to be the same), having lots of trouble with reliability. I know it is not easy to provide, but some sort of max delivery time (or exception is thrown) would be nice. Also I think it should be impossible to setup a subscriber without registering an error handler when connections is lost.
Stig, You are correct that error handling is tough. You can read some of our thoughts in the matter here: https://ayende.com/blog/174913/api-design-robust-error-handling-and-recovery
There are actually several different layers of error handling here.
And I'm probably forgetting a few.
We're consider changing the type that we exposed to the lambda to include additional details (including script error information). Errors in connection to the server / servers are handled via events, because we'll retry to reconnect as needed. Errors in the user code are fed into a dedicated error handler
OnError
method.I'm using quick & dirty demos here, but basically, you need to pass an
IObserver<T>
, which has anOnError
method that we'll call. And that is the only thing we expose externally. Users can use reactive extensions for better API, but that is explicitly ignoring errors.It would be nice if you could specify the subscription id or equivalent instead of having to use the one that comes back from Create. When you create a subscription, you effectively have to create a bespoke management system to be able to lookup existing subscriptions.
For example, you could look for a subscription under "subscriptions/orders" rather than having to create a document with that id that stores the subscription id in order to find the subscription.
Kijana, We have done the work so this is possible, but this is still something I'm not decided yet. There are issues with what happens when you try to create a subscription that already exists, maybe with a different configuration, etc.
I'd assume it would behave like any other document - you'd get an exception.
Perhaps that's what really strikes me is odd. So many things in the db work naturally with a string document id, and then here we're back to a long like it's a row in a table.
Kijana, Take a look at http://issues.hibernatingrhinos.com/issue/RavenDB-7436
Comment preview