Using RavenDB Subscriptions with complex object graphs
RavenDB Subscriptions allows you to create a query and subscribe to documents that match the query. They are very useful in many scenarios, including backend processing, queues and more.
Subscriptions allow you to define a query on a document, and get all the documents that match this query. The key here is that all documents don’t refer to just the documents that exists now, but also future documents that match the query. That is what the subscription part is all about.
The subscription query operate on a single document at a time, which leads to open questions when we have complex object graphs. Let’s assume that we want to handle via subscriptions all Orders that are managed by an employee residing in London. There isn’t a straightforward of doing this. One option would be to add EmployeeCity to the Orders document, but that is a decidedly inelegant solution. Another option is to use the full capabilities of RavenDB. For Subscription queries, we actually allow you to ask question on other documents, like so:
Now we’ll only get the Orders who employee is in London. Simple and quite elegant.
It does have a caveat, though. We will only evaluate this condition whenever the order changes, not when the employee changed. So if the employee moves, old orders will not be matched against the subscription, but new ones will.
Comments
"useful in many scenarios, including backend processing, queues..."
i had already couple of use cases where in the end i couldn't use it as a queue because it's not really possible to implement competing consumers in a dynamic way. "dynamic": it needs to be possible to dynamically assign more workers. can you show how to implement that with subscriptions, maybe i'm missing something? in a webcast you used for this the changes API, but changes API brings a whole world of other problems i don't want to solve (optimistic concurrency etc).
the only way i see to implement is with a flag on each JobDoc to say which worker will do it. Each worker filters those in the subscription. But this "assignment" needs to be implemented which i think is not that trivial and can get cumbersome.
it's interesting to see that with all the features in ravendb over time i was able to reduce the amount of other db's needed: (e.g. SQL, document expiration for caches (redis), distributed counters (redis), search (elastic search), time series (influxdb), in the near future office doc search (sharepoint PITA)...). unfortunately i still need rabbit mq or redis to implement a proper competing consumer scenario.
btw: whats happening with the graph feature? seems to be "experimental" for a very long time
Tobias,
Competing consumers with subscriptions run into a problem. With competing consumers, how would you know when to send the next batch?How can you ensure reliable delivery in this case?Consider the case for two consumers on the same subscription. One of them got a batch and is processing it, the next should also get a batch?What happens on failure of the first client? We'll need to keep track of non trivial amount of state in this case, all the outgoing batches that should be sent.
With concurrent consumers, there is also another worry. A document may be modified multiple times in a short order. In which case, the same document may be in two concurrent batch.That is likely to lead to issues with your code, because of the concurrent processing.
I would love to have good answers to those issues. As for graphs, they are experimental yet because we haven't had enough in field use of them to give us more confidence in their utilization.
Comment preview