RavenDB Subscriptions & Messaging patterns
RavenDB is a database, not a queue or a service bus. That said, you can make use of RavenDB subscriptions to get a very similar behavior to a service bus. Let’s see how much effort it will take us to implement backend processing using RavenDB only.
We assume that we have commands or messages, that are written to the Commands collection and are handled via a subscription (which may have multiple concurrent workers). In terms of your messaging models, we have:
The CommandBase we have here defines the following infrastructure properties:
- Status – enum [Initial, Processing, Failed, Completed] – default value is Initial
- RetriesCount – int – default value is 3
- Error – string – null by default
We can now define our subscription using the following query:
from Commands as c
where c.RetriesCount > 0 and c.Status != 'Completed' and c.’@metadata’.’@refresh’ == null
This query is pretty simple, but it allows me to get all the documents that haven’t exceeded their retry count. The @refresh option allows me to register a command to be executed at a later point in time. See the documentation here, this is a feature that exists specifically to allow you to schedule commands with subscriptions.
In my subscription workers, I can now execute:
The code above is sufficient to get most of the way toward a robust message handling system.
I can easily see what messages are being processing, I can see how long they take, etc. I can see what failed and why. And I can see the history of commands.
That handles scenarios such as error handling and retries, introspection on the state of the system and you can derive from here all the relevant numbers on throughput, capacity, etc.
It isn’t a complete solution, but for very little code, you can take this quite a long way.
Comments
A couple of nice things about this:
Transaction guarantees become a lot simpler because there's only one piece of infrastructure involved (RavenDB).
Because of that, troublesome outbox patterns are kinda redundant (you can change the entity/document and publish the event (another doc) in the same raven db transaction).
Some of the other features of RavenDB compose nicely if you need them (subscriptions, queries, revisions, expiry, refresh etc.).
For a world of microservices, the less infrastructure dependencies the better. Very likely we need a database to do anything useful. If that db can nicely fill the role of a queue for asynchronous processing too with some transactional and reliability guarantees, while avoiding the pitfalls of "relational DB's as queues", I'm all for that simplification of the infrastructure dependencies.
For the future, I'd love to see this a step further: ETL to Kafka or RabbitMQ. RavenDB would then nicely fill the outbox pattern where events are transactional with the data updates, but keeps the application isolated from the dependency on the queue infrastructure while still being able to participate in a larger ecosystem.
Cool, I've always wanted to have RavenDb transport in Rebus or NServiceBus as an alternative to SQL Server transports.
Correct me if I'm wrong but I tend to believe that to preserve reasonable order of messages, pulling messages from RavenDb by querying would require waiting for stale indexes so this subscription based approach seems to be the way to avoid it + we get the benefits of being served via TCP.
It may be a case though that some service bus libraries do not expect transports to have their own message pump so I suppose that some parts of SubscriptionWorker<T> would need to be extracted as a lower-level API (e.g. to connect and read the batch) to make it work for them.
Trev,
You are correct about the synergy of features, we are quite explicitly thinking about how we can merge them all for this scenario. As for ETL to Kafka, take a look at this post, I would love your feedback:
https://ayende.com/blog/195585-B/feature-design-etl-for-queues-in-ravendb?key=3f64714eb3f340ea93e0e9b17714fff5
Milosz,
If you wanted to do that with queries, you'll likely need to do pulling. The typical way would be to marry Changes() API (which will let you know when the index changes) and a query to get the latest values, etc. It is actually fairly simple to turn a SubscriptionWorker to a IEnumerable, if that is needed. The key issue is that you need to be able to mark completion to acknowledge the batch.
Comment preview