Ayende @ Rahien

filter by tags archive

architecture (618) rss
bugs (451) rss
challanges (123) rss
community (381) rss
databases (481) rss
design (896) rss
development (647) rss
hibernating-practices (72) rss
miscellaneous (592) rss
performance (397) rss
programming (1093) rss
raven (1459) rss
ravendb.net (545) rss
reviews (184) rss

2025
- August (6)
- July (7)
- June (7)
- May (10)
- April (10)
- March (10)
- February (7)
- January (12)
2024
- December (3)
- November (2)
- October (1)
- September (3)
- August (5)
- July (10)
- June (4)
- May (6)
- April (2)
- March (8)
- February (2)
- January (14)
2023
- December (4)
- October (4)
- September (6)
- August (12)
- July (5)
- June (15)
- May (3)
- April (11)
- March (5)
- February (5)
- January (8)
2022
- December (5)
- November (7)
- October (7)
- September (9)
- August (10)
- July (15)
- June (12)
- May (9)
- April (14)
- March (15)
- February (13)
- January (16)
2021
- December (23)
- November (20)
- October (16)
- September (6)
- August (16)
- July (11)
- June (16)
- May (4)
- April (10)
- March (11)
- February (15)
- January (14)
2020
- December (10)
- November (13)
- October (15)
- September (6)
- August (9)
- July (9)
- June (17)
- May (15)
- April (14)
- March (21)
- February (16)
- January (13)
2019
- December (17)
- November (14)
- October (16)
- September (10)
- August (8)
- July (16)
- June (11)
- May (13)
- April (18)
- March (12)
- February (19)
- January (23)
2018
- December (15)
- November (14)
- October (19)
- September (18)
- August (23)
- July (20)
- June (20)
- May (23)
- April (15)
- March (23)
- February (19)
- January (23)
2017
- December (21)
- November (24)
- October (22)
- September (21)
- August (23)
- July (21)
- June (24)
- May (21)
- April (21)
- March (23)
- February (20)
- January (23)
2016
- December (17)
- November (18)
- October (22)
- September (18)
- August (23)
- July (22)
- June (17)
- May (24)
- April (16)
- March (16)
- February (21)
- January (21)
2015
- December (5)
- November (10)
- October (9)
- September (17)
- August (20)
- July (17)
- June (4)
- May (12)
- April (9)
- March (8)
- February (25)
- January (17)
2014
- December (22)
- November (19)
- October (21)
- September (37)
- August (24)
- July (23)
- June (13)
- May (19)
- April (24)
- March (23)
- February (21)
- January (24)
2013
- December (23)
- November (29)
- October (27)
- September (26)
- August (24)
- July (24)
- June (23)
- May (25)
- April (26)
- March (24)
- February (24)
- January (21)
2012
- December (19)
- November (22)
- October (27)
- September (24)
- August (30)
- July (23)
- June (25)
- May (23)
- April (25)
- March (25)
- February (28)
- January (24)
2011
- December (17)
- November (14)
- October (24)
- September (28)
- August (27)
- July (30)
- June (19)
- May (16)
- April (30)
- March (23)
- February (11)
- January (26)
2010
- December (29)
- November (28)
- October (35)
- September (33)
- August (44)
- July (17)
- June (20)
- May (53)
- April (29)
- March (35)
- February (33)
- January (36)
2009
- December (37)
- November (35)
- October (53)
- September (60)
- August (66)
- July (29)
- June (24)
- May (52)
- April (63)
- March (35)
- February (53)
- January (50)
2008
- December (58)
- November (65)
- October (46)
- September (48)
- August (96)
- July (87)
- June (45)
- May (51)
- April (52)
- March (70)
- February (43)
- January (49)
2007
- December (100)
- November (52)
- October (109)
- September (68)
- August (80)
- July (56)
- June (150)
- May (115)
- April (73)
- March (124)
- February (102)
- January (68)
2006
- December (95)
- November (53)
- October (120)
- September (57)
- August (88)
- July (54)
- June (103)
- May (89)
- April (84)
- March (143)
- February (78)
- January (64)
2005
- December (70)
- November (97)
- October (91)
- September (61)
- August (74)
- July (92)
- June (100)
- May (53)
- April (42)
- March (41)
- February (84)
- January (31)
2004
- December (49)
- November (26)
- October (26)
- September (6)
- April (10)

Couchbase vs RavenDB Performance at Rakuten Kobo Whitepaper

Nov 26 2021

RavenDB 5.3 New FeaturesRevisions includes

time to read 2 min | 313 words

Tweet Share Share 2 comments

Tags:

Enabling RavenDB’s revisions allows you to ask RavenDB to keep immutable copies of a document. We originally envisioned this feature as a way to have easy audit trails and a time travel feature. Revisions were meant to be something that you’ll typically access as the administrator, not something that we expected to be used in normal course of events.

Usage in the field showed that users often want to make revisions a core part of the domain model. We have a user that uses revisions to mark the Approved (and thus, locked) version of a Plan document, for example. Another example is a payroll processing system where the Contract for a particular employee isn’t pointing to a document, but to a specific revision of that contract. Modifications to the contract have no impact on the employee unless they are explicitly moved to a new version of the contract.

Seeing all those use cases popping up for revisions, we added more features around revisions to make it easier to work with them (such as allowing to explicitly create a revision on command or enforcing revisions policies after the fact).

In RavenDB 5.3 we have added the ability to include revisions as part of the query, so you get the same benefit of reduction in the number of remote calls as you get when working with document references. Let’s say that I want to calculate payroll for a set of employees, here is how I can do that (the Contract property contains the change vector of the specific version of the contract):

In terms of API, this is how you use it:

This smooths out a scenario where you had to deal with making multiple remote calls into a single one for the whole process. Just the kind of improvement that I like to see.

Nov 25 2021

RavenDB 5.3 New FeaturesJSON Patch

time to read 1 min | 150 words

Tweet Share Share 0 comments

Tags:

JSON Patch is a feature that allows the frontend to send a set of changes on documents. If you are working with complex documents, that can result in a significant reduction in bandwidth. There are many scenarios where the client can modify a document on the browser, then produce the JSON Patch to make the server match the changes.

In RavenDB 5.3, we added direct support for implementing JSON Patch inside of RavenDB. The frontend code can forward the patch operations directly to RavenDB, where they will be executed by the database. Concurrent patches on the same document aren’t going to contend with one another and will be processed correctly. For example, in this post I’m using patch scripts to modify a document. I can do the same using JSON patch as well.

Here is how you can use the new ability:

Small improvement, but can make some scenarios much easier.

Nov 24 2021

RavenDB 5.3 New FeaturesStudio & Query improvements

time to read 3 min | 424 words

Tweet Share Share 4 comments

Tags:

I like to think about myself as a database guy. My go to joke about building user interfaces is that a <table> is all I need for layout (it’s not a joke). About a decade ago I just gave up on trying to follow what is going on in the frontend land and accepted that I’ll reside in the backend from here on after.

Being ignorant of the ways you’ll write a modern frontend doesn’t affect the fact that I like to use a good user interface. I have seriously mixed feelings about the importance of RavenDB Studio to the project. On the one hand, I care that it is easy to use, obvious and functional. I love that it is beautiful and will generally make your life easier. And at the same time, I abhor the fact that it has such an impact on people’s decisions. I mean, the backend of RavenDB is absolutely beautiful, from a technical perspective. But everyone always talk about the studio.

Leaving aside my mini rant, we spend quite a lot of time and effort on the studio and the User Experience in general. This release is not an exception and we have a couple of major new updates to the studio.

One of the most common things you’ll do in the studio is run queries. In this release we have done a complete revamp of the automatic code completion for the client-side RQL queries written in the studio.
The new code assistance is available when writing any query in the Query view, Patch view, and in the Subscription Query. That was actually quite interesting, from a computer science perspective. We have formal grammar for RQL now, for example, which means that we can provide much better experience for query editing. For example, take a look:

Full code completion assistance and better error handling directly at the studio makes it easier to work with RavenDB for both developers and operations.

The second feature is the Identities page:

Identities has been a feature in RavenDB for a long time, and somehow they have never been front and center. Maybe the discoverability of the feature suffered? You can now create, edit and modify the identities directly in the studio, not just through the API.

Nov 23 2021

RavenDB 5.3 New FeaturesTCP Compression

time to read 2 min | 378 words

Tweet Share Share 2 comments

Tags:

Most of the time, you’ll communicate with RavenDB using HTTP, making REST calls. When you are doing that, you can take advantage of request compression. If the client indicates that it is able to by sending a Content-Encoding: gzip, RavenDB will send the data to you compressed. Given that we are working with JSON texts, which compress very well, we are looking at pretty significant savings in network bandwidth. This has been the case for RavenDB for many years (I didn’t check, but at least a decade, I believe).

There are certain cases, however, where RavenDB will use a binary protocol instead of HTTP. Those are usually scenarios where we are communicating directly with another RavenDB instance. All internal communications between RavenDB nodes will use direct TCP connections and when using Subscriptions, the client will open a TCP connection for the server and use that on a long term basis.

One of the fallacies of distributed computing is that bandwidth is infinite. One of the realities of cloud computing, on the other hand, is that you are paying for bandwidth. Even when you are running inside the same cloud region, cross availability zone network traffic is still charged. As you can imagine, on active systems, you may notice that you are spending a lot of bandwidth on inter cluster communication.

With RavenDB 5.3, we have added compression support for the replication and subscription connections . That means that replication and subscriptions will default for compressing the data. We are using the Zstd algorithm. In our tests, it produced both a higher compression ratio and faster performance than GZip. You don’t have to do anything for this to work (although there is a configuration option "Server.Tcp.Compression.Disable" to disable that if you really want to). When you upgrade to RavenDB 5.3, the cluster will automatically start compressing all traffic.

In our tests, we are seeing 85% (!) reduction in the amount of network traffic that we send out. That is something that I’m very much looking to seeing in our metrics once this is rolled out completely.

This is a RavenDB 5.3 feature (expected mid November) and will be available in the Professional and Enterprise editions of RavenDB.

Nov 18 2021

RavenDB 5.3 New FeaturesExperimental PostgreSQL wire protocol

time to read 4 min | 649 words

Tweet Share Share 4 comments

Tags:

A really nice feature that we have in RavenDB 5.3 is support for wire protocol compatibility with PostgreSQL. That opens up RavenDB to the entire PostgreSQL ecosystem. You are now able to connect to a RavenDB instance using the tools such as psql, Npgsql, etc. This feature is both surprisingly simple and incredibly complex at the same time.

The actual wire protocol from Postgres is well documented and pretty clean. Doing a clean room implementation of that is a straightforward process. Adding that to RavenDB, on the other hand, led to a number of interesting challenges. To start with, the protocol assumes, at the wire level, that all the rows that you return for a query have the same structure. This is a very reasonable assumption to make for a relation database protocol, but it doesn’t hold true when you are talking about a schema-less database such as RavenDB.

Then there is the fact that clients will generate all sort of pretty scary queries before you even get to running a user’s query. For example, take a look at how Npgsql is detecting the capabilities of the database that it connects to. Just supporting the wire protocol isn’t sufficient, you also need to support quite a bit of additional behavior.

When we implemented this feature, we decided that we’ll support the wire protocol, so you’ll be able to connect, issue queries and get results. However, the query language itself is going to be RQL. We aren’t attempting to pretend that we are a PostgreSQL instance to the outside world, only implement enough to make integration and compatibility work.

Here is an example of running a query through the Postgres integration.

This is an experimental feature, mind you. It is showing a lot of promise, but we want to get some more feedback from our users about which ways we should take it. The feature opens up many doors, but it is also bringing with it a non trivial amount of complexity.

This feature requires that we’ll open up another port to the world, this is something that we require the user to explicitly allow. To enable this feature, you’ll need to set the following options in the settings.json configuration file:

"Integrations.PostgreSQL.Enabled": true
"Features.Availability" : "Experimental"

You can also control which port it will use using the Integrations.PostgreSQL.Port configuration option. We default to 5433 if none is specified.

At the current time, we only allow to issue queries and not modify data using the Postgres integration. This is something that we would very much like more feedback on, what kind of scenarios would you like to have where write scenario is supported? What kind of writes do you expect to have at that point?

Finally, a word about security. The PostgreSQL protocol supports using TLS for encryption. When running in insecure mode, RavenDB will reject SSL/TLS connections from Postgres client. When running in a secured mode (the default), the same server certificate that is used inside of RavenDB will also be used for the Postgres connection. However, while we usually require that the other side also authenticates using a client certificate, in the case of Postgres connection, we run into a problem. There are quite a few scenarios where we found out that while the Postgres protocol supports mutual authentication using client certificates, clients aren’t supporting it.

For that reason, we are allowing user & password authentication (on top of TLS connection, obviously) for the Postgres connections at this time. Note that there is no correlation between the Postgres login and the access to any other RavenDB features (where client certificates is the only option).

This is part of the RavenDB 5.3 release (expected in mid November) and will be available in the Professional and Enterprise editions.

Nov 17 2021

RavenDB 5.3 New FeaturesElasticsearch ETL

time to read 3 min | 564 words

Tweet Share Share 0 comments

Tags:

RavenDB tries to be a good neighbor in your systems. RavenDB is typically used in polyglot solutions and we are often brought in to existing ecosystems. One of the things that we do to make it easier to use RavenDB is to have a full suite of built-in tools to make pushing data to other destinations.

For example, you can define an ETL process that will push document changes from RavenDB (potentially transforming & filtering them) to a relational database, another RavenDB instance, a data lake / OLAP system and much more.

In RavenDB 5.3 we have added Elasticsearch as an ETL target for RavenDB. If you are familiar with RavenDB ETL processes, the behavior is pretty much the same as you would expect. You select which collections you want to push to Elasticsearch, you provide a script that filters and transform the data and then you are done. From that point on, it is RavenDB’s responsibility to keep the Elasticsearch target up to date with any changes that are happening inside of RavenDB.

I’ll discuss the exact details on how to make it work shortly, but first I want to talk a bit about the usage scenario for this. Elasticsearch, just like RavenDB, it using Lucene behind the scenes to implement indexes. Unlike RavenDB, however, Elasticsearch is all about… well, searching. In that context, there is a pretty big overlap between RavenDB and Elasticsearch. In fact, one of the primary reasons we see people selecting RavenDB is that they now don’t need to maintain multiple environments (one to store the data and an Elasticsearch cluster for searching on that), RavenDB is able to undertake both needs in a single highly integrated and performant package.

The most common scenario for Elasticsearch ETL is when you already have an existing investment in Elasticsearch. RavenDB will naturally integrate into your environment, without needing to make any significant changes. That can enable you to start running queries, Kibana dashboard, etc on your RavenDB documents.

Here is the transformation script:

And the configuration telling RavenDB where to go:

You can push multiple collections to multiple Elasticsearch indexes. It is important to note that you must include the RavenDB document Id as a property in the script and also set it in the destination index configuration. If the Elasticsearch index doesn't already exist, RavenDB will create it for you on the fly.

This is… pretty much it. The actual feature is fully fledged, of course. You get monitoring and tracking, it will run in high availability mode and will be assigned an owner node in the cluster, etc. If there is a failure on Elasticsearch, there is no data loss, RavenDB will wait for the target to come back up and push all the data that was changed in the meantime. The ETL process is an online process, which means that you can expect to see changes in RavenDB reflected in Elasticsearch index within a few milliseconds of the transaction commit.

This feature is available in the Professional and Enterprise editions of RavenDB and will be included in the RavenDB 5.3 released, scheduled for mid November.

Nov 15 2021

RavenDB 5.3 New FeaturesIncremental time series & implementing lambda based accounting

time to read 2 min | 285 words

Tweet Share Share 0 comments

Tags:

Everyone is on the cloud these days, and one of the things that I keep seeing pushed is the notion of usage based billing. Basically, the idea that you are paying for what you use.

Let’s assume that we are building a software as a service where users can submit an image and you’ll do some computation on that. The actual details aren’t relevant. What matters is that your pricing model is based around how much time processing each image takes and how much memory is used. You are running this on many machines and need to figure out how to do billing at the end of the month. It turns out that this can be quite a challenge. With incremental time series, a lot of the details around that just go away.

Here is how you can implement this:

You count the required memory as well as the actual runtime and record that in an incremental time series. We are also storing the details in a separate document for that particular run in the same transaction (if the user cares about that level of detail). The interesting bit about how this can be used is that the data is now immediately available for the user to see how much they are going to be billed.

Typically, a lot of time is spent in figuring out how to record those details efficiently and then how to query and aggregate those. We tested time series in RavenDB to billions of data points, and the internal format lends itself very well to aggregated queries.

Now you can take the code above, run it on 100s of machines, and it will all end up giving you the proper result in the end.

Nov 12 2021

RavenDB 5.3 New FeaturesIncremental time series

time to read 6 min | 1111 words

Tweet Share Share 2 comments

Tags:

In RavenDB 5.0 we had a major new feature, native time series support. Using this feature, you can store values over time, query and aggregate them, store them efficiently, produce rollups, etc.

The classic example for time series data in RavenDB is when you have data coming from sensors. For example a Fitbit monitoring heartrate, a stock exchange feed giving you stock values. You don’t care about a particular value, you care about the value over time. It turns out that there are quite a lot of use cases for those kind of details. We have seen a major pick up in IoT related fields in particular.

However, the API we provided for users to insert data for time series had a limitation, have a look:

The API gives you the ability to record a value (or a set of values) at a particular point in time, with an optional tag for additional meaning. What is the problem with this API, then?

Well, it works great if you are processing data from a singular source (the stock exchange feed, or a medical device), but it fails to do its job if you may need to record multiple values for the same timestamp.

Huh? What does that even mean? If we a are storing a value per timestamp, obviously there should be a value for that timestamp. How can there be multiple values? Note that here I’m not talking about something like location (with latitude and longitude coordinates), those are covered under storing an array of values on the same timestamp.

The issue happens when you have the need to record multiple different values at the same timestamp. Typical time series are things like Heartrate, Location, StockPrice, etc. Having multiple values for the same thing at the same time frame doesn’t really work. In the Location time series, if I’m both here and there, you can expect trouble (if only because the paradox cops will show up). A stock may have different prices at the same time in different exchanges, sure, but that is not the same value, by its very nature.

There is a common scenario where this will happen. When what I’m recording is not the full value, but part of that value. The classic example for that is tracking page views. Let’s say that I want to know how many people are looking at this blog post, I cannot use the Append() API for that purpose. Each individual operation is going to belong to a particular timestamp. What happens if I have two views on this post at the exact same millisecond? For that matter, what happens in the more “interesting” case of having writes to the same millisecond on two different nodes in the cluster?

With timeseries as we envisioned them for the 5.0 release, that wasn’t an issue, a timeseries had a value in a particular timestamp. But supporting a scenario such as tracking views, or any scenario where we want to record partial data and have RavenDB take care of everything else isn’t served well by this model.

Note that RavenDB already has the notion of distributed counters, they are intended specifically for doing such things. It is trivial in RavenDB to implement a counter that would track the overall views on a post. It will also handle concurrency, distributing data between nodes, everything that needs to be handled. So why can’t I use that?

It turns out that I typically want to know more than just the total number of views on the post, I want to know when they happened. Counters are only a partial answer for that.

That is why incremental time series were created. They are here to marry the ability of time series to track a value over time and the distributed counters ability to aggregate information concurrently and in a safe distributed manner. Here is the new API for incremental time series:

The changes are apparent at the API level, the Increment() is not setting the value, it is incrementing it with a delta value. So two increments on the same timestamp will give you the right result. Note that we don’t have a way to tag the entry any longer. That is no longer meaningful, because a single timestamp may have multiple different values. The method is called increment, but note that you can also pass negative values, if you want to reduce the amount.

You can see in the image on the right how this looks like in the studio. An incremental time series is one that has the “INC:” prefix in the name. Such a time series is able to accept only increment operations, it will reject attempts to append values to it. In the same sense, a non incremental time series will not allow you to increment a value, only append new entries. We wanted to have a strong separation between the two time series modes because mixing them up resulted in a huge mess of edge cases that are really hard to solve.

I probably should explain the terminology here, because it reflects an important distinction:

Append – add a new timestamp and the value(s) for that time. This appends to the time series a new entry. Appending an entry to a time that is already in the timeseries will overwrite that time.
Increment – add a new timestamp and its values. If there is already value for that time in the time series, we’ll add the new value and existing value together, writing their sum as the new value.

That isn’t actually how it works internally, but that is the conceptual model.

Aside from using increment to set the values, incremental time series behave just like any other time series. You can query over them, aggregate, index, etc. They can create rollups (a rolled up incremental time series is a normal time series, not an incremental one), apply retention polices, and everything else that you can do with a time series, the special behavior of incremental time series does not extend to its rolled-up versions.

Here is a full example of how you can use this feature:

As usual, this is transactional with any other operation you may want to do, so you can increment a time series along side uploading an attachment and modifying a document, as a single atomic transaction.

And now we can ask about view counts on an hourly basis for the last week, like so:

This feature is going to be available in all editions of RavenDB 5.3, expected for release in mid November. I got so many ideas about what you can use this for Smile .

Nov 11 2021

RavenDB 5.3 New FeaturesConcurrent Subscriptions & Serial operations

time to read 6 min | 1086 words

Tweet Share Share 4 comments

Tags:

Almost as soon as we introduced concurrent subscriptions, we ran into a serious problem in their use. The desire was to do things in a serial fashion. That was quite infuriating, because we spent to much time working on making things concurrent, and now we had to deal with making them serial again? What the hell?

Before I dive any further, it will probably be for the best if I explained a bit more about the context of this very strange feature request.

Consider a system where the subscription is used to process commands, which may relationships between one another. For example, consider the following commands (all of them belonging to the same “Commands” collection):

EmployeePayroll – commands/40-A
EmployeeBankAccountChange – commands/34-A
EmployeeContractUpdate – commands/49-C

For each one of those commands (and many more), we want to run some logic. Some of this requires us to touch third party services, which means that we are likely to be slow / stalled on some cases. That is the exact case for using concurrent subscriptions.

The developers quickly jumped on the new system, setting the mode of the subscription as concurrent and running multiple workers. Things worked, latency was down and everyone was happy. Everyone, that is, except for George. The problem was George had gotten married recently. Well, that wasn’t the actual problem. George is happily married. The problem is that George and his wife have a new joint bank account. George let the HR department know about the new bank account in advance, which resulted in the EmployeeBankAccountChange command being generated. Then payroll day hit, and we have an EmployeePayroll command as well.

This is where things started to get iffy. In terms of timing, the EmployeeBankAccountChange happened before the EmployeePayroll command. When the subscription was running in serial mode, it was guaranteed that it will always process the commands in the order that they were modified. That meant that handling things like changing the bank account and actually paying had a very natural order. If you made the change before payroll, it got processed before hand, otherwise, it was processed afterward.

With concurrent subscriptions, this is no longer the situation. We are still working roughly in the order of modification, but we are no longer guaranteeing it. And it is possible to process documents out of order.

RavenDB’s concurrent subscriptions will ensure that you’ll not have to worry about concurrent processing of a single document, but in this case, there are different documents, so they can be processed concurrently. An EmployeeBankAccountChange may take a long time (verifying accounts, etc) while EmployeePayroll is just adding a line to a ACH file, so it is very likely that we’ll process the payroll before the account change. And that makes George very sad. Let’s see how we can avoid depressing the newlywed.

One option is to make use of another RavenDB feature, the compare exchange support. This allows you to use strongly consistent, cluster-wide, values which are suitable for distributed locks. I looked into what it will take to build this and quailed in fear. I don’t want to let things become this complicated.

The key issue here is that we want both concurrency and serial work. An interesting observation is that there is a scope for such things. Commands on the same employee should run in the same order they were issued, commands on different employees are free to run in whatever order they like. How can we make this work without diving head first into complexity the like of which will keep you up at night?

For the most part, we can assume that concurrent operations for the same employee is rare. Even when we have multiple commands for the same employee, we can expect that there won’t be many of them. Given that, we can change the way we model the commands themselves. Instead of creating a document per command, we’ll have a document per employee.

Where before we had this model:

We’ll now have the following model:

What does this give us? We now have a commands/employees/1-A for the first employee, all operations on the employee and handled as a single unit, guaranteed by the concurrent subscription. Let’s explore further how that works, okay?

With the previous model/modeling, to register a command, we need to just call:

All the commands were using the Commands collection, so the subscription worker will look like::

from Commands

And if we process this concurrently, we may process the commands for the same employee at the same time, leading to sadness in the household of George. Instead, with the new model/modeling, we can use the patching API to handle this. Here is what this looks like:

The idea in this case is that all commands for the same employee use the same document. If there isn’t already such a value, we’ll create a new instance, otherwise, we’ll apply the patch script and add to it. The end result is that we can have multiple concurrent operations and they will all be added to the same document in order of execution. However, so far this has nothing to do with concurrent subscriptions. What do we do from here? Here is what the subscription worker looks like after these changes:

The idea is that when we enqueue a command, we register them in the document specifically for the employee (the scope for serial work in a concurrent subscription) and when we process the command in the subscription worker we patch out all the commands that we already executed.

This behavior will guarantee that we can process commands serially within a concurrent worker. All commands for the same employee will be processed serially in the order they were submitted, while different employees will be processed concurrently.We even support adding additional commands to the employee document while the worker is processing commands, we’ll simply handle them in the next batch after the employee commands are all done.

One thing that I’m not discussing here is what to do in case we have concurrent modifications on the commands document in multiple nodes? That would generate a conflict and RavenDB defaults to selecting the latest version. You can configure RavenDB to resolve this property, I talk about this at length here.

Aside from leaning on the new concurrent subscriptions feature, all the rest of the things that we have been using in this post to solve the problem are long standing features of RavenDB and both conceptually and in practice this gives us a great deal of simplicity to handle a non trivial issue.

As usual, I would very much welcome your feedback.

Nov 10 2021

RavenDB 5.3 New FeaturesConcurrent subscriptions

time to read 7 min | 1214 words

Tweet Share Share 2 comments

Tags:

RavenDB supports a dedicated batch processing mode, using the notion of subscriptions. A subscription is simply a way to register a query with the database and have the database send the subscriber the documents that match the query.

The previous sentence is taken directly from the Inside RavenDB book, and it is a good intro for the topic. A subscription is a way to process documents that match a query. A good example might be to run various business processes as a result of data changes. Let’s assume that we have a bank, and a new customer was registered. We need to run plenty of such processes (Know Your Customer, Anti Money Laundering, Credit Score, in-house estimation, credit limits & authorization, etc).

A typical subscription query would then be:

from Customers where Onboarded = false

And then we can register to that subscription. At this point, the database will start sending us all the customers that haven’t been onboarded yet. This is a persistent query, so restarts and failures are handled properly. And the key aspect is that RavenDB will push the matching documents to the subscription worker. RavenDB will handle batching of the results, ensure that we can process humungous amount of data safely and easily and in general remove a lot of hassle from backend processing.

Up until RavenDB 5.3, however, a subscription was defined to be a singleton. In other words, at any given point, only a single subscription worker could be running. That is enforced by the server and help making it much easier to reason about processing documents. One comment that we got is that this is great, if the processing that we are doing is internal, but if there is the need to make a remote call to a potentially slow service, that can be an issue.

For example, consider the following worker code:

What happens when the CheckCreditScore() is slow? We are halting processing for everything. In some cases, it is only particular customers that are slow, and we absolutely want to process them in parallel. However, RavenDB did not allow that.

In RavenDB 5.3, we are bringing concurrent subscriptions to the table. When you create the subscription worker, you can define it with a Concurrent mode, like so:

When you have done that, RavenDB will allow multiple concurrent workers to run at the same time, processing batches in parallel. That means that a single slow customer will not halt your entire processing pipeline.

In general, I would like you to think about this flag as just removing a limitation. Previously we blocked you from an operation, and now you can run freely. However…

We didn’t decide to limit your capabilities just because we like doing that. One of the key aspects of subscriptions is that they offer reliable processing of documents. If an exception has been thrown when processing a batch, RavenDB will resend the batch to the worker again, until processing is susccessful. If we handed a batch of documents to process to a worker, and that worker crashed without letting us know, we need to make sure that the next client to connect will start processing from the last acknowledged batch.

It turns out that adding concurrency and the ability for workers to work completely independently of one another make such promises a lot harder to implement.

There is also another aspect that we have to consider. When we have just a single worker, certain concurrency issues never happen, but when we allow you to run concurrently, we have to deal with them.

Consider the subscription above, running on two workers. We handed a new customer document to Worker A, which started processing it. While Worker A is processing the document, that document has changed. That means that it needs to be processed again by the subscription. We have Worker B available and ready, but if we allow such a scenario, we risk getting a race between the workers, working on the same document.

We could punt that to the user and ask them to ensure that this is something that they handle, but that isn’t the philosophy of RavenDB. Instead, we have implemented the following behavior for concurrent subscriptions:

When the server sends a batch of documents to a worker, that worker “checks them out”. Until that worker signals the server that the batch has been either processed or failed, we’ll not send those documents out to other workers, even if they have been modified. Once a batch is acknowledged as processed, we’ll scan all the documents in that batch and see if we need to schedule them for the next batch, because they were missed while they were checked out.

That means that from the perspective of the user, they can write code knowing that only a single subscription worker will run on a given document at a time. This is a very powerful promise and can significantly simplify the complexity of building your systems. A single worker that is stalling will not prevent the other workers from making progress. There aren’t any timeouts to deal with. If you have a process that may take a long time, as long as the worker is alive and functioning (maintaining the TCP connection to the server), the server will consider the documents that the worker is processing as checked out.

Concurrent subscriptions require you to opt in, using the Concurrent flag. All workers for a subscription must agree to run in a concurrent mode. This is to ensure that there aren’t any workers that expect pure serial work model. If you aren’t setting this flag, you’ll keep getting the usual serial behavior of subscriptions. We require opting in to this behavior because we violate an important guarantee of the subscription, that you’ll process the documents in the order in which they were modified. This is now no longer the case, obviously.

The first worker to connect to a subscription will determine if it will run in concurrent mode or serial mode. Any new worker trying to run on that subscription needs to be concurrent (if the first one was concurrent) and no concurrent worker can join a subscription that has a serial worker active. This is a transient setting, it is important to note. When the last worker is shut down, the subscription state is reset, and then you can connect a worker for the first time again (which will then be able to set the mode of the subscription).

You can see in the benchmark image on the right the impact of adding concurrent workers when there is a non trivial processing time. It is important to note that the concurrent part of the concurrent subscriptions is the fact that the workers are running in parallel. We are still sending batches of documents for each worker independently and then waiting for confirmation. If you have no significant processing time for a batch, you’ll not see a significant improvement in processing time (the server side cost of processing the documents, sending the batch, etc is related to the total number of documents, and won’t be impacted).

Concurrent subscriptions are available in RavenDB 5.3 (due to be released by mid November) and will be available in the Professional and Enterprise editions of RavenDB.

Oren Eini

Oren Eini

CEO of RavenDB

RavenDB 5.3 New FeaturesRevisions includes

RavenDB 5.3 New FeaturesJSON Patch

RavenDB 5.3 New FeaturesStudio & Query improvements

RavenDB 5.3 New FeaturesTCP Compression

RavenDB 5.3 New FeaturesExperimental PostgreSQL wire protocol

RavenDB 5.3 New FeaturesElasticsearch ETL

RavenDB 5.3 New FeaturesIncremental time series & implementing lambda based accounting

RavenDB 5.3 New FeaturesIncremental time series

RavenDB 5.3 New FeaturesConcurrent Subscriptions & Serial operations

RavenDB 5.3 New FeaturesConcurrent subscriptions

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication

Main feed
Comments feed