Ayende @ Rahien

Oren Eini aka Ayende Rahien CEO of Hibernating Rhinos LTD, which develops RavenDB, a NoSQL Open Source Document Database.

You can reach me by:


+972 52-548-6969

Posts: 6,990 | Comments: 49,635

filter by tags archive
time to read 2 min | 303 words

Recently the time.gov site had a complete makeover, which I love. I don’t really have much to do with time in the US in the normal course of things, but this site has a really interesting feature that I love.

Here is what this shows on my machine:


I love this feature because it showcase a real world problem very easily. Time is hard. The concept we have in our head about time is completely wrong in many cases. And that leads to interesting bugs. In this case, the second machine will be adjusted on midnight from the network and the clock drift will be fixed (hopefully).

What will happen to any code that runs when this happens? As far as it is concerned, time will move back.

RavenDB has a feature, document expiration. You can set a time for a document to go away. We had a bug which caused us to read the entries to be deleted at time T and then delete the documents that are older than T. Expect that in this case, the T wasn’t the same. We travelled back in time (and the log was confusing) and go an earlier result. That meant that we removed the expiration entries but not their related documents. When the time moved forward enough again to have those documents expire, the expiration record was already gone.

As far as RavenDB was concerned, the documents were updated to expire in the future, so the expiration records were no longer relevant. And the documents never expired, ouch.

We fixed that by remembering the original time we read the expiration records. I’m comforted with knowing that we aren’t the only one having to deal with it.

time to read 6 min | 1077 words

I run across this article, which talks about unit testing. There isn’t anything there that would be ground breaking, but I run across this quote, and I felt that I have to write a post to answer it.

The goal of unit testing is to segregate each part of the program and test that the individual parts are working correctly. It isolates the smallest piece of testable software from the remainder of the code and determines whether it behaves exactly as you expect.

This is a fairly common talking point when people discuss unit testing. Note that this isn’t the goal. The goal is what you what to achieve, this is a method of applying unit testing. Some of the benefits of unit test, are:

Makes the Process Agile and Facilitates Changes and Simplifies Integration

There are other items in the list on the article, but you can just read it there. I want to focus right now on the items above, because they are directly contradicted by separating each part of the program and testing it individually, as is usually applied in software projects.

Here are a few examples from posts I wrote over the years. The common pattern is that you’ll have interfaces, and repositories and services and abstractions galore. That will allow you to test just a small piece of your code, separate from everything else that you have.

This is great for unit testing. But unit testing isn’t a goal in itself. The point is to enable change down the line, to ensure that we aren’t breaking things that used to work, etc.

An interesting thing happens when you have this kind of architecture (and especially if you have this specifically so you can unit test it): it becomes very hard to make changes to the system. That is because the number of times you repeated yourself has grown. You have something once in the code and a second time in the tests.

Let’s consider something rather trivial. We have the following operation in our system, sending money:


A business rule says that we can’t send money if we don’t have enough in our account. Let’s see how we may implement it:

This seems reasonable at first glance. We have a lot of rules around money transfer, and we expect to have more in these in the future, so we created the IMoneyTransferValidationRules abstraction to model that and we can easily add new rules as time goes by. Nothing objectionable about that, right? And this is important, so we’ll have unit tests for each one of those rules.

During the last stages of the system, we realize that each one of those rules generate a bunch of queries to the database and that when we have load on the system, the transfer operation will create too much pain as it currently stand. There are a few options that we have available at this point:

  • Instead of running individual operations that will each load their data, we’ll do it once for every one. Here is how this will look like:

As you can see, we now have a way to use Lazy queries to reduce the number of remote calls this will generate.

  • Instead of taking the data from the database and checking it, we’ll send the check script to the database and do the validation there.

And here we moved pretty much the same overall architecture directly into the database itself. So we’ll not have to pay the cost of remote calls when we need to access more information.

The common thing for both approach is that it is perfectly in line with the old way of doing things. We aren’t talking about a major conceptual change. We just changed things so that it is easier to work with properly.

What about the tests?

If we tested each one of the rules independently, we now have a problem. All of those tests will now require non trivial modification. That means that instead of allowing change, the tests now serve as a barrier for change. They have set our architecture and code in concrete and make it harder to make changes.  If those changes were bugs, that would be great. But in this case, we don’t want to modify the system behavior, only how it achieve its end result.

The key issue with unit testing the system as a set of individually separated components is that concept that there is value in each component independently. There isn’t. The whole is greater than the sum of its parts is very much in play here.

If we had tests that looked at the system as a whole, those wouldn’t break. They would continue to serve us properly and validate that this big change we made didn’t break anything. Furthermore, at the edges of the system, changing the way things are happening usually is a problem. We might have external clients or additional APIs that rely on us, after all. So changing the exterior is something that I want to enforce with tests.

That said, when you build your testing strategy, you may have to make allowances. It is very important for the tests to run as quickly as possible. Slow feedback cycles can be incredibly annoying and will kill productivity. If there are specific components in your system that are slow, it make sense to insert seams to replace them. For a example, if you have a certificate generation bit in your system (which can take a long time) in the tests, you might want to return a certificate that was prepared ahead of time. Or if you are working with a remote database, you may want to use an in memory version of that. An external API you’ll want to mock, etc.

The key here isn’t that you are trying to look at things in isolation, the key is that you are trying to isolate things that are preventing you from getting quick feedback on the state of the system.

In short, unless there is uncertainty about a particular component (implementing new algorithm or data structure, exploring unfamiliar library, using 3rd party code, etc), I wouldn’t worry about testing that in isolation. Test it from outside, as a user would (note that this may take some work to enable that as an option) and you’ll end up with a far more robust testing infrastructure.

time to read 6 min | 1149 words

I recently got into an interesting discussion about one of the most interesting features of RavenDB, the ability to automatically deduce and create indexes on the fly, based on actual queries made to the server. This is a feature that RavenDB had for a very long time, over a decade and one that I’m quite proud of. The discussion was about whatever such a feature was useful or not in real world scenario. I obviously leant on this being incredibly useful, but I found the discussion good enough to post it here.

The gist of the argument against automatic indexes is that the developers should be in control of what is going on in the database and create the appropriate indexes on their own accord. The database should be not be in the business of creating indexes on the fly, which is scary to do in production.

I don’t like the line of thinking that says that it is the responsibility of the developers / operators / database admins to make sure that all queries use the optimal query plans. To be rather more exact, I absolutely think that they should do that, I just don’t believe that they can / will / are able to.

In many respects, I consider the notion of automatic index creation to be similar to garbage collection in managed languages. There is currently one popular language that still does manual memory management, and that is C. Pretty much all other languages have switched to some other mode that mean that the developer don’t need to track things manually. Managed languages has a GC, Rust has its ownership model, C++ has RAII and smart pointers, etc. We have decades and decades of experience telling us that no, developers actually can’t be expected to keep track of memory properly. There is a direct and immediate need for systematic help for that purpose.

Manual memory management can be boiled down to: “for every malloc(), call free()”. And yet it doesn’t work.

For database optimizations, you need to have a lot knowledge. You need to understand the system, the actual queries being generated, how the data is laid out on disk and many other factors. The SQL Server Query Performance Tuning book is close to a thousand pages in length. So that is decidedly not a trivial topic.

It is entirely possible to expect experts to know the material and have a checkpoint to deployment that would ensure that you have done the Right Thing before deploying to production. Expect that this is specialized knowledge, so now you have gate keepers, and going back to manual memory management woes, we know that this doesn’t always work.

There is a cost / benefit calculation here. If we make it too hard for developers to deploy, the pace of work would slow down. On the other hand, if a bad query goes to production, it may take the entire system down.

In some companies, I have seen weekly meetings for all changes to the database. You submit your changes (schema or queries), it get reviewed in the next available meeting and deploy to production within two weeks of that. The system was considered to be highly efficient in ensuring nothing bad happened to the database. It also ensured that developers would cut corners. In a memorable case, a developer needed to show some related data on a page. Doing a join to get the data would take three weeks. Issuing database calls over the approved API, on the other hand, could be done immediately. You can guess how that ended up, don’t you?

RavenDB has automatic indexes because they are useful. As you build your application, RavenDB learn from the actual production behavior. The more you use a particular aspect, the more RavenDB is able to optimize it. When changes happen, RavenDB is able to adjust, automatically. That is key, because it remove a very tedious and time consuming chore from the developers. Instead of having to spend a couple of weeks before each deployment verifying that the current database structure still serve for the current set of queries, they can rest assured that the database will handle that.

In fact, RavenDB has a mode where you can run your system on a test database and take the information gather from the test run and apply it on your production system. That way, you can avoid having to learn the new behavior on the fly. You can introduce the new changes to the system model at an idle point in time and let RavenDB adjust to it without anything much going on.

I believe that much of the objection for automatic indexes comes from the usual pain involved in creating indexes in other databases. Creating an index is often seen as a huge risk. It may lock tables and pages, it may consume a lot of system resources and even if the systems has an online index creation mode (and not all do), it is something that you Just Don’t do.

RavenDB, in contrast, has been running with this feature for a decade. We have had a lot of time to improve the heuristics and behavior of the system under this condition. New indexes being introduced are going to have bounded resources allocated to them, no locks are involved and other indexes are able to server requests with no interruption in service. RavenDB is also able to go the other way, it will recognize which automatic indexes are superfluous and remove them. And automatic indexes that see no use will be expired by the query optimizer for you. The whole idea is that there is an automated DBA running inside the RavenDB Query Optimizer that will constant monitor what is going on, reducing the need for manual maintenance cycles.

As you can imagine, this is a pretty important feature and has been through a lot of optimization and work over the years. RavenDB is now usually good enough in this task that in many cases, you don’t ever need to create indexes yourself. That has enormous impact on the ability to rapidly evolve your product. Because you are able to do that instead of going over a thousand pages book telling you how to optimize your queries. Write the code, issue your queries, and the database will adjust.

Will all those praises that I heap upon automatic index creation, I want to note that it is a most a copper bullet, not a silver one. Just like with garbage collection, you are free from the minutia and tedium of manual memory management, but you still need to understand some of the system behavior. The good thing about this is that you are free()-ed  from having to deal with that all the time. You just need to pay attention in rare cases, usually at the hotspots of your application. That is a much better way to invest your time.

time to read 7 min | 1291 words

A system that runs on a single machine is an order of magnitude simpler than one that reside on multiple machines. The complexity involved in maintaining consistency across multiple machines is huge. I have been dealing with this for the past 15 years and I can confidently tell you that no sane person would go for multi machine setup in favor of a single machine if they can get away with it. So what was the root cause for the push toward multiple machines and distributed architecture across the board for the past 20 years? And why are we see a backlash against that today?

You’ll hear people talking about the need for high availability and the desire to avoid a single point of failure. And that is true, to a degree. But there are other ways to handle that (primary / secondary model) rather than the full blown multi node setup.

Some users simply have too much data to go around and have to make use of a distributed architecture. If you are gathering a TB / day of data, no single system is going to help you, for example. However, most users aren’t there. A growth rate of GB / day (fully retained) is quite respectable and will take over a decade to start becoming a problem on a single machine.

What I think people don’t remember so well is that the landscape has changed quite significantly in the past 10 – 15 years. I’m not talking about Moore’s law, I’m talking about something that is actually quite a bit more significant. The dramatic boost in speed that we saw in storage.

Here are some numbers from the start of the last decade a top of the line 32GB SCSI drive with 15K RPM could hit 161 IOPS. Looking at something more modern disk with 14 TB will have 938 IOPS. That is a speed increase of over 500%, which is amazing, but not enough to matter. These two numbers are from hard disks. But we have had a major disruption in storage at the start of the millennium. The advent of SSD drives.

It turns out that SSDs aren’t nearly as new as one would expect them. They were just horribly expensive. Here are the specs for such a drive around 2003. The cost would be tens of thousands (USD) per drive. To be fair, this was meant to be used in rugged environment (think military tech, missiles and such), but there wasn’t much else in the market. In 2003 the first new commodity SSD started to appear, with sizes that topped at 512MB.

All of this is to say, in the early 2000s, if you wanted to store non trivial amount of data, you had to face the fact that you had to deal with hard disks. And you could expect some pretty harsh limitations on the number of IOPS available. And that, in turn, meant that the deciding factor for scale out wasn’t really the processing speed. Remember that the C10K problem was still a challenge, but reasonable one, in 1999. That is, handling 10K concurrent connections on a single server (to compare, millions of connections per server isn’t out of the ordinary).

Given 10K connections per server, with each one of them needing a single IO per 5 seconds, what would happen? That means that we need to handle 2,000 IOPS. That is over ten times what you can get from a top of the line disk at that time. So even if you had a RAID0 with ten drives and was able to get perfect distribution of IO to drive, you would still be about 20% short. And I don’t think you’ll want to get 10 drives at RAID0 in production. Given the I/O limits, you could reasonably expect to serve 100 – 300 operations per second per server. And that is assuming that you were able to cache some portion of the data in memory and avoid disk hits. The only way to handle this kind of limitation was to scale out, to have more servers (and more disks) to handle the load.

The rise of commodity SSDs changed the picture significantly and NVMe drives are the icing on the cake. SSD can do tens of thousands of IOPS and NVMe can do hundreds of thousands IOPS (and some exceed the million IOPS with comfortable margin).

Going back to the C10K issue? A 49.99$ drive with 256GB has specs that exceed 90,000 IOPS. Those 2000 IOPS we had to get 10 machines for? That isn’t really noticeable at all today. In fact, let’s go higher. Let’s say we have 50,000 concurrent connections each one issuing an operation once a second. This is a hundred times more work than the previous example. But the environment in which we are running is very different.

Given an operating budget of 150$, I will use the hardware from this article, which is basically a Raspberry PI with SSD drive (and fully 50$ of the budget go for the adapter to plug the SSD to the PI). That gives me the ability to handle 4,412 requests per second using Apache, which isn’t the best server in the world. Given that the disk used in the article can handle more than 250,000 IOPS, we can run a lot on a “server” that fits into a lunch box and cost significantly less than the monthly cost of a similar machine on the cloud. This factor totally change the way you would architect your systems.

The much better hardware means that you can select a simpler architecture and avoid a lot of complexities along the way. Although… we need to talk about the cloud, where the old costs are still very much a factor.

Using AWS as the baseline, I can get a 250GB gp2 SSD drive for 25$ / month. That would give me 750 IOPS for 25$. That is nice, I guess, but it puts me at less than what I can get from a modern HDD today. There is the burst capability on the cloud, which can smooth out some spikes, but I’ll ignore that for now. Let’s say that I wanted to get higher speed, I can increase the disk size (and hence the IOPS) at linear rate. The max I can get from gp2 is 16,000 IOPS at a cost of 533$.  Moving to io1 SSD, we can get 500GB drive with 3,000 IOPS for 257$ per month, and exceeding 20,000 IOPS on a 1TB drive would cost 1,425$.

In contrast, 242$ / month will get us a r5ad.2xlarge machine with 8 cores, 64 GB and 300 GB NVMe drive. A 1,453$ will get us a r5ad.12xlarge with 48 cores, 384 GB and 1.8TB NVMe drive. You are better off upgrading the machine entirely and running on top of the local NVMe drive and handling the persistency yourself than paying the storage costs associated with having it out as a block storage.

This tyranny of I/O costs and performance has had a huge impact on the overall system architecture of many systems. Scale out was not, as usually discussed, a reaction to the limits of handling the number of users. It was a limit on how fast the I/O systems could handle concurrent load. With SSD and NVMe drives, we are in a totally different field and need to consider how that affect our systems.

In many cases, you’ll want to have just enough data distribution to ensure high availability. Otherwise, it make more sense to get larger but fewer machines. The reduction in management overhead alone is usually worth it, but the key aspect is reducing the number of moving pieces in your architecture.

time to read 3 min | 462 words

imageWe are exposing an API to external users at the moment. This API is currently exposed only via a web app. The internal architecture is that we have an controller using ASP.Net MVC that will accept the request from the browser, validate the data and then put the relevant commands in a queue for backend processing.

The initial version of the public API is going to be 1:1 identical to the version we have in the web application. That leads to an obvious question. How do we avoid code duplication between the two?

Practically speaking, there are some differences between the two versions:

  • The authentication is different (session cookie vs. api key)
  • The base class for the controller is different

Aside from those changes, they are remarkedly the same. So how should you share the code?

My answer, in this case, was that the proper strategy was Ctrl+C and Ctrl+V. I do not want to share code between those two locations. In this case, code duplication is the appropriate strategy for ensuring maintainable code.

Here is my reasoning:

  • This is the entry point to our system, not where the real work is done. The controller is in charge of validation, creating the command to run and sending it to the backend. The real heavy lifting happens elsewhere and will be common regardless.
  • That doesn’t mean that the controller doesn’t have work to do, mind. However, that work is highly dependent on the interaction model. In other words, it depends on what we show to the user in the web application. The web application and the existing controller are deployed in tandem and have no versioning boundary between them. Changes to the UI (adding a new field, for example, or additional behaviors) will be reflected in the controller.
  • The API, on the other hand, is a release once maintain for a long time. We don’t want any unforeseen changes here. Changes to the web controller code should not have an impact on the API behavior.

In other words, the fact that we have value equality doesn’t mean that we have identity equality. The code is the same right now, sure. And would be fairly easy to abstract so we’ll only have it in a single location. But that would create a maintenance burden. Whenever we make a modification in the web portion, we’ll have to remember to keep the old behavior for the API. Or we won’t remember, and create backward compatibility issues.

Creating separate codebases for ease use case result in a decision we make once, and then the most straightforward approach for writing the code will give us the behavior we want.

You might also want to read about the JFHCI approach.

time to read 2 min | 388 words

I recently had what amounted to a drive by code review. I was looking into code that wasn’t committed or PR. Code that might not have been even saved to disk at the time that I saw it. I saw that while working with the developer on something completely different. And yet even a glace was enough to cause me to pause and make sure that this code will be significantly changed before it ever move forward. The code in question is here:

What is bad about this code? No, it isn’t the missing ConfigureAwait(false), in that scenario we don’t need it. The problem is in the very first line of code.

This is meant to be public API. It will have consumers from outside our team. That means that the very first thing that we need to ensure is that we don’t expose our own domain model to the outside world.

There are multiple reasons for this. To start with, versioning is a concern. Sure, we have the /v1/  in the route, but there is nothing here that would make break if we changed our domain model in a way that a third party client relies on. We have a compiler, we really want to be able to use it.

The second issue, which I consider more important, is that this leaks information that I may not really want to share. By exposing my full domain model to the outside world, I risk quite a bit. For example, I may have internal notes on the support ticket which I don’t want to expose to the public. Any field that I expose to the outside world is a compatibility concern, but any field that I add is a problem as well. This is especially true if I assume that those fields are private.

The fix is something like this:

Note that I have class that explicitly define the shape that I’m giving to the outside world. I also manually map between the internal and external fields. Doing something like auto mapper is not something that I want, because I want all of those decisions to be made explicitly. In particular, I want to be sure that every single field that I share with the outside world is done in such a way that it is visible during PR reviews.

time to read 2 min | 260 words

These are not the droids you are looking for! – Obi-Wan Kenobi

Sometimes you need to find a set of documents not because of their own properties, but based on a related document. A good example may be needing to find all employees that blue Nissan car. Here is the actual model:


In SQL, we’ll want a query that goes like this:

This is something that you cannot express directly in RavenDB or RQL. Luckily, you aren’t going to be stuck, RavenDB has a couple of options for this. The first, and the most closely related to the SQL option is to use a graph query. That is how you will typically query over relationships in RavenDB. Here is what this looks like:

Of course, if you have a lot of matches here, you will probably want to do things in a more efficient manner. RavenDB allows you to do so using indexes. Here is what the index looks like:

The advantage here is that you can now query on the index in a very simple manner:

RavenDB will ensure that you get the right results, and changing the Car’s color will automatically update the index’s value.

The choice between these two comes down to frequency of change and how large the work is expected to be. The index favors more upfront work for faster query times while the graph query option is more flexible but requires RavenDB to do more on each query.

time to read 3 min | 568 words

Compression is a nice way to trade off time for space. Sometimes, this is something desirable, especially as you get to the higher tiers of data storage. If your data is mostly archived, you can get significant savings in storage in trade for a bit more CPU. This perfectly reasonable desire create somewhat of a problem for RavenDB, we have competing needs here. On the one hand, you want to compress a lot of documents together, to benefit for duplications between documents. On the other hand, we absolutely must be able to load a single document as fast as possible. That means that just taking 100MB of documents and compressing them in a naïve manner is not going  to work, even if this is going to result in great compression ratio. I have been looking at zstd recently to help solve this issue. 

The key feature for zstd is the ability to train the model on some of the data, and then reuse the resulting dictionary to greatly increase the compression ratio.

Here is the overall idea. Given a set of documents (10MB or so) that we want to compress, we’ll train zstd on the first 16 documents and then reuse the dictionary to compress each of the documents individually. I have used a set of 52MB of JSON documents as the test data. They represent restaurants critics, I think, but I intentionally don’t really care about the data.

Raw data: 52.3 MB. Compressing it all with 7z gives us 1.08 MB. But that means that there is no way to access a single document without decompressing the whole thing.

Using zstd with the compression level of 3, I was able to compress the data to 1.8MB in 118 milliseconds. Choosing compression level 100 reduced the size to 1.02MB but took over 6 seconds to run.

Using zstd on each document independently, where each document is under 1.5 KB in size gave me a total reducing from to 6.8 MB. This is without the dictionary. And the compression took 97 milliseconds.

With a dictionary whose size was set to 64 KB, computed from the first 128 documents gave me a total size of 4.9 MB and took 115 milliseconds.

I should note that the runtime of the compression is variable enough that I’m pretty much going to call all of them the same.

I decided to use this on a different dataset and run this over the current senators dataset. Total data size is 563KB and compressing it as a single unit would give us 54kb. Compressing as individual values, on the other hand, gave us a 324 kb.

When training zstd on the first 16 documents with 4 KB of dictionary to generate we got things down to 105 kb.

I still need to mull over the results, but I find them really quite interesting. Using a dictionary will complicate things, because the time to build the dictionary is non trivial. It can take twice as long to build the dictionary as it would be to compress the data. For example, 16 documents with 4 kb dictionary take 233 milliseconds to build, but only take 138 milliseconds to compress 52 MB. It is also possible for the dictionary to make the compression rate worse, so that is fun.

Any other idea on how we can get both the space savings and the random access option would be greatly appreciated.

time to read 2 min | 383 words

I did a code review recently and pretty much the most frequent suggestion was something along the line of: “This needs to be pushed to the infrastructure”. I was asked to be clearer about this, so I decided to write a blog post about it.

In general, whenever you see a repeating code pattern, you don’t need to start extracting it. What you need to do is to check whatever this code pattern serves a purpose. If it doesn’t serve a purpose, only then it is time to see if we can abstract that to remove duplication. I phrase things in this manner because all too often we see a tendency to immediately want to abstract things out. The truth is that in many cases, trying to abstract things is going to cause things to be less clear down the line. That is why I wanted to call it out first, even when I want to explain how to do the exact thing that I caution you about it.  Resource cleanup in performance sensitive code is a good example of one scenario where you don’t want to put things to the infrastructure, you want everything to be just there. There are other reasons, too.

After all the disclaimers, let’s talk about a concrete example in which we should do something about it.

Error handling is a great case for moving to infrastructure. This code is running inside an MVC Controller, and we can move our error handling from inside each action to the infrastructure, you can read about it here. I’m not sure if this is the most up to date reference for error handling, but that isn’t the point. The exact mechanism that you do it doesn’t matter. The whole idea is that you don’t want to see it. You push it to the infrastructure and then it is handled.

In the same manner, if you need to do logging or auditing, push them down the stack if they are in the form of: “User X accessed Y”. On the other hand, if you need something like: “Manager X authorized N vacations days for Y”, that is a business audit which should be recorded in the business logic, not in the infrastructure.  I wrote about this a lot in the past.

time to read 3 min | 557 words

One of the first features that RavenDB had, from the very first release, was multi document ACID transactions. With RavenDB you could modify multiple documents at the same time and then save them all, knowing that they would be be saved as a single atomic unit. Other NoSQL databases had you jump through fire if you wanted transactional behavior (building your own two phase commit protocol at the application level, not fun). I consider this to be a pretty important feature, obviously. But why?

In Inside RavenDB book, there is the following advice about modeling considerations for documents:

  • Independent, meaning a document should have its own separate existence from any other documents.
  • Isolated, meaning a document can change independently from other documents.
  • Coherent, meaning a document should be legible on its own, without referencing other documents.

In other words, with proper modeling, you shouldn’t need to have multi document transactions. Any transaction should only contain a single document, so that should be enough, no? Why spend all the time and effort on building multi document transactions?

Well, to start with “proper modeling” is a very loaded term. It is great if you can get it, but there are many cases where you need to deviate from it for various reasons. For example, in this blog, we represent a blog post as two documents. One holds the text of the post, the other holds the comments for the post. The reason for this separation is that there are many cases where you want only the blog post, and not the (potentially very many) comments.

In this case, the layout of the document is subject to the physical realities. It is better to split the document to multiple documents based on their purpose. However, if I add a comment to a post, I want to both add it to the Comments document and to increase the NumberOfComments property on the Post document. Doing this in a single transaction means that I don’t have to worry about consistency.

Another good example of wanting to work with multiple documents at the same time is when my documents aren’t using just holding data. Consider the case of accepting a new employee to the company. I need to:

  • Create the new Employee document.
  • Create initial workflow requirements (orientation, employee handbook, tax papers, setup machine / user / vpn access, etc).

In other words, each document here is independent, but they are created together. I want to create the new Employee’s document and at the same time setup a task for the IT to create a new user, allocate a machine, etc. I want to have the employee go through orientation, do all the require paper work, etc.

Each one of those is modeled as a separate document, because they are. See the definition above and consider how they match. But I absolutely don’t want to have a partial state. That I have a new employee, but I didn’t setup payroll for them is a big problem. Having multi document transaction make things a lot simpler.

You can argue that it would be better to model things as a set of related services with independent databases. I’m not sure that I disagree, but this is a much more complex architecture. Not having to go there on day one, while having clear and easy to use consistency guarantees is a major plus in my eyes.


No future posts left, oh my!


  1. Production postmortem (29):
    23 Mar 2020 - high CPU when there is little work to be done
  2. RavenDB 5.0 (3):
    20 Mar 2020 - Optimizing date range queries
  3. Webinar (2):
    15 Jan 2020 - RavenDB’s unique features
  4. Challenges (2):
    03 Jan 2020 - Spot the bug in the stream–answer
  5. Challenge (55):
    02 Jan 2020 - Spot the bug in the stream
View all series


Main feed Feed Stats
Comments feed   Comments Feed Stats