This post is here to answer several queries in the mailing list, and some questions that were raised in this blog post. I think that this is important enough to warrant a post here, instead of an email to the list, or just a comment.
To summarize, we had a few issues recently that impacted our users’ systems. Those are usually (but not always) cases where a combination of features wasn’t working properly (feature intersection), or just actual bugs. That led to some questions that are worth answering. You can find all the details below, but I would like to talk about what we are actually doing.
In the past 4 or 5 years, we have managed to create a NoSQL database for the .NET platform, and it has been doing nothing but picking up steam ever since we released it. We have been working hard to provide performance, features and stability for our users. On a personal note, it has been quite an amazing ride, seeing more people put RavenDB to use and creating interesting applications and features.
First, there seems to be some concerns about the new things that we are doing. Voron, in particular, appears to be a cause for concern. We have relied on Esent as our storage engine for the past four or five years, to great success. Not least of its properties is the fact that Esent has been around the block for a while now, and is proven to be robust and safe in the simplest of methods, high and constant use over multiple decades. Esent also have its share of problems, but we didn’t forget why we chose it in the first place. Indeed, I still think that that was an excellent choice. With Voron, the only change you’ll see is that it won’t be the only choice.
Voron is meant to allow us to run on Linux machines, and to provide us with a fully owned stack, so we can do more interesting things across the board. But we aren’t letting go of Esent, and in any way you care to name, Esent is still going to be the core (and default) option we have for storage in RavenDB. With RavenDB 3.0, you’ll have the option to make an informed choice about selecting Voron as a storage engine, with a list of pros & cons.
Second, we do acknowledge that we suffer from a typical blindness for how we approach RavenDB. Since we built it, we know how things are supposed to be, and that is how we usually test them. Even when we try to go for the edge cases, we are constrained by our own thinking. We are currently working on getting an external testing team to do just that. Actively work to make use of RavenDB in creative ways specifically to try to break it.
Third, our own internal policies for releasing RavenDB need to be adjusted slightly. In particular, we are usually faced with two competing pressures: Release Already and Super Stable. We have always tried to release both unstable and stable versions, and the process for moving from unstable to stable is a pretty good one, I think. We have:
- The test suite, now clocking at just over 3,000 tests.
- A separate test suite that is meant to stress test the database.
- Performance test suite, to make sure that we are in line for general performance.
- Longevity tests, making sure that we don’t have any issues in long term usage.
- Finally, as an act of dog fooding, we upgrade our own servers to the new build, and let it run in production for a while, just to make absolutely sure.
We are going to add additional tests (see the 2nd point) to the process, and we are going to extend the duration of all of those steps. I think that in the past few months we have have leaned too far toward the “Release Already” mode, so we are going to try to lean back (hopefully not too much) the other way.
Fourth, with regards to licensing. It has been our policy to provide anyone with a free trail license of RavenDB if they want to test it on a temporary basis. We require permanent non developer servers to have a license. I think that this strikes the appropriate balance.
Fifth, we are going to be working on additional tooling round deployment and upgrades. For customers that jump multiple versions (moving from 1.x to 2.5, for example), the update process of the RavenDB internal storage data during upgrades can be lengthy and there is too little visibility into it at the moment. We are also working on building tools that help figure out what is going on with a production instance (more ops endpoint, more visibility into internal operations, etc).
In summary, we are grateful for our users for bringing any issues to our attention. We are trying hard to have a very responsive feedback cycle, and we can usually resolve most issues within 24 – 48 hours. But I know we need to do better in making sure that users have a more streamlined experience.