Ayende @ Rahien

My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by email or phone:


+972 52-548-6969

, @ Q j

Posts: 6,646 | Comments: 48,401

filter by tags archive

Reviewing the Bleve search library

time to read 6 min | 1191 words

Bleve is a Go search engine library, and that means that it hit a few good points with me. It is interesting, it is familiar ground and it is in a language that I’m not too familiar with, so that is a great chance to learn some more.

I reviewed revision: 298302a511a184dbab2c401e2005c1ce9589a001

I like to start with reading from the bottom up, and in this case, the very first thing that I looked at was the storage level. Bleve uses a pluggable storage engine and currently has support for:

  • BoltDB
  • LevelDB
  • Moss
  • In memory tree

This is interesting, if only because I put BoltDB and Moss on my queue of projects to read.

The actual persistent format for Bleve is very well document here. Which make it much easier to understand what is going on. The way Bleve uses the storage, it has a flat key/value store view of the world, as well as needing prefix range queries. Nothing else is required. Navigating the code is a bit hard for me as someone who isn’t too familiar with Go, but the interesting things start here, in scorch.go (no idea why this is called scorch, though).


We get a batch of changes, and run over them, adding an _id field to the document. So far, pretty simple to figure out. The next part is interesting:


You can see that we are running in parallel here, starting the analysis work and queuing it all up. Bleve then wait for the analysis to run. I’ll dig a bit deeper into how that work in a bit, first I want to understand how the whole batch concept work.


So that tells us some interesting things. First, even though there is the concept of a store, there is also this idea of a segment. I’m familiar with this from Lucene, but there it is tied very closely to the on disk format. Before looking at the analysis, let’s look at this concept of segments.

The “zap” package, in this term, seems to refer to the encoding that is used to store the analysis results. It looks like it is running over all the results of the batch and write them into a single binary value. This is very similar to the way Lucene works so far, although I’m still confused about the key/value store. What is happening is that after the segment is created, it is sent to prepareSegment. This eventually send it to a Go channel that is used in the Scortch.mainLoop function (which is being run as a separate thread).

Here is the relevant code:


The last bit is the one that is handling the segment introduction, whatever that is. Note that this seems to be strongly related to the store, so hopefully we’ll see why this is showing up here. What seems to be going on here is that there is a lot of concurrency in the process, the code spawns multiple go funcs to do work. The mainLoop is just one of them. The persisterLoop is another as well as the mergerLoop. All of which sounds very much like how Lucene works.

I’m still not sure how this is all tied together. So I’m going to follow just this path for now and see what is going on with these segments. A lot of the work seems to be around managing this structure:


The Segment itself is an interface with the following definition:


There are go in memory and mmap versions of this interface, it seems. So far, I’m not following relation between the storage interface and this segments idea. I think that I’m lost here so I’m going to go a slightly different route. Instead of seeing how Bleve write stuff, let’s focus on how it reads. I’ll try to follow the path of a query. This path of inquiry leads me to this guy:


Again, very similar to Lucene. And the TermFieldReader is where we are probably going to get the matches for this particular term (field, value). Let’s dig into that. And indeed, following the code for this method leads to the inverted index, called upside_down in this code. I managed to find how the terms are being read, and it makes perfect sense, exactly as expected, it does a range query and parses both key and values for the relevant values. Still not seeing why there is the need for segments.

And here is where things start to come together. Bleve uses the key/value interface to store some data that it searches on, but document values are stored in segments, and are loaded directly from there on demand. At a glace, it looks like the zap encoding is used to store values in chunks. It looks like I didn’t paid attention before, but the zap format is actually documented and it is very helpful. Basically, all the per document (vs. per term / field) data is located there, as well as a few other things.

I think that this is were I’ll stop. The codebase is interesting, but I now know enough to have a feeling how things work. Some closing thoughts:

  • Really good docs.
  • I didn’t use my usual “read the project in lexical file order” to figure out things, and I had a hard time navigating the codebase because of that. Probably my lack of Go chops.
  • There seems to be a lot more concurrency for stuff that I would usually assume be single threaded than I’m used to. I’m aware that Go has builtin concurrency primitives and it is more common to use there, but it seems strange to see. As consume of search libraries, I’m not sure that I’m happy about this. I like to control my threading behaviors.
  • It seems that a lot of the data is held in memory (mmap) but in a format that requires work to handle or in the key/value store, but again, in a format that require work.

The problem with work is that you have to do it each and every time. I’m used to Lucene (read it once from disk and keep a cached version in memory that is very fast) or Voron, in which the data is held in memory and can be access with zero work.

I didn’t get to any of the core parts of the library (analysis, full text search). This is because they aren’t likely to be that different and they are full of the storage interaction details that I just went over.

RavenDB Security ReviewEncrypt, don’t obfuscate

time to read 2 min | 293 words

imageThe second finding in the security report was Inaccurate Key Size. Basically, we initialized a 512 bytes buffer as the master key we will use if the user didn’t give us one. The problem is that our encryption algorithms were using 256 bits out of this range.

The 512 bytes value wasn’t selected at random. This is the minimum sector size on all hard disks that you are likely to encountered and it was chosen to ensure that writing this value to disk would be atomic.

Fortunately, nothing in the code actually depended on that atomic property and it was fairly misleading. While it isn’t as if we are likely to run out of random numbers, reading the code and understanding that you use a 4096 bits buffer as the encryption key but only expect the first 256 bits to be used (note the difference between bits & bytes in this post, they matter) is confusing.

Another issue was that we reused this value to several different encryption algorithms, all of them were taking the same size key, but while that works, for code clarity and ensuring the longevity of the code, it is better to explicitly separate them.

In this case, it was very much the case of premature planning causing us to use something that was a fair bit more complex in practice than what we needed. The solution there was to be explicit and separate the requirements, even if we had to write a bit more code, the cost over time would be much lower because it is not clearer what is going on and you don’t have to guess from prior knowledge.

RavenDB Security ReviewEncrypting data on disk

time to read 3 min | 474 words

imageContinuing our discussion on nonce reuse issues that were raised in the security report, I want to talk about the way we encrypt the most important thing, your data.

RavenDB uses an algorithm called XChaCha20Poly1305 with 256 bits key to encrypt the data. But as we have learned, just using a key is not good enough, we need to use a nonce as well. This is easy when you need to encrypt a message in one go, but the way encryption in RavenDB works, we need to encrypt pieces of the data (randomly, depending on the way users are accessing the system).

In order to do that, RavenDB encrypt each page (usually 8KB in size) independently of each other. We actually use a different key for each page, derived from the master key, but I’ll touch on that in a different post. Here, I want to talk about nonce.

Encryption today is not just about hiding data, it is also about being able to detect if the value has been tampered with, typically called authenticated encryption (AEAD). The algorithm we use requires 16 bytes for the message authentication code (MAC). The problem is that we need to store that MAC somewhere, and the nonce as well. And that value cannot be in the page itself, since that is encrypted.

Luckily for us, we have the page header, a 64 bytes that are reserved at the beginning of each page. And we planned things accordingly to ensure that RavenDB will use only 32 bytes out of the header, giving us 32 bytes free for the encryption to use. The problem is that the XChaCha20Poly1305 algorithm uses a 16 bytes MAC and a 24 bytes nonce. And that is a bit too much to fit in 32 bytes, as you can imagine. Here is the kind of space allocation we have:


Increasing the size of the page header will have repercussions throughout our system, very late in the game, so we didn’t want to do that. Instead, we cheated. The 16 bytes of the nonce are generated using a cryptographic random number generator, but we pass a pointer to the page header 8 bytes before the nonce, so the encryption algorithm also takes the last 8 bytes of the page header itself into account in the nonce. We are guaranteed at least 128 bits of strong randomness there, and the page header itself will change from time to time, obviously, but we rely on the nonce random bytes to ensure uniqueness.

In this manner, we are able to fit the encryption requirements into our existing file structure and have strong encryption without uprooting everything.

RavenDB Security ReviewNonce reuse

time to read 4 min | 768 words

imageNonce reuse was an issue in four separate locations in the RavenDB security report. But what is a nonce? And what does this matter? A cryptographic nonce is a number that can only be used once.

Let’s consider what encryption does. Given some initial state (a key, for example) it takes an input and outputs what to an outside observe should look like completely random noise. Let’s assume that I have the following secret message that I want to send: “Attack at dawn”. I run it through my most sophisticated encryption algorithm (with a pre-shared key) and get the following secret message:

Assume that I have an adversary that is capable of intercepting such messages, even if they don’t have the key. What can do with this knowledge?

Well, if I’m always using the same key, and encryption is a pure mathematical computation, that means that encrypting the same string twice with the same key is going to result in the same encrypted output. Now, assume that I have some way to get you to encrypt a message of my choosing. For example, if I know that in reaction to something that I will do you’ll send a message saying “Attack imminent”, I can move some troops and then watch for a message to go by:

By comparing the two messages I can deduce that this: “✏” = “Attack”. From there, I can probably crack everything else in a short order.

Now, to be fair, anything above is very far from how things actually behave, but it should allow you to build a mental model of what it going on and why this is important. If you are interested in learning cryptography, I highly recommend the book Serious Cryptography.

One way of avoid these issues to to not generate the same output for the same input each time. In order to do that we need to add something to the mix, and that is the nonce. The nonce is some number that is added to the state of the encryption / decryption and will ensure that two identical messages are not going to generate the same output (because they aren’t going to use the same nonce).

It’s important to understand that without a nonce, you don’t actually need to have identical inputs. In fact, the risk is that an attacked will get two different encrypted messages with the same key. At which point, depending on the exact encryption algorithm used, the attacker can get quite far into breaking the encryption. Again, I’m skipping over a lot of details here, trying to give you the general idea rather than the details.

Pretty much all cryptographic protocol have the notion of a nonce. Something it is called IV, but that generally has the same purpose and it seems like nonce is a more popular term these days.

That leads to an interesting issue, if you reuse the same (key, nonce) pair to encrypt two different messages, it is game over, from a cryptographic point of view. So you really want to avoid that. In general, there are two ways to do that. Either use a counter and increment that each time you encrypt a message or generate a random number that is big enough that collisions don’t matter (usually, 192 bits number).

The first finding in the report was the use of a 64 bits randomly generated nonce. The problem is that this is suspect to a birthday attack and a 64 bits value gives us only 232 level of security, and that is low, given today’s standards. A proper way to handle that is to use a 192 bits number. In order to attack that you’ll need 296 attempts, and that is 79,228,162,514,264,300,000,000,000,000 attempts, which is safe. The answer here was to change the encryption algorithm to one that indeed uses a 192 bits nonce and generate that using a cryptographically secured random number generator.

The third finding in the report had the same issue of 64 bits value, but in a somewhat nastier form. We were accepting the secret and entropy from our callers, and that gave them too much control over what we can do. We changed the code so we’ll only accept the secret to be encrypted and handled all the cryptographic details (now using 192 bits randomly generated nonce) directly, instead of exposing details that can be incorrectly used.

The final nonce reuse is a bit more complex to explain, and I’ll dedicate a post just for that.

RavenDB Security ReviewFinding and details

time to read 4 min | 646 words

imageIn Jan 2018 we asked Edge Security to do a thorough review of RavenDB security and cryptography usage. We wanted to get an outside expert opinion before the RTM release, to make sure that we put out a system that is robust and secured.

As an aside, I strongly recommend doing such a thing on major version releases (at least). Especially if you need an expert opinion, and security is certainly one area in which you want to have things verified.

In the spirit of full transparency, we have made the full report available here. I want to point out that all the issues that were raised in the report were fixed before the RTM release, but I think that it it worth going over each of the items that were brought up in the report and explore them. We have tried our best to create a secured system and it was… humbling to get the report and see fifteen different locations where we failed to do so.

Security is hard to do and even harder to get right. The good news from our perspective was that all those issues were high risk in terms of their impact on the security of the product, but minor in terms of their effect on the overall architecture, so we were able to fix them quickly.

I’m going to take the time now to address each type of failure that was brought up in the report, discuss what kind of risk it represents and how it was resolved. I’ll deal with that in the next posts in this series.

The most important parts the report are quoted below:

RavenDB deploys cryptography essentially on two different fronts: symmetric cryptography of all data on disk, and asymmetric cryptography via X.509 certificates as a means of authentication between clients and servers.

All symmetric encryption uses Daniel J. Bernstein’s XChaCha20Poly1305 algorithm, as implemented in libsodium, with a randomized 192-bit nonce. While opting for XChaCha20 over ChaCha20 means more calls to the RNG and a computation of HChaCha20, it also means that there is no possibility of nonce-reuse, which means that it is considerably more resilient than adhoc designs that might make a best-effort attempt to avoid nonce-reuse, without ensuring it. Symmetric encryption covers the database main data store, index definitions, journal, temporary file streams, and secret handling.

Such secret handling uses the Windows APIs for protected data, but only for a randomly generated encryption key, which is then used as part of the XChaCha20Poly1305 AEAD, to add a form of authentication. All long-term symmetric secrets are derived from a master key using the Blake2b hash function with a usage-specific context identifier.

At setup time, client and server certificates are generated. Clients trust the server’s self-signed certificate, and the server trusts each client based on a fingerprint of each client’s certificate. All data is exchanged over TLS, and TLS version failures for certificate failures are handled gracefully, with a webpage being shown indicating the failure status, rather than aborting the TLS handshake. Server certificates are optionally signed by Let’s Encrypt using a vendor-specific domain name. Certificates are generated using BouncyCastle and are 4096-bit RSA.

Keys, nonces, and certificate private keys are randomly generated using the operating system’s CSPRNG, either through libsodium or through BouncyCastle.

If you aren’t familiar with cryptographic terms, this can be pretty scary. There are lots of terms and names that are thrown around. I want to increase the knowledge of my readers, and after seeing the reactions of the guys internally to the report, I think it would do a lot of good to actually go over a real world report and its mitigations and discuss how we resolved them. Along the way, I’ll attempt to cover many of these cryptographic terms and dechiper (pun intended) their meaning.

Book recommendations in the test of time

time to read 8 min | 1409 words

Technical books are interesting. Some of them last for decades, some of them are valid only for a session. I had a few discussions recently about books in a conference, in particular, what books would I recommend. That got me to really think about the topic. There are a lot of books that I think were really valuable for me when I read them that wouldn’t really make sense to recommend / talk about today. Not because they are bad books, but because both the industry and the reader have changed.

imageimageConsider a book that I was really impressed with at the time: Patterns of Enterprise Application Architecture.

It is a great book, and quite interesting. But if you’ll try to write anything based on its contents, you are doing yourselves and your employer a great disservice. This is a book that you’ll read, today, to understand how the already pre-existing libraries and frameworks are put together. Interesting, certainly, but much less relevant than when it came out over 15 years ago.

There are a lot of such cases. Books that are relevant for either the time period in which they were written, or sometimes even to the time period in the career of their reader. An example of a book that I was quite taking with is the Code Complete book. For very much the same reasons as the PoEAA book, they are much less relevant now than when they came out. This is because the ideas exposed in these books have won, they are both ubiquitous and expected.

That is not to say that you’ll always find them, or proper behavior, but that is the ground floor on which you’re expected to start from, not something that you need to aim at and strive for.

Because of this, I am actually struggling to think about good technical books that I believe would withstand the test of time. Anything that is too tech specific usually have an expiration date attached to it. And even if we are talking about concepts and ideas, I’m interested in things that will give me more than just information, but actually provide something more. A good book in this regard is something that would change how I’m doing things for a long time. A compiler book would tell me about how to write parsers and work with AST, how to generate code and and lot of details in this nature. And that would be very valuable, but it would also usually be knowledge that is very specific to a task and place. It will be generally applicable.

Thinking back over all the technical books I have read, there are just a few that I can point to and say: “This book changed the way I write code and build systems”. And these books typically are still relevant today and I can happily recommend them developers at every stage of their career.

The ones that really pops to mind are:

imageRelease It!

I read it a few times, which is pretty rare for me with technical books and I got a few copies floating around in the office so I can tell people, “Read this and you’ll get it”.

The ideas there about building robust production systems, what are the challenges and the things to watch out for are invaluable. The patterns outlined in the book, anything from circuit breaker to explicit transparency have been invaluable for the software I write.

I do have to point out that the tech in the book is often Java (circa 2010, I guess). So when the books discusses specific options, that is often not relevant, but the content and the ideas are fascinating and had made a major impact on how I write code and architect systems.

imageWorking Effectively with Legacy Code

I remember reading this book and going “Ahhh” several times over. The book talks about how you can approach a legacy codebase and make changes to it, and presumably it is useful in this regard. I read it very early in my career, before I really had the chance to get enough code to call it legacy and I have used the techniques in the book to avoid getting myself into too much trouble over time.

I should note that a lot of the things that are discussed, such as creating seams in the system so you can write tests for it are actually very useful for many other things. One of the things that I have noticed is that I will routinely make use of such seams for debugging things. To provide additional behavior and insight into what the system is doing explicitly to find a particular issue.

For some reason, I haven’t seen a lot of usage of that, but I consider debug hooks to be a really important feature of good software and I started doing that as a direct result of the kind of things that I read in Working Effectively with Legacy Software.

imageOperating Systems Concepts

I have to admit that I have a much older edition of this book, and I like that cover a lot more. But I think you will be more interested in hearing about the contents of this book.

This book, as well as Operating Systems Design and Implementation, cover how operating systems actually work, the major components and how they are actually put together. The topic may seem pretty academic and not of much use for application developers but I found it fascinating when I read it for the first time and I think it is very relevant today. Not so much the details, which are quite often different between operating systems and operating systems versions but the actual high level concepts.

It also help in understanding what is actually going on when you are running code on a machine. Things like how threading is implemented and the idea of how the OS gets to decide what runs, how memory works and how you can make use of that, etc.

I would be able to write RavenDB if I didn’t have a good grasp of all these details and the 4.0 release has been quite explicit about building software so the operating system can help us, instead of having to fight it. In order to do that, we needed to understand how the OS works, what it expects applications to do and how to actually make the best use of that.

You might note that there are a lot of books that aren’t here. Nothing about source control or writing tests, the Pragmatic Programmer or Design Patterns. To head things off at the pass, it isn’t that these are not important, but at this point, talking to an experienced  developer, I just assume that that kind of knowledge is already ingrained.


Federico had the following recommendation. Modern C++ Design is one of those books that literally break your understanding on how code is built and interpreted. I remember having taken this book back in early 2002 when I saw it standing on the counter of the computer science department library. I somehow convinced the secretary to give it to me under the promise of returning it, because there was a professor waiting for it. I read it completely in a week. End result, either this guy was absolutely crazy or I didnt understood a thing (latter I discovered it was not the former). So I did what anyone responsable enough would do; start all over again, and not return the book. Had to read it 3 times in the space of a month to barely grasp the concepts and got fined because of the late return 1 month later :D

I would love to hear about the books that you found fundamental to your career.

Book recommendationSerious Cryptography

time to read 2 min | 291 words

imageI haven’t done a technical book recommendation for a while, but I think that this is a really great book to break that streak.

Serious Cryptography talks about cryptography, obviously, but it does it in such a way that it is understandable. I think that it is unique in the sense that most of the other cryptography books and materials that I have read started from so many baseline assumptions or were so math heavy that they were not approachable. The other types of cryptography books, like the Code Book are more in the sense of popular science. They give you background, but nothing actionable.

What I really liked about Serious Cryptography (henceforth, the book) is that it is a serious discussion of cryptography without delving too deeply into math (but with clear explanations and details on it) and that it is practical. Oh, it isn’t an API guideline and it isn’t something that you can just pick up and learn cryptography, but it does an amazing good in laying out the field and explaining all sort of concepts and ideas that are generally just assumed.

I read it in two days, because it was fascinating reading and because it is relevant to what I’m actually doing. Some of the most fun parts is “how things fail” when the author discuss various failure that happened in the real world, what caused them and what actions were taken as a result.

If you have any interest in security whatsoever, I highly recommend this book.

And if you have good technical books, especially in a similar vein, I would love to hear about it.

PR ReviewThe simple stuff will trip you

time to read 2 min | 285 words

In a recent PR, I run into this code, which is used in query generation to decide if we need to quote a particular alias. The code itself is pretty straightforward and easy to follow:



It also have two distinct issues. First, there is the allocation because of the ToUpper call and second, we are doing O(N) search on the alias array every single time.

I asked for a change, to use HashSet and to use the OrdinalIgnoreCase comparer.

Here is the change I got back:



This is exactly what I asked for, and it is very subtly wrong. We are now saving an allocation, which is great, but the problem is with the Contains method.


This looks okay, but this is not HashSet.Contains, instead, this is an extension method from Enumerable.Contains, which iterates over the set and compare each value.

The fix is also simple:



And now we don’t have O(N) anymore.

Although I’ll admit that for such small size, it probably doesn’t matter.

PR ReviewEncapsulation stops at the assembly boundary

time to read 2 min | 247 words

The following set of issues all fall into code that is used within the scope of a single assembly, and that is important. I’m writing this blog post before I got the chance to talk to the dev in question, so I’m guessing about intent.


This change is likely motivated by the fact that callers are not expected to make a modification to the resulting dictionary.

That said, this is used between different components in the same assembly, and is never exposed outside. That means that we have a much higher trust between the components, and reading IReadOnlyDictionary means that we need to spend more cycles trying to figure out who you are trying to protect from.

Equally important, in this case, the Dictionary methods can be called without any virtual call overhead, while the IReadOnlyDictionary needs interface dispatch to work.


This is a case that is a bit more subtle. The existingData is a variable that is passed to a method. The problem is that in this case, no one is ever going to send null, and sending a null is actually an error.

In this case, if we did get a null, I would rather that the code would immediately crash with “what just happened?” rather than limp along with bad data.

PR ReviewBeware the things you can’t see

time to read 1 min | 110 words

I had to reject the following change in a recent PR. IN this context, the flags and conflicted.Flags are the same, and that wasn’t the problem. Can you spot the issue?


The problem is that the second version does an allocation. It does this silently, and you need to know about this issue to know that this happens. There is good discussion on this in this StackOverflow question.

It looks like this has been fixed in the JIT for CoreCLR and will be part of the 2.1 release when it is out.


  1. RavenDB 4.1 Features: Cluster wide ACID transactions - one day from now

There are posts all the way to Jun 20, 2018


  1. RavenDB 4.1 features (6):
    19 Jun 2018 - Explain that choice
  2. Codex KV (2):
    06 Jun 2018 - Properly generating the file
  3. I WILL have order (3):
    30 May 2018 - How Bleve sorts query results
  4. Inside RavenDB 4.0 (10):
    22 May 2018 - Book update
  5. RavenDB Security Report (5):
    06 Apr 2018 - Collision in Certificate Serial Numbers
View all series



Main feed Feed Stats
Comments feed   Comments Feed Stats