Ayende @ Rahien

My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by email or phone:


+972 52-548-6969

, @ Q c

Posts: 6,434 | Comments: 47,591

filter by tags archive

All I asked was Hello World

time to read 2 min | 262 words

When running a full cluster on a single machine, you end up with a lot of console windows running instances of RavenDB, and it can be a bit hard to identify which is which.

We had the same issue when we have multiple tabs, it takes effort to map urls to the specific node, we solved this particular problem by making sure that this is very explicit in the browser.


And that turned out to be a great feature.

So I created an issue to also print the node id in the console, so we can quickly match a console window to the node id that we want. The task is literally to write something like:


And the only reason I created an issue for that is that I didn’t want to be side tracked by figuring out where to put this line.

Instead of a one line change, I got this:


Now, here is the difference between a drive by line of code and an actual resolution. Here is what this looks like:


And this handles nodes that haven’t been assigned and ID yet and color the nodes differently based on their topology so they can be easily be told apart. It also make the actual important information quite visible at a glance.

Error handling belongs at Layer 7 (policy)

time to read 3 min | 411 words

imageWhen you setup a RavenDB server securely, it doesn’t require a valid X509 certificate to be able to connect to it.

You might want to read that statement a few times to realize what I’m saying. You don’t need a certificate to connect to a RavenDB service. But on the other hand, we use client certificate to authenticate connections. So which is it?

The problem is with the specific OSI layer that we are talking about. When using SSL / TLS you can require a client certificate on connection. That ensures that only connections with a client certificate can be processed by the server. This is almost what we want, but it has the sad effect of having really poor error reporting capabilities.

For example, let us assume that you are trying to connect to a secured RavenDB server without a certificate. If we just require client certificate, then the error would be raised a the SSL layer, giving you errors that look something like:

  • The connection has been closed
  • Could not establish trust relationship for the SSL/TLS

This can also happen if there is a firewall in the middle, if enough packets were dropped, if there is a solar storm or just because it is Monday.

Or, it can be because you don’t have a certificate, your certificate expired, it is not known or you are trying to access a resource you are not authorized to.

In all of those cases, the error you’ll receive if you require & validate the certificate at OSI Layer 6 is going, to put it plainly, suck. Just imagine trying to debug something like that in a protocol that by design is intended to be hard / impossible to eavesdrop to. Instead, we can allow access to the server without a certificate and enforce the required certificate at a higher level.

That means that when you connect to RavenDB and you don’t have a certificate, we’ll accept the connect and then either give you an error (if you are an API) that you can reason about (because it will tell you what is wrong) or you’ll be redirected into a nice error page (if you are using the browser). Alternatively, if you have a certificate and it is not valid for any number of reasons, RavenDB will tell you all about it.

That reduce the amount of hair loss in the case of errors significantly.

RavenDB 4.0Unbounded results sets

time to read 3 min | 503 words

Unbounded result sets are a pet peeve of mine. I have seen them destroy application performance more then once. With RavenDB, I decided to cut that problem at the knees and placed a hard limit on the number of results that you can get from the server. Unless you configured it differently, you couldn’t get more than 1,024 results per query. I was very happy with this decisions, and there have been numerous cases where this has been able to save an application from serious issues.

Unfortunately, users hated it. Even though it was configurable, and even though you could effectively turn it off, just the fact that it was there was enough to make people angry.

Don’t get me wrong, I absolutely understand some of the issues raised. In particular, if the data goes over a certain size we suddenly show wrong results or error, leaving the app in a “we need to fix this NOW”. It is an easy mistake to make. In fact, in this blog, I noticed a few months back that I couldn’t get entries from 2014 to show up in the archive. The underlying reason was exactly that, I’m getting the number of items per month, and I’ve been blogging for more than 128 months, so the data got truncated.

In RavenDB 4.0 we removed the limit. If you don’t specify a limit in a query, you’ll get exactly how many results there are in the database. You can ask RavenDB to raise an error if you didn’t specify a limit clause, which is a way for you to verify that you won’t run into this issue in production, but it is off by default and will probably better match the new user expectations.

The underlying issue of loading too many results is still there, of course. And we still want to do something about it. What we did was raise alerts.

I have made a query on a large set (160,000 results, about 400 MB in all) and the following popped up in the RavenDB Studio:


This tells the admin that it have some information that it needs to look at. This is intentionally non obtrusive.

When you click on the notifications, you’ll get the following message.


And if you’ll click on the details, you’ll see the actual details of the operations that triggered this warning.


I actually created an issue so we’ll supply you with more information (such as the index, the query, duration and the total size that it generated over the network).

I think that this gives the admin enough information to act upon, but will not cause hardship to the application. This make it something that we Should Fix instead Get the OnCall Guy.

RavenDB 4.0Managing encrypted databases

time to read 6 min | 1012 words

imageOn the right you can see how the new database creation dialog looks like, when you want to create a new encrypted database. I talked about the actual implementation of full database encryption before, but todays post’s focus is different.

I want to talk a out managing encrypted databases. As an admin tasked working with encrypted data, I need to not only manage the data in the database itself, but I also need to handle a lot more failure points when using encryption. The most obvious of them is that if you have an encrypted database in the first place, then the data you are protecting is very likely to be sensitive in nature.

That raise the immediate question of who can see that information. For that matter, are you allowed to see that information? RavenDB 4.0 has support for time limited credentials, so you register to get credentials in the system, and using whatever workflow you have the system generate a temporary API key for you that will be valid for a short time and then expire.

What about all the other things that an admin needs to do? The most obvious example is how do you handle backups, either routine or emergency ones. It is pretty obvious that if the database is encrypted, we also want the backups to be encrypted, but are they going to use the same key? How do you restore? What about moving the database from one machine to the other?

In the end, it all hangs on the notion of keys. When you create a new encrypted database, we’ll generate a key for you, and require that you confirm for us that you have persisted that information in some manner. You can print it, download it, etc. And you can see the key right there in plain text during the db creation. However, this is the last time that the database key will ever reside in plain text.

So what about this QR code, what is it doing there? Put simply, it is there to capture attention. It replicates the same information that you have in the key itself, obviously. But what for?

The idea is that users are often hurrying through the process, (the “Yes, dear!” mode) and we want to encourage them to stop of a second and think. The use of the QR code make it also much more likely that the admin will print and save the key in an offline manner, which is likely to be safer than most methods.

So this is how we encourage administrators to safely remember the encryption key. This is useful because that give the admin the ability to take a snapshot on one machine, and then recover it on another, where the encryption key is not available, or just move the hard disk between machines if the old one failed. It is actually quite common in cloud scenarios to have a machine that has an attached cloud storage, and if the machine fails, you just spin up a new machine and attach the storage to the new one.

We keep the encryption keys secret by utilizing system specific keys (either DPAPI or decryption key that only the specific user can access), so moving machines like that will require the admin to provide the encryption key so we can continue working.

The issue of backups is different. It is very common to have to store long term backups, and needing to restore them in a separate location / situation. At that point, we need the backup to be encrypted, but we don’t want it it use the same encryption key as the database itself. This is mostly about managing keys. If I’m managing multiple databases, I don’t want to record the encryption key for each as part of the restore process. That is opening us to a missed key and a useless backup that we can do nothing about.

Instead, when you setup backups (for encrypted databases it is mandatory, for regular databases, optional) we’ll give you the option to provide a public key that we’ll then use to encrypted the backup. That means that you can more safely store it in cloud scenarios, and regardless of how many databases you have, as long as you have the private key, you’ll be able to restore the backup.

Finally, we have one last topic to cover with regards to encryption, the overall system configuration. Each database can be encrypted, sure, but the management of the database (such as connection strings that it replicates to, API keys that it uses to store backups and a lot of other potentially sensitive information) is still stored in plain text. For that matter, even if the database shouldn’t be encrypted, you might still want to encrypted the full system configuration. That lead to somewhat of a chicken and egg problem.

On the one hand, we can’t encrypt the server configuration from the get go, because then the admin will not know what the key is, and they might need that if they need to move machines, etc. But once we started, we are using the server configuration, so we can’t just encrypt that on the fly. What we ended up using is a set of command line parameters, so if the admins wants to run encrypted server configuration, they can stop the server, run a command to encrypt the server configuration and setup the appropriate encryption key retrieval process (DPAPI, for example, under the right user).

That gives us the chance to make the user aware of the key and allow to save it in a secure location. All members in a cluster with an encrypted server configuration must also have encrypted server configuration, which prevents accidental leaks.

I think that this is about it, with regards to the operations details of managing encryption, Smile. Pretty sure that I missed something, but this post is getting long as it is.

Beautiful errors

time to read 2 min | 322 words

The following code in a recent PR caused it to be rejected, can you figure out why?


The error clearly states that what the error is, but it fails to provide crucial details. Namely, which files have been corrupted. If I’m seeing an error like this in my logs, I need to be able to figure out what happened, and not hope & guess.

I’m picking up on this particular change because I found myself tallying in my head the number of comments I make on PRs from our team, and quite a large portion of that involve these kind of changes. What I’m looking for with error handling is not just to do it properly and handle all edge cases, but to also properly report it so the person who will end up reading this error message (very likely years from now) will have a clue about what they are supposed to do now.

Sometimes we go to great lengths to ensure that this is the case. We have an entirely different server mode dedicated to handling catastrophic errors, so when you try to get into RavenDB, you’ll get a meaningful error page that will at least try to give you an idea about how to fix this issue, for example. The sad part is that it is very easy to have a single error sour up a really good experience, because it doesn’t provide you with the right information to fix it.

We spend a lot of time just crafting errors properly. They go to the log, they are sometimes blocking the UI (if the server cannot start), we have dedicated alert system that handles error and alert distribution across the cluster so an admin can get into any node and see all the stuff that they need to know about across the entire cluster.

Beginning the RavenDB 4.0 book

time to read 12 min | 2208 words

You might have seen me talking about how close we are to a RavenDB beta release. Today marked a very important step along the route to an actual release. I’ve shifted my focus. Instead of going head down in the code and pushing things forward and doing all the sort of crazy stuff that you have seen me talking about for the past year and a half, I got started on the Inside RavenDB 4.0 book.

I say started because just the rough table of contents took me almost the entire day to complete. I’m expecting that this will take the majority of my time for the next few months, which means that you’ll get all the drib and drabs from the raw drafts as they are composed.  I’m also using this as a pretty nice way to go over the entire product and see how it all comes together as a cohesive whole, instead of looking at just a single piece every time.

Given that the period of putting bugs in the code is almost over, I feel that I can safely let the rest of the team fish out all the oopsies hat I managed to get in and focus on the product, rather than the code. This is the second time that I have made such a shift (and the third time I’m writing a book), and it still feels awkward. On the other hand, there is a great sense of accomplishment when you see how things just click together and all that hard work is finally real in a way that no code review or artificial scenario can replicate.

Here is what I have planned so far for the book. Your comments are welcome as always.

One of the major challenges in writing this book came in considering how to structure it. There are so many concepts that relate to one another that it can be difficult to try to understand them in isolation. We can't talk about modeling documents before we understand the kind of features that we have available for us to work with, for example. Considering this, I'm going to introduce concepts in stages.

Part I - The basics of RavenDB

Focus: Developers

This part contains a practical discussion on how to build an application using RavenDB, and we'll skip theory and concepts in favor of getting things done. This is what you'll want new hires to read before starting to work with an application using RavenDB, we'll keep the theory and the background information for the next part.

  • Chapter 2 - Zero to RavenDB - focuses on setting you up with a RavenDB instance, introduce the studio and some key concepts and walk you through the Hello World equivalent of using RavenDB by building a very simple To Do app.
  • Chapter 3 - CRUD operations - discusses RavenDB the basics of connecting from your application to RavenDB and the basics that you need to know to do CRUD properly. Entities, documents, attachments, collections and queries.
  • Chapter 4 - The Client API - explores more advanced client scenarios, such as lazy requests, patching, bulk inserts, and streaming queries and using persistent subscriptions. We'll talk a bit about data modeling and working with embedded and related documents.

Part II - Ravens everywhere

Focus: Architects

This part focuses on the theory of building robust and high performance systems using RavenDB. We'll go directly to working with a cluster of RavenDB nodes on commodity hardware, discuss distribution of data and work across the cluster and how to best structure your systems to take advantage of what RavenDB brings to the table.

  • Chapter 5 - Clustering Setup - walks through the steps to bring up a cluster of RavenDB nodes and working with a clustered database. This will also discuss the high availability and load balancing features in RavenDB.
  • Chapter 6 - Clustering Deep Dive - takes you through the RavenDB clustering behavior, how it works and how the both servers & clients are working together to give you a seamless distributed experience. We'll also discuss error handling and recovery in a clustered environment.
  • Chapter 7 - Integrating with the Outside World - explores using RavenDB along side additional systems, for integrating with legacy systems, working with dedicated reporting databases, ETL process, long running background tasks and in general how to make RavenDB fit better inside your environment.
  • Chapter 8 - Clusters Topologies - guides you through setting up several different clustering topologies and their pros and cons. This is intend to serve as a set of blueprints for architects to start from when they begin building a system.

Part III - Indexing

Focus: Developers, Architects

This part discuss how RavenDB index data to allow for quick retrieval of information, whatever it is a single document or aggregated data spanning years. We'll cover all the different indexing methods in RavenDB and how you can should use each of them in your systems to implement the features you want.

  • Chapter 9 - Simple Indexes - introduces indexes and their usage in RavenDB. Even though we have performed queries and worked with the data, we haven't actually dealt with indexes directly so far. Now is the time to lift the curtain and see how RavenDB is searching for information and what it means for your applications.
  • Chapter 11 - Full Text Search - takes a step beyond just querying the raw data and shows you how you can search your entities using powerful full text queries. We'll talk about the full text search options RavenDB provides, using analyzers to customize this for different usages and languages.
  • Chapter 13 - Complex indexes - goes beyond simple indexes and shows us how we can query over multiple collections at the same time. We will also see how we can piece together data at indexing time from related documents and have RavenDB keep the index consistent for us.
  • Chapter 13 - Map/Reduce - gets into data aggregation and how using Map/Reduce indexes allows you to get instant results over very large data sets with very little cost. Making reporting style queries cheap and accessible at any scale. Beyond simple aggregation, Map/Reduce in RavenDB also allows you to reshape the data coming from multiple source into a single whole, regardless of complexity.
  • Chapter 14 - Facet and Dynamic Aggregation - steps beyond static aggregation provided by Map/Reduce and give you the ability to run dynamic aggregation queries on your data, or just facet your search results to make it easier for the user to find what they are looking for.
  • Chapter 15 - Artificial Documents and Recursive Map/Reduce - guides you through using indexes to generate documents, instead of the other way around, and then use that both for normal operations and to support recursive Map/Reduce and even more complex reporting scenarios.
  • Chapter 16 - The Query Optimizier - takes you into the RavenDB query optimizer, index management and how RavenDB is treating indexes from the inside out. We'll see the kind of machinery that is running behind the scenes to get everything going so when you make a query, the results are there at once.
  • Chapter 17 - RavenDB Lucene Usage - goes into (quite gory) details about how RavenDB is making use of Lucene to implement its indexes. This is intended mostly for people who need to know what exactly is going on and how does everything fit together. This is how the sausage is made.
  • Chapter 18 - Advanced Indexing Techniques - dig into some specific usages of indexes that are a bit... outside the norm. Using spatial queries to find geographical information, generating dynamic suggestions on the fly, returning highlighted results for full text search queries. All the things that you would use once in a blue moon, but when you need them you really need them.

Part IV - Operations

Focus: Operations

This part deals with running and supporting a RavenDB cluster or clusters in production. From how you spina new cluster to decommissioning a downed node to tracking down performance problems. We'll learn all that you need (and then a bit more) to understand what is going on with RavenDB and how to customize its behavior to fit your own environment.

  • Chapter 19 - Deploying to Production - guides you through several production deployment options, including all the gory details about how to spin up a cluster and keep it healthy and happy. We'll discuss deploying to anything from a container swarm to bare metal, the networking requirements and configuration you need, security implications and anything else that the operation teams will need to comfortably support a RavenDB cluster in that hard land called production.
  • Chapter 20 - Security - focuses solely on security. How you can control who can access which database, running an encrypted database for highly sensitive information and securing a RavenDB instance that is exposed to the wild world.
  • Chapter 21 - High Availability - brings failure to the table, repeatedly. We'll discuss how RavenDB handles failures in production, how to understand, predict and support RavenDB in keeping all of your databases accessible and high performance in the face of various errors and disasters.
  • Chapter 22 - Recovering from Disasters - covers what happens after disaster strikes. When machines melt down and go poof, or someone issues the wrong command and the data just went into the incinerator. Yep, this is where we talk about backups and restore and all the various reasons why operations consider them sacred.
  • Chapter 23 - Monitoring - covers how to monitor and support a RavenDB cluster in production. We'll see how RavenDB externalize its own internal state and behavior for the admins to look at and how to make sense out of all of this information.
  • Chapter 24 - Tracking Performance - gets into why a particular query or a node isn't performing up to spec. We'll discuss how one would track down such an issue and find the root cause responsible for such a behavior, a few common reasons why such things happen and how to avoid or resolve them.

Part V - Implementation Details

Focus: RavenDB Core Team, RavenDB Support Engineers, Developers who wants to read the code

This part is the orientation guide that we throw at new hires when we sit them in front of the code. It is full of implementation details and background information that you probably don't need if all you want to know is how to build an awesome system on top of RavenDB.

On the other hand, if you want to go through the code and understand why RavenDB is doing something in a particular way, this part will likely answer all those questions.

  • Chapter 25 - The Blittable Format - gets into the details of how RavenDB represents JSON documents internally, how we go to this particular format and how to work with it.
  • Chapter 26 - The Voron Storage Engine - breaks down the low-level storage engine we use to put bits on the disk. We'll walk through how it works, the various features it offers and most importantly, why it had ended up in this way. A lot of the discussion is going to be driven by performance consideration, extremely low-level and full of operating system and hardware minutiae.
  • Chapter 27 - The Transaction Merger - builds on top of Voron and comprise one of the major ways in which RavenDB is able to provide high performance. We'll discuss how it came about, how it is actually used and what it means in terms of actual code using it.
  • Chapter 28 - The Rachis Consensus - talks about how RavenDB is using the Raft consuensus protocol to connect together different nodes in the cluster, how they are interacting with each other and the internal details of how it all comes together (and fall apart and recover again).
  • Chapter 31 - Cluster State Machine - brings the discussion one step higher by talking about how the RavenDB uses the result of the distributed consensus to actually manage all the nodes in the cluster and how we can arrive independently on each node to the same decision reliably.
  • Chapter 30 - Lording over Databases - peeks inside a single node and explores how a database is managed inside that node. More importantly, how we are dealing with multiple databases on the same node and what kind of impact each database can have on its neighbors.
  • Chapter 31 - Replication - dives into the details of how RavenDB manages multi master distributed database. We'll go over change vectors to ensure conflict detection (and aid in its resolution) how the data is actually being replicated between the different nodes in a database group.
  • Chapter 32 - Internal Architecture - gives you the overall view of the internal architecture of RavenDB. How it is built from the inside, and the reasoning why the pieces came together in the way they did. We'll cover both high-level architecture concepts and micro architecture of the common building blocks in the project.

Part VI - Parting

This part summarizes the entire book and provide some insight about what our future vision for RavenDB is.

  • Chapter 33 - What comes next - discusses what are our (rough) plans for the next major version and our basic roadmap for RavenDB.
  • Chapter 34 - Are we there yet? Yes! - summarize the book and let you go and start actually using all of this awesome information.

Implementing low level trieDigging into the C++ impl

time to read 4 min | 682 words

The low level trie challenge is storing a trie structure inside a 32KB buffer, and allowing read, write and delete operations on it. The idea is that you can have no additional data about the trie other than the buffer. The full implementation is available on github.

Because of that, you need to think very differently about how you approach the solution. Here is the header file with the main trie operations:

You’ll note that the only field that I have on the class is a 32KB buffer. All the data is stored inside it. In order to handle that, we are going to treat this buffer as our main memory and “allocate” our data inside it. I guess we could do this by using C++ allocators, but that seem to be very heavy weight and I don’t think it will work very well for this scenario. Instead, we define the following basic structures:

The trie header is located at the very beginning of the buffer, and contains the basic information about the trie. And then we have each node entry, which is about a specific entry. In particular, the offset fields in the node_header_info are actually pointers into the buffer itself. And the next_alloc field in the trie_header_info is used to “allocate” additional memory inside the buffer.

The idea is that whenever we need more memory, we just take it from the top of the buffer, until we run out of room. Here is how I add the very first entry:

I’m forcing it into a specific well known place in the buffer (line 4), and then I handle all rest of the memory assignments from there. The same holds true when we start getting into the nested structure. Each entry has a array of offsets that points to its children, which is pointed to by the children_offset value.

When we need to add a new child to a node, we aren’t modifying the old array, instead, we are allocating a completely new one, adding the new value and sorting the whole thing.

The sorting thing is actually fun, because it means that when I’m reading from the trie, I can reduce the amount of work that I have to do per level.

Here is how I read from the trie:

We first try to find a match in the trie, starting from the root node. The find_match method is then called to find if the current node is a match, and if not, whatever we can go further. It does so by comparing the key to the stored value, and if there is a partial match,recursing into the next level.

A fun part happens inside the find_child method, where we do a sorted search over the children array to find the node with the matching starting byte.

The whole idea is that because I’m using constrained memory, I can make do with very little actual work to manage the memory, I’m just using and discarding it all the time. But I am keeping track of when I can next allocate the memory, and how much of the buffer is in use. When I hit the memory limit, I’m going to defrag the trie. This is like so:

There is quite a lot going on in there, but the basic idea is that we are going to copy the buffer to a temporary buffer, reset our own buffer (this all happens in the first 4 lines), and then we are going to traverse the data in the temp buffer and insert it into the real buffer directly. One of the things that is most important here is that we properly calculate the size of the children array for each node.

I could have probably have done more here:

  • Ensure that the data is aligned
  • Merge nodes that have only a single child and no value of their own

And probably a lot of other stuff, but this has been a fun experiment.

Implementing low level trieSolving with C++

time to read 3 min | 440 words

The low level trie question has been a favorite question of mine for a while. It is simple in concept, but the limitations placed on it make it pretty hard to actually build. Previous posts in this series outlined the approach I had for solving this, but I always got caught up with something and didn’t get around to actually sitting down and resolving this completely.

As part of learning Rust, I decided to go ahead and implement this low level trie using Rust. I have failed, it was just too much babysitting by the compiler and having to fight it. I knew exactly what I wanted to do, but I kept having to jump through hops to get it to it. Eventually, I just called it quits  and decided to abandon the attempt to use Rust.

But I still want to do something out of my comfort zone, so I decided to run this exercise using C++. Now, I used to write quite a lot of C++ (along with VB, VBScript and ASP classic). But that was in the late 90s, and very early 2000s. I heard through the grapevine that someone kicked the C++ standard committee into high gear and started actually improve the language.

The result was three evenings spent on building a low level trie impl in C++, and quite a lot of fun. I’ll have another post about the actual details of the implementation, but in this post I mostly wanted to talk about the experience of getting back to C++. And it is… strange.

On the one hand, because I’m so used to C# and have used C++ before, this is oh so comfortable. Like wearing old set of gloves that you forgot that you even had.

On the other hand, I forgotten quite a lot of details about the language and the libraries, and they changed. My old C++ code would be newing up stuff and fighting to manage memory and very likely leaking like crazy. In this codebase? I don’t have a single new call throughout. And being able to do things like lambdas in C++ feels like magic.

I’ll admit that the codebase is heavily influenced by my Rust work. To start with, I’m using snake_case convention, and I found that I’m using a lot more std::pair that I would expect myself to use.

I would appreciate any code review on this, the core code is about 400 lines or so, and I’m mostly interested to know whatever I managed to write idiomatic modern C++, and if not, how this can be improved.

Implementing low level triePart II

time to read 1 min | 172 words

I was asked recently why I’m “burning” my interview questions by posting them on the blog. That actually has several reasons:

  1. If a candidate reads my blog and is able to produce high quality code based on this, well… that is pretty much the job description right there.
  2. We just finish another recruitment round, and we aren’t planning another one for at least 4 – 6 months.
  3. The fact that I’m posting the answers to a specific question doesn’t mean that the the subject matter is closed.

For example, let us take this question & answer. Note that this is approachable because  there is just standard .NET code.

However, the code as posted contains a bug, it is a small one, and all unit tests are passing, but it result is slightly inefficient behavior. Exposing that to a unit test is relatively easy, but going from the failing test back to the root cause and then fixing it would be an interesting investigative technique and good show of skills.

Implementing low level triePart I

time to read 5 min | 826 words

A few months ago I asked this question, and since then I’ve been asking it from some candidates. This is my 2nd tier question, and answering this correctly is going to give you quite a few brownie points.

The idea is to implement this interface:

But the only field that you can put in the implementation is a 32 KB byte array. See the original post for the full details.

Let us get rid of the easy stuff first, since the only value I can keep is a 32 KB byte array, saving and loading the values is trivial:

We can just save and load it directly, nothing much to do except validate that the size match on load.

Now, let us see what we need to do, here is an example of a trie:

Typically, you would implement that using nested dictionaries, but that isn’t going to work given this exercise limitations.

So let us consider a more primitive alternative. We need to store the following information about each node in the trie. We start by defining the following node:

This wouldn’t work for the limits we have, but we are going to be building this in pieces. So this is an important step to understand how we go about it.

We are going to store the data in the following format:


So on each level, we are able to jump to the next stage and check its values. Searching for the right value in the children array is going to be simply a matter of doing  binary search on the children array.

The full code listing is linked from the bottom of the post, if you want to see it in context.

And here it is when it is stored as an object tree. Note that each child store the full key, not just the part it owns, because it make it easier to visualize.


Let us look at how we are building this kind of trie. We’ll start with two pretty simple helper methods.

There isn’t much to really see here, FindDifference will find the first difference between two strings from a given offset and FindMatch does a standard binary search on the array. The only special thing here is the fact that we are returning the match position even if we failed, we needed that to be able to know where to put the next entry. This will be clear when we’ll look at the Insert method.

This is quite a bit of code, but it can neatly divide into three separate options. In the first case, we reach an empty section in the tree, so we can just create a new leaf and call it a day. This is why we use a ref variable here, because it allows us to mutate the given parameter, which can be either the root, or the array on a nested node.

If there are already values at this level, we check if we need to go into the next level (creating the new level if necessary).

If there isn’t a value at this level that start with the current prefix, we just add it to this level as a value.

Pretty simple, once you break it down. The fun part about this is that as written, this is actually safe for multi threaded use, so a single write can make modifications at the same time that multiple readers can read, without any need for synchronization.

Reading from the trie given this structure is pretty simple:

We simply use the same logic as before, but because we have to make no changes, this is much simple to work with.

The last bit that we still need to handle is the deletes. Here is how this is handled.

That is basically the inverse of Insert, but it need to handle patching up the trie on the way back up again.

And this is pretty much it, right?

Except… that this is not what I started with. Where is the byte array? We are allocating memory here like crazy and all we did was implement a pretty standard trie without the use of dictionaries.

What is the point?

Well, now that we have this implementation, we have a good understanding on what is actually required of us, and we can take this code and move it to the array, but that would be in the next post.

In the meantime, you can read the entire code in one go here.


No future posts left, oh my!


  1. Optimizing JavaScript and solving the halting problem (2):
    18 Aug 2017 - Part II
  2. RavenDB 4.0 (12):
    14 Aug 2017 - Maintaining transaction boundary integrity in a distributed cluster
  3. Public Service Announcement (2):
    11 Aug 2017 - ConcurrentDictionary.Count is locking
  4. PR Review (4):
    10 Aug 2017 - Errors, errors and more errors
  5. Production postmortem (19):
    07 Aug 2017 - 30% boost with a single line change
View all series


Main feed Feed Stats
Comments feed   Comments Feed Stats