Ayende @ Rahien

My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by email or phone:


+972 52-548-6969

, @ Q c

Posts: 6,465 | Comments: 47,699

filter by tags archive

Implementing low level trieDigging into the C++ impl

time to read 4 min | 682 words

The low level trie challenge is storing a trie structure inside a 32KB buffer, and allowing read, write and delete operations on it. The idea is that you can have no additional data about the trie other than the buffer. The full implementation is available on github.

Because of that, you need to think very differently about how you approach the solution. Here is the header file with the main trie operations:

You’ll note that the only field that I have on the class is a 32KB buffer. All the data is stored inside it. In order to handle that, we are going to treat this buffer as our main memory and “allocate” our data inside it. I guess we could do this by using C++ allocators, but that seem to be very heavy weight and I don’t think it will work very well for this scenario. Instead, we define the following basic structures:

The trie header is located at the very beginning of the buffer, and contains the basic information about the trie. And then we have each node entry, which is about a specific entry. In particular, the offset fields in the node_header_info are actually pointers into the buffer itself. And the next_alloc field in the trie_header_info is used to “allocate” additional memory inside the buffer.

The idea is that whenever we need more memory, we just take it from the top of the buffer, until we run out of room. Here is how I add the very first entry:

I’m forcing it into a specific well known place in the buffer (line 4), and then I handle all rest of the memory assignments from there. The same holds true when we start getting into the nested structure. Each entry has a array of offsets that points to its children, which is pointed to by the children_offset value.

When we need to add a new child to a node, we aren’t modifying the old array, instead, we are allocating a completely new one, adding the new value and sorting the whole thing.

The sorting thing is actually fun, because it means that when I’m reading from the trie, I can reduce the amount of work that I have to do per level.

Here is how I read from the trie:

We first try to find a match in the trie, starting from the root node. The find_match method is then called to find if the current node is a match, and if not, whatever we can go further. It does so by comparing the key to the stored value, and if there is a partial match,recursing into the next level.

A fun part happens inside the find_child method, where we do a sorted search over the children array to find the node with the matching starting byte.

The whole idea is that because I’m using constrained memory, I can make do with very little actual work to manage the memory, I’m just using and discarding it all the time. But I am keeping track of when I can next allocate the memory, and how much of the buffer is in use. When I hit the memory limit, I’m going to defrag the trie. This is like so:

There is quite a lot going on in there, but the basic idea is that we are going to copy the buffer to a temporary buffer, reset our own buffer (this all happens in the first 4 lines), and then we are going to traverse the data in the temp buffer and insert it into the real buffer directly. One of the things that is most important here is that we properly calculate the size of the children array for each node.

I could have probably have done more here:

  • Ensure that the data is aligned
  • Merge nodes that have only a single child and no value of their own

And probably a lot of other stuff, but this has been a fun experiment.

Implementing low level trieSolving with C++

time to read 3 min | 440 words

The low level trie question has been a favorite question of mine for a while. It is simple in concept, but the limitations placed on it make it pretty hard to actually build. Previous posts in this series outlined the approach I had for solving this, but I always got caught up with something and didn’t get around to actually sitting down and resolving this completely.

As part of learning Rust, I decided to go ahead and implement this low level trie using Rust. I have failed, it was just too much babysitting by the compiler and having to fight it. I knew exactly what I wanted to do, but I kept having to jump through hops to get it to it. Eventually, I just called it quits  and decided to abandon the attempt to use Rust.

But I still want to do something out of my comfort zone, so I decided to run this exercise using C++. Now, I used to write quite a lot of C++ (along with VB, VBScript and ASP classic). But that was in the late 90s, and very early 2000s. I heard through the grapevine that someone kicked the C++ standard committee into high gear and started actually improve the language.

The result was three evenings spent on building a low level trie impl in C++, and quite a lot of fun. I’ll have another post about the actual details of the implementation, but in this post I mostly wanted to talk about the experience of getting back to C++. And it is… strange.

On the one hand, because I’m so used to C# and have used C++ before, this is oh so comfortable. Like wearing old set of gloves that you forgot that you even had.

On the other hand, I forgotten quite a lot of details about the language and the libraries, and they changed. My old C++ code would be newing up stuff and fighting to manage memory and very likely leaking like crazy. In this codebase? I don’t have a single new call throughout. And being able to do things like lambdas in C++ feels like magic.

I’ll admit that the codebase is heavily influenced by my Rust work. To start with, I’m using snake_case convention, and I found that I’m using a lot more std::pair that I would expect myself to use.

I would appreciate any code review on this, the core code is about 400 lines or so, and I’m mostly interested to know whatever I managed to write idiomatic modern C++, and if not, how this can be improved.

Implementing low level triePart II

time to read 1 min | 172 words

I was asked recently why I’m “burning” my interview questions by posting them on the blog. That actually has several reasons:

  1. If a candidate reads my blog and is able to produce high quality code based on this, well… that is pretty much the job description right there.
  2. We just finish another recruitment round, and we aren’t planning another one for at least 4 – 6 months.
  3. The fact that I’m posting the answers to a specific question doesn’t mean that the the subject matter is closed.

For example, let us take this question & answer. Note that this is approachable because  there is just standard .NET code.

However, the code as posted contains a bug, it is a small one, and all unit tests are passing, but it result is slightly inefficient behavior. Exposing that to a unit test is relatively easy, but going from the failing test back to the root cause and then fixing it would be an interesting investigative technique and good show of skills.

Implementing low level triePart I

time to read 5 min | 826 words

A few months ago I asked this question, and since then I’ve been asking it from some candidates. This is my 2nd tier question, and answering this correctly is going to give you quite a few brownie points.

The idea is to implement this interface:

But the only field that you can put in the implementation is a 32 KB byte array. See the original post for the full details.

Let us get rid of the easy stuff first, since the only value I can keep is a 32 KB byte array, saving and loading the values is trivial:

We can just save and load it directly, nothing much to do except validate that the size match on load.

Now, let us see what we need to do, here is an example of a trie:

Typically, you would implement that using nested dictionaries, but that isn’t going to work given this exercise limitations.

So let us consider a more primitive alternative. We need to store the following information about each node in the trie. We start by defining the following node:

This wouldn’t work for the limits we have, but we are going to be building this in pieces. So this is an important step to understand how we go about it.

We are going to store the data in the following format:


So on each level, we are able to jump to the next stage and check its values. Searching for the right value in the children array is going to be simply a matter of doing  binary search on the children array.

The full code listing is linked from the bottom of the post, if you want to see it in context.

And here it is when it is stored as an object tree. Note that each child store the full key, not just the part it owns, because it make it easier to visualize.


Let us look at how we are building this kind of trie. We’ll start with two pretty simple helper methods.

There isn’t much to really see here, FindDifference will find the first difference between two strings from a given offset and FindMatch does a standard binary search on the array. The only special thing here is the fact that we are returning the match position even if we failed, we needed that to be able to know where to put the next entry. This will be clear when we’ll look at the Insert method.

This is quite a bit of code, but it can neatly divide into three separate options. In the first case, we reach an empty section in the tree, so we can just create a new leaf and call it a day. This is why we use a ref variable here, because it allows us to mutate the given parameter, which can be either the root, or the array on a nested node.

If there are already values at this level, we check if we need to go into the next level (creating the new level if necessary).

If there isn’t a value at this level that start with the current prefix, we just add it to this level as a value.

Pretty simple, once you break it down. The fun part about this is that as written, this is actually safe for multi threaded use, so a single write can make modifications at the same time that multiple readers can read, without any need for synchronization.

Reading from the trie given this structure is pretty simple:

We simply use the same logic as before, but because we have to make no changes, this is much simple to work with.

The last bit that we still need to handle is the deletes. Here is how this is handled.

That is basically the inverse of Insert, but it need to handle patching up the trie on the way back up again.

And this is pretty much it, right?

Except… that this is not what I started with. Where is the byte array? We are allocating memory here like crazy and all we did was implement a pretty standard trie without the use of dictionaries.

What is the point?

Well, now that we have this implementation, we have a good understanding on what is actually required of us, and we can take this code and move it to the array, but that would be in the next post.

In the meantime, you can read the entire code in one go here.


  1. RavenDB 4.0 nightly builds are now available - about one hour from now
  2. Writing SSL Proxy: Part I, routing - about one day from now
  3. Writing SSL Proxy: Part II, delegating authentication - 2 days from now
  4. 0.1x or 10x, time matters - 3 days from now
  5. RavenDB 4.0 Unsung Heroes: Map/reduce - 4 days from now

And 2 more posts are pending...

There are posts all the way to Oct 03, 2017


  1. RavenDB 4.0 Unsung Heroes (4):
    22 Sep 2017 - Field compression
  2. RavenDB 4.0 (13):
    11 Sep 2017 - Support options
  3. Optimizing select projections (5):
    01 Sep 2017 - Part IV–Understand, don’t do
  4. Optimizing JavaScript and solving the halting problem (2):
    18 Aug 2017 - Part II
View all series


Main feed Feed Stats
Comments feed   Comments Feed Stats