Ayende @ Rahien

My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:


+972 52-548-6969

, @ Q c

Posts: 6,283 | Comments: 46,769

filter by tags archive

Protocol design implications: REST vs. TCP

time to read 3 min | 444 words

I was going over design documents today, and I noticed some common themes in the changes that we have between RavenDB 3.5 and RavenDB 4.0.

With RavenDB 3.5 (and all previous versions), we always had the communication layer as HTTP REST calls between nodes. When I designed RavenDB, REST was the thing to do, and it is reflected in the design of RavenDB itself. However, 8 years later, we sat down and considered whatever this is really appropriate for everything. The answer was a resounding no. In fact, while over 95% of RavenDB is still pure REST calls, we have moved certain key functions to using TCP directly.

Note that this goes in directly contrast to this post of mine from 2012: Why TCP is evil and HTTP is king.

The concerns in this post are still valid, but we have found that there are a few major reasons why we want to switch to TCP for certain stuff. In particular, the basic approach is that the a client will communicate with the server using HTTP calls, but servers communicate with one another using TCP. The great thing about TCP is that it is a stream oriented protocol, so I don’t need to carry state with me on every call.

With HTTP, each call is stateless, and I can’t assume anything about the other side. That means that I need to send the state, manage the state on the other side, and have to deal with potential issues such as concurrency in the same conversation, restarts of one side that the other side can’t easily detect, repeated validation on each call, etc.

With TCP, on the other hand, I can make a lot of assumptions about the conversation. I have state that I can carry between calls to the other side, and as long as the TCP connection is opened, I can assume that it is valid. For example, if I need to know what is the last item I sent to the remote end, I can query that at the beginning of the TCP connection, as part of the handshake, and then I can just assume that what I sent to the other side has arrived (since otherwise I’ll eventually get an error, requiring me to create a new TCP connection and do another handshake). On the other side, I can verify the integrity of a connection once, without requiring me to repeatedly verify our mutual state on each and every message being passed.

This has drastically simplified a lot of code on both the sending and receiving ends, and reduced the number of network roundtrips by a significant amount.

Getting the design ready for production troubleshooting

time to read 2 min | 339 words

The following is an excerpt from a design document for a major feature in RavenDB 4.0 that I’m currently reviewing, written by Tal.

One of the major problems when debugging such issues in production is the fact that most of the interesting information resides in memory and goes away when the server restarts, the sad thing is that the first thing an admin will do when having issues with the server is to recycle it, giving us very little to work with. Yes, we have logs, but debug level logs are very expensive and usually are not enabled in production (nor should they), we already have the ability to turn logs on, on a production system which is a great option but not enough. The root cause of a raft problem usually resides in the past so unless we have logs from the beginning of time there is not much use for them. The suggested solution is a persistent log for important events that indicate that things went south.

This is based on our experience (and frustration) from diagnosing production issues. By the time the admin see something is wrong, the problem already occurred, and in the process of handling the problem, the admin will typically focus on fixing it, rather than figuring out what exactly is going on.

Those kind of features, focusing explicitly on giving us enough information to find the root cause of the issue has been an on going effort for us. Yesterday they enabled us to get a debug package from a customer (a zip file that the server can generate with a lot of important information), go through it and figure out exactly what the problem was (the customer was running in 32 bits mode and running into virtual memory exhaustion) in one support roundtrip, rather than having to go back and forth multiple times to try to get a bunch of different data points to figure out the issue.

Also, go and read Release It, it has a huge impact on actual system design.

Rust based load balancing proxy server with async I/O

time to read 5 min | 981 words

In my previous Rust post, I built a simple echo server that spun a whole new thread for each connection. In this one, I want to do this in an async manner. Rust doesn’t have the notion of async/await, or something similar to Go green threads (it seems that it used to, and it was removed as costly abstraction for low level system language).

I’m going to use Tokio.rs to do that, but sadly enough, the example on the front page is about doing an async echo server. That kinda killed the mood for me there, since I wanted to deal with actually implementing it from scratch. Because of that, I decided to do something different and build an async Rust based TCP level proxy server.

Expected usage:

cargo run live-test.ravendb.net:80 localhost:8080

Which should print the port that this proxy runs on and then route the connection to one of those endpoints.

This led to something pretty strange, check out the following code:


Can you figure out what the type of addr is? It is inferred, but from what? The addr definition line does not have enough detail to figure it out. Therefor, the compiler actually goes down and see that we are passing it to the bind() method, which takes a std::net::SocketAddr value. So it figures out that the value must be a std::net::SocketAddr.

This seems to be utterly backward and fragile to me.  For example, I added this:


And the compiler was very upset with me:


I’m not used to the variable type being impacted by its usage. It seems very odd and awkward. It also seems to be pretty hard to actually figure out what the type of a variable is from just looking at the code. And there isn’t an easy way to get it short of causing an intentional compiler error that would reveal those details.

The final code looks like this:

At the same time, there is a lot going on here and this is very simple.

Lines 1 – 15 are really not interesting. Lines 17 – 29 are about parsing the user’s input, but the fun stuff begins from line 30 and onward.

I use fun cautiously, it wasn’t very fun to work with, to be honest. On lines 30 & 31 I setup the event loop handlers. And then bind them to a TCP listener.

On lines 40 – 62 I’m building the server (more on that later) and on line 64 I’m actually running the event loop.

The crazy stuff is all in the server handling. The incoming().for_each() call will call the method for each connected client, passing the stream and the remote IP. I then split the TCP stream into a read half and a write half, and select a node to load balance to.

Following that, I’m doing an async connect to that node, and if it is successful I’m splitting the server and then reverse them using the copy methods. Basically attaching the input and output of each to the other side. Finally, I’m joining the two together, so we’ll have a future that will only be done when both sending and receiving is done, and then I’m sending it back to the event loop.

Note that when I’m accepting a new TCP connection, I’m not actually pausing to connect to the remote server. Instead, I’m going to setup the call and then pass the next stage to the event loop ( the spawn ) method.

This was crazy hard to do and generated a lot of compilation errors along the way. Why? See line 57, where we erase the types?

The type of send_data without this line is something like Future<Result<(u64,u64), Error>>. But the map & map_err turn it into just a Future. If you don’t do that? Well, the compiler errors are generally very good, but it seems that inference can take you into la-la land, see this compiler error. That reminds me of trying to make sense of C++ template errors in 1999.


Now, here is the definition of the spawn method:


And I didn’t understand this syntax at all. Future is a trait, and it has associated types, but I’m thinking about generics as only the stuff inside the <>, so that was pretty confusing.

Basically, the problem was that I was passing a future that was returning values, while the spawn method expected one that was expecting none.

I also tried to change the and_then to just then, but at that point I got:


At which point I just out.

However, just looking at the code on its own, it is quite nicely done, and it expresses exactly what I want it to. My problem is that every single change that I make there has repercussions down the line, which is hard for me to predict.

Database security and defaults

time to read 3 min | 454 words

imageThe nightmare scenario for a database vendor is something like this: Over 27,000 databases managed by MongoDB held to ransom; 99,000 still vulnerable.

To be fair, this isn’t quite the nightmare scenario. The nightmare scenario would be if this would be due to some vulnerability in the database, but in this case, this isn’t that at all. It is simply that admins have setup a publicly visible database with no permissions on the internet, and said “okay, we are done, what is the next ticket?”.

Now, I presume that it didn’t really went on like that, but the problem is that if you follow the proper instructions, you are fine, by default, all your data is exposed over the network. I’m assuming that a few of those were setup by a proper dev ops team, and mostly they were done by “Joe, we are going to prod, here are the server credentials, make sure that the db is running there”.  Or, also likely, “We are done with dev, we can just use the same servers for prod”, with no one going in and setting them up properly.

You should note that this isn’t really about MongoDB specifically (although this is the one that has the most noise at the moment). This makes for a pretty sad reading, you literally require nothing to do to “hack” into production systems, and access over 600 TB of data (just for MongoDB).

The scary thing is that you have questions like this: bind_ip = does not work but works.

So the user will actively try to fight any measure you have to protect them.

With RavenDB, we have actually made it a startup error (the server will abort) if you are running a production instance (identified with a license) but you don’t require authentication. Now, there are scenarios where this is valid, such as running on a secured network, but they are pretty far, so you have a configuration option that you can set that will enable this scenario, but that require an explicit step and hopefully get the user thinking. With RavenDB 4.0, we’ll require authentication (or explicit configuration override) whenever a user ask us to bind to an interface other than localhost.

I think that is one case where you have to reverse “let’s make it easy to use us” and also consider putting hurdles to actually get it running. Because in the long run, getting this wrong means that it is very easy to shoot yourself in the foot.

Building a low level trie with Rust: Part I

time to read 3 min | 481 words

Before getting to grips with a distributed gossip system in Rust, I decided that it would be better to look at something a bit more challenging, but smaller in scope. I decided to implement the low level trie challenge in Rust.

This is interesting, because it is complex enough problem to require thinking even for experienced developers, but at the same time, it isn’t complex, just have a lot of details. It also require us to do a lot of lot level stuff and manipulate memory directly, so that is something that would be an interesting test for a system level programming language.

On the one hand, even with just a few hours with Rust, I can see some elegance coming out of certain pieces.  For example, take a look at the following code:

This is responsible for searching on the trie for a value, and I like that the find_match function traverse the tree and allow me to return both an enum value and a the closest match to this when it fails (so I can continue the process directly from there).

On the other hand, we have pieces of code like this:


And any line that has four casts in it is already suspect. And as I’m dealing with the raw memory, I have quite a bit of this.

And I certainly feeling the pain of the borrow checker. Here is where I’m currently stumped.

This is a small and simple example that shows the issue. It fails to compile:


I have a method that takes a mutable MyTrie reference, and pass it to a method that expects a immutable reference. This is fine, and would work. But I need to use the value from the find method in the delete_internal method, which again needs a mutable instance. But this fails with:

error[E0502]: cannot borrow `*self` as mutable because it is also borrowed as immutable

I understand the problem, but I am not really sure how to solve it. The problem is that I kinda want the find method to remain immutable, since it is also used on the read method, which can run on immutable instances.Technically speaking, I could copy the values that I want out of the node reference and do a lexical scope that would force the immutable borrow to end, but I’m unsure yet what would be the best option.

It seems like a lot of work to get what I want in spite, and not with the help of, the compiler.

The “average” developer field of interest

time to read 3 min | 445 words

I recently got a comment that included this:

…this "Making code faster" series is pretty useless for the average developer working on the usual application.

And I couldn’t disagree more.

Now, to be fair, the kind of challenges that we have to deal with while building high performance database engine are quite different than the kind of challenges that a typical enterprise developer has to deal with. That isn’t quite true, we have the studio, which behaves very much like an application, but you’ll rarely see me talking about the JavaScript aspects of building the RavenDB Studio. I’ll just say that from my perspective, this post summarize my feelings about modern JavaScript dev.

But back to the topic, the average developer is a mythical beast., who apparently have very little time to look around from coding yet another login page that have to be delivered now. I have had several such discussions about this in the past. And I think that this post summarize the opposing view, pretty much saying that is offesnive to expect someone to have the time to improve themselves.

My thinking is that if you value your career, you need to contiously put it in effort to actually improve and extend themselves, period. And that isn't to say that this is easy.

Here is the deal, if you are only interested in what you can bring you immediate value (the hottest JS libraries, or some design pattern that you need to use tomorrow), you are doing yourself a disservice. In order to be good, you need to continuously invest in learning new stuff. And you need to do it in such as way that you aren’t continuously learning the same stuff over and over again (no, learning WebForms, MVC 1, MVC 2 … MVC 5, MVC Core doesn’t count).

Quite a bit of this isn’t really going to be useful in the near future, but expanding your knowledge base is going to be useful in the long term. You are going to run into things and go “Ah! I know that already”, or be able to provide much better solutions then the stuff that have already been tried.

Yes, that actually take both work and effort. You need to make time to do so, and when you have family and kids that isn’t easy. But it is worth it.

And just because I know people are going to read it as such, that does not mean that you've got to abandon the kids to raise themselves while you are hacking away at your latest interest. For most people, putting in two to four hours a week is possible. Feel free to cut down the time you are browsing Facebook, for example.

First run with Rust, the echo server

time to read 3 min | 440 words

I have an idea for a relatively large Rust project, but I want to get up to speed on the language first. So I decided to write a simple Echo server. In other words, listen to the network, and just echo back what the user is sending us.

Simple and relatively trivial, but also involve enough moving pieces that it isn’t hello world.

Here is what I came up with:

I have no idea if this is actually idiomatic Rust, but I tried to keep it as close as possible to the spirit of the Rust book as I could.

One thing to note, the Windows telnet takes several seconds to connect, which made me think that my code is somehow slow, using telnet on Linux gives instant response.

There are a couple of things to note here.

On lines 7 and 9-10 you can see me using expect. This is effectively a way to say “if this returns an error, kill the program with this message”. This is an interestingly named method (read, I think it is backward), which I think is suitable for the high level portions of your code, but you shouldn’t probably use it in anything that doesn’t have full control over the environment.

On line 14, we start to listen to the network, accepting connections. On each accepted connection, we spin a new thread and then pass it the new connection. I actually expected that passing the connection to the thread would be harder (or at least require a move), I’m not sure why it worked. I’m guessing that this is because the stream we got is the result of the iteration, and we already took ownership on that value?

The rest happens on line 26, in the handle_client method (incidentally, the Rust compiler will complain if you don’t match the expected naming conventions, which is a good way to ensure that you have consistent experience).

You might note that I have an issue with error handling here. Rust’s methods return a Result struct, and that requires unpacking it. In the first case, line 28, we just assume that it isn’t even a valid connection, but in the second, we actually handle it via nested ifs. As I understand it, it might be done with composing this, but I tried using and_then, or_else, map and map_error and wasn’t really able to come up with something that would actually work.

My next challenge, let us avoid taking a thread per connection and do async I/O. It looks like we can do something similar to TPL as well as an event loop.

Uninformed comments on Rust

time to read 3 min | 591 words

So my current project is to learn Rust. This is part of firm belief that if you don’t actively extend your reach, you’ll be forever stuck in an infinite loop. So Rust it is.

Why Rust? Because the previous time I did something like that I looked at Go, which is nice, but except for green threads (which I’m familiar with from Erlang), it is pretty much the same old thing. I’ll admit that the compilation speeds are pretty attractive there, but I’m working mostly on system software this days, and it seems like it is almost there, but not quite.

So the choice was between Rust and modern C++. The last time I actually did any C++ professionally was in 2006, I needed to build a tool to make a machine join the domain automatically, or something like that. I remember being stuck for an embarrassing long time on “using std” vs “using namespace std”. Before that, I actually did C++ for a few years, in 1999- 2002 or so. I was below average, at best, but I had read all the Myers books, and I had great confidence that I can mess up real well.

But C++ today is a wildly different beast, much nicer, but at the same time… The core concepts (memory management, RAII, etc) I already know, and a lot of the rest seems like stuff that I now take for granted (foreach, auto,lambdas, not doing manual memory management all the time, etc). Looking into modern C++ codebase and discussion, I see a lot of stuff about move constructors and variadic templates. That caused me some pain, but basically, there isn’t anything new there for me, just detail work.

Rust is new, in the sense that it has the Burrow checker and it is supposed to be a safe & fast system programming language. That seems like a big contradiction in term, but it is at least interesting.

So I set out to read the Rust book. I have done so, and below you can see some of my impressions while reading it.

This post is written before I did anything more interesting than writing a hello world.

Rust has macros, and I like that. Should be interesting to see what it can do.  Okay, I saw what it can do, and I sort of want to go back to variadic templates. To be rather more exact, that syntax looks sort of the same. And remind me strongly of trying to grok XSLT in the 90s.

I like that it has explicit threading support, and it is interesting that this is baked directly into the language and checked by the compiler. Although some of the stuff that is mentioned there is fishy (Rc<RefCell<Vec<T>>> and other strangeness), but then again, I’ve literally just read the book, so need to see how this work out in practice.

Something that was really strange during the reading is that Rust uses the type of the assignment to infer types.

Speaking of this, I’m used to languages that have far less syntax. In comparison, it seems like Rust have a lot of stuff going on. Most of it seems to be optional / inferred, but that is surprising.

Loop labels and named break / continue are really nice. They are always ugly when you need them.

And obviously the type system is quite sophisticated. I’m going to see how hard it is going to hurt me when I try writing actually stuff with it.

Searching shouldn’t be so hard

time to read 3 min | 411 words

The trigger for this post is a StackOverflow question that caught my eye.

Let us imagine that you have the following UI, and you need to implement the search function:

enter image description here 

For simplicity’s sake, I’ll assume that you have the following class:

And we need to implement this search, we want users to be able to search by the restaurant name, or its location or its cuisine, or all of the above, for that matter. A query such as”Rama Thai” or “coffee 48th st” should all give us results.

One way of doing that is to do something like this:

Of course, that would only find stuff that matches directly. It will find “Rama” or “Thai”, but “Rama Thai” would confuse it. We can make it better, somewhat, but doing a bit of work on the client side and changing the query, like so:

That would now find results for “Rama Thai”, right? But what about “Mr Korean” ? Consider a user who have no clue about the restaurant name, let alone how to spell it, but just remember enough pertinent information “it was Korean food and had a Mr in its name, on Fancy Ave”.

You can spend a lot of time trying to cater for those needs. Or you can stop thinking about the data you search as the same shape of your results and use this index:

Note that what we are doing here is picking from the restaurant document all the fields that we want to search on and plucking them into a single location, which we then mark as analyzed. This let RavenDB know that it is time to start cranking. It merge all of those details together and arrange them in such a way that the following query can be done:

And now we don’t do a field by field comparison, instead, we’ll apply the same analysis rules that we applied at indexing time to the query, after which we’ll be able to search the index. And now we have sufficient information not just to find this a restaurant named “Fancy Mr Korean” (which to my knowledge doesn’t exist), but to find the appropriate Korean restaurant in the appropriate street, pretty much for free.

Those kind of features can dramatically uplift your applications’ usability and attractiveness to users. “This sites gets me”. 

Tricks of working with native memory

time to read 2 min | 362 words

I want to start by saying that this isn’t my idea, I read about it a few times, and I recently encountered it with sodium_malloc, so I decided to write my own implementation of the world’s most expensive memory allocator.

What this code does is pretty simple, and quite brutal. It allocates memory in such a fashion that absolutely guarantee that you can’t get away with a whole host of memory problems.

For example, if you try overwrite a buffer allocated by this method, you’ll immediately hit the guard page and die horribly (and predictably, in the place where the error actually happened, not a long way off). If you somehow write before the buffer, that will be detected on free if this is a small under write (which tend to be much rarer, by the way), or immediately if this is a big change.

What is more, once the memory is freed, it is poisoned, and can never be used again. This pretty much rely on us running on 64 bits with effectively unlimited virtual memory, and has the nasty side effect of turning a 50 bytes allocation to something requiring 12 KB. Having said that, as a debugging tool, this is invaluable.

And yes, I’m aware that windows already have that with the heap verifier. But as I’m using this in .NET code, I needed to write my own (this also pretty much work the same way with Linux, you just need to switch the API, but the functionality is the same).

This was built because we were chasing a memory corruption error, and I run this, but it pointed me to a totally different location than suspected. So it is either doing a very good job, or it found me another issue.

… investigating …

Or, as the case may be, we found a bug in the actual memory guard (we didn’t handle allocations of exact page size correctly, and they broke), but at least it broke consistently and was pretty easy to find once I looked in the right place Smile.


  1. The metrics calculation methods - 3 days from now
  2. The struggle with Rust - 4 days from now

There are posts all the way to Jan 24, 2017


  1. Answer (9):
    20 Jan 2017 - What does this code do?
  2. Challenge (48):
    19 Jan 2017 - What does this code do?
  3. Implementing low level trie (2):
    14 Dec 2016 - Part II
  4. The performance regression in the optimization (2):
    01 Dec 2016 - Part II
  5. Digging into the CoreCLR (4):
    25 Nov 2016 - Some bashing on the cost of hashing
View all series


Main feed Feed Stats
Comments feed   Comments Feed Stats