Ayende @ Rahien

My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:


+972 52-548-6969

, @ Q c

Posts: 6,282 | Comments: 46,766

filter by tags archive

First run with Rust, the echo server

time to read 3 min | 440 words

I have an idea for a relatively large Rust project, but I want to get up to speed on the language first. So I decided to write a simple Echo server. In other words, listen to the network, and just echo back what the user is sending us.

Simple and relatively trivial, but also involve enough moving pieces that it isn’t hello world.

Here is what I came up with:

I have no idea if this is actually idiomatic Rust, but I tried to keep it as close as possible to the spirit of the Rust book as I could.

One thing to note, the Windows telnet takes several seconds to connect, which made me think that my code is somehow slow, using telnet on Linux gives instant response.

There are a couple of things to note here.

On lines 7 and 9-10 you can see me using expect. This is effectively a way to say “if this returns an error, kill the program with this message”. This is an interestingly named method (read, I think it is backward), which I think is suitable for the high level portions of your code, but you shouldn’t probably use it in anything that doesn’t have full control over the environment.

On line 14, we start to listen to the network, accepting connections. On each accepted connection, we spin a new thread and then pass it the new connection. I actually expected that passing the connection to the thread would be harder (or at least require a move), I’m not sure why it worked. I’m guessing that this is because the stream we got is the result of the iteration, and we already took ownership on that value?

The rest happens on line 26, in the handle_client method (incidentally, the Rust compiler will complain if you don’t match the expected naming conventions, which is a good way to ensure that you have consistent experience).

You might note that I have an issue with error handling here. Rust’s methods return a Result struct, and that requires unpacking it. In the first case, line 28, we just assume that it isn’t even a valid connection, but in the second, we actually handle it via nested ifs. As I understand it, it might be done with composing this, but I tried using and_then, or_else, map and map_error and wasn’t really able to come up with something that would actually work.

My next challenge, let us avoid taking a thread per connection and do async I/O. It looks like we can do something similar to TPL as well as an event loop.

Uninformed comments on Rust

time to read 3 min | 591 words

So my current project is to learn Rust. This is part of firm belief that if you don’t actively extend your reach, you’ll be forever stuck in an infinite loop. So Rust it is.

Why Rust? Because the previous time I did something like that I looked at Go, which is nice, but except for green threads (which I’m familiar with from Erlang), it is pretty much the same old thing. I’ll admit that the compilation speeds are pretty attractive there, but I’m working mostly on system software this days, and it seems like it is almost there, but not quite.

So the choice was between Rust and modern C++. The last time I actually did any C++ professionally was in 2006, I needed to build a tool to make a machine join the domain automatically, or something like that. I remember being stuck for an embarrassing long time on “using std” vs “using namespace std”. Before that, I actually did C++ for a few years, in 1999- 2002 or so. I was below average, at best, but I had read all the Myers books, and I had great confidence that I can mess up real well.

But C++ today is a wildly different beast, much nicer, but at the same time… The core concepts (memory management, RAII, etc) I already know, and a lot of the rest seems like stuff that I now take for granted (foreach, auto,lambdas, not doing manual memory management all the time, etc). Looking into modern C++ codebase and discussion, I see a lot of stuff about move constructors and variadic templates. That caused me some pain, but basically, there isn’t anything new there for me, just detail work.

Rust is new, in the sense that it has the Burrow checker and it is supposed to be a safe & fast system programming language. That seems like a big contradiction in term, but it is at least interesting.

So I set out to read the Rust book. I have done so, and below you can see some of my impressions while reading it.

This post is written before I did anything more interesting than writing a hello world.

Rust has macros, and I like that. Should be interesting to see what it can do.  Okay, I saw what it can do, and I sort of want to go back to variadic templates. To be rather more exact, that syntax looks sort of the same. And remind me strongly of trying to grok XSLT in the 90s.

I like that it has explicit threading support, and it is interesting that this is baked directly into the language and checked by the compiler. Although some of the stuff that is mentioned there is fishy (Rc<RefCell<Vec<T>>> and other strangeness), but then again, I’ve literally just read the book, so need to see how this work out in practice.

Something that was really strange during the reading is that Rust uses the type of the assignment to infer types.

Speaking of this, I’m used to languages that have far less syntax. In comparison, it seems like Rust have a lot of stuff going on. Most of it seems to be optional / inferred, but that is surprising.

Loop labels and named break / continue are really nice. They are always ugly when you need them.

And obviously the type system is quite sophisticated. I’m going to see how hard it is going to hurt me when I try writing actually stuff with it.

What does this tidbit DO?

time to read 1 min | 176 words

I run into the following inside a C codebase.


That caused some head scratching, until I realized that this is probably just a funny way to write a static assert.

Basically, sizeof() is a constant time value that is being evaluated by the compiler, I assume that the cast to void is to make this clear to the compiler that we are throwing away the value intentionally and that it shouldn’t actually generate any code here.

The rest is pretty basic, the compiler need to compute the size of the array, and to do that, it need to compute the value, but if the above statement is false, the value is –1, and that is not a valid size for the array, so a compiler error will be raised.

I think that there are better ways to do it, but I’m guessing this a more portable version?

Searching shouldn’t be so hard

time to read 3 min | 411 words

The trigger for this post is a StackOverflow question that caught my eye.

Let us imagine that you have the following UI, and you need to implement the search function:

enter image description here 

For simplicity’s sake, I’ll assume that you have the following class:

And we need to implement this search, we want users to be able to search by the restaurant name, or its location or its cuisine, or all of the above, for that matter. A query such as”Rama Thai” or “coffee 48th st” should all give us results.

One way of doing that is to do something like this:

Of course, that would only find stuff that matches directly. It will find “Rama” or “Thai”, but “Rama Thai” would confuse it. We can make it better, somewhat, but doing a bit of work on the client side and changing the query, like so:

That would now find results for “Rama Thai”, right? But what about “Mr Korean” ? Consider a user who have no clue about the restaurant name, let alone how to spell it, but just remember enough pertinent information “it was Korean food and had a Mr in its name, on Fancy Ave”.

You can spend a lot of time trying to cater for those needs. Or you can stop thinking about the data you search as the same shape of your results and use this index:

Note that what we are doing here is picking from the restaurant document all the fields that we want to search on and plucking them into a single location, which we then mark as analyzed. This let RavenDB know that it is time to start cranking. It merge all of those details together and arrange them in such a way that the following query can be done:

And now we don’t do a field by field comparison, instead, we’ll apply the same analysis rules that we applied at indexing time to the query, after which we’ll be able to search the index. And now we have sufficient information not just to find this a restaurant named “Fancy Mr Korean” (which to my knowledge doesn’t exist), but to find the appropriate Korean restaurant in the appropriate street, pretty much for free.

Those kind of features can dramatically uplift your applications’ usability and attractiveness to users. “This sites gets me”. 

Tricks of working with native memory

time to read 2 min | 362 words

I want to start by saying that this isn’t my idea, I read about it a few times, and I recently encountered it with sodium_malloc, so I decided to write my own implementation of the world’s most expensive memory allocator.

What this code does is pretty simple, and quite brutal. It allocates memory in such a fashion that absolutely guarantee that you can’t get away with a whole host of memory problems.

For example, if you try overwrite a buffer allocated by this method, you’ll immediately hit the guard page and die horribly (and predictably, in the place where the error actually happened, not a long way off). If you somehow write before the buffer, that will be detected on free if this is a small under write (which tend to be much rarer, by the way), or immediately if this is a big change.

What is more, once the memory is freed, it is poisoned, and can never be used again. This pretty much rely on us running on 64 bits with effectively unlimited virtual memory, and has the nasty side effect of turning a 50 bytes allocation to something requiring 12 KB. Having said that, as a debugging tool, this is invaluable.

And yes, I’m aware that windows already have that with the heap verifier. But as I’m using this in .NET code, I needed to write my own (this also pretty much work the same way with Linux, you just need to switch the API, but the functionality is the same).

This was built because we were chasing a memory corruption error, and I run this, but it pointed me to a totally different location than suspected. So it is either doing a very good job, or it found me another issue.

… investigating …

Or, as the case may be, we found a bug in the actual memory guard (we didn’t handle allocations of exact page size correctly, and they broke), but at least it broke consistently and was pretty easy to find once I looked in the right place Smile.

Strong data encryption questions

time to read 3 min | 428 words

Image result for encryption iconWith RavenDB 4.0, we are looking to strengthen our encryption capabilities. Right now RavenDB is capable of encrypting document data and the contents of indexes at rest. That is, if you look at the disk, the data is securely encrypted. However, in memory, we keep quite a bit of information in plain text (mostly in caches of various kinds), and the document metadata isn’t encrypted, so documents keys are visible.

With RavenDB 4.0 we are looking into making some stronger guarantees. That means that we want to keep all data encrypted on disk, and only decrypt it during transaction, after which it will immediately be encrypted back.

Now, encryption and security in general are pretty big fields, and I’m by no means an expert, so I thought that I would outline the initial goals of our research and see if you have anything to add.

  • All encryption / decryption operations are done on data that is aligned on 4KB boundary and is always in multiples of 4 KB. It would be extremely helpful if the encryption would not change the size of the data. Given that the data is always in 4KB increments, I don’t think that this is going to be an issue.
  • We can’t use managed API to do so. Out data is actually residing in unmanaged memory, so ideally we would need something like this:

  • I also need to do this be able to call this from C#, and it needs to run on Windows, Linux and hopefully Mac OS.
  • I’ve been looking at stuff like this page, trying to understand what it means and hoping that this is actually using best practices for safety.

Another problem is that just getting the encryption code right doesn’t help without managing all the rest of it properly. Selecting the appropriate algorithm and mode, making sure that the library we use is both well known and respected, etc. How do we distributed / deploy / update it over multiple platforms?

Any recommendations?

You can see some sample code that I have made here: https://gist.github.com/ayende/13b206b9d83e7aa126df77d6b12711f3

This is basically the sample OpenSSL translated to C# with a bit of P/Invoke. Note that this is meant for our own use, so we don't need padding since we always pass a buffer that is a multiple of 4KB. 

I'm assuming that since this is based on the example on the OpenSSL wiki, it is also a best practice sample. There is a chance that I am mistaken, however, which is why we have this post.

RavenDB 4.0 Indexing Benchmark

time to read 4 min | 645 words

I run into this post, which was pretty interesting to read. It compares a bunch of ways to index inside CouchDB, so I decided to see how RavenDB 4.0 compares.

I wrote the following code to generate the data inside RavenDB:

In the CouchDB post, this took… a while. With RavenDB, this took 7.7 seconds and the database size at the end was 48.06 MB. This is my laptop, 6th gen i7 with 16 GB RAM and SSD drive. The CouchDB tests were run over a similar machine, but with 8 GB RAM, but RavenDB didn’t get to use much memory at all throughout the benchmark.

It is currently sitting on around 205MB working set, and allocated a total of 70 MB managed memory and 19 MB of native memory.

I then created the following index:


This does pretty much the same as the indexes created in CouchDB. Note that in this case, this is running in process, so the nearest equivalent would be to Erlang native views.

This took 1.281 seconds to index 100,000 documents, giving us a total of 78,064 indexed documents per second.

The working during the indexing process grew to 312 MB.

But those were small documents, what about larger ones? In the linked post, there was another test done, this time with each index also having a 300KB text property. Note that this is a single property.

The math goes like this: 300KB * 100,000 documents = 30,000,000 KB = 29,296 MB = 28.6 GB.

Inserting those took 1 minute and 40 seconds. However, the working set for RavenDB peaked at around 500 MB, and the total size of the database at the end was  512.06 MB. It took me a while to figure out what was going on.

One of the features that we have with the blittable format is the fact that we can compress large string values. The large string property was basically lorem ipsum repeated 700 times, so it compressed real well. And the beauty of this is that unless we actually need to touch that property and do something to it, it can remain compressed.

Indexing that amount of data took 1.45 seconds, for 68,965 indexes documents per second.

My next test was to avoid creating a single large property and go with a lot of smaller properties.

This generates 100,000 documents in the 90KB – 900 KB range. Inserting this amount of data took a while longer. This is mostly because of the amount of data that we write in this scenario which is in the many GB range. In fact, this is what my machine looked like while the insert was going on:


Overall, the insert process took 8:01 minutes to complete. And the final DB size was around 12 GB.

Indexing that amount of information took a lot longer, and consume a few resources:


Time to index? Well, it was about twice as much as before, with RavenDB taking a total of 2.58 seconds to index 100,000 documents totaling 12 GB in size.

That comes to 38,759 indexes documents per second.

A large part of that is related to the way we are storing information in blittable format. We don’t need to do any processing to it, so we are drastically faster.

Comparing to the CouchDB results in the linked post, we have:

  RavenDB CouchDB
Small documents 78,064 19,821
Large documents 38,759   9,363

In other words, we are talking about being 4 times faster than the faster CouchDB option.

RavenDB 4.0 Alpha is out!

time to read 3 min | 428 words

imageFor the past two years or so, we have been working very hard on RavenDB 4.0. You have seen some of the posts in this blog detailing some of the work we have been doing. In RavenDB 4.0 we have put a tremendous amount of emphasis on performance and reducing the amount of work we are doing. We have been quite successful. Our metrics show a minimum of 10x improvement across the board, and often we are even faster.

I’m going to be posting a lot about RavenDB 4.0 in the next few weeks, to highlight some of the new stuff, this time in context, but for now, let me just say that I haven’t been this excited about releasing software since the 1.0 release of RavenDB.

You can download the new alpha here, we have pre-built binaries for Windows, Linux (Ubuntu) and Raspberry PI. Mac OS support is scheduled and will likely be in the next alpha.

The alpha version supports building databases, working with documents, subscriptions, changes, indexing (simple, full text and map/reduce). There have been significant improvements to indexing speed and how they work, memory utilization across the board is much lower and CPU utilization is much reduced.

We aren’t ready to do full blown benchmarks yet, but a sample scenario that has RavenDB 3.5 pegged at 100% with 45,000 requests per second has RavenDB 4.0 relaxing with 70% CPU with 153,000 requests per second.

We would like to get feedback on RavenDB 4.0, and so please download the software and give it a whirl.

You need to make sure that you use both server & client from the same version (older versions will not work with the new server). We actually have a couple of clients, we have the updated client for 4.0, which is pretty much the same as the old client, just updated to work with the new server, and we have a completely new client, which is built to provide much higher performance and apply the same lessons we learned while building the server to the client code. Over time, the new client is going to be the default one, but while we aren’t there yet, we want people to have access to it from the get go.

Please remember that this is pre-release alpha software, as such, bugs are expected, and should be reported.

Business features vs. technical features

time to read 4 min | 613 words

A feature is something that your application/service does. Usually we don’t give it a lot more thought, but I recently had an interesting discussion about the exact distinctions between a business feature and a technical feature.

Let us imagine that we are talking about an application that allow to send snail mail, we already seen it before. A user will call the API and then a few days later a physical letter will show up at your door. So far, it is pretty simple. The question is, what can you offer in addition to expand the business.

For example, we might offer:

  • Mail tracking – providing a way to ensure that the recipient actually got the letter.
  • Snail mail to email – getting a physical email, and having that sent to the customer.

Those two are obvious extensions to the core business, and from the point of view of the business, it is great. From a technical perspective, that is a whole lot of complexity. You need to integrate with FedEx to handle the mail tracking, and you need to setup some sort of an automated system that will sort the mail, scan it and upload it to the customer’s account.

The problem is that at this point, you don’t really know what kind of reaction those features are going to have. They are both non trivial and in some cases require major capital expenditure to implement and are pretty hard to properly size upfront.

So you split it. Instead of doing this as a single feature, you have a business feature and a technical feature. A business feature means that your business offers this service, building that requires research to show that we can actually offer that, check whatever there are legal ramifications (some mail can be sensitive, privacy concerns, etc), check what kind of pricing we can charge, etc.  The technical feature is actually implementing all of that.

But the key observation here is that you don’t actually do the technical implementation, at least not just yet. You do the work around the business end of the feature, and then you announce this feature availability. As in, right now you can track the snail mail, or right now you can get your mail scanned and uploaded. This is done with minimal technical work in the backend, and with the caveat that this still experimental and pricing might change.

This isn’t cheating, mind you. Once you announced this feature, you wait to see what kind of reaction we’ll have. One of the options is that users will really love this feature, and start immediately using it. In this case, you have a good problem, people are flocking to give you money. In the meantime, you have Joe and Samantha, from the local high school working for minimum wage in the afternoon to manually do the work. So you can complete the customer expectations, as you are now working to complete the technical side and automate the whole thing (firing Joe and Samantha along the way).

The key here is that you don’t have to do any major upfront investment, in development or in facilities, before you can have this feature. And most of the time, even if it is a major feature, the ramp up time is enough for you to have a pretty good idea about what you actually need to do. And in the meantime, you have a micro service architecture, it is just that the services aren’t called FedExTrackingService and ScanAndSortPhysicalMailService but Joe and Samantha.

In other words, you have mechanical Turk the feature until you can teach you system to properly play chess.


  1. Answer: What does this code do? - 13 hours from now
  2. The metrics calculation methods - 4 days from now

There are posts all the way to Jan 23, 2017


  1. Answer (9):
    16 Aug 2011 - Modifying execution approaches
  2. Challenge (48):
    19 Jan 2017 - What does this code do?
  3. Implementing low level trie (2):
    14 Dec 2016 - Part II
  4. The performance regression in the optimization (2):
    01 Dec 2016 - Part II
  5. Digging into the CoreCLR (4):
    25 Nov 2016 - Some bashing on the cost of hashing
View all series



Main feed Feed Stats
Comments feed   Comments Feed Stats