Ayende @ Rahien

Oren Eini aka Ayende Rahien CEO of Hibernating Rhinos LTD, which develops RavenDB, a NoSQL Open Source Document Database.

You can reach me by:

oren@ravendb.net

+972 52-548-6969

, @ Q j

Posts: 6,801 | Comments: 48,964

filter by tags archive
time to read 7 min | 1321 words

imageAlmost by accident, it turned out that I implemented a pretty simple, but non trivial task in both C and Rust and blogged about them.

Now that I’m done with both of them, I thought it would be interesting to talk about the differences in the experiences.

The Rust version clocks at exactly 400 lines of code and uses 12 external crates.

The C version has 911 lines of C code and another 140 lines in headers and depends on libuv and openssl.

Both took about two weeks of evenings of me playing around. If I was working full time on that, I could probably do that in a couple of days (but probably more, to be honest).

The C version was very straightforward. The C language is pretty much not there, and on the one hand, it didn’t get in my way at all. On the other hand, you are left pretty much on your own. I had to write my own error handling code to be sure that I got good errors, for example. I had to copy some string processing routines that aren’t available in the standard library, and I had to always be sure that I’m releasing resources properly. Adding dependencies is something that you do carefully, because it is so painful.

The Rust version, on the other hand, uses the default error handling that Rust has (and much improved since the last time I tried it). I’m pretty sure that I’m getting worse error messages than the C version I used, but that is good enough to get by, so that is fine. I had to do no resource handling. All of that is already handled for me, and that was something that I didn’t even consider until I started doing this comparison.

When writing the C version, I spent a lot of time thinking about the structure of the code, debugging through it (to understand what is going on, since I also learned how OpenSSL work) and seeing if things worked. Writing the code and compiling it were both things that I spent very little time on.

In comparison, the Rust version (although benefiting from the fact that I did it second, so I already knew what I needed to do) made me spend a lot more time on just writing code and getting it to compile.  In both cases, I decided that I wanted this to be a production worthy code, which meant handling all errors, producing good errors, etc. In C, that was simply a tax that needed to be dealt with. With Rust, that was a lot of extra work.

The syntax and language really make it obvious that you want to do that, but in most of the Rust code that I reviewed, there are a lot of unwrap() calls, because trying to handle all errors is too much of a burden. When you aren’t doing that, your code size balloons, but the complexity of the code didn’t, which was a great thing to see.

What was really annoying is that in C, if I got a compiler error, I knew exactly what the problem was, and errors were very localized. In Rust, a compiler error could stymie me for hours, just trying to figure out what I need to do to move forward. Note that the situation is much better than it used to be, because I eventually managed to get there, but it took a lot of time and effort, and I don’t think that I was trying to explore any dark corners of the language.

What really sucked is that Rust, by its nature, does a lot of type inferencing for you. This is great, but this type inferencing goes both backward and forward. So if you have a function and you create a variable using: HashMap::new(), the actual type of the variable depends on the parameters that you pass to the first usage of this instance. That sounds great, and for the first few times, it looked amazing. The problem is that when you have errors, they compound. A mistake in one location means that Rust has no information about other parts of your code, so it generates errors about that. It was pretty common to make a change, run cargo check and see three of four screen’s worth of errors pass by, and then go into a “let’s fix the next compiler error” for a while.

The type inferencing bit also come into play when you write the code, because you don’t have the types in front of you (and because Rust love composing types) it can be really hard to understand what a particular method will return.

C’s lack of async/await meant that when I wanted to do async operations, I had to decompose that to event loop mode. In Rust, I ended up using tokio, but I think that was a mistake. I should have used the event loop model there as well. It isn’t as nice, in terms of the code readability, but the fact that Rust doesn’t have proper async/await meant that I had a lot more additional complexity to deal with, and that nearly caused me to give up on the whole thing.

I do want to mention that for C, I had run Valgrind a few times to get memory leaks and invalid memory accesses (it found a few, even when I was extra careful). In Rust, the compiler was very strict and several times complained about stuff that if allowed, would have caused problems. I did liked that, but most of the time, it felt like fighting the compiler.

Speaking of which, the compilation times for Rust felt really high. Even with 400 lines of code, it can take a couple of seconds to compile (with cargo check, mind, not full build). I do wonder what it will do with a project of significant size.

I gotta say, though, compiling the C code meant that I would have to test the code. Compiling the Rust code meant that I could run things and they usually worked. That was nice, but at the same time, getting the thing to compile at all was a major chore many times. Even with the C code not working properly, the feedback loop was much shorter with C than with Rust. And some part of that was that I already had a working implementation for most of what I needed, so I had a lot less need to explore when I wrote the Rust code.

I don’t have any hard conclusions from the experience, I like the simplicity of C, and if I had something like Go’s defer to ensure resource disposal, that would probably be enough (I’m aware of libdefer and friends). I find the Rust code elegant (except the async stuff) and the standard library is great. The fact that the crates system is there means that I have very rich access to additional libraries and that this is easy to do. However, Rust is full of ceremony that sometimes seems really annoying. You have to use cargo.toml and extern crate for example.

There is a lot more to be done to make the compiler happy. And while it does catch you sometimes doing something your shouldn’t, I found that it usually felt like busy work more than anything else. In some ways, it feels like Rust is trying to do too much. I would have like to see something less ambitious. Just focusing on one or two concepts, instead of trying to be high and low level language, type inference set to the make, borrow checker and memory safety, etc. It feels like this is a very high bar to cross, and I haven’t seen that the benefits are clearly on the plus side here.

time to read 2 min | 345 words

I’m pretty much done with my Rust protocol impl. The last thing that I wanted to try was to see how it would look like when I allow for messages to be handled out of band.

Right now, my code consuming the protocol library looks like this:

This is pretty simple, but note that the function definition forces us to return a value immediately, and that we don’t have a way to handle a command asynchronously.

What I wanted to do is to change things around so I could do that. I decided to implemented the command:

remind 15 Nap

Which should help me remember to nap. In order to handle this scenario, I need to provide a way to do async work and to keep sending messages to the client. Here was the first change I made:

image

Instead of returning a value from the function, we are going to give it the sender (which will render the value to the client) and can return an error if the command is invalid in some form.

That said, it means that the echo implementation is a bit more complex.

There is… a lot of ceremony here, even for something so small. Let’s see what happens when we do something bigger, shall we? Here is the implementation of the reminder handler:

Admittedly, a lot of that is error handling, but there is a lot of code here to do something that simple.  Compare that to something like C#, where the same thing could be written as:

I’m not sure that the amount of complexity that is brought about by the tokio model, even with the async/await macros is worth it at this point. I believe that it needs at least a few more iterations before it is going to be usable for the general public.

There is way too much ceremony and work to be done, and a single miss and you are faced with a very pissed off compiler.

time to read 3 min | 410 words

After a lot of trouble, I’m really happy that I was able to build an async I/O implementation of my protocol. However, for real code, I think that I would probably recommend using with the sync API instead, since at least that is straightforward and doesn’t incur so much overhead at development time. The async stuff is still very much a “use at your own risk” kind of deal from my perspective. And I can’t imagine trying to use it in a large project and no suffering from the complexity.

As a good example, take a look at the following bit of code:

image

It doesn’t seem to be doing much, right? And it is clear what the intent of the code is.

However, if you try to compile this code you’ll get:

image

Now, it took me a long while to figure out what is going on.  The issue is that the code I’m seeing isn’t the actual code, because of macro expansions.

So let’s resolve this and see what the expanded code looks like:

This is after formatting, of course, but it certainly looks scary. Glancing at this code doesn’t tell me what the problem was, so I tried replacing the method with the expanded result, and I got the same error, but this time I got it on a line that helped me figure it out. Here is the issue:

image

We use the ? to return early from the poll method, and the Receiver I’m using in this case is defined to have a Result<String, ()>, so this is the cause of the problem.

I returned my own error type as a result, giving me the ability to convert from (), but that was a really hard thing to resolve.

It might be better to have Rust also offer to show the error on the expanded code by default, because it was somewhat of a chore to actually get to this.

What made this oh so confusing is that I had the exact same code, but using a Stream<String, io:Error> that worked, obviously. But it was decidedly non obvious to see what was the difference between two identical pieces of code.

time to read 3 min | 588 words

On my last post, I got really frustrated with tokio’s complexity and wanted to move to use mio directly. The advantages are that the programming model is pretty simple, even if actually working with is is hard. Event loops can cause your logic to spread over many different locations and make it hard to follow. I started to go that path until I figure out just how much work it would take. I decided to give tokio a second change, and at this point, I looked into attempts to provide async/await functionality to Rust.

It seems that at least some work is already available for this, using futures + some Rust macros. That let me write code that is much more natural looking, and I actually managed to make it work.

Before I get to the code, I want to point out some concerns that I have right now. The futures-await crate (and indeed, all of tokio) seems to be in a state of flux. There is an await in tokio, and I think that there is some merging around of all of those libraries into a single whole. What I don’t know, and can’t find any information about, is what I should actually be using, and how all the pieces come together. I have to note that even with async/await, the programming model is still somewhat awkward, but it is at a level that I can live with. Here is how I built it.

First, we need to accept connections, which is done like so:

Note that I have two #[async[ annotations. One for the method as a whole and one for the for loop. This just accept the connection and spawn a task to handle that, the most interesting tidbits are in the actual processing of the connection:

You can see that this is fairly straightforward code. We first do the TLS handshake, then we validate the certificate. If there is an auth error, we send it to the user and back off. If we are successful, however, things get interesting.

I create a channel, which allow me to  split off the read and write portions of the task. This means that I can send results out of order, if I wanted to, which is great for the actual protocol handling. The first thing to do is to send the OK string to the client, so they know that we successfully connected, then we spawn the read/write tasks. The write task is pretty simple, overall:

You can see the funny .0 references, which is an artifact of the fact that the write_all() function consumes the writer we pass to it and return (a potentially different) writer in the result.  This is pretty common for functional languages.

I’m pretty sure that I can avoid the two calls to write_all for the postfix, but that is easier for now.

Processing the commands is simple as well:

For each command we support, we have an entry on the server configuration and we fetch and invoke it. The result of the command will be written to the client by the write task. Right now we have a 1:1 association between them, but this is now easily broken.

And finally, having an actually command run and running the server itself:

This is pretty simple now, and it give us a nice model to program commands and responses.

I pushed the whole code to this branch, if you care to look at it.

I have some more comments about this code, but I’ll reserve them for another post.

time to read 2 min | 346 words

I kept going with tokio for a while, I even got something that I think would eventually work. The whole concept is around streams, so I create a way to generate them. This is basically taking this code and making it async.

I gave up well into the second hour. Here is where I stopped:

image

I gave up when I realized that the reader I’m using (which is SslStream) didn’t seem to have poll_read. The way I’m reading the code, it is supposed to, but I just threw up my hands at disgust at this time. If it this hard, it ain’t going to happen.

I wrote significant amount of async code in C# at the time when events and callbacks were the only option and then when the TPL and ContinueWith was the way to go. That was hard, and async/await is a welcome relief, but the level of frustration and “is this wrong, or am I really this stupid?” that I got midway through is far too much.

Note that this isn’t even about Rust. Some number of issues that I run into were because of Rust, but the major issue that I have here is that I’m trying to write a function that can be expressed in a sync manner in less than 15 lines of code and took me about 10 minutes to write the first time. And after spending more hours than I’m comfortable admitting, I couldn’t get it to work. The programming model you have here, even if everything did work, means that you have to either decompose your behavior to streams and interact with them in this manner or you put everything as nested lambdas.

Either option doesn’t make for a nice model to work with. I believe that there is another async I/O model for Rust, the MIO crate, which is based on the event loop model. I’ve already implemented that in C, so that might be a more natural way to do things.

time to read 5 min | 941 words

Now that we have a secured and authentication connection, the next stage in making a proper library is to make it run more than a single connection at time. I could have use a thread per connection, of course, or even use a thread pool, but neither of those options is valid for the kind of work that I want to see, so I’m going to jump directly into async I/O in Rust and see how that goes.

The sad thing about this is that I expect that this will make me lose some / all of the nice API that I get for OpenSSL in the sync mode.

Async in Rust is handled by a crate called tokio, and there seems to be active work to bring async/await to the language itself. In the meantime, we have to make do with the usual facilities, which ought to make this interesting.

It actually looks like there is a crate that gives pretty nice handling of tokio async I/O and OpenSSL so that is encouraging. However, as part of trying to re-write everything in tokio style, I got the compiler very upset with me. Here is (partial) error message:

image

Last time I had to parse such errors, I was working in C++ templated code and the year was 1999.

And here is the piece of code it so dislikes:

image

I googled around and there is this detailed answer on a similar topic that frankly, frightened me. I shouldn’t have to dig this deeply and have to start drawing diagrams on so many disparate pieces of the code just to figure out a compiler error.

Let’s try to break it to its component parts and see if that make sense, I reduce the code in question to just:

image

Got another big scary error message. Okay, let’s try it without the OpenSSL stuff?

image

This produce the same error, but in a much less scary tone:

image

Okay, now this looks about as simple as it can be. And now the fix is pretty obvious:

image

The key to understand here, I believe (I haven’t tested it yet) that the write_all call will either perform its work or schedule it, so any future work based on it should go in a nested and_then call. So the result of the single for_each invocation is not the direct continuation of the previous call.

That is fine, I’ll deal with that, I guess.

Cue here about six hours of programming montage.

I have been programming over 20 years, I like to think that I have been around the block a few times. And the simple task of reading a message from TCP using async I/O took me far too long. Here is what I eventually ended up with:

image

This is after fighting with the borrow checker (a lot, it ended up winning), trying to grok my head around the model that tokio has. It is like they took the worst parts of async programming, married it to stream programming’s ugly second cousin and then decided to see if any of the wedding guests is open for adoption.

And if the last sentence doesn’t make sense to you, you are welcome, that is how I felt at certain points. Here is one of the errors that I run into:

image

What is this string, where did it come from and why do we have a unit “()” there? Let me see if I can explain what is going on here. Here is a very simple bit of code that would explain things.

image

And here is the error it generates:

image

The problem is that spawn is expecting a future that results a result that has no meaning, something like: Future<Result<(), ()>>. This make sense, since there isn’t really anything that it can do with whatever the result is. But the error can be really confusing. I spent a lot of time trying to actually parse this, then I had to go and check the signatures of the method involved, and then I had to reconstruct what are the generic parameters that are required, etc.

The fix, btw, is this:

image

Ask yourself how long it would take you to figure what the changes between these versions of the code are without the marker.

Anyway, although I’m happy that I got something done, this approach is really not sustainable. I’m pretty sure that I’m either doing something wrong or missing something. It shouldn’t be this hard. I got some ideas that I want to try, which I’ll talk about in the next post.

time to read 3 min | 478 words

After running into a few hurdles, I managed to get rust openssl bindings to work, which means that this is now the time to actually wire things properly in my network protocol, let’s see how that works, shall we?

First, we have the OpenSSL setup:

As you can see, this is pretty easy and there isn’t really anything there that is of actual interest. It does feel a whole lot easier than dealing with OpenSSL directly in C, though.

That said, when I started actually dealing with the client certificate, things got a lot more complicated. The first thing that I wanted to do is to do my authentication, which is defined as:

  • Client present a client certificate (can be any client certificate).
  • If a client doesn’t give a certificate, we accept the connection, send a message (using the encrypted tunnel) and abort.
  • If the client provide an certificate, it must be one that was previously registered in the server. That is what allowed_certs_thumbprints is for. If it isn’t, we accept the connection, write an error and abort.
  • If the client certificate has expired or is not yet valid, accept, write error & abort.

You get the gist. Here is what I had to do to implement the first part:

Most of the code, actually, is about generating proper and clear error messages, more than anything else. I’m not sure how to get the friendly name from the certificate, but this seems to be a good enough stand-in for now.

We validate that we have a certificate, or send an error back. We validate that the certificate presented is known to us, or we send an error back.

The next part I wanted to implement was… really far too hard than it should be. I just wanted to verify that the certificate not before/not after dates are valid. And the problem is that the rust bindings for OpenSSL do not expose that information. Luckily, because it is using OpenSSL, I can just call to OpenSSL directly. That led me to some interesting search into how Rust calls out to C, how foreign types work and a lot of “fun” like that. Given that I’m doing this to learn, I suppose that this is a good thing, though.

Here is what I ended up with (take a deep breath):

Notice that I’m doing all of this (defining external function, defining helper functions) inside the authenticate_certificate function. Coming up with that was harder than expected, but I really liked the fact that it was possible, and that I can just shove this into a corner of my code and not have to make a Big Thing out of it.

And with that, I the authentication portion of my network protocol in Rust done.

The next stage is going to be implementing a server that can handle more than a single connection at a time Smile.

time to read 2 min | 340 words

After getting really frustrated with the state of Rust & TLS, and I decided to sit down and figure out what it would take to make the OpenSSL crate actually build successfully. Even though the crate claims to support vcpkg, it seems that there were issues there. I started from a clean slate, and checked that I have openssl via vcpkg installed:

image

I then got into a rabbit hole of errors in the build, first:

image

This seems like it wants to statically link to them by default, but when I set the env variable, I got:

image

Looking closely at the error (always read the error message), you can see that it is looking for a 64 bits build, but I’ve a x86 build.

That very likely explains the issues that I previously had. I tried to point it to the SSL build directory, and I’m pretty sure that I used the 32bits directory. It rejected the attempted link, but didn’t bother to tell me about it.

To be fair, this isn’t Rust’s fault, it is link.exe’s fault for not providing a clear error about this case. Actually, this is the case where you are going to invest some time writing a feature whose only purpose is to get good errors when the user messed up. But that kind of attention to detail make a world of difference.

Here is what fixed this for me.

image

And with that, I can build using:

image

Hurray! That is enough for now, I guess. I’ll get things actually working in another post.

time to read 2 min | 274 words

After trying (and failing) to use rustls to handle client authentication, I tried to use rust-openssl bindings. It crapped out on me with a really scary link error. I spent some time trying to figure out what was going on, but given that it said that I wanted to write Rust code, not deal with link errors, I decided to see if the final alternative in the Rust eco system will work, native-tls package.

And… that is a no go as well. Which is sad, because the actual API was quite nice. The reason it isn’t going to work? The native-tls package just has no support for client certificate authentication when running as a server, so not usable for me.

That leaves me with strike three out of three:

  • rustls – native Rust API, easy to work with, but doesn’t allow to accept arbitrary client certificates, only ones from known issuers.
  • rust-openssl – I have build this on top of OpenSSL before, so I know it works. However, trying to build it on Windows resulted in link errors, so that was out.
  • native-tls – doesn’t have support for client certificates, so not usable.

I think that at this point, I have three paths available to me:

  • Give up and maybe try doing something else with Rust.
  • Fork rustls and add support for accepting arbitrary client certificates. I’m not happy with this because it requires changing not just rustls but also probably webpki package and I’m unsure if the changes I have in mind will not hurt the security of the system.
  • Try to fix the OpneSSL link issue.

I think that I’ll go with the third option, but this is really annoying.

time to read 6 min | 1017 words

The task that I have for now is to add client authentication via X509 client certificate. That is both obvious and non obvious, unfortunately. I’ll get to that, but before I do so, I want to go back to the previous post and discuss this piece of code:

I’ll admit that I’m enjoying exploring Rust features, so I don’t know how idiomatic this code is, but it is certainly dense. This basically does the setup for a TCP listener and setting up of the TLS details so we can accept a connection.

Rust allows us to define local functions (inside a parent function), this is mostly just a way to define a private function, since the nested function has no access to the parent scope. The open_cert_file function is just a way to avoid code duplication, but it is an interesting one. It is a generic function that accepts an open ended function of its own. Basically, it will open a file, read it and then pass it to the function it was provided. There is some error handling, but that is pretty much it.

The next fun part happens when we want to read the certs and key file. The certs file is easy, it can only ever appear in a single format, but the key may be either PKCS8 or RSA Private Key. And unlike the certs, where we expect to get a collection, we need to get just a single value. To handle that we have:

image

First, we try to open and read the file as a RSA Private Key, if that isn’t successful, we’ll attempt to read it as PKCS8 file. If either of those attempts was successful, we’ll try to get the first key, clone it and return.  However, if there was an error in any part of the process, we abort the whole thing (and exit the function with an error).

From my review of Rust code, it looks like this isn’t non idiomatic code, although I’m not sure I would call it idiomatic at this point.  The problem with this code is that it is pretty fun to write, when you read it is obvious what is going on, but it is really hard to actually debug this. There is too much going on in too little space and it is not easy to follow in a debugger.

The rest of the code is boring, so I’m going to skip that and start talking about why client authentication is going to be interesting. Here is the core of the problem:

image

In order to simplify my life, I’m using the rustls’ Stream to handle transparent encryption and decryption. This is similar to how I would do it when using C#, for example. However, the stream interface doesn’t have any way for me to handle this explicitly. Luckily, I was able to dive into the code and I think that given the architecture present, I can invoke the handshake manually on the ServerSession and then hand off the session as is to the stream.

What I actually had to do was to setup client authentication here:

image

And then manually complete the handshake first:

image

And this is when I run into a problem, when trying to connect via my a client certificate, I got the following error:

image

I’m assuming that this is because rustls is actually verifying the certificate against PKI, which is not something that I want. I don’t like to use PKI for this, instead, I want to register the allowed certificates thumbprints, but first I need to figure out how to make rustls accept any kind of client certificate. I’m afraid that this means that I have to break out the debugger again and dive into the code to figure out where we are being rejected and why…

After a short travel in the code, I got to something that looks promising:

image

This looks like something that I should be able to control to see if I like or dislike the certificate. Going inside it, it looks like I was right:

image

I think that I should be able to write an implementation of this that would do the validation without checking for the issuer. However, it looks like my plan run into a snag, see:

image

I’m not sure that I’m a good person to talk about the details of X509 certificate validation. In this case, I think that I could have done enough to validate that the cert is valid enough for my needs, but it seems like there isn’t an way to actually provide another implementation of the ClientCertVerifier, because the entire package is private. I guess that this is as far as I can use rustls, I’m going to move to the OpenSSL binding, which I’m more familiar with and see how that works for me.

Okay, I tried using the rust OpenSSL bindings, and here is what I got:

image

So this is some sort of link error, and I could spend half a day to try to resolve it, or just give up on this for now. Looking around, it looks like there is also something called native-tls for Rust, so I might take a peek at it tomorrow.

FUTURE POSTS

  1. Technical marketing from the other side - 11 hours from now
  2. The first database I ever built (20 years ago) - about one day from now
  3. Data modeling with indexes: Event sourcing–Part III–time sensitive data - 2 days from now
  4. RavenDB Customers Portal - 5 days from now

There are posts all the way to Feb 25, 2019

RECENT SERIES

  1. Data modeling with indexes (6):
    11 Feb 2019 - Event sourcing–Part II
  2. Production postmortem (25):
    18 Feb 2019 - This data corruption bug requires 3 simultaneous race conditions
  3. RavenDB 4.2 Features (3):
    14 Feb 2019 - Pull Replication has landed
  4. Making money from Open Source Software (3):
    08 Feb 2019 - How we do it?
  5. Using TLS in Rust (5):
    31 Jan 2019 - Handling messages out of band
View all series

RECENT COMMENTS

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats