Ayende @ Rahien

Oren Eini aka Ayende Rahien CEO of Hibernating Rhinos LTD, which develops RavenDB, a NoSQL Open Source Document Database.

You can reach me by:

oren@ravendb.net

+972 52-548-6969

, @ Q j

Posts: 6,759 | Comments: 48,829

filter by tags archive
time to read 6 min | 1064 words

Now that I have a good idea on how to use OpenSSL and libuv together, I’m going to change my code to support that mode of operation. I have already thought about this a lot, and the code I already have is ready to receive the change in behavior, I think.

One of the things that I’m going to try to do while I move the code over is properly handle all error conditions. We’ll see how that goes.

I already have the concept of a server_state_run() method that handles all the network activity, dispatching,  etc. So that should make it easy. I’m going to start by moving all the libuv code there. I’m also going to take the time to refactor everything to an API that is more cohesive and easier to deal with.

There is some trouble here, with having to merge together two similar (but not quite identical) concepts. My libuv & openssl post dealt with simply exposing a byte stream to the calling code. My network protocol code is working at a higher level. Initially I tried to layer things together, but that quickly turned out to be a bad idea. I decided to have a single layer that handles both the reading from the network, using OpenSSL and parsing the commands over the network.

The first thing to do was to merge the connection state, I ended up with this code:

There are a few things that are interesting here. On the one hand, I want to keep the state of the connection private, but on the other, we need to expose this out to the user to use some parts of it. The way libuv handles it is with comments denoting what are considered public / private portions of the interface. I decided to stick it in a dedicated struct. This also allowed me to get the size of the private members, which is important for what I wanted to do next.

The connection state struct have the following sections:

  • private / reserved – 64 bytes
  • available for user to use – 64 bytes (and aligned on 64 bytes boundary)
  • msg buffer – 8,064 bytes

The idea here is that we give the user some space to keep their own data in, and that the overall connection state size is exactly 8KB, so can fit in two OS pages. On Linux, in most cases, we’ll not need a buffer that is over 3,968 bytes long, we can even save the second page materialization (because the OS lazily allocate memory to the process). I’m using 64 bytes alignment for the user’s data to reduce any issues that the user have for storing data about the connection. It will also keep it nicely within the data the user need to handle the connection nearby the actual buffer.

I’m 99% sure that I won’t need any of these details, but I thought it is best to think ahead, and it was fun to experiment.

Here is how the startup code for the server changed:

I removed pretty much all the functions that were previously used to build it. We have the server_state_init_t struct, which contains everything that is required for the server to run. Reducing the number of functions to build this means that I have to do less and there is a lot less error checking to go through. Most of the code that I had to touch didn’t require anything interesting. Take the code from the libuv/openssl project, make sure it compiles, etc. I’m going to skip talking about the boring stuff.

I did run into an couple of issues that are worth talking about. Error handling and authentication. As mentioned, I’m using client certificates for authentication, but unlike my previous code, I’m not explicitly calling SSL_accept(), instead, I rely on OpenSSL to manage the state directly.

This means that I don’t have a good location to put the checks on the client certificate that is used. For that matter, our protocol starts with the server sending an: “OK\r\n” message to the client to indicate successful connection. Where does this go? I put all of this code inside the handle_read() method.

This method is called whenever libuv has more data to give us on the connection. The actual behavior is on ensure_connection_intialized(), where we check a flag on the connection, and if we haven’t done the initialization of the connection, we check i OpenSSL consider the connection established. If it is established, we validate the connection and then send the OK to start the ball rolling.

You might have noticed a bunch of work with flags CONNECTION_STATUS_WRITE_AND_ABORT and CONNECTION_STATUS_INIT_DONE. What is that about?

Well, CONNECTION_STATUS_INIT_DONE is self explanatory, I hope. This just tells us whatever the connection has already been checked or not. This save us the cost of validate the client cert of each packet. Usually, SSL handshake means that we could do this check only inside the “need to read more from the network”, but I think that there are certain communication patterns in which the SSL handshake could be completed and the packet will already have additional encrypted information for the connection. For example, I’m pretty sure that TLS 1.3 0-RTT is one such case. This is why the ensure_connection_initialized() is called twice in the code.

Of more interest is the CONNECTION_STATUS_WRITE_AND_ABORT flag. This is set in one of two locations. First, if we fail to validate the certificate for the connection. Second, if we failed to process the message that was sent to us (inside read_message()).

In either case, we want to close the connection, but we have a problem: Error handling. We use libuv for all I/O, and that is asynchronous in nature. We want to write an error to the other side, to be nice, and in order to do that, we need to process the write and keep the connection around long enough that we’ll actually send it to the other side. Because of this, when this flag is set, we have the following behaviors:

  • Any newly available data on that connection is immediately discarded
  • The next write will flush all the data to the network, wait for confirmation that this was sent and close the connection.

This works very nicely to allow me to abort on an error and still get really nice errors on the other side.

As usual, you can read the full code for the network protocol for this post here.

time to read 3 min | 490 words

Now that I’m actually doing real work with input from the network, I thought it would be a good time to stop and take a look at whatever I’m exposing stuff. C is known for buffer overruns and security issues, and compounding that with network software that accepts untrusted input, that is something that we should take a look at.

The first line of defense is to use Valgrind and see if it reports any errors. It reported a memory leak (I didn’t free the command’s buffer, it seemed), which was easy to fix. But it also reported a much more serious issue:

Conditional jump or move depends on uninitialised value(s)

This is not something that I wanted to see. The nice thing about Valgrind is that it prints a nice stack trace, even if in this case, it didn’t make any sense. It was deep inside the strcmp() function, which I assume is fine. I dug around on how this warning in implemented before I got what was going on. Basically, I was handing strcmp memory that was never initialized, which caused this warning.

Here is the relevant piece of the code:

Take a look and see if you can see the error.

It might be helpful if I told you that this is an error that would only be raised if I’m writing to the stream manually, not via any client API.

It took me a while to figure out what was going on. This piece of code is meant to be called multiple times, building a single buffer (of up to 8KB in size) from potentially multiple network reads.

I had a small optimization there to avoid scanning from the beginning of the string and just scan from where we already scanned. This is the to_scan variable. As it turned out, this darling had nasty consequences.

Look at line 7 in the code sample, I’m telling strnstr() to start reading from the string from the specified position, but I pass the original size. I read past the end of the buffer. Likely still in my own memory, but that would almost certainly have caused issues down the road, and it is easy to construct a sequence of operations that would cause me to thing that a message is over when I haven’t finished actually sending it (reading the \r\n\r\n divider from a previous message).

Once that was fixed, it was green across the board. But I’m not a C programmer, and I’m not sure if there are other stuff that I should be looking at. I’m using string functions with explicit length, doing proper error checking, etc. Code review for this hasn’t show any issue, but I’m sure that there are more stuff there.

The actual code is about 100 lines of C code that I think is fairly straightforward. I would be very happy to hear what else I can do to this piece of code to increase my confidence in it.

time to read 3 min | 549 words

So far, I did a whole lot of work around building the basic infrastructure of just building a trivial echo server with SSL. But the protocol I have in mind is a lot more complex, let’s get started with actually implementing the parsing of messages.

To start with, we need to implement parsing of lines. In C, this is actually a decidedly non trivial operation, because you need to read the data from the network into someplace and parse it. This area is rife with errors, so that is going to be fun.

Here is a simple raw message:

GET employees/1-A employees/2-B
Timeout: 30
Sequence: 293
Include: ReportsTo

The structure goes:

CMD args1 argN\r\n

And then header lines with:

Name: value\r\n

The final end of the message is \r\n\r\n.

To make things simple for myself, I’m going to define the maximum size of a message as 8KB (this is the common size in HTTP as well). Here is how I decided to represent it in memory:

image

The key here is that I want to minimize the amount of work and complexity that I need to do. That is why the entire message is limited to 8KB. I’m also simplifying how I’m going to be handling things from an API perspective. All the strings are actually C strings, null terminated, and I’m using the argv, argc convention for naming, just like in the main function.

This means that I can simply read from the network until I find a “\r\n\r\n” in there. Here is how I do this:

There is a bit of code here, but the gist of it is pretty simple. The problem is that I need to handle partial state. That is, a single message may come in two separate packets, or multiple messages may come in a single packet. I don’t have a way to control that, so I need to be careful about tracking past state. The connection has a buffer that is used to hold the state in memory, whose size is large enough to hold the largest possible message. I’m reading from the network to a buffer and then scanning to find the message separator.

If I couldn’t find it, I’m recording the last location where it could be starting, and then issuing another network read and will try searching for \r\n\r\n again. Once that is found, the code will call to the parse_commnad() method that operates over the entire command in memory (which is much easier). With that done, my message parsing is actually quite easy, from a conceptual point of view, although I’ll admit that C make it a bit long.

I’m copying the memory from the network buffer to my own location, this is important because the read_message() function will overwrite it in a bit, and it also allow me to modify the memory more easily, which is required for using strtok(). This basically allow me to tokenize the message into its component parts. First on a line by line basis (with splitting on space for the first line and then treating this as headers lines).

I added the ability to reply to a command, which means that we are pretty much almost done. You can see the current state of the code here.

time to read 4 min | 638 words

I decided that this is time to take my network protocol and make it cross platform, so I tried to compile it on the Linux subsystem for Windows. As an aside, the fact that I could edit everything in Visual Studio while compiling using GCC and having immediate feedback is amazing, given the usual workflow that this entails. And I would very much like to hear about an IDE that is comparable to Visual Studio out there.

I got to say, the situation for dependencies on C/C++ is flat out horrible. I’m depending on OpenSSL for this code, and I have used VCPkg for actually setting up the dependency. On WSL, on the other hand, the OpenSSL packages are from 2014(!), so they were incompatible. I re-compiled the lastest stable on WSL and tried to get it to work. It didn’t, first it didn’t put the new code in the right place and when I forced it to use the right paths, it failed with missing methods. It looks like I’m spoiled from the point of backward compatibility, but several methods has been deprecated or flat our removed.

I fixed that stuff only to find out that what the WSL version requires is actually newer than what the OpenSSL version I have on the Winodws machine has. I could compile the latest on Windows as well but for now it was just easier to conditional compile the stuff that I needed. I find it sad, but I want to get things done.

After a lot of grunt work, I’m happy to say that this looks like it is actually working. You can find the code for that in this link. I also wrote a small C# code to connect to the server, which looks like:

I gotta admit, the difference in the lines of code and complexity between the two code bases is pretty staggering.

I have to admit, I had an ulterior motive behind wanting to run on Linux, I wanted to see just how badly I managed to manage memory in this short bit of C code, and Valgrind is one of the more common tools to do that.

Of course, Valgrind doesn’t run on WSL, so I had to build that from source as well. I have to say, I was really careful about the actual memory usage and freeing everything. Valgrind was still able to find an issue within the first two minutes of me running it. I got to say, I’m really impressed. Here is the leak:

image

As you can see, this is happening because I’m using client certificates and it is deep inside OpenSSL (narrator voice: it wasn’t a problem with OpenSSL).

This was a bit of a problem to try to figure out, because I was really careful and properly match each allocation call with the call to release it (in this case, SSL_new() with SSL_free()). I went over the code to establish and tear down a connection many times, but couldn’t see any problem.

The issue here is that Valgrind, in this case, shows me where the memory was allocate, but the leaks is actually elsewhere. OpenSSL implements reference counting for many objects, and as part of the client certificate usage I’m getting the client’s certificate an examining it. The act of getting the certificate increment the ref count on that certificate. I have to explicitly release that certificate once I’m done with it. Naturally Valgrind wasn’t able to figure all of that and just told me that I actually have a leak.

Lesson learned, look at the allocation stack, but don’t marry it or assume it is known correct. Here is the latest drop of code, which is able to pass Valgrind with no leaks detected.

time to read 3 min | 443 words

Initially I thought that my next step would be to write the code to handle more than a single connection at a time, but I decided that this isn’t the natural next step. It would be better to build the API and abstractions that are required to use this network protocol before committing to an actual implementation. I took a peek at what is available, and it looks like we have the following options:

  • Thread per connection – obviously out.
  • select() / pool() – allow to have a single thread manage multiple connections, but require a lot of state management.
  • libuv – or similar, that handle all of those details for me.

Regardless of what I’ll end up in the end, I think that the externally facing API is always going to be the same, so I’m going to focus on that. Right now we have:

image

In other words, we have an interface that explicitly require us to manage state, manage the socket, etc. I think the first thing to do would be to change that. The first thing I did was to remove the connection_create and connection_read. They are now handled internally and no longer the concern of the calling code. Instead, we have:

image

This is basic OO in C, we define a bunch of function pointers and allow the user to send them to us to be trigger appropriately. On connection create, we’ll call connection_create(), and the user can return a state object that will be used in future callbacks. I’m currently using connection_recv() that accepts a raw buffer, because I haven’t dealt with parsing yet.

Note the call to server_state_run(), which actually does all the work. I’m currently just going to take whatever we already have and convert to this API.

That turned out to be fairly straightforward to do, I’m happy to say. I did find that working on C code after midnight has negative implications, I wrote:

image

Do you see the bug? The first time I’m adding a new cert, I’m going to allocate a single byte, then I’m going to write the full certificate_thumbprint to that memory, well beyond the buffer I actually allocated. The fix?

image

That’s simple enough, but it was an unpleasant thing to realize that I made this error.

time to read 4 min | 769 words

Getting errors from SSL isn’t easy. Sometimes, I think that so much encryption has wrapped things up and error reporting are treated as secret information that must be withheld. The root of the problem is that SSL doesn’t really have a way for the two communicating parties to tell each other: “I don’t trust you because you wear glasses”. There are some well known error codes, but for the most part, you’ll get a connection abort. I want to see what it would take to provide good error handling for network protocol using SSL that handles:

  • No certificate provided
  • Expired / not yet valid certificate provided
  • Unfamiliar certificate provided

In order to do that, we must provide this error handling at a higher level than SSL. Therefor, we need to provide something in a higher layer. In this protocol, the first thing that the server will send to the client on connection will be: “OK\r\n” if everything is okay, or some error string that will explain the issue, otherwise. This turned out to be rather involved, actually.

First, I had to ask OpenSSL to send the “please gimme a client cert”:

image

Note the callback? It will accept everything, since doing otherwise means that we are going to just disconnect with very bad error experience. After completing the SSL_accept() call, I’m explicitly checking the client certificate, like so:

image

I have to say, the sheer amount of code (and explicit error handling) can be exhausting. And the fact that C is a low level language is something that I certainly feel. For example, I have this piece of code:

image

This took a long time to write. And it should have been near nothing, to be honest. We get the digest of the certificate, convert it from raw bytes to hex and then do a case insensitive comparison on the registered certificates that we already know.

To be honest, that piece of code was stupidly hard to write. I messed up and forgot that snprintf() will always insert a null terminator and I specified that it should insert only 2 characters. That led to some confusion, I have to say.

The last thing that this piece of code does is scan through the registered certificates and figure out if we are familiar with the one the client is using. You might note that this is a linked list, because this is pretty much the only data structure that you get in C. The good thing here is that it is easy to hide the underlying implementation, and this is a just some code that I write for fun. I don’t expect to have a lot of certificates to deal with, nor do I expect o have to optimize the connection portion of this code, but it bothers me.

In C#, I would use a dictionary and forget about it. In C, I need to make a decision on what dictionary to use. That adds a lot of friction to the process, to be honest. If I wasn’t using a dictionary, I would use a List<T>, which has the advantage that it is contiguous in memory and fast to scan. For completion’s sake, here is the code that register a certificate thumbprint:

image

As you can see, this isn’t really something that interesting. But even though it doesn’t really matter, I wonder how much effort it will take to avoid the usage of a link list… I think that I can do that without too much hassle with realloc. Let’s see, here is the new registration code:

image

And here is how I’m going to be iterating on it:

image

So that isn’t too bad, and we can be sure that this will be all packed together in memory. Of course, this is pretty silly, since if I have enough certificates to want to benefits from sequential scan performance, I might as well use a hash…

Next topic I want to tackle is actually handling more than a single connected client at a time, but that will be for the next post. The code for this one is here.

time to read 3 min | 406 words

In my last post, I introduced a basic error handling mechanism for C API. The code I showed had a small memory leak in it. I love that fact about it, because it is hidden away and probably won’t show up very easily for production code until very late in the game. Here is the fix:

image

The problem was that I was freeing the error struct, but wasn’t freeing the actual error message. That is something that is very easy to miss, unfortunately. Especially as we assume that errors are rare. With that minor issue out of the way, let’s look at how we actually write the code. Here is the updated version of creating a connection, with proper error handling:

image

The calling code will check if it got NULL and then can decide whatever it should add it’s own error (for context) and return a failure to its own caller or handle it. The cleanup stuff is still annoying with goto, but I don’t believe that there is much that can be done about it.

I’m going to refactored things a bit, so I have proper separation and an explicit API. Here is what the API looks like now:

image

This is basically a wrapper around all the things we need to around SSL. It handles one time initialization and any other necessary work that is required. Note that I don’t actually expose any state outside, instead just forward declaring the state struct and leaving it at that. In addition to the overall server state, there is also a single connection state, whose API looks like this:

image

Now, let’s bring it all together and see how it works.

This is a pretty trivial echo server, I’ll admit, but it actually handles things like SSL setup, good error handling and give us a good baseline to start from.

The code for this series can be found here and this post’s code is this commit. Next, I want to see what it would take to setup client authentication with OpenSSL.

time to read 4 min | 792 words

As part of my usual routine, I’m trying out writing some code in C, to get a feeling for a different environment. I wanted to build something that is both small enough to complete in a reasonable time and complex enough that it would allow to really explore how to use things. I decided to use C (not C++) because it is both familiar and drastically different from what I usually do. The project in question? Implementing the network protocol I wrote about here.  Another part of the challenge that I set out to myself, this should be as production quality code as I could make it, which means paying all the usual taxes you would expect.

The first thing that I had to do was to figure out how to actually networking and SLL working in C. Something that would take me 5 minutes in C# took me a several hours of exploring and figuring things out. Eventually, I settled down on the obvious choice for SSL with OpenSSL. That is portable, reasonably well documented and seems fairly easy to get started with.

Here is some code from the sample TCP server from the OpenSSL documentation:

image

This is interesting, it show what needs to be done and does it quite clearly. Unfortunately, this isn’t production quality code, there are a lot of stuff here that can go wrong that we need to handle. Let’s wrap things in proper functions. The first thing to do is to capture the state of the connection (both its socket and the SSL context associated with it). I created a simple structure to hold that, and here is how I close it.

image

So far, pretty simple. However, take a look on the code that creates the connection:

image

As you can see, quite a lot of this function is error handling and cleanup. This code looks like it does the right thing and cleanup after itself in all cases. So far, so good, but we are still missing a very important component. We handled the error, but we haven’t reported it. In other words, from the outside, any failure will look exactly the same to the caller. That is not a good thing if you want to create software that is expressive and will tell you what is wrong so you can fix it.

If this was C#, I would be throwing an exception with the right message. As this is C, we run into some interesting issues. As part of your error handling, you might run into an error, after all… In particular, good error handling usually requires string formatting, and that can cause issues (for example, being unable to allocate memory).  There is also the issue of who frees the memory allocated for errors, of course.

Typically, you’ll see code that either prints to the console or to a log file and it is usually a major PITA. OpenSSL uses a thread local error queue for this purpose, which gives you the ability to hold a context, but it requires an awful lot of ceremony to use and doesn’t seem to be useful for generic error handling. I decided to see if I can do a quick and dirty approach to the same problem, with something that is slightly more generic.

The purpose was to get reasonable error handling strategy without too much hassle. Here is what I came up with. This is a bit much, but I’ll explain it all in a bit.

image

Whenever we get an error, we can call push_error. Note that I included an error code there as well, in addition to the string. This is if there will be a need to do programtic error handling, for example, to handle a missing file that is reported “upstairs”.

I also included how we get errors out, which is pretty simple. You can figure out from here how you would handle error handling in code, I’m going to assume.

Here is how this is used:

image

And the output that this will print to the console would be:

consoleapplication4.cpp:211 - main() - 1065 can't open db: northwind
consoleapplication4.cpp:210 - main() - 2 cannot open file: raven.db

And yay, I invented exceptions!

Also, I have a memory leak there, can you find it?

FUTURE POSTS

No future posts left, oh my!

RECENT SERIES

  1. Design exercise (3):
    19 Dec 2018 - Distributing (consistent) data at scale, answer
  2. Refactoring C code (8):
    12 Dec 2018 - Going to async I/O
  3. RavenDB 4.2 Features (2):
    28 Nov 2018 - Let’s get colorful
  4. Production postmortem (23):
    23 Nov 2018 - The ARM is killing me
  5. Graphs in RavenDB (11):
    08 Nov 2018 - Real world use cases
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats