Writing SSL ProxyPart II, delegating authentication

time to read 4 min | 681 words

I mentioned that RavenDB 4.0 uses x509 client certificate for authentication, right? As it turns out, this can create some issues for us when we need to do more than just blind routing to the right location.

Imagine that our proxy is setup in front of the RavenDB Docker Swarm not just for handling routing but to also apply some sort of business logic. It can be that you want to do billing per client basis on their usage, or maybe even inspect the incoming data into RavenDB to protect against XSS (don’t ask, please). But those are strange requirements. Let us go with something that you can probably emphasize with more easily.

We want to have a RavenDB cluster that uses a Let’s Encrypt certificate, but that certificate has a very short life time, typically around 3 months. So you probably don’t want to setup these certificates within RavenDB itself, because you’ll be replacing them all the time. So we want to write a proxy that would handle the entire process of fetching, renewing and managing Let’s Encrypt certificates for our database, but the certificates that the RavenDB cluster will use are internal ones, with much longer expiration times.

So far, so good. Except…

The problem that we have here is that here we have a problem. Previously, we used the SNI extension in our proxy to know where we are going to route the connection, but now we have different certificates for the proxy and for the RavenDB server. This means that if we’ll try to just pass the connection through to the RavenDB node, the client will detect that it isn’t using a trusted certificate and fail the request. On the other hand, if we terminated the SSL connection at the proxy, we have another issue, we use x509 client certificate for ensuring that the user actually have the access they desire. And we can’t just pass the client certificate forward, since we terminated the SSL connection.

Luckily, we don’t have to deal with a true man in the middle simulation here, because we can configure the RavenDB server to trust the proxy. All we are left now is to figure out how the proxy can tell the RavenDB server what is the client certificate that the proxy authenticated. A common way to do that is to send the client certificate details over in a header, and that would work, but…

Sending the certificate details in a header has two issues for us. First, it would mean that we need to actually parse and mutate the incoming data stream. Not that big a deal, but it is something that I would like to avoid if possible. Second, and more crucial for us, we don’t want to have to validate the certificate on each and every request. What we want to do is take advantage on the fact that connections are reused and do all the authentication checks once, when the client connects to the server. Authentication doesn’t cost too much, but when you are aiming at tens and thousands of requests a second, you want to reduce costs as much as possible.

So we have two problems, but we can solve them together. Given that the RavenDB server can be configured to trust the proxy, we are going to do the following. Terminate the SSL connection at the proxy, and validate the client certificate (just validate the certificate, not check permissions or such) and then the magic happens. The proxy will generate a new certificate, signed with the proxy own key and registering the original client certificate thumbprint in the new client certificate (caching that certificate, obviously). Then the proxy route the request to its destination, signed with its own client certificate. The RavenDB server will recognize that this is a proxied certificate, pull the original certificate thumbprint from the proxied client certificate and use that to verify the permissions to assign to the user.

The proxy can then manage things like refreshing the certificates from Let’s Encrypt and RavenDB can get proxied requests.

More posts in "Writing SSL Proxy" series:

  1. (27 Sep 2017) Part II, delegating authentication
  2. (26 Sep 2017) Part I, routing