Writing SSL ProxyPart I, routing
RavenDB 4.0 uses x509 client certificates for authentication. That is good, because it means that we get both encryption and authentication on both ends, but it does make is more complex to handle some deployment scenarios. It turns out that there is quite a big demand for doing things to the data that goes to and from RavenDB.
We’ll start with the simpler case, of having dynamic deployment on Docker, with nodes that may be moved from location to location. Instead of exposing the nastiness of the internal network to the outside world with URLs such as (https://129-123-312-1.rvn-srv.local:59421) we want to have nice and clean urls such as https://orders.rvn.cluster. The problem is that in order to do that, we need to put a proxy in place.
That is pretty easy when you deal with HTTP or plain TCP, but much harder when you deal with HTTPS and TLS because you also need to handle the encrypted stream. We looked at various options, such as Ngnix and Traefik as well as a peek at Squid but we rule them out for various reasons, mostly related to the deployment pattern (Ngnix doesn’t handle dynamic routing), feature set (Traefik doesn’t handle client certificates properly) and usecase (Squid seems to be much more focused on being a cache). All of them didn’t support the proper networking model we want (1:1 connection matches from client to server, which we would really like to preserve because it simplify authentication costs significantly).
So I set out to explore what it would take to build an SSL Proxy to fit our needs. The first thing I looked at was how to handle routing. Given a user that type https://orders.rvn.cluster in the browser, how does this translate to actually hitting an internal Docker instance with a totally different port and host?
The answer, as it turned out, is that this is not a new problem. One of the ways to do that is to just intercept the traffic. We can do that because in this deployment model, we control both the proxy and the server, so we can put the certificate fro “orders.rvn.cluster” in the proxy, decrypt the traffic and then forward it to the right location. That works, but it means that we have a man in the middle. Is there another option?
As it turns out, this is such a common problem that there are multiple solutions for it. These are SNI (Server Name Indication) and ALPN (Application Layer Protocol Negotiation), both of which allow allow the client to specify what they want to get from the server as part of the initial (and unencrypted) negotiation. This is pretty sweet from the point of view of the proxy, because it can make routing decisions without needing to do the TLS negotiation but not so much for the user if they are currently trying to check “super-shady.site”, since while the contents of their request is masked, the destination is not. I’m not sure how big of a security problem this is (the end IP isn’t encrypted, after all, and even if you host a thousands sites on the same server, it isn’t that big a deal to narrow it down).
Anyway, the key here is that this is possible, so let’s make this happen. The solution is almost literally pulled from the StreamExtended readme page.
We get a TCP stream from a client, and we peek into it to read the TLS header, at which point we can pull the server name out. At this point, you’ll note, we haven’t touched SSL and we can forward the stream toward its destination without needing to inspect any other content, just carrying the raw bytes.
This is great, because it means that things like client authentication can just work and authenticate against the final server without any complexity. But it can be a problem if we actually need to do something with the traffic. I’ll discuss how to handle this properly in the next post.