Security analysis on error reporting
When talking about the security errors we generate in RavenDB 4.0, we got some really good comments, which are worth discussing. The following are some of the interesting tidbits from the comments there.
- can this behavior help some malevolent user trying to hack the system?
- at least make it an admin setting so by default it just gives a 403 and you have to turn on the detailed error reporting.
- I'm not questioning the technical decision, but simply the fact to returning info to the user. simply logging this error server-side, or activate this behavior to a specific admin setting as Paul suggests, looks more "safe" to me.
- often when you log to an account with wrong user or password, error message don't specify of password or username is wrong. just an example.
Let me start from scratch and outline the problem. We are using a x509 certificate to authenticate against the server. The user may use the wrong / expired / invalid certificate, the user may use no certificate or the user may use a valid certificate but attempt to do something that they don’t have access to.
By default, we can just reject the connection attempt, which will result in the following error:
We consider such an error utterly unacceptable. So we have to accept the connection, figure out what the higher level protocol is (HTTP, usually, but sometimes can be our own internal TCP connection) and send an error that the user can understand. In particular, we send a 403 HTTP error back with a message saying what the error is.
The worry is that there is some information disclosure inherent in the error message. Let us analyze that:
- a request without a certificate, we error and announce that they didn’t use a certificate, and that one is required. There is no information here that the user is not already in possession of. They may not be aware of it, but they are in possession of the fact that they aren’t using a certificate.
- a request with an expired certificate, we error and let the user know that. The user already have the certificate, and therefor can already figure out that it is expired. We don’t disclose any new information here, except that the user may try to use this to figure out what the server time is. This is generally not sensitive information and it can be safe to assume that it is close to what the rest of the world believe to be the current time, so I don’t see anything here that should worry us.
- a request with a certificate that the server doesn’t know about. This result in an error saying that the server doesn’t know about the certificate. Here we get into actual information disclosure issue. We let the user know that the certificate isn’t known. But what can they gain from this?
In the case of username / password, we always say that the username or password is incorrect, we shouldn’t let the user know if a username already exists and that just the password is wrong. This is because it provide the user with information that it didn’t have (that the username is correct). Now, in practical terms, that is almost never the case, since you have password reset links and they tend to tell you if the email / username you want to reset the password to is valid.
However, with a certificate, there aren’t two pieces of information here, there is just one, so we don’t provide any additional information.
A request with a certificate to do an operation that the certificate isn’t authorized to do. This will result in one of the following errors:
There are a few things to note here. First, we don’t disclose any information beyond what the user has already provided us. The operation and the database are both provided from the user, and we use the FriendlyName so the user can tell what the certificate in question was.
Note that this check run before we check whatever the database actually exists on the server, so there isn’t any information disclosed about the existence of databases either.
Given that the user tried to perform an operation that they are not allowed to, we need to reject that operation (honey pot behavior is a totally separate issue and probably shouldn’t be part of any sane API design). Given that we reject the operation, we need to provide clear and concise errors for the user, rather than the CONNECTION REFUSED error. Given the kind of errors we generate, I believe that they provide sufficient information for the user to figure out what is going on.
As to whatever we should log this / hide this behind a configuration setting. That is really confusing to me. We are already logging rejected connections, because the admin may be interested in reviewing this. But requiring to call the admin and look at some obscure log file is a bad design in terms of usability. The same is true for hiding behind a configuration option. Either this is a secured solution, and we can report these errors, or we have to put the system into a known unsecured state (in production) just to be able to debug a connection issue. I would be far more afraid from that scenario, especially since that would be the very first thing that an admin would do, any time that there is a connection issue.
So this is the default (and only) behavior, because we have verified that this is both a secured solution and a sane one.