The cost of the authentication method
You might have noticed that we are doing a lot of work around performance. Some of this work can be done with just optimizing the way we are doing certain operations, but for some things, we cannot just optimize things, and a behavior change is also required.
In this case, we are talking about the authentication method for RavenDB. The current way it works goes something like this.
- The user is making a request to a RavenDB server.
- The server requires security, and ask the user to authenticate. For this case, we’ll sue API Keys.
- The user & server will have a side channel discussion for authentication via API Keys, which will result in a token.
- This token is sent as a header in all future requests.
- The token is cryptographically signed, so the server can validate that it is valid.
So far, so good, but this does pose some issues.
To start with, we took a lot from OAuth, so that means that we assume that there are multiple entities in this. The user, the server and the authenticator, and the use of cryptographic signature is meant to ensure that the server can trust the token generated by the authenticator and served by the user.
However, in pretty much all cases, the server and the authenticator are the same. There are some special cases relating to replication and load balancing, but they aren’t relevant at this point, and we can work around them if need be.
And we’ll want to do that. The reason that this is problematic is very simple, right now, we need to cryptographically validate the token on every single request. And that is expensive. Of course, in some cases, it is actually meant to be expensive, that is why it is secure.
So we need to reduce this cost, and we can reduce that by saying that we can just keep a simple token. Conceptually, this now becomes:
- The user is making a request to a RavenDB server.
- The server requires security, and ask the user to authenticate. For this case, we’ll sue API Keys.
- The user & server will have a side channel discussion for authentication via API Keys, which will result in a token.
- This token is sent as a header in all future requests.
- The token is just a unique id (guid), which is stored on the server memory.
Because the token is unique, and per server, we don’t need to do any crypto validation on the value. We can just check if the value is in our memory, and that would be it.
The first reaction we typically get is “but what about security? someone can re-use that guid to authenticate as someone else”. That assumes that you can sniff the conversation between client and server. If you can do that, then you are probably already don’t care about security, since you aren’t using HTTPS. But note that the same behavior occurs using the crypto token as well. If you manage to take that, you can present it to the server as your own token, and the server can’t tell the difference between a valid client and a malicious one.
The good part of this is that now the authentication part of the request is a dictionary lookup, instead of cryptographic signature validation. And the performance of authenticated requests is much higher.
Comments
Probably a bright decision but easy for the salesmen of the competitors to attack.
Carsten, How would such an attack work?
They simply tell it is not secure and refer to your blog and talk about man-in-the-middle-attack
Carsten, There is no new MITM. If you are using HTTPS, they you are obviously safe, which is the always recommended way to do this. If you aren't, someone could reuse a token, but someone already can reuse the tokens
This is somewhat like WCF's establishSecurityContext=true, where SSL and KeepAlive headers were used to establish a durable session, and made life miserable for people trying to introduce Load balancing for the services.
Which brinks the question, why not use WebSockets https://msdn.microsoft.com/en-us/library/system.net.websockets.websocket(v=vs.110).aspx ?
They are session based, use KeepAlive on SSL connections, authentication can be done once, and you could use them to send data both ways.
Pop, Because our entire API is using HTTP request / response. And we need to have each individual request use a previous auth cycle.
So we'll auth once and get the token, and all future requests will use that token until it expires
HTTPS is no longer obviously safe Heartbleed Bug https://en.wikipedia.org/wiki/Heartbleed http://www.zdnet.com/article/microsoft-reveals-windows-vulnerable-to-freak-ssl-flaw/
https://twitter.com/bsdphk/status/532517760035979264 One broken SSL is a mistake. Two is an accident. Three is sabotage. Apple, OpenSSL + Microsoft = NSA Full House ?
=== Maybe change the token at each request or calculate from a sample of the provided data.
Carsten, If you can't rely on HTTPS, you can't rely on encryption in general. At some point, you have to rely on something.
You always need two factor security. Several backup systems, power supply and so on when talking enterprise.
HTTPS can be broken by e.g. false certificates and is no longer considered as secure as before.
I understand that in RavenDB scenario it makes sense. But what do you think about to use this approach in a popular public API? I mean, do you think that it is scalable? is still cheeper to go to database to validate the token?
Juan, The easiest would be to have a service that you query for the validity of the token, sure. Or doing the crypto check yourself.
If you needed to keep using an external authentication party you could keep using regular signed tokens but keep a cache (dictionary) on the server of verified tokens, to speed up the authorization proccess on each request (do a lookup first, which will work most of the time).
You can make it a configurable option and you do not risk that RavenDB miss the security requirements at a new customer.
It seems that you already have the program code for both options.
Whats default is up to you :-)
Carsten, Adding options is a bad idea. Any option that you have double your test matrix, and you open up issues that open pop up with a certain set of issues being turned on
Nico, Yes, this is probably what we'll do when we need to handle it
One thing you could do, if it's a sync perf issue, would be to do the calculation async in another thread, and then only join it right before you send the response. Although if you're just trying to reduce total CPU usage, this wouldn't help, but if it's just having to wait for the auth to finish before you start running your real code, then async would help in this case. But I like you're solution better. I'm amazed how many people seem to think that you always need in depth protection at every possible layer. Having worked in security, I'm come to learn that perimeter defense is the most important, and only for very sensitive data do you really need to have intranetwork encryption between your app servers and db servers.
Have you checked that the time to perform auth is the same irrespective of varying degrees of token correctness to help gain confidence you're not vulnerable to a timing attack? (I.e., no char in the token matches an existing token, 1st char matches, 2nd char matches, etc). My first thought when I read you had ditched the crypto library was that you'd introduce a timing attack scenario.
String compare/equality will almost certainly fail fast, character by character, so you need to also ensure your dictionary lookup is not similarly vulnerable. Consider adding tests to ensure a change in implementation detail do not introduce vulnerabilities in future.
I suspect the use of dictionary will bail you out here, but (assuming it's currently secure) if the implementation of dictionary or gethashcode changes, it could introduce a timing related vulnerability in future, if one doesn't exist already.
There maybe deeper exploits on the use of guid or string also. If an attacker can use variations in authentication execution time to determine they've found a token that hashes to an existing dictionary bucket (birthday attack, doesn't need to be the correct token), and if the attacker knows how to compute other guids that result in the same hash (because your not using a secure hash function it might be possible to engineer this), they might have a shortcut to your actual guid that would pass the eventual equality check.
How long do you allow a client to use a given token, before reauthenticating or refreshing? If a client maintains long sessions and if we know the token won't expire during the session, that gives an attacker an incentive because they have time on their hands.
Just knowing you're not using a crypto library gives me confidence you'll have a vulnerability though, so you really to test that tokens can't be mined using time variations.
D
Darran, We don't validate the token one char at a time. We do that through a dictionary, so the cost of that would be to always hash it, and if the hash isn't correct, it would bounce. You could try to use timing for this, and hit the right hash value, then char by char. But the token is only valid for about 30 minutes.
"The token is just a unique id (guid)" doesn't sound very secure, since guids are predictable. Don't you think that "cryptographically secure random" would sound (and work) much better?
Konstantins, Guids are predictable in this manner. We are also aren't using it for anything except to serve as the "this is me" notice
This still sounds compatible with OAuth client_credentials flow and Bearer token specs, since none of those specifications mandate that token is some crypto-signed message and don't require specific validation routines. Token should be cryptographically secure (and crypro-secure random is ok here) and verifiable, that's it, AFAIK.
What will be the behavior of the following scenario: 1.Client sends a request to the server 2.After negotiation server sends a token to the client 3.Server crashes 4.Server recovers 5.Client sends a request with the given lost token what would be the outcome? I) server replies unauthorized and the client has to negotiate again II) server automatically starts a negotiation due to token not found
In my scenarios we've done some work on the performance of bearer token validation by implementing a cache for token validation results. The first call requires a cryptographic check of the token and subsequent calls just hit the cache to find the previously validated result. There are still checks after the cache-hit to ensure the token hasn't expired, but otherwise it's solved the problem.
If we wanted to eliminate the cryptographic check entirely, there's also the option of moving to reference tokens. It reduces the bearer token to just a small reference value that is passed to the OAuth 2.0 token introspection endpoint by the protected resource to obtain the "value" of the token. It's the same deal as far as caching the validation results too. The only difference this makes is reducing the size of the header needed to be sent over the wire.
What made RavenDB choose its own authentication strategy, inspired by OAuth 2.0 rather than directly implementing an OAuth 2.0 flow?
Paul, What we describe here is reference tokens. Except that we don't have a separate server, RavenDB is also the auth server. And we are implementing OAuth 2.0 in the current version. But we want something more efficient in the next version
I have been reading a lot about JSON web tokens lately. If you need to communicate data from server to client (non-secret) it is a interesting way to communicate token like behavior. https://jwt.io/introduction/
Comment preview