RavenDB Security ReviewFinding and details
In Jan 2018 we asked Edge Security to do a thorough review of RavenDB security and cryptography usage. We wanted to get an outside expert opinion before the RTM release, to make sure that we put out a system that is robust and secured.
As an aside, I strongly recommend doing such a thing on major version releases (at least). Especially if you need an expert opinion, and security is certainly one area in which you want to have things verified.
In the spirit of full transparency, we have made the full report available here. I want to point out that all the issues that were raised in the report were fixed before the RTM release, but I think that it it worth going over each of the items that were brought up in the report and explore them. We have tried our best to create a secured system and it was… humbling to get the report and see fifteen different locations where we failed to do so.
Security is hard to do and even harder to get right. The good news from our perspective was that all those issues were high risk in terms of their impact on the security of the product, but minor in terms of their effect on the overall architecture, so we were able to fix them quickly.
I’m going to take the time now to address each type of failure that was brought up in the report, discuss what kind of risk it represents and how it was resolved. I’ll deal with that in the next posts in this series.
The most important parts the report are quoted below:
RavenDB deploys cryptography essentially on two different fronts: symmetric cryptography of all data on disk, and asymmetric cryptography via X.509 certificates as a means of authentication between clients and servers.
All symmetric encryption uses Daniel J. Bernstein’s XChaCha20Poly1305 algorithm, as implemented in libsodium, with a randomized 192-bit nonce. While opting for XChaCha20 over ChaCha20 means more calls to the RNG and a computation of HChaCha20, it also means that there is no possibility of nonce-reuse, which means that it is considerably more resilient than adhoc designs that might make a best-effort attempt to avoid nonce-reuse, without ensuring it. Symmetric encryption covers the database main data store, index definitions, journal, temporary file streams, and secret handling.
Such secret handling uses the Windows APIs for protected data, but only for a randomly generated encryption key, which is then used as part of the XChaCha20Poly1305 AEAD, to add a form of authentication. All long-term symmetric secrets are derived from a master key using the Blake2b hash function with a usage-specific context identifier.
At setup time, client and server certificates are generated. Clients trust the server’s self-signed certificate, and the server trusts each client based on a fingerprint of each client’s certificate. All data is exchanged over TLS, and TLS version failures for certificate failures are handled gracefully, with a webpage being shown indicating the failure status, rather than aborting the TLS handshake. Server certificates are optionally signed by Let’s Encrypt using a vendor-specific domain name. Certificates are generated using BouncyCastle and are 4096-bit RSA.
Keys, nonces, and certificate private keys are randomly generated using the operating system’s CSPRNG, either through libsodium or through BouncyCastle.
If you aren’t familiar with cryptographic terms, this can be pretty scary. There are lots of terms and names that are thrown around. I want to increase the knowledge of my readers, and after seeing the reactions of the guys internally to the report, I think it would do a lot of good to actually go over a real world report and its mitigations and discuss how we resolved them. Along the way, I’ll attempt to cover many of these cryptographic terms and dechiper (pun intended) their meaning.
More posts in "RavenDB Security Review" series:
- (27 Mar 2018) Non-Constant Time Secret Comparison
- (26 Mar 2018) Encrypt, don’t obfuscate
- (23 Mar 2018) Encrypting data on disk
- (22 Mar 2018) Nonce reuse
- (20 Mar 2018) Finding and details
One begins to see the value in piggy-backing off existing security mechanisms like HTTPS, SSL, certs, rather than rolling everything yourself.
Reading through the security report is fascinating...and much of it is over my head. Looking forward to the set of posts on this.
Really looking forward to this - are all the relevant change histories present in the github repo?
Peter, Hi, Yes, all the changes are in there, including multiple commits to fix some of these issues. You can start from the file names and look at the histories.
Hi. Can you clarify something for me?
You said "XChaCha20 over ChaCha20 means more calls to the RNG". Why? Because are you calling
randombytes_buf()to generate every Nonce? or is it something inside the algorithm that I can not see?
In the first case maybe using just a counter is fine as long as you do not repeat the counter. Am I right?
Crowley, With the
XChaCha20we use 24 bytes of random values every time. With
ChaCh20we called it once for 8 bytes, then incremented it each time. But
random_bufisn't that expensive that it matters, to be frank.
Crowley, The problem with a counter is that you now need to guard against a malicious user trying to get you to reuse a counter