Initial design for strong encryption in RavenDB 4.0
The previous post generated some great discussion, and we have done a bit of research in the meantime about what is going to be required in order to provide strong encryption support in RavenDB.
Note: I’m still no encryption expert. I’m basing a lot of what I have here on reading libsodium code and docs.
The same design goals that we had before still hold. We want to encrypt the data at the page level, but it looks like it is going to be impossible to just encrypt the whole page. The reason behind that is that encryption is actual a pure mathematical operation, and given the same input text and the same key, it is always going to generate the same value. Using that, you can create certain attacks on the data by exploiting the sameness of the data, even if you don’t actually know what it is.
In order to prevent that, you would use an initialization vector or nonce (seems to be pretty similar, with the details about them being relevant only with regards to the randomness requirements they have). At any rate, while I initially hoped that I can just use a fixed value per page, that is a big “nope, don’t do that”. So we need some place to store that information.
Another thing that I run into is the problem with modifying the encrypted text in order to generate data that can be successfully decrypted but is different from the original plain text. A nice example of that can be seen here (see the section: How to Attack Unauthenticated Encryption). So we probably want to have that as well.
This is not possible with the current Voron format. Luckily, one of the reasons we built Voron is so we can get it to do what we want. Here is what a Voron page will look after this change:
Voron page: 8 KB in size, 64 bytes header +-------------------------------------------------------------------------+ |Page # 64 bits|Page metadata up to 288 bits |mac 128 bits| nonce 96 bits| +-------------------------------------------------------------------------+ | | | Encrypted page information | | | | 8,128 bytes | | | | | +-------------------------------------------------------------------------+
The idea is that when we need to encrypt a page, we’ll do the following:
- First time we need to encrypt the page, we’ll generate a random nonce. Each time that we encrypt the page, we’ll increment the nonce.
- We’ll encrypt the page information and put it in the page data section
- As well as encrypting the data, we’ll also sign both it and the rest of the page header, and place that in the mac field.
The idea is that modifying either the encrypted information or the page metadata will generate an error because the tampering will be detected.
This is pretty much it as far as the design of the actual data encryption goes. But there is more to it.
Voron uses a memory mapped file to store the information (actually, several, with pretty complex interactions, but it doesn’t matter right now). That means that if we want to decrypt the data, we probably shouldn’t be doing that on the memory mapped file memory. Instead, each transaction is going to set aside some memory of its own, and when it needs to access a page, it will be decrypted from the mmap file into that transaction private copy. During the transaction run, the information will be available in plain text mode for that transaction. When the transaction is over, that memory is going to be zeroed. Note that transactions RavenDB tend to be fairly short term affairs. Because of that, each read transaction is going to get a small buffer to work with and if there are more pages accessed than allowed, it will replace the least recently used page with another one.
That leaves us with the problem of the encryption key. One option would be to encrypt all pages within the database with the same key, and use the randomly generated nonce per page and then just increment that. However, that does leave us with the option that two pages will be encrypted using the same key/nonce. That has a low probability, but it should be considered. We can try deriving a new key per page from the master page, but that seems… excessive. But it looks like there is another option is to generate use a block chipper, where we pass different block counter for each page.
This would require a minimal change to crypto_aead_chacha20poly1305_encrypt_detached, allowing to pass a block counter externally, rather than have it as a constant. I asked the question with more details so I can have a more authoritative answer about that. If this isn’t valid, we’ll probably use a nonce that is composed of the page # and the number of changes that the page has went through. This would limit us to about 2^32 modifications on the same page, though. We could limit the a single database file size to mere 0.5 Exabyte rather than 128 Zettabyte, but somehow I think we can live with it.
This just leave us with the details of key management. On Windows, this is fairly easy. We can use the CryptProtectData / CryptUnprotectData to protect the key. A transaction will start by getting the key, doing its work, then zeroing all the pages it touched (and decrypted) and its key. This way, if there are no active transactions, there is no plaintext key in memory. On Linux, we can apparently use Libsecret to do this. Although it seems like it has a much higher cost to do so.
Comments
On windows rather than decrypting the key each time you could use a SecureString, which under the covers uses (RtlDecryptMemory ) https://msdn.microsoft.com/en-us/library/windows/desktop/aa387692(v=vs.85).aspx and which from my experience is significantly faster.
Pop Catalin, SecureMemory seems to be using: CryptProtectMemory and CryptUnprotectMemory. On Linux, interestingly enough, they don't do that (because there is nothing that can emulate those).
https://github.com/dotnet/coreclr/blob/master/src/mscorlib/corefx/System/Security/SecureString.Unix.cs#L18
I personally like a particular definition of encryption:
"Encryption is ability that allows you to keep a large amount of data safe if you can keep a small piece of data safe".
The small piece of data is the key and keeping the key safe is done in two ways:
1) preventing the key from being read / intercepted. 2) preventing the key from being guessed.
It's easier to improve point 2) by strengthening the encryption and eliminating various pats of attack that can lead to guessing the key. However what a single entity cannot do by itself is, Point 1), keep the key safe.
This is the reason all the software is cracked no matter the protection scheme., they cannot do point 1) by themselves.
There's no way to protect the key without external support (OS support or a Dedicated service). Basically you need to hand the secret to a trusted party of higher authority like the OS for safe keeping (or encryption).
I'm not a Linux guy, but it looks like that's exactly what libsecret does. Unfortunately it use IPC (inter process communication) which could be too costly for your particular use case. One possible option would be to use Intel SGX in Linux when available (Intel Skylake or newer), but it looks like the efforts for an implementation are in early stages https://github.com/01org/linux-sgx.
You should encrypt the data, then compute a MAC on the encrypted data and page header. You shouldn't MAC the unencrypted data, that's not as good: http://crypto.stackexchange.com/a/205
Pop Catalin,
One interesting option to reduce the risk is that we'll generate a random key on startup, and instead of calling the OS to encrypt / decrypt it for us when we need it, we'll call the OS _once_, and then we'll use the random key (which is ephemeral) when we need to decrypted key itself. Of course, this just means that the ephemeral key is there, and there are turtles all the way down... :-(
The likely scenario for us is that we'll have to keep the key in storage in encrypted format, then decrypt it via libsecret and keep it there.
Mircea, Yes, the idea is that we MAC the actual encrypted bytes. But, we also MAC some plain text data that we need, so we can authenticate it didn't change.
Oren, that is good to know. It's OK to MAC plain text if it's not secret, the issue was with ciphertext. Thanks for the reply!
This looks like a solid alternative/compromise to your initial design ideas. Congrats on taking community feedback early in the process and incorporating in a sane way!
As an additional sanity check, I'd float the final design on crypto.stackexchange.com for additional feedback. Or even hire one of the high-rep experts there for a short design review; I'm sure a design sign-off by someone like The Bear would give you and your customers even more confidence.
http://crypto.stackexchange.com/users/28/thomas-pornin
Ryan, Yes, that is something that we'll very likely do
In theory looks viable. But, as libsodium chacha20 is incresing automatically block counter every 64 bytes if page has a high page# and has enough data you could overflow counter. Maybe you need some more changes to control this and continue with the first counter value (maybe 1U like the constant?) once you reach max value.
Crowley, Are you _sure_? That isn't how I read the code.
See here for the constant value currently being passed: https://github.com/jedisct1/libsodium/blob/master/src/libsodium/crypto_aead/chacha20poly1305/sodium/aead_chacha20poly1305.c#L42
And then this is used; https://github.com/jedisct1/libsodium/blob/68564326e1e9dc57ef03746f85734232d20ca6fb/src/libsodium/crypto_stream/chacha20/ref/stream_chacha20_ref.c#L263
I'm not seeing this increasing.
No. I'm going to try to explain what I think and if I'm wrong your could help me to understand it. I just wanna learn.
In libsodium doc I can read: Internally, ChaCha20 works like a block cipher used in counter mode. It uses a dedicated 64-bit block counter to avoid incrementing the nonce after each block.
As you can see in the picture, in counter block mode the keystream is generated combining the nonce and the counter and increase the counter for each block. This allows to reuse the nonce for a large amount of data (2^64 blocks) without being another epic fail like wifi WEP.
The combination of nonce and initial counter can be seen in here In ctx->input[12] and ctx->input[13] the counter is stored.
Here you have the loop that encritp the stream using bytes blocks. At the begining you can see the last stream bytes edgecase. Below you can see the encritpion itself. In L208 you can see how the loop breaks because last block of the stream was encrpted.
In L184 just before storing the encrypted stream to the output I see what it looks like and increment of the counter (j12, j13).
It has sense or I am missing something?
Crowley, I'm not 100% sure, but it looks like you are correct, yes.
Another thing I was thinking; if I'm right about counter increase; is about the option that two pages will be encrypted using the same key/nonce. As you said: that has a low probability, but it should be considered.
Maybe the solution about use page# as initial counter could lead to a vulnerability even if you managed to modify crypto_stream_chacha20_xor_icto take a page number and control the potential overflow.
Giving 2 pages #1 and #2 each with enough data to cover 3 blocks of encryption and with the bad luck to get the same random nonce:
The first page will cipher their blocks with: nonce+1 ---- nonce+2 ---- nonce+3 The secon page will cipher their blocks with : nonce+2 --- nonce+3 ---nonce+4
This give us 2 blocks encrypted with the same nonce/counter/key.
Crowley, No, nonce and counter are two separate values.
Comment preview