With RavenDB 4.0, we are looking to strengthen our encryption capabilities. Right now RavenDB is capable of encrypting document data and the contents of indexes at rest. That is, if you look at the disk, the data is securely encrypted. However, in memory, we keep quite a bit of information in plain text (mostly in caches of various kinds), and the document metadata isn’t encrypted, so documents keys are visible.
With RavenDB 4.0 we are looking into making some stronger guarantees. That means that we want to keep all data encrypted on disk, and only decrypt it during transaction, after which it will immediately be encrypted back.
Now, encryption and security in general are pretty big fields, and I’m by no means an expert, so I thought that I would outline the initial goals of our research and see if you have anything to add.
- All encryption / decryption operations are done on data that is aligned on 4KB boundary and is always in multiples of 4 KB. It would be extremely helpful if the encryption would not change the size of the data. Given that the data is always in 4KB increments, I don’t think that this is going to be an issue.
- We can’t use managed API to do so. Out data is actually residing in unmanaged memory, so ideally we would need something like this:
- I also need to do this be able to call this from C#, and it needs to run on Windows, Linux and hopefully Mac OS.
- I’ve been looking at stuff like this page, trying to understand what it means and hoping that this is actually using best practices for safety.
Another problem is that just getting the encryption code right doesn’t help without managing all the rest of it properly. Selecting the appropriate algorithm and mode, making sure that the library we use is both well known and respected, etc. How do we distributed / deploy / update it over multiple platforms?
You can see some sample code that I have made here: https://gist.github.com/ayende/13b206b9d83e7aa126df77d6b12711f3
This is basically the sample OpenSSL translated to C# with a bit of P/Invoke. Note that this is meant for our own use, so we don't need padding since we always pass a buffer that is a multiple of 4KB.
I'm assuming that since this is based on the example on the OpenSSL wiki, it is also a best practice sample. There is a chance that I am mistaken, however, which is why we have this post.