Ayende @ Rahien

Hi!
My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by email or phone:

ayende@ayende.com

+972 52-548-6969

, @ Q c

Posts: 6,349 | Comments: 47,155

filter by tags archive

I was wrong, reflecting on the .NET design choices

time to read 3 min | 480 words

I have been re-thinking about some of my previous positions with regards to development, and it appear that I have been quite wrong in the past.

In particular, I’m talking about things like:

Note that those posts are parts of a much larger discussion, and both are close to a decade old. They aren’t really relevant anymore, I think, but it still bugs me, and I wanted to outline my current thinking on the matter.

C# is non virtual by default, while Java is virtual by default. That seems like a minor distinction, but it has huge implications. It means that proxying / mocking / runtime subclassing is a lot easier with Java than with C#. In fact, a lot of frameworks that were ported from Java rely on this heavily, and that made it much harder to use them in C#. The most common one being NHibernate, and one of the chief frustrations that I kept running into.

However, given that I’m working on a database engine now, not on business software, I can see a whole different world of constraints. In particular, a virtual method call is significantly more expensive than a direct call, and that adds up quite quickly. One of the things that we routinely do is try to de-virtualize method calls using various tricks, and we are eagerly waiting .NET Core 2.0 with the de-virtualization support in the JIT (we already start writing code to take advantage of it).

Another issue is that my approach to software design has significantly changed. Where I would previously do a lot of inheritance and explicit design patterns, I’m far more motivated toward using composition, instead. I’m also marking very clear boundaries between My Code and Client Code. In My Code, I don’t try to maintain encapsulation, or hide state, whereas with stuff that is expected to be used externally, that is very much the case. But that give a very different feel to the API and usage patterns that we handle.

This also relates to abstract class vs interfaces, and why you should care. As a consumer, unless you are busy doling some mocking or so such, you likely don’t, but as a library author, that matters a lot to the amount of flexibility you get.

I think that a lot of this has to do with my view point, not just as an Open Source author, but someone who runs a project where customers are using us for years on end, and they really don’t want us to make any changes that would impact their code. That lead to a lot more emphasis on backward compact (source, binary & behavior), and if you mess it up, you get ricochets from people who pay you money because their job is harder.

Emoji Encoding: A new style for binary encoding for the web

time to read 4 min | 604 words

Computers think in binary, and you would have thought that sending binary data around would be pretty easy. But that turns out to be a completely non trivial task. The problem is those pesky humans and needing to interface with them.

For example, if I need to send some binary data over email, I can either do that as attachment, with high probability of at least a few people never getting it, or I can encode it somehow. Typical choices are Base64 encoding for the low tech and barcodes / QR code and the like. For the fancy among us, we can try go with Base85 and other such things. That is pretty standard, but it really has a lot of limitations. Base64 will increase the size of the data by 25%, and it is case sensitive, so it is hard to get right if you need to actually look at it and not just copy/paste it. It is also limited to plain old ASCII, for compatibility reasons that don’t make a lot of sense in today’s world.

I have been thinking about this for a long time, because we need to send binary data (license information) in text, and we also need that to look well and formatted.

After a lot of thought and experimentation, I’m proud to announce a new form of encoding: the Emoji Encoder, available currently for .NET, but soon to be available for Ruby, Python, Go, Node.JS, Ember.js, React.JS and maybe jQuery.

The idea for this innovation came to me because of the following observations:

  • Emojis are becoming much more important in any textual conversation (to the point where people will say an emoji). That mean that we can rely on them for long term, which is very important for storage technology.
  • Trying to read meaning from emojis being sent is clearly impossible, as anyone taking a peek at a text conversation between two teenage girls can say. (Although they appear to have a hidden meaning, if she sent the red heel and not the blue heel emoji that apparently means something.)
  • Because emojis are so relevant, they can be sent anywhere a normal text would go, including email, social media, printing, etc.
  • There are a lot of emojis, allowing us to overcome the bloat of Base64 and its friends by dedicating a single emoji for each byte in a 1:1: mapping.

That means that in terms of characters, Emoji Encoding is a net win. Consider the following equivalent information:

  • I5xy4dT9Qyjp7DKwuVI6y95EwlDeO/NBeiuc3GJ5Mjo= <—45 characters
  • ℹ⤴⚫✔⭕㊗◀☔➖✂♥⛵✖♍❤⛵✅✏ℹ⛲✂ <—33 characters

That is quite important when dealing with constrained textual formats, such as twitter, where the above will be rendered as:

There are other advantages. This data is actually a 256 bits key for use in encryption. And you can actually show it to a user and have a reasonably good chance that they will be able to tell it apart from something else. It rely on the ability of humans to recognize shapes, but it will be very hard for them to actually tell someone your key. There has been a lot of research around such things, and while it isn’t a primary motivation for us, it is a very nice perk.

I mentioned that a key interest for us is the usage in licensing code. Here is an example of how a license email will now look:

I think that in addition to being pretty, it is also going to bring a smile to people faces, so the Emoji Encoder is a win all around.

Low level Voron optimizationsPrimitives & abstraction levels

time to read 5 min | 858 words

One of the things that I noticed with the recent spate of work we have been doing is that we are doing things that we have already tried, and failed. But suddenly we are far more successful. What is the difference?

Case in point, transaction merging and early lock release. Those links both go to our initial implementation that was written in 2013. That is four years ago. Yes, today I can tell you that transaction merging was able to give us two orders of magnitude improvement and early lock release gave us 45% boost in performance. But looking at the timeline, we rolled back early lock release in early 2014.

The complexity of the feature is certainly non trivial, but the major point that led to its removal in 2014 was that it wasn’t worth it. That is, it didn’t pay enough to be worth the complexity it brought. When we sat down to design Voron for RavenDB 4.0, one of the first areas that we sought to eliminate was the transaction merging. We wanted Voron to be single threaded, by design.

And I still very much stand by those decisions. So how can I reconcile both statements? The core difference between them is where those are located, and what this means.

Transaction merging now is not done by Voron, instead, this is something that RavenDB does on top of Voron. But why?

When we had transaction merging in Voron, it meant that we had to submit transactional work to Voron in a format that it could understand. And Voron is a very low level library, so it doesn’t really understand much. This gave us very small “vocabulary” to work with. More than that, it also meant that we had to deal with such features as explicit concurrency (at the Voron level, on top of the concurrency primitives exposed by RavenDB). Let us take the simplest example. We have two threads that want to write to the same document.

That means that we have to build the buffer we want to write in memory, then submit it to Voron, with the right concurrency setting at the Voron level. This is after we already checked the concurrency semantics at the RavenDB level, and with double cost to ensure that a concurrency conflicts at all levels are properly handled. From the point of view of Voron, that meant much more common merged transaction failing (which kills perfromance) and much higher complexity overall when using it. Alongside that, we also have much higher memory usage, because we have to allocate buffers to hold the data we need to write, then submit it to Voron, so the rate of allocations was much higher.

We still saw performance improvement over not using it, but nothing that was really major as two orders of magnitude that we see today. Another aspect of this is that when we built Voron, we built it to fit our existing architecture (which was built on top of Esent), so it reflect a lot of design decisions coming from there.

With RavenDB 4.0, we took a few giant steps back and decided to design the whole system as a single integrated piece. In fact, that meant that any attempt to do concurrency at the Voron level was abandoned. That meant that the moment you had a write transaction, you were safe from concurrency, you didn’t have to worry about anyone modify the data you were looking at. There was no need to allocate special buffers and hold them, because we are always writing directly to Voron, instead of buffering in memory.

This was a dramatic simplification of the API and its usage, and it meant that the code is much more approachable and easy to understand, work with and make performant. Of course, it also meant that we had a serial lock, which is where the transaction merger became such a huge deal. But the point here is that this kind of transaction merging isn’t done at the Voron level, but at the RavenDB level, and instead of submitting primitive operations we can submit full fledge work items, including logic. So writing a document is now done by the request thread parsing the document, preparing a MergedPutCommand class and submitting it to the transaction merger.

The transaction merger will then execute the command under a write transaction, and it will directly manipulate Voron. This means that we get both high concurrency and safety from concurrency issues at the same time.  Early lock release plays into that as well, we had to modify Voron to allow that, but what we did was to build low level primitives that can be used by higher levels, without making assumptions on their usage.

On the Voron side of things, we just have the notion of async commit (with a list of requirements that happen to be exactly fit what is going on in the transaction merging portion in RavenDB), and the actual transaction lock handoff / early lock released is handled at a higher layer, with a lot more information about the system.

Why you should avoid graceful error handling like the plague that it is

time to read 3 min | 536 words

A while ago I was reviewing a pull request by a team member and I realized that I’m looking at an attempt to negotiate graceful termination of a connection between two nodes. In particular, the code in question was invoked when one node was shutting down or had to tear down the connection for whatever reason.

That code was thrown out, and it made a very graceful arc all the way to the recycle bin.

But why? The underlying reason for this was to avoid needless error messages in the logs, which can trigger support calls and cost time & effort to figure out what is going on. That is an admirable goal, but at the same time, it is a false hope and a dangerous one at that.

Let us consider what it means that a node is shutting down. It means that it now needs to notify all its peers about this. It is no longer enough to just tear down all connections, it need to talk to them, and that means that we introduced network delays into the shutdown procedure. It also means that we now have to deal with error handling when we are trying to notify a peer that this node is shutting down,  and that way lead to madness.

On the other hand, we have the other node, which node needs to also handle its peer getting up in the middle of the conversation and saying “I’m going away now” mid sentence. For that matter, since the shutdown signal (which is the common case for this to be triggered) can happen at any time, now we need to have thread safety on shutdown so we can send a legible message to the other side, and the other side must be ready to accept the shutdown message at any time. (“Do you have any new documents for me” request that expects a “There are N messages for you” now also need to handle “G’dbye world” notification).

Doing this properly complicates the code at every level, and you still need to handle the rude shutdown scenario.

Furthermore, what is the other side is supposed to do with the information that this node is shutting down the connection voluntarily? It is supposed to not connect to it again? If so, what policy should it use to decided if the other side is down for valid reasons or actually unavailable?

Assuming that there is actually a reason why there is a TCP connection between the two nodes, any interruption in service, for whatever reason, is not a valid state.

And if we ensure that we are always ending the connection in the same rude manner, we also gain a very valuable feature. We make sure that the error handling portion of the code get exercised on a regular basis, so if there are any issues there, they will be discovered easily.

As for the original issue of reducing support calls because of transient / resolved errors. That can be solved by not logging the error immediately, but waiting a bit to verify that the situation actually warrants writing to the operations log (writing to the info log should obviously happen regardless).

The struggle with Rust

time to read 5 min | 983 words

So I spent a few evenings with Rust, and I have managed to do some pretty trivial stuff with it, but then I tried to do something non trivial (low level trie that relies on low level memory manipulations). And after realize that I just have to fight the language non stop, I am not going to continue forward with this.

Here is where I gave up:

image

I have a buffer, that I want to mutate using pointers, so I allocated a buffer, with the intent to use the first few bytes for some header, and use the memory directly and efficiently. Unfortunately, I can’t. I need to mutate the buffer in multiple places at the same time (both the trie header and the actual node), but Rust refuses to let me do that because then I’ll have multiple mutable references, which is exactly what I want.

It just feels that there is so much ceremony involved in getting a Rust program to actually compile that there isn’t any time left to do anything else.  This post certainly resonated with me strongly.

That is about the language, and about what it requires. But the environment isn’t really nice either. It starts from the basic, I want to allocated some memory.

Sure, that is easy to do, right?

  • alloc::heap::allocate is only for unstable, and might change underneath you.
  • alloc::raw_vec::RawVec which give you raw memory directly is unstable and likely to remain so. Even though it is much safer to use than directly allocating memory.

We are talking about allocating memory, in a system level language, and unless you are jumping through hops, there is just no way to do that.

I’ll admit that I’m also spoiled in terms of tooling (IDEs, debuggers, etc), but the Rust environment is pretty much “grab a text editor, you’ll have syntax highlighting and maybe something a bit more”, and that is it. I tried three or four different editors, and while some of intellij-rust, for example, was able to do some code analysis, it wasn’t able to actually build anything (I think that I needed to install JDE or some such), VS Code could build and run (but not debug) and it marks every single warning with, and combined with Rust’s eagerness of warning, made it very hard to write code. Consider when all you code looks like this:

image

No debugger beyond println (and yes, I know about GDB, that ain’t a good debugging experience) is another major issue.

I really want to like Rust, and it has some pretty cool ideas, but the problem is that it is just too hard to actually get something done in any reasonable timeframe.

What is really concerning is that any time that I want to do anything really interesting you need to either go and find a crate to do it (without any assurances of quality, maintainability, etc) or you have to use a nightly version or enable various feature flags or use unstable API versions. And you have to do it for anything beyond the most trivial stuff.

The very same trie code that I tried to write in Rust I wrote in one & half evenings in C++ (including copious searches for the correct modern ways to do something), and it works, it is obvious and I don’t have to fight the compiler all the time.

Granted, I’ve written in C++ before, and C# (my main language) is similar, but the differences are staggering. It isn’t just the borrow checker, it is the sum of the language features that make it very hard to reason about what the code is doing. I mentioned before that the fact that generics are resolved on usage, which can happen quite a bit further down from the actual declaration is very confusing. It might have been different if I have been coming from an ML background, but Rust is just too much work for too little gain.

One of my core requirements, the ability to write code and iterate over behavior quickly is blocked because every time that I’m trying to compile, I’m getting weird complication errors and then I need to implement workarounds to make the compiler happy, which introduce complexity into the code for very simple tasks.

Let us take a simple example, I want to cache the result of a DNS lookup. Here is the code:

We’ll ignore the unstable API usage with lookup_host, or the fact that it literally took me over an hour to get 30 lines of code out in a shape that I can actually demonstrate the issue.

There is a lot of stuff going on here. We have a cache of boxed strings in the hash map to maintain ownership on them, and we look them up and then add a cloned key to the cache because the owner of the host string is the caller, etc. But most importantly, this code is simple, expressive and wrong. It won’t compile, because we have both immutable borrow (on line 17) and a mutable one (on line 25 & 26).

And yes, I’m aware of the entry API on the HashMap that is meant to dealt with this situation. The problem is that all those details, and making the compiler happy in a very simple code path, is adding a lot of friction to the code. To the point where you don’t get anything done other than fight the compiler all the time. It’s annoying, and it doesn’t feel like I’m accomplishing anything.

The metrics calculation methods

time to read 3 min | 418 words

Any self respecting database needs to be able to provide a whole host of metrics for the user.

Let us talk about something simple, like the requests / second metrics. This seems like a pretty easy metric to have, right? Every second, you have N number of requests, and you just show that.

But it turns out that just showing the latest req/sec number isn’t very useful, primarily because a lot of traffic actually have a valleys & peaks. So you want to have the req/sec not for a specific second, but for some time ago (like the req/sec over the last minute & 15 minutes).

One way to do that is to use an exponentially-weighted moving average. You can read about their use in Unix in these articles. But the idea is that as we add samples, we’ll put more weight on the recent samples, but also take into account historical data.

That has the nice property that it reacts quickly to changes in behavior, but it smooth them out that you see a gradual change over time. The bad thing about it is that it is not accurate (in the sense that this isn’t very easy for us to correlate to exact numbers) and it is smooth out changes.

On the other hand, you can take exact metrics. Going back to the req/sec number, we can allocate an array of 900 longs (so enough for 15 minutes with one measurement per second) and just use this cyclic buffer to store the details. The good thing about that is that it is very accurate, we can easily correlate results to external numbers (such as the results of a benchmark).

With the exact metrics, we get the benefit of being able to get the per second data and look at peaks & valleys and measure them. With exponentially weighted moving average, we have a more immediate response to changes, but it is never actually accurate.

It is a bit more work, but it is much more understandable code. On the other hand, it can result in strangeness. If you have a a burst of traffic, let’s say 1000 requests over 3 seconds, then the average req/sec over the last minute will stay fixed at 50 req/sec for a whole minute. Which is utterly correct and completely misleading.

I’m not sure how to handle this specific scenario in a way that is both accurate and expected by the user.

Protocol design implications: REST vs. TCP

time to read 3 min | 444 words

I was going over design documents today, and I noticed some common themes in the changes that we have between RavenDB 3.5 and RavenDB 4.0.

With RavenDB 3.5 (and all previous versions), we always had the communication layer as HTTP REST calls between nodes. When I designed RavenDB, REST was the thing to do, and it is reflected in the design of RavenDB itself. However, 8 years later, we sat down and considered whatever this is really appropriate for everything. The answer was a resounding no. In fact, while over 95% of RavenDB is still pure REST calls, we have moved certain key functions to using TCP directly.

Note that this goes in directly contrast to this post of mine from 2012: Why TCP is evil and HTTP is king.

The concerns in this post are still valid, but we have found that there are a few major reasons why we want to switch to TCP for certain stuff. In particular, the basic approach is that the a client will communicate with the server using HTTP calls, but servers communicate with one another using TCP. The great thing about TCP is that it is a stream oriented protocol, so I don’t need to carry state with me on every call.

With HTTP, each call is stateless, and I can’t assume anything about the other side. That means that I need to send the state, manage the state on the other side, and have to deal with potential issues such as concurrency in the same conversation, restarts of one side that the other side can’t easily detect, repeated validation on each call, etc.

With TCP, on the other hand, I can make a lot of assumptions about the conversation. I have state that I can carry between calls to the other side, and as long as the TCP connection is opened, I can assume that it is valid. For example, if I need to know what is the last item I sent to the remote end, I can query that at the beginning of the TCP connection, as part of the handshake, and then I can just assume that what I sent to the other side has arrived (since otherwise I’ll eventually get an error, requiring me to create a new TCP connection and do another handshake). On the other side, I can verify the integrity of a connection once, without requiring me to repeatedly verify our mutual state on each and every message being passed.

This has drastically simplified a lot of code on both the sending and receiving ends, and reduced the number of network roundtrips by a significant amount.

Building a low level trie with Rust: Part I

time to read 3 min | 481 words

Before getting to grips with a distributed gossip system in Rust, I decided that it would be better to look at something a bit more challenging, but smaller in scope. I decided to implement the low level trie challenge in Rust.

This is interesting, because it is complex enough problem to require thinking even for experienced developers, but at the same time, it isn’t complex, just have a lot of details. It also require us to do a lot of lot level stuff and manipulate memory directly, so that is something that would be an interesting test for a system level programming language.

On the one hand, even with just a few hours with Rust, I can see some elegance coming out of certain pieces.  For example, take a look at the following code:

This is responsible for searching on the trie for a value, and I like that the find_match function traverse the tree and allow me to return both an enum value and a the closest match to this when it fails (so I can continue the process directly from there).

On the other hand, we have pieces of code like this:

image

And any line that has four casts in it is already suspect. And as I’m dealing with the raw memory, I have quite a bit of this.

And I certainly feeling the pain of the borrow checker. Here is where I’m currently stumped.

This is a small and simple example that shows the issue. It fails to compile:

image

I have a method that takes a mutable MyTrie reference, and pass it to a method that expects a immutable reference. This is fine, and would work. But I need to use the value from the find method in the delete_internal method, which again needs a mutable instance. But this fails with:

error[E0502]: cannot borrow `*self` as mutable because it is also borrowed as immutable

I understand the problem, but I am not really sure how to solve it. The problem is that I kinda want the find method to remain immutable, since it is also used on the read method, which can run on immutable instances.Technically speaking, I could copy the values that I want out of the node reference and do a lexical scope that would force the immutable borrow to end, but I’m unsure yet what would be the best option.

It seems like a lot of work to get what I want in spite, and not with the help of, the compiler.

Initial design for strong encryption in RavenDB 4.0

time to read 7 min | 1322 words

The previous post generated some great discussion, and we have done a bit of research in the meantime about what is going to be required in order to provide strong encryption support in RavenDB.

Note: I’m still no encryption expert. I’m basing a lot of what I have here on reading libsodium code and docs.

The same design goals that we had before still hold. We want to encrypt the data at the page level, but it looks like it is going to be impossible to just encrypt the whole page. The reason behind that is that encryption is actual a pure mathematical operation, and given the same input text and the same key, it is always going to generate the same value. Using that, you can create certain attacks on the data by exploiting the sameness of the data, even if you don’t actually know what it is.

In order to prevent that, you would use an initialization vector or nonce (seems to be pretty similar, with the details about them being relevant only with regards to the randomness requirements they have). At any rate, while I initially hoped that I can just use a fixed value per page, that is a big “nope, don’t do that”. So we need some place to store that information.

Another thing that I run into is the problem with modifying the encrypted text in order to generate data that can be successfully decrypted but is different from the original plain text. A nice example of that can be seen here (see the section: How to Attack Unauthenticated Encryption). So we probably want to have that as well.

This is not possible with the current Voron format. Luckily, one of the reasons we built Voron is so we can get it to do what we want. Here is what a Voron page will look after this change:

  Voron page: 8 KB in size, 64 bytes header
+-------------------------------------------------------------------------+
|Page # 64 bits|Page metadata up to 288 bits  |mac 128 bits| nonce 96 bits|
+-------------------------------------------------------------------------+
|                                                                         |
|  Encrypted page information                                             |
|                                                                         |
|       8,128 bytes                                                       |
|                                                                         |
|                                                                         |
+-------------------------------------------------------------------------+

The idea is that when we need to encrypt a page, we’ll do the following:

  • First time we need to encrypt the page, we’ll generate a random nonce. Each time that we encrypt the page, we’ll increment the nonce.
  • We’ll encrypt the page information and put it in the page data section
  • As well as encrypting the data, we’ll also sign both it and the rest of the page header, and place that in the mac field.

The idea is that modifying either the encrypted information or the page metadata will generate an error because the tampering will be detected.

This is pretty much it as far as the design of the actual data encryption goes. But there is more to it.

Voron uses a memory mapped file to store the information (actually, several, with pretty complex interactions, but it doesn’t matter right now). That means that if we want to decrypt the data, we probably shouldn’t be doing that on the memory mapped file memory. Instead, each transaction is going to set aside some memory of its own, and when it needs to access a page, it will be decrypted from the mmap file into that transaction private copy. During the transaction run, the information will be available in plain text mode for that transaction. When the transaction is over, that memory is going to be zeroed. Note that transactions RavenDB tend to be fairly short term affairs. Because of that, each read transaction is going to get a small buffer to work with and if there are more pages accessed than allowed, it will replace the least recently used page with another one.

That leaves us with the problem of the encryption key. One option would be to encrypt all pages within the database with the same key, and use the randomly generated nonce per page and then just increment that. However, that does leave us with the option that two pages will be encrypted using the same key/nonce. That has a low probability, but it should be considered. We can try deriving a new key per page from the master page, but that seems… excessive. But it looks like there is another option is to generate use a block chipper, where we pass different block counter for each page.

This would require a minimal change to crypto_aead_chacha20poly1305_encrypt_detached, allowing to pass a block counter externally, rather than have it as a constant. I asked the question with more details so I can have a more authoritative answer about that. If this isn’t valid, we’ll probably use a nonce that is composed of the page # and the number of changes that the page has went through. This would limit us to about 2^32 modifications on the same page, though. We could limit the a single database file size to mere 0.5 Exabyte rather than 128 Zettabyte, but somehow I think we can live with it.

This just leave us with the details of key management. On Windows, this is fairly easy. We can use the CryptProtectData / CryptUnprotectData to protect the key. A transaction will start by getting the key, doing its work, then zeroing all the pages it touched (and decrypted) and its key. This way, if there are no active transactions, there is no plaintext key in memory. On Linux, we can apparently use Libsecret to do this. Although it seems like it has a much higher cost to do so.

Strong data encryption questions

time to read 3 min | 428 words

Image result for encryption iconWith RavenDB 4.0, we are looking to strengthen our encryption capabilities. Right now RavenDB is capable of encrypting document data and the contents of indexes at rest. That is, if you look at the disk, the data is securely encrypted. However, in memory, we keep quite a bit of information in plain text (mostly in caches of various kinds), and the document metadata isn’t encrypted, so documents keys are visible.

With RavenDB 4.0 we are looking into making some stronger guarantees. That means that we want to keep all data encrypted on disk, and only decrypt it during transaction, after which it will immediately be encrypted back.

Now, encryption and security in general are pretty big fields, and I’m by no means an expert, so I thought that I would outline the initial goals of our research and see if you have anything to add.

  • All encryption / decryption operations are done on data that is aligned on 4KB boundary and is always in multiples of 4 KB. It would be extremely helpful if the encryption would not change the size of the data. Given that the data is always in 4KB increments, I don’t think that this is going to be an issue.
  • We can’t use managed API to do so. Out data is actually residing in unmanaged memory, so ideally we would need something like this:




  • I also need to do this be able to call this from C#, and it needs to run on Windows, Linux and hopefully Mac OS.
  • I’ve been looking at stuff like this page, trying to understand what it means and hoping that this is actually using best practices for safety.

Another problem is that just getting the encryption code right doesn’t help without managing all the rest of it properly. Selecting the appropriate algorithm and mode, making sure that the library we use is both well known and respected, etc. How do we distributed / deploy / update it over multiple platforms?

Any recommendations?

You can see some sample code that I have made here: https://gist.github.com/ayende/13b206b9d83e7aa126df77d6b12711f3

This is basically the sample OpenSSL translated to C# with a bit of P/Invoke. Note that this is meant for our own use, so we don't need padding since we always pass a buffer that is a multiple of 4KB. 

I'm assuming that since this is based on the example on the OpenSSL wiki, it is also a best practice sample. There is a chance that I am mistaken, however, which is why we have this post.

FUTURE POSTS

  1. Thread pool starvation? Just add another thread - one day from now

There are posts all the way to Apr 24, 2017

RECENT SERIES

  1. Performance optimizations (2):
    11 Apr 2017 - One step forward, ten steps back
  2. RavenDB Conference videos (12):
    03 Mar 2017 - Replication changes in 3.5
  3. Low level Voron optimizations (5):
    02 Mar 2017 - Primitives & abstraction levels
  4. Implementing low level trie (4):
    26 Jan 2017 - Digging into the C++ impl
  5. Answer (9):
    20 Jan 2017 - What does this code do?
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats