One of the surprising points for improvement in our performance run was the following logic, responsible for copying the data from the user to our own memory:
Those three lines of code were responsible for no less than 25% of our performance. It was obvious that something needed to be done. My belief is that the unmanaged memory stream is just not optimized for this scenario, resulting in a lot of copying, allocations and costs.
Here is what we did instead. We create a temporary space that is allocated once, like this:
You can see that we are doing some interesting stuff there. In particular, we are allocated a managed buffer, but also force the GC to pin it. We keep this around for the entire lifetime of the database, too. The idea here is that we want to avoid the cost of pinning & unpinning it all the time, even if it means that we have an unmovable memory.
At any rate that important thing about this is that it gives us access to the same memory from managed and unmanaged perspectives. And that, in turn, leads to the following code:
We first read the values from the stream into the managed buffer, then copy them from the unmanaged pointer to the buffer to our own memory.
The idea here is that we accept a Stream abstraction, and that can only work with managed buffers, so we have to go through this route, instead of having to copy the memory directly. The reason we do that is that we don’t want to force the user of our API to materialize the data fully. We want to be able to stream it into the database.
At any rate, this has made some serious improvement to our performance, but I’ll be showing the details on a future post.