Note, this post was written by Federico.
The .Net Just-In-Time compiler (or JIT for short) is one of those marvels you don't even realize is there, unless you have to smooth talk it to do your bidding. At the Voron level, most routines are so tight that the assembler emitted becomes important. It is not uncommon to have to reason about how the instructions are going to be decoded by the CPU, if the microarchitecture is able to execute two instructions independently or not or we are memory-bound (having too many cache misses).
Those with a C/C++ background know the optimizing compiler is actually quite good at its job; and even so, tricks are required to squeeze the last performance bits. The JIT is no different, with the addition that while the C++ compiler has all the time in the world (sort-of); the restrictions imposed by compiling while your code is executing are far more severe. However, even on a budget, the CoreCLR does a pretty amazing job at optimizing. Until, of course, you expect the performance feats of the offline C++ compiler from it.
So the question you may probably have is: "Why bother? Why don't you just go and write the database in C++?”. I won't go in detail on that, since we covered that already . What I can tell you is, that when you dig into the JIT world, it is pretty easy to achieve performance on par with C++ if the JIT by design doesn't get in your way with unsupported features. Good thing is the CoreCLR team is quite open to tackle those issues if presented with a detailed analysis on the cost-effectiveness of your optimization.
In this series we will talk about the tricks we had to teach our code to squeeze those last bits of micro-optimizations. From specific optimizations requested to the development team and workarounds until they are available, to tricks we use to ensure the assembler code generated is of the highest quality possible.
One word of caution: The JIT is ever improving so it is important to understand that the optimizations opportunities presented here may not apply tomorrow or just be done automatically in future versions of CoreCLR.