With a little knowledge and a profiler, let us optimize!

time to read 13 min | 2574 words

As part of the work we have been doing on Voron, I wrote a few benchmarks and looked at where the hot spots are. One of the major ones was this function:

   1: public override void Flush()
   2: {
   3:     if (_flushMode == FlushMode.None)
   4:         return;
   5:  
   6:     PagerState.Accessor.Flush();
   7:     _fileStream.Flush(_flushMode == FlushMode.Full);
   8: }

This is the “fsync()” call, effectively. Accessor.Flush() call will resolve to a FlushViewOfFile(0, size); and _fileStream.Flush(true) will resolve to FlushFileBuffers on Windows.

It isn’t surprising that this would be THE hotspot, it is the part where we actually have to wait for the hardware to do stuff, after all. But further investigation revealed that it wasn’t the FlushFileBuffers that was really costly, it was the FlushViewOfFile. What FlushViewOfFile will do is scan all of the pages in range, and flush them to the OS (not to disk) if they are dirty. That is great, but it is effectively an O(N) operation. We have more knowledge about what is going on, so we can do better. We already know what are the dirty pages, so we can actually use that, instead of letting the OS do all the work.

But then we run into another problem. If we actually just call FlushViewOfFile for every page separately, we are going to have to spend a lot of time just calling to the OS when we have to do a large write. So we need to balance the amount of data we send to FlushViewOfFile with the number of times we are going to call FlushViewOfFile. Therefor, I came up with the following logic. We are going to group calls to FlushViewOfFile, as long as they are nearby (within 256KB of one another), this will give us the best balance between reducing the number of pages that FlushViewOfFile needs to call and the number of times we call FlushViewOfFile.

This now looks like this:

   1: public override void Flush(List<long> sortedPagesToFlush)
   2: {
   3:     if (_flushMode == FlushMode.None || sortedPagesToFlush.Count == 0)
   4:         return;
   5:  
   6:     // here we try to optimize the amount of work we do, we will only 
   7:     // flush the actual dirty pages, and we will do so in sequential order
   8:     // ideally, this will save the OS the trouble of actually having to flush the 
   9:     // entire range
  10:     long start = sortedPagesToFlush[0];
  11:     long count = 1;
  12:     for (int i = 1; i < sortedPagesToFlush.Count; i++)
  13:     {
  14:         var difference = sortedPagesToFlush[i] - sortedPagesToFlush[i - 1];
  15:         // if the difference between them is not _too_ big, we will just merge it into a single call
  16:         // we are trying to minimize both the size of the range that we flush AND the number of times
  17:         // we call flush, so we need to balance those needs.
  18:         if (difference < 64)
  19:         {
  20:             count += difference;
  21:             continue;
  22:         }
  23:         FlushPages(start, count);
  24:         start = sortedPagesToFlush[i];
  25:         count = 1;
  26:     }
  27:     FlushPages(start, count);
  28:  
  29:     if (_flushMode == FlushMode.Full)
  30:         _fileStream.Flush(true);
  31: }

A side affect of this is that we are more likely to be writing to the disk in a sequential fashion because of this.

The end result of this change was doubling the performance of the system under worse case scenario to “just” 25% faster under best conditions.