Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,596
|
Comments: 51,224
Privacy Policy · Terms
filter by tags archive
time to read 8 min | 1411 words

One of the most frequent operations we make in RavenDB is getting a page pointer. That is the basis of pretty much everything that you can think of inside the storage engine. On the surface, that is pretty simple, here is the API we’ll call:

public Page GetPage ( long pageNumber )

Easy, right? But internally, we need to ensure that we get the right page. And that is where some complexity enters the picture. Voron, our storage engine, implements a pattern called MVCC (multi-version concurrency control). In other words, two transactions loading the same page may see different versions of the page at the same time.

What this means is that the call to GetPage () needs to check if the page:

  • Has been modified in the current transaction
  • Has been modified in a previously committed transaction and has not yet flushed to disk
  • The on-disk version is the most up-to-date one

Each one of those checks is cheap, but getting a page is a common operation. So we implemented a small cache in front of these operations, which resulted in a substantial performance improvement. 

Conceptually, here is what that cache looks like:

public unsafe class PageLocator {

     private struct PageData {

         public long PageNumber ;

         public byte* Pointer ;

         public bool IsWritable ;

    }

     private PageData [] _cache = new PageData [ 512 ];

     public byte* Get ( long page , out bool isWritable ) {

         var index = page & 511 ;

         ref var data = ref _cache [ index ];

         if ( data.PageNumber == page ) {

             isWritable = data.IsWritable ;

             return data.Pointer ;

        }

         return LookupAndGetPage ( page , ref data , out isWritable );

    }

     public void Reset () {

         for ( int i = 0 ; i < _cache.Length ; i++)

             _cache [ i ].PageNumber =-1 ;

    }

}

This is intended to be a fairly simple cache, and it is fine if certain access patterns aren’t going to produce optimal results. After all, the source it is using is already quite fast, we simply want to get even better performance when we can. This is important because caching is quite a complex topic on its own. The PageLocator itself is used in the context of a transaction and is pooled. Transactions in RavenDB tend to be pretty short, so that alleviates a lot of the complexity around cache management.

This is actually a pretty solved problem for us, and has been for a long while. We have been using some variant of the code above for about 9 years. The reason for this post, however, is that we are trying to optimize things further. And this class showed up in our performance traces as problematic.

Surprisingly enough, what is actually costly isn’t the lookup part, but making the PageLocator ready for the next transaction. We are talking about the Reset () method.

The question is: how can we significantly optimize the process of making the instance ready for a new transaction? We don’t want to allocate, and resetting the page numbers is what is costing us.

One option is to add an int Generation field to the PageData structure, which we’ll then check on getting from the cache. Resetting the cache can then be done by incrementing the locator’s generation with a single instruction. That is pretty awesome, right?

It sadly exposes a problem, what happens when we use the locator enough to encounter an overflow? What happens if we have a sequence of events that brings us back to the same generation as a cached instance? We’ll be risking getting an old instance (from a previous transaction, which happened long ago). 

The chances for that are low, seriously low. But that is not an acceptable risk for us. So we need to consider going further. Here is what we ended up with:

public unsafe class PageLocator {

     private struct PageData {

         public long PageNumber ;

         public byte* Pointer ;

         public ushort Generation ;

         public bool IsWritable ;

    }

     private PageData [] _cache = new PageData [ 512 ];

     private int _generation = 1 ;

     public byte* Get ( long page , out bool isWritable ) {

         var index = page & 511 ;

         ref var data = ref _cache [ index ];

         if ( data.PageNumber == page && data.Generation == _generation ) {

             isWritable = data.IsWritable ;

             return data.Pointer ;

        }

         return LookupAndGetPage ( page , ref data , out isWritable );

    }

     public void Reset () {

         _generation++;

         if ( _generation >= 65535 ){

                 _generation = 1 ;

                 MemoryMarshal.Cast < PageData , byte >( _cache ).Fill ( 0 );

        }

    }

}

Once every 64K operations, we’ll pay the cost of resetting the entire buffer, but we do that in an instruction that is heavily optimized. If needed, we can take it further, but here are our results before the change:

And afterward:

The cost of the Renew ()call, which was composed mostly of the  Reset () call, basically dropped off the radar,and the performance roughly doubled.

That is a pretty cool benefit for a straightforward change.

 

time to read 1 min | 183 words

Designer (2)

I recently talked about how RavenDB is now using ZStd as the default compression algorithm for backups. That led to a reduction both in the amount of storage we are consuming for backups and a significant reduction in the time to actually run the backups.

We have been exploring where else we can get those benefits and the changes were recently released in RavenDB 6.0.2.

RavenDB now supports ZStd for HTTP compression, which you can control using the DocumentConventions.HttpCompressionAlgorithm.

You can find all the gory details about the performance impact in the release announcement here.

The really nice thing is that you can expect to see about a 50% reduction in the amount of bandwidth being used at comparable or better timings. That is especially true if you are using bulk inserts, where the benefit is most noticeable.

If you are running on the cloud, that matters a lot, since a reduction in bandwidth to and from the database translates directly into dollars being saved.

time to read 1 min | 139 words

This was actually released a while ago, I was occupied with other matters and missed that.

I had a blast talking with Carl & Richard about data sharding and how we implemented that in RavenDB.

What is data sharding, and why do you need it? Carl and Richard talk to Oren Eini about his latest work on RavenDB, including the new data sharding feature. Oren talks about the power of sharding a database across multiple servers to improve performance on massive data sets. While a sharded database is typically in a single data center, it is possible to distribute the shards across multiple locations. The conversation explores the advantages and disadvantages of the different approaches, including that you might not need it today, but it's great to know it's there when you do!

You can listen to the podcast here.

time to read 1 min | 103 words

Jaime and I had a really good discussion about RavenDB, why I took the time to create my own NoSql database engine, and the fact that I built it using .NET Core before it was released (back in the pre-1.0 days, when it was known as dnx), and some of the optimisation stories that I worked on when creating RavenDB. Along the way, we cover what the GC (or garbage collector) is, performance issues to look out for when dealing with large JSON objects, and some tips for those who want to optimise their applications.

You can listen to it here.

Would love your feedback.

FUTURE POSTS

  1. Replacing developers with GPUs - about one day from now
  2. Memory optimizations to reduce CPU costs - 6 days from now
  3. AI's hidden state in the execution stack - 9 days from now

There are posts all the way to Aug 18, 2025

RECENT SERIES

  1. RavenDB 7.1 (7):
    11 Jul 2025 - The Gen AI release
  2. Production postmorterm (2):
    11 Jun 2025 - The rookie server's untimely promotion
  3. Webinar (7):
    05 Jun 2025 - Think inside the database
  4. Recording (16):
    29 May 2025 - RavenDB's Upcoming Optimizations Deep Dive
  5. RavenDB News (2):
    02 May 2025 - May 2025
View all series

Syndication

Main feed ... ...
Comments feed   ... ...
}