Optimization story: GetNextIdentityValueWithoutOverwritingOnExistingDocuments
A customer had a problem. They were mostly using the RavenDB HiLo algorithm for saving documents to the database, which is very fast & cheap. That client, however, chose to use the identity method. Which means that RavenDB will assign the value.
This is usually used if you need to have sequential values. The identity is actually being managed internally by RavenDB, and that works perfectly fine.
Except… What happens when you enter replication to the mix. The documents with the identity values are replicated to the secondary server, and there we don’t have the identity value, we just have the docs being written with their full id. (users/1, users/2, users/3, etc).
So far, so good. But what happens when you have a failover and you need to write to the secondary, and you use the identity? Well, RavenDB ain’t stupid, and it won’t overwrite the users/1 document. Instead, it will search for the next available opening from the smallest identity value generated and use that. The code looks like this:
This works, great. Except when you have large number of documents that have already been written. Instead of the brute force search, we now use the following approach:
This can figure out the first free item in a range of billion documents in under 100 tries, which I am pretty sure if good enough.