Generating sequential numbers in a distributed manner

time to read 4 min | 631 words

On its face, we have a simple requirement:

  • Generate sequential numbers
  • Ensure that there can be no gaps
  • Do that in a distributed manner

Generating the next number in the sequence is literally as simple as ++, so surely that is a trivial task, right?

The problem is with the second requirement. The need to ensure that there are no gaps comes often when dealing with things like invoices. The tax authorities are really keen on “show me all your invoices”, and if there are gaps in the numbers, you may have to provide Good Answers.

You may think that the third one, running in a distributed environment, is the tough challenge, but that isn’t actually the case. If we are running in a single location, that is fairly easy. Run the invoice id generation as a transaction, and you are done. But the normal methods of doing that are usually wrong in edge cases.

Let’s assume that we use an Oracle database, which uses the following mechanism to generate the new invoice id:

invoice_seq.NEXTVAL

Or SQL Server with an identity column:

CREATE TABLE invoices ( invoice_id INT IDENTITY(1,1) PRIMARY KEY, ... );

In both cases, we may insert a new value to the invoices table, consuming an invoice id. At some later point in time, we may roll back the transaction. Care to guess what happens then?

You have INVOICE #1000 and INVOICE #1002, but nothing in between. In fact, no way to even tell what happened, usually.

In other words, identity, sequence, serial, or autonumber – regardless of what database platform you use, are not suitable for generating gapless numbers.

The reasoning is quite simple. Assume that you have two concurrent transactions, which generate two new invoices at roughly the same time. You commit the later one before the first one, and roll back the first. You now have:

  • Invoice #1
  • Invoice #2
  • Invoice #1000
  • Invoice #1002

However, you don’t have Invoice #1001, and you cannot roll back the sequence value there, because if you do so, it will re-generate the #1002 on the next call.

Instead, for gapless numbers, we need to create this as a dedicated part of the transaction. So there would be a record in our system that contains the NextInvoiceId, which will be incremented as part of the new invoice creation.

In order to ensure that there are no gaps, you need to ensure that the NextInvoideId record increment is handled as a user operation, not a database operation. In other words, in SQL Server, that is a row in a table, that you manually increment as part of adding a new invoice. Here is what this will look like:

As you can see, we increment the row directly. So it will be rolledback as well.

The downside here is that we can no longer create two invoices concurrently. The second transaction would have to wait on the lock on the row in the next_gapless_ids table.

All of that happens inside a single database server. What happens when we are running in a distributed environment?

The answer in this case, is the exact same thing. You need a transaction, a distributed one, using a consensus algorithm. Here is how you can achieve this using RavenDB’s cluster wide transactions, which use the Raft protocol behind the scenes:

The idea is simple, we have a transaction that modifies the gapless ids document and creates a new invoice at the same time. We have to handle a concurrency exception if two transactions try to create a new invoice at the same time (because they both want to use the same invoice id value), but in essence this is pretty much exactly the same behavior as when you are running on a single node.

In other words, to ensure the right behavior, you need to use a transaction. And if you need a distributed transaction, that is just a flag away with RavenDB.