After talking about transaction merging, there is another kind of database trick that you can use to optimize your performance even further. It is called early lock release. With early lock release, we rely on the fact that the client will only know when the transaction has been committed after we told him. And that we don’t have to tell it right away.
That sounds strange, how can not telling the client that its transaction has been committed improve performance. Well, it improve throughput, but it can also improve the performance. But before we get into the details, let us see the code, and then talk about what it means:
The code is nearly identical to the one in the previous post, but unlike the previous post, here we play a little game. Instead of flushing the journal synchronously, we are going to do this in an async manner. And what is more important, we don’t want for it to complete. Why is that important?
It is important because notifying the caller that the transaction has been complete has now moved into the async callback, and we can start processing additional operations are write them to the journal file at the same time that the I/O for the previous operation completes.
As far as the client is concerned, it doesn’t matter how it works, it just need to get the confirmation that it has been successfully committed. But from the point of view of system resources, it means that we can parallelize a key aspect of the code, and we can proceed with the next set of transactions before the previous one even hit the disk.
There are some issues to consider. Typically you have only a single pending write operation, because if the previous journal write had an error, you need to abort all future transactions that relied on its in memory state (effectively, rollback that transaction and any speculative transactions we executed assuming we can commit that transaction to disk. Another issue is that error handling is more complex, if a single part of the merge transaction failed, it will fail unrelated operations, so you need to unmerge the transaction, run them individually, and report on each operation individually.
More posts in "The Guts n’ Glory of Database Internals" series:
- (08 Aug 2016) Early lock release
- (05 Aug 2016) Merging transactions
- (03 Aug 2016) Log shipping and point in time recovery
- (02 Aug 2016) What goes inside the transaction journal
- (18 Jul 2016) What the disk can do for you
- (15 Jul 2016) The curse of old age…
- (14 Jul 2016) Backup, restore and the environment…
- (11 Jul 2016) The communication protocol
- (08 Jul 2016) The enemy of thy database is…
- (07 Jul 2016) Writing to a data file
- (06 Jul 2016) Getting durable, faster
- (01 Jul 2016) Durability in the real world
- (30 Jun 2016) Understanding durability with hard disks
- (29 Jun 2016) Managing concurrency
- (28 Jun 2016) Managing records
- (16 Jun 2016) Seeing the forest for the trees
- (14 Jun 2016) B+Tree
- (09 Jun 2016) The LSM option
- (08 Jun 2016) Searching information and file format
- (07 Jun 2016) Persisting information