As you can imagine, I’m really interested in what is going on with storage engines. So when I read the RocksDB Tuning Guide with interest. Actually, mounting horror was more like it, to be frank.
It all culminated pretty nicely with the final quote:
Unfortunately, configuring RocksDB optimally is not trivial. Even we as RocksDB developers don't fully understand the effect of each configuration change. If you want to fully optimize RocksDB for your workload, we recommend experiments…
That is pretty scary statement. Now, to be fair, RocksDB is supposed to be a low level embedded storage engine, it isn’t meant to be something that is directly exposed to the user or operations people.
I’m literally writing databases for a living, and I had a hard time following all the options they had there. And it appears that from the final thought that the developers of RocksDB are also at a loss here. It is a very problematic state to be in. Because optimizations and knowing how to get the whole thing working is a pretty important part of using a database engine. And if your optimizations process relies on “change a bunch of settings”, and see what happens, that is not kind at all.
Remember, RocksDB is actually based on LevelDB, which is a database which was designed to be the backend of IndexdDB, which runs in the browser and has a single threaded client and a single user, pretty much. LevelDB can do a lot more than that, but that was the primary design goal, at least initially.
The problems with LevelDB are pretty well known, it suffers from write amplification, as well as known to hang if there is a compaction going on, and… well, you can read on that elsewhere.
RocksDB was supposed to take LevelDB and improve on all the issues that LevelDB had. But it appears that most of what was done was to actually create more configuration options (yes, I’m well aware that this is an unfair statement, but it sure seems so from the tuning guide). What is bad here is that the number of options is very high, and the burden it puts on the implementers is pretty high. Basically, it is a “change & pray” mindset.
Another thing that bugs me is the number of “mutex bottlenecks” that are mentioned in the tuning guide, with the suggestions to shard things to resolve them.
Now, to be fair, an embedded database require a bit of expertise, and cooperation from the host process, but that seems to be pushing it. It is possible that this is only required for the very high end usage, but that is still not fun.
Compared to that, Voron has about 15 configuration options, most of them about controlling the default sizes we use. And pretty much all of them are set to be auto tuned on the fly. Voron on its on actually have very few moving parts, which makes reasoning about it much simpler and efficient.