I’m not smart enough to build this feature

May 11 2020

I’m not smart enough to build this feature

time to read 4 min | 796 words

There have been a couple of cases where I was working on a feature, usually a big and complex one, that made me realize that I’m just not smart enough to actually build it.

A long time ago (five or six years), I had to tackle free space handling inside of Voron. When an operation cause a page to be released, we need to register that page somewhere, so we’ll be able to reuse it later on. This doesn’t seem like too complex a feature, right? It is just a free list, what is the issue?

The problem was that the free list itself was managed as pages, and changes to the free list might cause pages to be allocated or de-allocated. This can happen while the free list operation is running. Basically, I had to make the free list code re-entrant. That was something that I tried to do for a long while, before just giving up. I wasn’t smart enough to handle that scenario properly. I could plug the leaks, but it was too complex and obvious that I’m going to run into additional cases where this would cause issues.

I had to use another strategy to handle this. Instead of allowing the free list to do dynamic allocations, I allocated the free list once, and then used a bitmap mode to store the data itself. That means that modifications to the free list couldn’t cause us to mutate the structure of the free list itself. The crisis was averted and the feature was saved.

I just checked, and the last meaningful change that happened for this piece of code was in Oct 2016, so it has been really stable for us.

The other feature where I realized I’m not smart enough to handle came up recently. I was working on the documents compression feature. One of the more important aspects of this feature is the ability to retrain, which means that when compressing a document, we’ll use a dictionary to reduce the size, and if the dictionary isn’t well suited, we’ll create a new one. The first implementation used a dictionary per file range. So all the documents that are placed on a particular location will use the same dictionary. That has high correlation to the documents written at the same time and had the benefit of ensuring that new updates to the same document will use its original dictionary. That is likely to result in good compression rate over time.

The problem was that during this process, we may find out that the dictionary isn’t suited for our needs and that we need to retrain. At this point, however, we were already computed the required space. But… the new dictionary may compress the data different (both higher and lower than what was expected). The fact that I could no longer rely on the size of the data during the save process lead to a lot of heartache. The issue was worse because we first try to compress the value using a specific dictionary, find that we can’t place it in the right location and need to put it in another place.

However, to find the new place, we need to be know what is the size that we need to allocate. And the new location may have a different dictionary, and there the compressed value is different…

Data may move inside RavenDB for a host of reasons, compaction, defrag, etc. Whenever we would move the data, we would need to recompress it, which led to a lot of complexity. I tried fighting that for a while, but it was obvious that I can’t manage the complexity.

I changed things around. Instead of having a dictionary per file range, I tagged the compressed value with a dictionary id. That way, each document could store the dictionary that it was using. That simplified the problem space greatly, because I only need to compress the data once, and afterward the size of the data remains the same. It meant that I had to keep a registry of the dictionaries, instead of placing a dictionary at the end of the specified file range, and it somewhat complicates recovery, but the overall system is ever so much simpler and it took a lot less effort and complexity to stabilize.

I’m not sure why just these problems shown themselves to be beyond my capabilities. But it does bring to mind a very important quote:

When I Wrote It, Only God and I Knew the Meaning; Now God Alone Knows

I also have to admit that I have had no idea that this quote predates code and computing. The earlier known reference for it is from 1826.

Tweet Share Share 3 comments

Tags:

Comments

12 May 2020
08:56 AM

Rafal

Great quote, this one for sure comes from a prophet. But i think it's not about being smart enough or not. It's about how much time and how much brain you can dedicate - if you have to handle lots of stuff at the same time then it's just impossible to handle complex tasks, even if you were clearly able to do so few years back. I have single-handedly created big & complex stuff in the past that has became a base for my company's product, but now i find it quite difficult to work on - some parts of the code became inaccessible to me, not mentioning the rest of the team who just steer clear of anything that requires understanding how it works internally. And it's not because the code is bad or something, it's just complex and touches so many areas that there's a really big chance nobody understands full consequences of their modifications.

23 May 2020
11:15 AM

Olivier

Very interesting Ayende & Rafal.

I've lived exactly the same situations. I tend to write all alone very intricate things (for me at least) and I don't pretend to be super smart, but sometimes only some months after I struggle to go back to some portions of code I wrote before. The code can be clean, well constructed, sometimes it's just too much mentally to go back straight into it. And other members of the team just don't want to mess with it (and frankly I prefer they don't mess with it).

And (in my opinion) no amount of unit testing can guarantee that some very complex portions of code can be modified by someone else than the initial author. And it's tiring and a concern.

I've developed many applications over 25 years, and this is only a concern for very small parts of each application.

Software is such an imperfect human activity.

24 May 2020
08:54 AM

Oren Eini

Olivier,

I think that this sums this up pretty well:

https://heeris.id.au/2013/this-is-why-you-shouldnt-interrupt-a-programmer/

Comment preview

Comments have been closed on this topic.

Oren Eini

Oren Eini

CEO of RavenDB