Pesky code review comments

time to read 5 min | 866 words

A large portion of my day to day tasks is to review code. I’m writing this post barely two weeks into the new year, and we already had over 150 PRs going into RavenDB alone.

As a result, I’ve gotten sensitive to certain issues. For example, the following is a suggestion made for fixing an issue in this method declaration:

image

This is a piece of code (in C) that is meant to handle some low level details for RavenDB. We use the CLR coding conventions for C#, but for C, we have chosen to use a different convention, using snake_case for methods, arguments and variables and SHOUTING_CASE for constants / defines. When reading through the code, I marked this violation of the naming convention for a fix.

This may seem minor, but it is probably annoying for the author of the code. They are interested in comments about the code functionality and utility. Why spend any time on something that doesn’t really have any impact? Both forms of the parameter name are just as readable to me, after all.

Before I get to this part, I want to show another piece of code. Or, to be rather more exact, two pieces of code. One of the reasons that we are using C code is that we can abstract large parts of the underlying platform inside the native code. That means that we have certain parts of the code that are written twice. Once for Windows and once for Linux.

Here is some code for Windows:

image

And here is the same code for Linux:

image

You can see that this is pretty much the same thing, just calling the different APIs for each platform. Once thing to notice here is that part of this method’s task is to ensure that the file that we open is at least as big as the initially requested size.

In Windows, to increase or decrease the file size you call SetFilePointer() followed by SetEndOfFile(). On Linux, you have fallocate() and ftruncate()*.This is reflected in the code. The Windows code has a single method to do this work and the Linux method has two methods. rvn_allocate_file_space() and rvn_truncate_file() which isn’t shown here.

* Actually, you might have fallocate(). Some file systems do not support it, and you need to use another workaround.

One of my code review comments has been that this need to be fixed, that we should have a _resize_file() method for Linux that would simple call the appropriate method based on the file size. But the question is, why?

These are two separate implementations for two separate operating systems. We are already creating the higher level abstraction level with operations that hide many system details. Why do I want to have a passthrough method just because the Windows code has this method?

The answer here, as in the case above with the parameter name, is the same. Consistency.

This it most obvious in the naming convention, but it is the same reasoning I had when I wanted to have the same method structure for both Linux and Windows.

Consistency is key for being able to slog through a lot of code. It is how I (and the rest of the team) can go through thousands of lines of code per week and understand what is going on. Because when we look at a piece of code, it follow certain conventions and structure. Reading the code is easy because we can ignore a lot of cruft around it and focus on what is going on.

In the case of the Windows / Linux methods, I first read one method and then the next, making sure that we are doing the same thing on all platforms. The different behavior (resize vs. allocate) was very obvious to me, which meant that I had to stop, go and look at each method’s implementation to figure out whatever there is any meaningful difference between them. That was a chore, and it would only become worse over time as we add additional functionality, so anything that isn’t different because it has to be different should match.

In general, I like code reviews where I can scan through the changes and not see the code, but it’s purpose. That happens when there isn’t anything there that I really have to think about and the intent is clear.

When writing in C#, we have decades (literally) of experience and organizational inertia that push us in the right direction. When push C code into the repository, I started to actually pay attention to those details explicitly, because I suddenly need to.

This is apparent in code reviews, but it isn’t just the case of me trying to make my own tasks easier. Code is read a lot more often than it is written, and focusing on making the code itself boring will pay off, because what the code is doing is what should be interesting in the long run.