Reviewing pull requests with large number of commits

time to read 2 min | 343 words

I have to go over a fairly compress pull request, adding a pretty big feature. The feature, if you care, is compressing data in the leaf pages of a B+Tree for use in Map/Reduce entries storage. The current and non final results are reducing storage costs from about 650MB to 32MB, so we are pretty happy with it, but this isn’t the topic of this post.

At this stage, the feature is not complete, but it is ready for initial review, so we created a temporary PR to review it, here is a partial view:


There are over 30 commits over a period of three weeks there, and they represent a lot of experimentation as we go along. This is fine & proper as far as we are concerned, since we this represent quite a bit of work, and we certainly want to be able to track it. But since this isn’t completed, the commit history is long, which means that reviewing it one commit at a time is not trivial. In particular, there are a lot of back and forth changes in the same place, and I don’t want to review code that has been changed.

Luckily, git make it easy to change that view.

I pull the changes to my local branch, then reset the log to the last commit before the merge, then I just squashed all those commit into a single one. You can check it out here.

The final result is:


Which is much nicer way to go about reviewing this, since I can look at all the changes in one shot, and more to the point, I can make comments on the changes using our normal workflow.