Boldly & confidently fail, it is better than the alternative
Recently I had the chance to sit with a couple of the devs in the RavenDB Core Team to discuss “keep & discard” habits*.
The major problem we now have with RavenDB is that it is big. And there are a lot of things going on there that you need to understand. I run the numbers, and it turns out that the current RavenDB contains:
- 835,000 Lines of C#
- 67,500 Lines of Type Script
- 87,500 Lines of HTML
That is divided into many areas of functionalities, but that is still a big chunk of stuff to go through. And that is ignoring things that require we understand additional components (like Esent, Lucene, etc). What is more, there is a lot of expertise in understanding what is going on in term of the full picture. We limit this value here because too much of it would result in high memory consumption under this set of circumstances, for example.
The problem is that it take time, and sometime a lot of it, to get good understanding on how things are all coming together. In order to handle that, we typically assign new devs issues from all around the code base. The idea isn’t so much to give them a chance to become expert in a particular field, but to make sure that they get the general idea of how come is structured and how the project comes together.
Over time, people tend to gravitate toward a particular area (M** is usually the one handling the SQL Replication stuff, for example), but that isn’t fixed (T fixed the most recent issue there), and the areas of responsibility shifts (M is doing a big task, we don’t want to disturb him, let H do that).
Anyway, back to the discussion that we had. What I realized is that we have a problem. Most of our work is either new features or fixing issues. That means that nearly all the time, we don’t really have any fixed template to give developers “here is how you do this”. A recent example was an issue where invoking smuggler with a particular set of filters would result in very high cost. The task was to figure out why, and then fix this. But the next task for this developer is to do sharded bulk insert implementation.
I’m mentioning this to explain a part of the problem. We don’t see a lot of “exactly the same as before” and a new dev on the team lean on the other members quite heavily initially. That is expected, of course, and encouraged. But we identified a key problem in the process. Because the other team members also don’t have a ready made answer, they need to dig into the problem before they can offer assistance, which sometimes (all too often, to be honest) lead to a “can you slide the keyboard my way?” and taking over the hunt. The result is that the new dev does learn, but a key part of the process is missing, the finding out what is going on.
We are going to ask both sides of this interaction to keep track of that, and stop it as soon as they realize that this is what is going on.
The other issue that was raised was the issue of fear. RavenDB is a big system, and it can be quite complex. It is quite reasonable apprehension, what if I break something by mistake?
Here it comes back to the price of failure. Trying something out means that at worst you wasted a work day, nothing else. We are pretty confident in our QA process and system, so we can allow people to experiment. Analysis paralysis is a much bigger problem. And I wasn’t being quite right, trying the wrong thing isn’t wasting a day, you learned what doesn’t work, and hopefully also why.
“I have not failed. I've just found 10,000 ways that won't work.”
― Thomas A. Edison
* Keep & discard is a literal translation of a term that is very common in the IDF. After most activities, there is an investigation performed, and one of the first questions asked is what we want to keep (good things that happened that we need to preserve for the next time we do this) and what we need to discard (bad things that we need to watch out for).
** The actual people are not relevant for this post, so I’m using letters only.