Boldly & confidently fail, it is better than the alternative
Recently I had the chance to sit with a couple of the devs in the RavenDB Core Team to discuss “keep & discard” habits*.
The major problem we now have with RavenDB is that it is big. And there are a lot of things going on there that you need to understand. I run the numbers, and it turns out that the current RavenDB contains:
- 835,000 Lines of C#
- 67,500 Lines of Type Script
- 87,500 Lines of HTML
That is divided into many areas of functionalities, but that is still a big chunk of stuff to go through. And that is ignoring things that require we understand additional components (like Esent, Lucene, etc). What is more, there is a lot of expertise in understanding what is going on in term of the full picture. We limit this value here because too much of it would result in high memory consumption under this set of circumstances, for example.
The problem is that it take time, and sometime a lot of it, to get good understanding on how things are all coming together. In order to handle that, we typically assign new devs issues from all around the code base. The idea isn’t so much to give them a chance to become expert in a particular field, but to make sure that they get the general idea of how come is structured and how the project comes together.
Over time, people tend to gravitate toward a particular area (M** is usually the one handling the SQL Replication stuff, for example), but that isn’t fixed (T fixed the most recent issue there), and the areas of responsibility shifts (M is doing a big task, we don’t want to disturb him, let H do that).
Anyway, back to the discussion that we had. What I realized is that we have a problem. Most of our work is either new features or fixing issues. That means that nearly all the time, we don’t really have any fixed template to give developers “here is how you do this”. A recent example was an issue where invoking smuggler with a particular set of filters would result in very high cost. The task was to figure out why, and then fix this. But the next task for this developer is to do sharded bulk insert implementation.
I’m mentioning this to explain a part of the problem. We don’t see a lot of “exactly the same as before” and a new dev on the team lean on the other members quite heavily initially. That is expected, of course, and encouraged. But we identified a key problem in the process. Because the other team members also don’t have a ready made answer, they need to dig into the problem before they can offer assistance, which sometimes (all too often, to be honest) lead to a “can you slide the keyboard my way?” and taking over the hunt. The result is that the new dev does learn, but a key part of the process is missing, the finding out what is going on.
We are going to ask both sides of this interaction to keep track of that, and stop it as soon as they realize that this is what is going on.
The other issue that was raised was the issue of fear. RavenDB is a big system, and it can be quite complex. It is quite reasonable apprehension, what if I break something by mistake?
Here it comes back to the price of failure. Trying something out means that at worst you wasted a work day, nothing else. We are pretty confident in our QA process and system, so we can allow people to experiment. Analysis paralysis is a much bigger problem. And I wasn’t being quite right, trying the wrong thing isn’t wasting a day, you learned what doesn’t work, and hopefully also why.
“I have not failed. I've just found 10,000 ways that won't work.”
― Thomas A. Edison
* Keep & discard is a literal translation of a term that is very common in the IDF. After most activities, there is an investigation performed, and one of the first questions asked is what we want to keep (good things that happened that we need to preserve for the next time we do this) and what we need to discard (bad things that we need to watch out for).
** The actual people are not relevant for this post, so I’m using letters only.
I hate to say it, but it looks like you need some documentation for the project, in the form of:
Design meeting notes: "Forever keep track of why something was decided to be done in a certain way and what was the reasoning and trade offs behind it"
Feature specific technical documentation: "Technical designs and notes for a specific feature and keep, notes, goals, decisions"
This is not bureaucracy (and should not become bureaucracy), it should be a way to capture context, intent, goals, reasoning and trade offs (if any). All the above are really really hard to infer from code alone, and if they become folklore it's even worse.Simply the presence of too much folklore in a big project creates the feeling of uncertainty and fear: You might be able to pass all the QA tests but you cold still implement something that's against the original intent and design, and no matter how much reassurance you get, it becomes a very unpleasant situation for certain individuals.
Yes you get the reassurance that's ok to fail, but most human beings,simply do not like it, and a limited few are actually ok with it (researcher and entrepreneurial risk taking types and are a limited subset of the population still)
Catalin, I think that you are right. We do have a lot of that, some of this in the form of blog posts, or internal design documents. But they don't cover everything, just the things that we considered super hard. That said, we are already working on internal documentations, and we'll have a lot more focus there in the future
If you are looking to document architecture decisions, I've found this template to be helpful:
I've found that deliberately clumsy code reviews are a good approach here.
If I come into a project that has 100Ks lines of code, then I won't be a high performer on day one or even day 90 (depending on the quality of the code). I've missed 5+ years of design meetings and requirements gathering. Every part of the application that I touch is going to carry a ton of baggage that I couldn't know beforehand. Since I want to become self-sufficient what I do is I write very clean, concise code that goes into review... and gets rejected... and leads to rework. Usually, when I do this, I'll have to rewrite quite a lot of code. A lead developer will often be very confused and suspicious and he'll ask me why I do it that way. My answer: "Code review is not a syntax analyzer. I use it to learn multiple things: 1) does this particular implementation meet the requirements (especially the ones I can't even know exist) 2) does this piece of code meet the team's expectation of quality and thoroughness" The answers to those questions I put into my next review. Do this enough times and it gets easier and easier to make progress.
Are the devs pairing during the "finding out what is going on phase"? I wonder if that would help.
Fschwiet, Very often, yes, but not 100% of the time.
Design documents have one major flaw - they get outdated rather quickly. Having all the documentation in one single place under version control (in sync with the code) would be ideal.
Voignt, I actually disagree with that. In the same way you have comments rot, just having the design docs in the same repo isn't going to do anything special
@voight, if the design documents get out of date quickly, they contain they wrong kind of information (mostly tentative decisions for the actual implementation).
It's human nature to try to put design decisions in the design documents, while actually that's the least important part of the document.
The really important parts are: goals, intent, pitfalls, reasoning for or against something or trade offs that are acceptable or are not acceptable.
You should not have in a design document: classes, interfaces, modules, these are implementation details (which change) and should be free to change.
See some examples:
@Ayende, by 'documentation' I had in mind a bare-bones spec (which should be kept current) saying 'this is why this module/sub-project should work like this', which could of course link to design documents, etc.
@Catalin, I agree that for stable projects with a clear vision that is true. For some interpretations of 'agile', requirements can change daily after the customer actually sees the result of first iterations.
@voight, Specs aren't that interesting. A lot more interesting is the details about why certain choice were made (or not made). And those change as we have more data. For example, a particular usage pattern might cause us to rethink a certain optimization scenario.
But spec in my eyes is the kind of contract we give to the user.
For example, facets in RavenDB changed quite a lot internally multiple times. Several time we had an entire re-write of the inner working. The external behavior didn't change.
very nice new blog look!!
Very Nice! Big like for the new design :)
C# to F# would cut the line count dramatically. And also you could manage the dependencies: http://evelinag.com/blog/2014/06-09-comparing-dependency-networks/#.VVGftvk2bqg
...and here is another reference: http://simontylercousins.net/does-the-language-you-use-make-a-difference-revisited/
Thorium, F# would also lead to a lot more complexity, reduced participation in the community, harder to find developers and increased costs all around.
Well... Do you have any reference to backup your claims? :-)
Thorium, Sure, here is the data: Note that in terms of salary range and the amount of position available, we are order of magnitude difference.
C# Developer: http://www.indeed.com/jobs?q=C%23+Developer
Salary Estimate $70,000+ (20128) $90,000+ (15353) $110,000+ (10286) $130,000+ (6727) $150,000+ (4227)
F# developer: http://www.indeed.com/jobs?q=F%23+Developer
Salary Estimate $70,000+ (83) $100,000+ (64) $120,000+ (47) $150,000+ (32) $170,000+ (16)
Ok, thanks. You have that option to hire cheaper developers. I think that the cheapest developers usually will actually increase your costs. But if that is your way, then I wish you good luck, and I accept that as an answer. How about "a lot more complexity"?
Thorium, See full reply here: http://ayende.com/blog/170849/why-ravendb-isnt-written-in-f-or-the-cost-of-the-esoteric-choice