One of the most important things that I learned about working with software is that you have to design, from the get go, the debug ability of the system. This is important, because by default, here is how the your system reports its state:
To make things worse, in many cases, we are talking about things that are really hard to figure out. With Rhino Service Bus, I had started, from day one, to add a lot of debug logs, we had the idea of the logging endpoint, error messages went to the error queue along with the error that they generated, etc.
With NHibernate, it took me three days after starting to build NH Prof to figure out that NH Prof was a great tool to use when working on NHibernate itself. With NH Prof, the actual file format that we used is an event stream, which is logged locally. Which means that the process of opening a saved file and listening to a running application is the same. That means that you can send me you File > Save output, and I can simulate locally the exact conditions that led to whatever problems you had.
As software becomes more and more complex, threading is involved, complex scenarios are introduced, multiple actors involved and more and more data is streaming in. The old patterns of debugging through a problem become less and less relevant. To put it simply, it is less and less possible to easily reconstruct a problem in most scenarios, at least, it is very hard to do from the information given.
Last week I started to add support for profiling RavenDB. The intent was mostly to allow us to build better integration with web applications, but during the process of investigating a very nasty bug, I figured out that I could use this profiling information to figure out what is going on. I did, and seeing the information, it was immediately obvious what was wrong.
I decided that I might was well go the whole hog there and introduced this:
No, it isn’t RavenDB Profiler (not yet), but it is a debug visualizer that you can use to look at the raw requests. It is a bit more than just a Fiddler clone, too, since it has awareness of things like RavenDB sessions, caching, aggressive caching, etc.
In the short time that this API is available, it has already help resolve at least two separate (and very hard to figure out) bugs.
I learned two things from the experience:
- Having the debug hooks is critically important, because it means that you can access that information at need.
- Just having the debug hooks is not enough, you have to provide a good way for them to be used.
In this case, it is a debugger visualizer, but we will have additional ways to expose this information to the users.