Ayende @ Rahien

Refunds available at head office

Things we learned from production, part IV–is your paperwork in order?

One of the major points that we worked on in the 1.2 release was making the ops team work easier. That included additional logging, like we have previously discussed, making RavenDB plays nicer with other parts of the system, adding performance counters, etc.

But those are the obvious things, and this series isn’t about the obvious things. One of the problems that we run into is that we already had a moderately good porthole into how RavenDB works.

The problem was that this porthole gave you access to the state of a single database ,which was great…

Except that in order to get a database statistics, you had to actually load that database. Imagine a system under load, and the admin need to check what is causing the load. The act of checking a database statistics will actually force that database to load, generating even more load. This is especially dangerous when we are talking about automated health monitoring tools, the fact that we monitor the health of our software shouldn’t cause it to do additional work.

In RavenDB 1.2 we have taken steps to make sure that we can report on all the active database without having to guess which ones are active and which aren’t. We have also taken additional steps to make sure that we give the admin even more information about what is going on.

You can see this pattern pretty much everywhere, in indexes, in operations, in database and server stats. There are a lot more places where we explicitly built the hooks to make it possible for the admin to figure out what is going on.

The lesson from that is that you have to provide a lot of information for the administrators, so they can figure out what is going on (and that administrator may very well be you, at 2 AM, trying to diagnose a problem). At the same time, you have to be sure to provide those hooks in a way that have minimal impact on the system. Having admin hooks in place that will put undue burden on the application is seriously not a cool thing to do.

Comments

Sergey Shumov
09/28/2012 10:29 AM by
Sergey Shumov

Ayende, when will you update RavenDB repository at github?

Ayende Rahien
09/28/2012 10:31 AM by
Ayende Rahien

Sergey, It was last updated about 2 hours ago. Check the 1.2 branch.

tobi
09/28/2012 11:36 AM by
tobi

The SQL Server guys use the approach to expose even lots of implementation details through DMVs (wait types, latches, ...) so that users can take a peek and diagnose stuff. I think that is the right approach.

Daan Le Duc
09/28/2012 12:58 PM by
Daan Le Duc

Ayende there are only 3 branches none of them is called 1.2

https://github.com/ravendb/ravendb/branches

Am i looking at the wrong place? Would love to see how you implemented your latest blog articles.

Thanks!

Daan Le Duc
09/28/2012 01:04 PM by
Daan Le Duc

Found it! seems to be on your private github account https://github.com/ayende/ravendb/tree/1.2

Jesper
09/28/2012 01:15 PM by
Jesper

@Daan Try https://github.com/ayende/ravendb/tree/1.2

Ayende Rahien
09/28/2012 03:46 PM by
Ayende Rahien

Daan, Yes, you are looking at the stable branch, you need to look here: https://github.com/ayende/ravendb/tree/1.2

Ayende Rahien
09/28/2012 03:47 PM by
Ayende Rahien

Jesus, They are already per database

dotnetchris
09/28/2012 05:34 PM by
dotnetchris

@Ayende i think your response to @Jesus was meant to be on the previous post.

Karep
09/30/2012 07:18 PM by
Karep

Would love to here about some details. After reading "Realease It" I'm thinking about making my app better for opps but I'm afraid my application will be full of logging statements making it hard to find the business logic of code.

Comments have been closed on this topic.