Ayende @ Rahien

Hi!
My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:

ayende@ayende.com

+972 52-548-6969

, @ Q c

Posts: 18 | Comments: 65

filter by tags archive

Things we learned from production, part IV–is your paperwork in order?

time to read 2 min | 362 words

One of the major points that we worked on in the 1.2 release was making the ops team work easier. That included additional logging, like we have previously discussed, making RavenDB plays nicer with other parts of the system, adding performance counters, etc.

But those are the obvious things, and this series isn’t about the obvious things. One of the problems that we run into is that we already had a moderately good porthole into how RavenDB works.

The problem was that this porthole gave you access to the state of a single database ,which was great…

Except that in order to get a database statistics, you had to actually load that database. Imagine a system under load, and the admin need to check what is causing the load. The act of checking a database statistics will actually force that database to load, generating even more load. This is especially dangerous when we are talking about automated health monitoring tools, the fact that we monitor the health of our software shouldn’t cause it to do additional work.

In RavenDB 1.2 we have taken steps to make sure that we can report on all the active database without having to guess which ones are active and which aren’t. We have also taken additional steps to make sure that we give the admin even more information about what is going on.

You can see this pattern pretty much everywhere, in indexes, in operations, in database and server stats. There are a lot more places where we explicitly built the hooks to make it possible for the admin to figure out what is going on.

The lesson from that is that you have to provide a lot of information for the administrators, so they can figure out what is going on (and that administrator may very well be you, at 2 AM, trying to diagnose a problem). At the same time, you have to be sure to provide those hooks in a way that have minimal impact on the system. Having admin hooks in place that will put undue burden on the application is seriously not a cool thing to do.


Comments

Sergey Shumov

Ayende, when will you update RavenDB repository at github?

Ayende Rahien

Sergey, It was last updated about 2 hours ago. Check the 1.2 branch.

tobi

The SQL Server guys use the approach to expose even lots of implementation details through DMVs (wait types, latches, ...) so that users can take a peek and diagnose stuff. I think that is the right approach.

Daan Le Duc

Ayende there are only 3 branches none of them is called 1.2

https://github.com/ravendb/ravendb/branches

Am i looking at the wrong place? Would love to see how you implemented your latest blog articles.

Thanks!

Daan Le Duc

Found it! seems to be on your private github account https://github.com/ayende/ravendb/tree/1.2

Jesper

@Daan Try https://github.com/ayende/ravendb/tree/1.2

Ayende Rahien

Daan, Yes, you are looking at the stable branch, you need to look here: https://github.com/ayende/ravendb/tree/1.2

Ayende Rahien

Jesus, They are already per database

dotnetchris

@Ayende i think your response to @Jesus was meant to be on the previous post.

Karep

Would love to here about some details. After reading "Realease It" I'm thinking about making my app better for opps but I'm afraid my application will be full of logging statements making it hard to find the business logic of code.

Comment preview

Comments have been closed on this topic.

FUTURE POSTS

  1. RavenDB 3.0 New Stable Release - 9 hours from now
  2. Production postmortem: The case of the lying configuration file - about one day from now
  3. Production postmortem: The industry at large - 2 days from now
  4. The insidious cost of allocations - 3 days from now
  5. Buffer allocation strategies: A possible solution - 6 days from now

And 4 more posts are pending...

There are posts all the way to Sep 11, 2015

RECENT SERIES

  1. Find the bug (5):
    20 Apr 2011 - Why do I get a Null Reference Exception?
  2. Production postmortem (10):
    31 Aug 2015 - The case of the memory eater and high load
  3. What is new in RavenDB 3.5 (7):
    12 Aug 2015 - Monitoring support
  4. Career planning (6):
    24 Jul 2015 - The immortal choices aren't
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats