Ayende @ Rahien

Refunds available at head office

Think about production, silly!

We just finished doing a big optimization in RavenDB, and one of the things that we needed to do was to store additional (internal) information so we could act upon it later on. If you must know, we now keep track of stats during indexing and can select the appropriate indexing approach based on the amount of data that we have available.

The details about this aren’t that important. What is important is that this is a piece of data that is used by RavenDB to make decisions. That means that just about the worst thing that we could possibly do is leave things at this state:

Think about what will happen in production, when you have an annoyed (and tired) ops team trying to figure out what is going on. Having a black box is the worst thing that you could possibly do, because you give the admin absolutely no input. And remember, you are going to be the one on call when the support phone rings.

One of the very final touches that we did was to add a debug endpoint that will expose those details to the user, so we could actually inspect them at runtime, and in production.  We have a lot of those, some are intended for monitoring purposes, such as the /admin/stats or the /databases/db-name/stats endpoints, some are meant for troubleshooting, such as the /databases/db-name/logs?type=error endpoint and some are purely for debugging purposes, such as /databases/db-name/indexes/index/name?debug=keys which gives you the stats about all the keys in a map/reduce index.

Trust me, you are going to need those, at some point.

Tags:

Posted By: Ayende Rahien

Published at

Originally posted at

Comments

Matthew Bonig
12/20/2012 03:39 PM by
Matthew Bonig

Is there a cheatsheet or reference somewhere of these various debugging endpoints? I can see this being incredibly useful to have a list of options I can start looking at. Didn't see anything in the docs.

Gene Hughson
12/20/2012 03:43 PM by
Gene Hughson

"Trust me, you are going to need those, at some point."

Indeed. Sometimes things get YAGNI'ed just because the affected stakeholder isn't represented. Not designing for operations is the new not designing for exceptions.

Ayende Rahien
12/20/2012 09:29 PM by
Ayende Rahien

Matthew, They are scheduled to be documented, for now, you can see them here: http://issues.hibernatingrhinos.com/issue/RDoc-50

Sergey Shumov
12/21/2012 04:09 AM by
Sergey Shumov

Good post, one can certainly learn a couple of good practices from RavenDB. Speaking of RavenDB cluster, how would you implement logs (and stats) aggregation? Simple HTTP GET -> Concat -> Display or something exotic like UDP broadcasting?

Ayende Rahien
12/21/2012 06:24 AM by
Ayende Rahien

Sergey, Yes, you can do that by getting the stats from all servers, yes.

Comments have been closed on this topic.