Ayende @ Rahien

My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:


+972 52-548-6969

, @ Q c

Posts: 6,125 | Comments: 45,492

filter by tags archive

Things we learned from production, part I–shutting down is hard to do

time to read 3 min | 507 words

This series of posts is going to talk about the things that we have learned ourselves and via our customers about running RavenDB in production. Those customers include people running on a single database on a Celeron 600 Mhz with 512 MB all the way to monsters like what RavenHQ is doing.

This particular story is about the effect of shutdown on RavenDB in production environments. Before we can do that, I have to explain the sequence of operations when RavenDB shuts down:

  • Stop accepting new connections
  • Abort all existing connections
  • For each loaded database:
    • Shut down indexing
    • For each index:
      • Wait for current indexing batch to complete
      • Flush the index
      • Close the index
    • Close database
  • Repeat the same sequence for the system database
  • Done

I am skipping a lot of details, but that is the gist of it.

In this case, however, you might have noticed something interesting. What happen if we have a large number of active databases, with a large number of actual indexes?

In that case, we have to wait for the current indexing batch to complete, then shut down each of the indexes, then move to the next db, and do the same.

In some cases, that can take a while. In particular, long enough while that we would get killed. Either by automated systems that decided we passed our threshold (in particular, iisreset gives you mere 20 seconds to restart, which tend to be not enough) or by an out of patience admin.

That sucks, because if you get killed, you don’t have the time to do a proper shutdown. You crashed & burned and died and now you have to deal with all the details of proper resurrection. Now, RavenDB prides itself on actually being a regular in this matters. You can yank the power cord out and once everything is back up, RavenDB will recover gracefully and with no data loss.

But, recovering from such scenarios can take precious time. Especially if, as is frequently the case in such scenarios, we have a lot of databases and indexes to recover.

Because of that, we actually had to spend quite a bit of time on optimizing the shut down sequence. It sounds funny, isn’t it? Very few people actually care about the time it takes them to shut down. But as it turned out, we have a fairly limited budget for that.  In particular, we parallelized the process of shutting down all of the databases together, and all of their indexes together as well.

That means more IO contention than before, but at least we could usually meet the deadline.  Speaking of which, we also added additional logging and configuration that told common hosts (such as IIS) that we really would like some more time before we would be hang out to dry.

On my next post, I’ll discuss the other side, how hard it is to actually wake up in the morning Smile.



When ASP.NET notifies you that it wants to unload the appdomain you can stay alive indefinitely by just not returning from your callback. Only after all callbacks have been called the countdown begins.

Certainly not what the architects intended but it works.

Ayende Rahien

Tobi, That is ASP.Net, that isn't IIS. IIS give you severe time limits.


In an azure 'worker role' environment would you be able to request more time if the server is self hosted (Raven.Database.Server.HttpServer)?

As an aside - would you recommend using the HttpServer instead of hosting via IIS (I wasn't sure how to use IIS hosting)

Ayende Rahien

Andrew, I am not familiar enough with the way worker roles work. We run RavenDB in production inside IIS. Self hosted, service mode, also have some limit on shutdown, in the sense that the service manager will give an error if you take too long, but won't kill you. IIS will kill you if you take too long


Ayende, out of curiosity, could you please expand a bit on the RavenHQ Monster ;)

Ayende Rahien

Louis, RavenHQ is hosting a LOT of databases, and it is one of the heaviest loaded RavenDB setups. That is what I meant, it has brought to our attention a lot of interesting issues like that.


Is there a way for a custom bundle to participate in the shutdown process?

I'm developing the update cascade bundle. This bundle might have several task in progress that probably need to be canceled.

Ayende Rahien

Jesus, We will call the IStartupTask dispose method, if it implements IDisposable, which is the hook you get when we shut down.


RavenHD is insanely expensive.. I just can't understand anyone paying that much. If I need that capacity, I would go MySQL, jezuz

Comment preview

Comments have been closed on this topic.


  1. RavenDB 3.5 whirl wind tour: I'll have the 3+1 goodies to go, please - 3 days from now
  2. The design of RavenDB 4.0: Voron has a one track mind - 4 days from now
  3. RavenDB 3.5 whirl wind tour: Digging deep into the internals - 5 days from now
  4. The design of RavenDB 4.0: Separation of indexes and documents - 6 days from now
  5. RavenDB 3.5 whirl wind tour: Deeper insights to indexing - 7 days from now

And 10 more posts are pending...

There are posts all the way to May 30, 2016


  1. The design of RavenDB 4.0 (14):
    05 May 2016 - Physically segregating collections
  2. RavenDB 3.5 whirl wind tour (14):
    04 May 2016 - I’ll find who is taking my I/O bandwidth and they SHALL pay
  3. Tasks for the new comer (2):
    15 Apr 2016 - Quartz.NET with RavenDB
  4. Code through the looking glass (5):
    18 Mar 2016 - And a linear search to rule them
  5. Find the bug (8):
    29 Feb 2016 - When you can't rely on your own identity
View all series


Main feed Feed Stats
Comments feed   Comments Feed Stats