Ayende @ Rahien

Hi!
My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:

ayende@ayende.com

+972 52-548-6969

, @ Q c

Posts: 6,124 | Comments: 45,475

filter by tags archive

Things we learned from production, part III–singleton thinking makes long queues

time to read 3 min | 471 words

One of the more interesting things that we had to learn in production was that we aren’t an only child. It is a bit more complex than that, and I am not explaining well, let me start at the beginning.

Usually, when we work on RavenDB, we work within the scope of a single database, all of our efforts are usually scoped to that. That means that when we worked on the multi database feature for RavenDB, we actually focused on the process of loading a single database up in the air. We considered how multiple databases will interact, and we made sure that they are isolated from one another, but that was about it.

In particular, as mentioned in the previous post, starting up and shutting down were done sequentially, on a per database basis. In order to prevent issues, we had a lock on the initialize database part of the process, so two requests to the same database will not result in the same database being loaded twice.

I mentioned that we were thinking on a single database mindset, right?

Can you guess what happened?

  • Request for DB #1 – lock acquired, starting up
    • Request for DB #1 – waiting for lock to release
    • Request for DB #1 – waiting for lock to release
    • Request for DB #1 – waiting for lock to release
  • DB initialized, lock released
  • All requests are now freed and can be processed.

What happen when we have multiple databases, however?

  • Request for DB #1 – lock acquired, starting up
    • Request for DB #1 – waiting for lock to release
    • Request for DB #2 – waiting for lock to release
    • Request for DB #3 – waiting for lock to release
  • DB initialized, lock released
  • Request for DB #2 – lock released, lock acquired, starting up
    • Request for DB #3 – waiting for lock to release

You guessed it, we actually had a global lock for starting (or disposing, for that matter) databases. That meant that a single db that took time to start would impact other databases.

More importantly, it would means that other requests, which were waiting for that database to load and then had to load their own database, had far less time to actually do the processing they needed. Which meant that they were far more likely to run into the request time limit and be aborted by IIS. Which left them in an inconsistent state. Which was a nightmare to figure out.

We resolved this issue by making sure that the lock is now handled only on the same database, and that we won’t lock forever, if after a while we still don’t have the db, we will error early and give you a 503 Service Unavailable error until the db is ready to rock.


Comments

Gene Hughson

The Highlander rule...if there can be only one, expect lots of mayhem ;-)

Andrew

Would be great ayende if you started tagging your posts with the build number that the changes came into effect

jesús

I'm afraid I have fallen in the same singleton thinking while developing the update cascade bundle https://github.com/jesuslpm/UpdateCascadeBundle. Now I need to support multiple databases. I guess there will be an instance of IStartupTask, AbstractPutTrigger and so on per database ?

Comment preview

Comments have been closed on this topic.

FUTURE POSTS

  1. The design of RavenDB 4.0: Making Lucene reliable - 6 hours from now
  2. RavenDB 3.5 whirl wind tour: I’ll find who is taking my I/O bandwidth and they SHALL pay - about one day from now
  3. The design of RavenDB 4.0: Physically segregating collections - 2 days from now
  4. RavenDB 3.5 Whirlwind tour: I need to be free to explore my data - 3 days from now
  5. RavenDB 3.5 whirl wind tour: I'll have the 3+1 goodies to go, please - 6 days from now

And 13 more posts are pending...

There are posts all the way to May 30, 2016

RECENT SERIES

  1. RavenDB 3.5 whirl wind tour (14):
    02 May 2016 - You want all the data, you can’t handle all the data
  2. The design of RavenDB 4.0 (13):
    28 Apr 2016 - The implications of the blittable format
  3. Tasks for the new comer (2):
    15 Apr 2016 - Quartz.NET with RavenDB
  4. Code through the looking glass (5):
    18 Mar 2016 - And a linear search to rule them
  5. Find the bug (8):
    29 Feb 2016 - When you can't rely on your own identity
View all series

RECENT COMMENTS

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats