Ayende @ Rahien

My name is Ayende Rahien
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:


+972 52-548-6969


Posts: 5,947 | Comments: 44,540

filter by tags archive

Death by 70,000 facets

Just sit right back and you'll hear a tale, a tale of a fateful bug. That started from a simple request, about a feature that was just a bit too snug.

Okay, leaving aside my attempts at humor. This story is about a customer reporting an issue. “Most of the time we have RavenDB running really fast, but sometimes we have high latency requests”.

After a while, we managed to narrow it down to the following scenario:

  • We have multiple concurrent requests.
  • Those requests contains a Lazy request that has a facet query.
  • The concurrent requests appears to all halt and then complete together.

In other words, it looks like we had all those requests waiting on a lock, then when it is released, all of them are free to return. This makes sense, there is a cache lock in the facet code that should behave in this manner. But when we looked at that, we could see that this didn’t really behave in the way we expected it to.

Eventually we got to test this out on the client data, and that is when we were able to pin point the issue.

Usually, you have facets like this:

The one of the left is when searching Amazon for HD, the one on the right is when you search Amazon for TV.


In RavenDB, you typically express this sort of query using:

session.Query<Product>().Search(“Search”, query).ToFacets(new Facet { Name = “Brand”} );

And we expect the number of facets that you have in a query to be in the order of a few dozens.

However, the client in question has done something a bit different. I think that this is because they brought the system over from a relational database. Each product in the system had a list of facets associated with it. It looked something like:

“Facets”: [13124,87324,32812,65743]

Obvious, this means that this product belongs to the “2,000 – 3,500” price range facet (electronics) the “Red” color facet (electronics, handheld) etc…

In total, we had over 70,000 facets in the database, and that is just something that we never really expected. Because we didn’t expect it, we reacted… poorly when we had to deal with it. In fact, what happened was that pattern of behavior meant that we effectively had worse than not having a cache, we would always have to do the work, and never really gain any benefit from it (there wasn’t enough sharing to actually trigger the benefits of the cache). And because we did locks on the cache to ensure that we don’t get into a giant mess… Well, you can figure out how it went from there.

The fix was to actually devolve the code in to a simpler method. Instead of trying to be smart and just figure out what we needed to compute for this query, we can be aggressive and load everything we needed. All the next requests will result in no wait time, because the data is already there. The code became much simpler.

Oh, and we got it deployed to production and saw a 75% decrease in the average request time, and no more sudden waits when we had requests piling up.



A 400% decrease in response time?

Paul Turner

I must be really stupid, even for a Monday: I can't understand what you mean by a "400% decrease" in request time. I read it as if a 50% decrease would mean "half the time". Need more Red Bull.

I assume the method which was doing poorly in this scenario had been written with some kind of optimization around the "in the order of a few dozens" scenario which didn't work for this 70,000 facets scenario. Did your simplification of the behaviour negatively or positively affect the "a few dozens" scenario? I'm curious whether it was straight-up enhancement or a compromise.

hilton smith

400% decrease just sounds weird. if you decrease by more than 100% you end up in negative.

Tom Derisk

it's a predictive system! LOL

Ayende Rahien

Guys, I fixed the post. I was meant to be the reverse of 400% increase.

hilton smith

I figured thats what you meant, but for some reason it just became a focal point of the article...

When a customer has an large amount of facets like they do here is it not worth restructuring the data in some way?

Ayende Rahien

Hilton, If we can sure. If we can't, we see if we there is anything that we can do to help

Ryan Heath

Was this a case of premature optimization? Since code is devolved into something simpler.

// Ryan

Ayende Rahien

Ryan, Pretty much, we tried to read less data, but it actually turned out that just reading it all.

Ayende Rahien

Ryan, Pretty much, we tried to read less data, but it actually turned out that just reading it all was cheaper.


Which build did this happen in? please include the relevant builds in post like these Oren.

Comment preview

Comments have been closed on this topic.


No future posts left, oh my!


  1. RavenDB Sharding (2):
    21 May 2015 - Adding a new shard to an existing cluster, the easy way
  2. The RavenDB Comic Strip (2):
    20 May 2015 - Part II – a team in trouble!
  3. Challenge (45):
    28 Apr 2015 - What is the meaning of this change?
  4. Interview question (2):
    30 Mar 2015 - fix the index
  5. Excerpts from the RavenDB Performance team report (20):
    20 Feb 2015 - Optimizing Compare – The circle of life (a post-mortem)
View all series



Main feed Feed Stats
Comments feed   Comments Feed Stats