reInvestigating query performance issue in RavenDB

time to read 3 min | 533 words

At the beginning of the year, we run into a problematic query. The issue was the use of an in clause vs. a series of OR. You can see the previous investigation results here. We were able to pinpoint the issue pretty well, very deep in the guts of Lucene, our query engine.

Fast Query Slow Query
image image
Time: 1 – 2 ms Time: 60 – 90 ms
image image

The key issue for this query was simple. There are over 600,000 orders with the relevant statuses, but there are no orders for CustomerId “customers/100”. In the OR case, we would evaluate the query lazily. First checking the CustomerId, and given that there have been no results, short circuiting the process and doing no real work for the rest of the query. The IN query, on the other hand, would do things eagerly. That would mean that it would build a data structure that would hold all 600K+ documents that match the query, and then would throw that all away because no one actually needed that.

In order to resolve that, I have to explain a bit about the internals of Lucene. As its core, you can think of Lucene in terms of sorted lists inside dictionaries. I wrote a series of posts on the topic, but the gist of it is:

 

Note that the ids for documents containing a particular term are sorted. That is important for a lot of optimizations in Lucene, which is also a major problem for the in query. The problem is that each component in the query pipeline needs to maintain this invariant. But when we use an IN query, we need to go over potentially many terms. And then we need to get the results in the proper order to the calling code. I implemented a tiered approach. If we are using an IN clause with a small number of terms in it (under 128), we will use a heap to manage all the terms and effectively do a merge sort on the results.

When we have more than 128 terms, that stops being very useful, however. Instead, we’ll create a bitmap for the possible results and scan through all the terms, filling the bitmap. That can be expensive, of course, so I made sure that this is done lazily by RavenDB.

The results are in:

  OR Query IN Query
Invalid CustomerId 1.39 – 1.5 ms 1.33 – 1.44 ms
Valid CustomerId 17.5 ms 12.3 ms

For the first case, this is now pretty much a wash. The numbers are slightly in favor of the IN query, but it is within the measurement fluctuations.

For the second case, however, there is a huge performance improvement for the IN query. For that matter, the cost is going to be more noticeable the more terms you have in the IN query.

I’m really happy about this optimization, it ended up being quite elegant.

More posts in "re" series:

  1. (27 Oct 2020) Investigating query performance issue in RavenDB
  2. (27 Dec 2019) Writing a very fast cache service with millions of entries
  3. (26 Dec 2019) Why databases use ordered indexes but programming uses hash tables
  4. (12 Nov 2019) Document-Level Optimistic Concurrency in MongoDB
  5. (25 Oct 2019) RavenDB. Two years of pain and joy
  6. (19 Aug 2019) The Order of the JSON, AKA–irresponsible assumptions and blind spots
  7. (10 Oct 2017) Entity Framework Core performance tuning–Part III
  8. (09 Oct 2017) Different I/O Access Methods for Linux
  9. (06 Oct 2017) Entity Framework Core performance tuning–Part II
  10. (04 Oct 2017) Entity Framework Core performance tuning–part I
  11. (26 Apr 2017) Writing a Time Series Database from Scratch
  12. (28 Jul 2016) Why Uber Engineering Switched from Postgres to MySQL
  13. (15 Jun 2016) Why you can't be a good .NET developer
  14. (12 Nov 2013) Why You Should Never Use MongoDB
  15. (21 Aug 2013) How memory mapped files, filesystems and cloud storage works
  16. (15 Apr 2012) Kiip’s MongoDB’s experience
  17. (18 Oct 2010) Diverse.NET
  18. (10 Apr 2010) NoSQL, meh
  19. (30 Sep 2009) Are you smart enough to do without TDD
  20. (17 Aug 2008) MVC Storefront Part 19
  21. (24 Mar 2008) How to create fully encapsulated Domain Models
  22. (21 Feb 2008) Versioning Issues With Abstract Base Classes and Interfaces
  23. (18 Aug 2007) Saving to Blob
  24. (27 Jul 2007) SSIS - 15 Faults Rebuttal
  25. (29 May 2007) The OR/M Smackdown
  26. (06 Mar 2007) IoC and Average Programmers
  27. (19 Sep 2005) DLinq Mapping