What is new in RavenDB 3.0Query diagnostics

time to read 6 min | 1113 words

concept-18290_640We talked a lot about the changes we made for indexing, now let us talk about the kind of changes we are talking about from the query side of things. More precisely, this is when we start asking questions about our queries.

Timing queries. While it is rare that we have slow queries in RavenDB, it does happen, and when it does, we treat it very seriously. However, in the last few cases that we have seen, the actual problem wasn’t with RavenDB, it was with sending the data back to the client when we had a large result set and large number of documents.

In RavenDB 3.0, we have added the ability to get detailed statistics about what is the cost of the query in every stage of the pipeline.

RavenQueryStatistics stats;
var users = session.Query<Order>("Orders/Totals")
    .Statistics(out stats)
    .Customize(x => x.ShowTimings())
    .Where(x=>x.Company == "companies/11" || x.Employee == "employees/2")
    .ToList();

foreach (var kvp in stats.TimingsInMilliseconds)
{
    Console.WriteLine(kvp.Key + ": " + kvp.Value);
}

Console.WriteLine("Total: " + stats.DurationMilliseconds);

We can now ask RavenDB to explain us its reasoning when doing so:

  • Lucene search: 10
  • Loading documents: 2
  • Transforming results: 0
  • Total: 21

As you can see, the total time for this query is 21 ms, and we have 12 ms accounted for in the actual search time. The rest is network traffic.  This can help you diagnose more easily where the problem is, and hence, how to solve it.

Query timeout and cancellation. As I mentioned, we don’t really have long queries in RavenDB very often. But that is actually is something that happens, and we need a way to deal with that. RavenDB now places a timeout on the amount of time a query gets to run (including querying Lucene, loading documents or transforming the results). A query that doesn’t complete in time will be cancelled, and an error will be returned to the user.

You can also view the currently executing queries and kill a long running query (if you have specified a high timeout, for example).

Explaining queries. Sometimes it is easy to understand why RavenDB has decided to give you documents in a certain order. You asked them sorted by date, and you get them sorted by date. But when you are talking about complex queries, that is much harder. RavenDB will sort the results by default based on relevancy, and that can sometimes be a bit puzzling to understand.

Here is how we can do this:

session.Advanced.DocumentQuery<Order>("Orders/Totals")
                    .Statistics(out stats)
                    .WhereEquals("Company", "companies/11")
                    .WhereEquals("Employee", "employees/3")
                    .ExplainScores()
                    .ToList();

var explanation = stats.ScoreExplantaions["orders/759"];

The result of this would be something that looks like this:

0.6807194 = (MATCH) product of:
  1.361439 = (MATCH) sum of:
    1.361439 = (MATCH) weight(Employee:employees/3 in 469), product of:
      0.4744689 = queryWeight(Employee:employees/3), product of:
        2.869395 = idf(docFreq=127, maxDocs=830)
        0.165355 = queryNorm
      2.869395 = (MATCH) fieldWeight(Employee:employees/3 in 469), product of:
        1 = tf(termFreq(Employee:employees/3)=1)
        2.869395 = idf(docFreq=127, maxDocs=830)
        1 = fieldNorm(field=Employee, doc=469)
  0.5 = coord(1/2)

And if we were to ask for the explanation for orders/237, we will get:

6.047595 = (MATCH) sum of:
  4.686156 = (MATCH) weight(Company:companies/11 in 236), product of:
    0.8802723 = queryWeight(Company:companies/11), product of:
      5.32353 = idf(docFreq=10, maxDocs=830)
      0.165355 = queryNorm
    5.32353 = (MATCH) fieldWeight(Company:companies/11 in 236), product of:
      1 = tf(termFreq(Company:companies/11)=1)
      5.32353 = idf(docFreq=10, maxDocs=830)
      1 = fieldNorm(field=Company, doc=236)
  1.361439 = (MATCH) weight(Employee:employees/3 in 236), product of:
    0.4744689 = queryWeight(Employee:employees/3), product of:
      2.869395 = idf(docFreq=127, maxDocs=830)
      0.165355 = queryNorm
    2.869395 = (MATCH) fieldWeight(Employee:employees/3 in 236), product of:
      1 = tf(termFreq(Employee:employees/3)=1)
      2.869395 = idf(docFreq=127, maxDocs=830)
      1 = fieldNorm(field=Employee, doc=236)

In other words, we can see that orders/237 is ranked much higher than orders/759. That is because is matched both clauses of the query. And a match on Company is a much stronger indication for relevancy because Companies/11 appears only in 10 documents out out 830, while Employees/3 appears in 127 out of 830.

For details about this format, see this presentation, it actually talks about Solr here, but this data comes from Lucene, so it applies to both.

That is it about queries diagnostics, next, we’ll deal with transformers and another important optimization, the staleness reduction system.

More posts in "What is new in RavenDB 3.0" series:

  1. (24 Sep 2014) Meta discussion
  2. (23 Sep 2014) Operations–Optimizations
  3. (22 Sep 2014) Operations–the nitty gritty details
  4. (22 Sep 2014) Operations–production view
  5. (19 Sep 2014) Operations–the pretty pictures tour
  6. (19 Sep 2014) SQL Replication
  7. (18 Sep 2014) Queries improvements
  8. (17 Sep 2014) Query diagnostics
  9. (17 Sep 2014) Indexing enhancements
  10. (16 Sep 2014) Indexing backend
  11. (15 Sep 2014) Simplicity
  12. (15 Sep 2014) JVM Client API
  13. (12 Sep 2014) Client side
  14. (11 Sep 2014) The studio
  15. (11 Sep 2014) RavenFS
  16. (10 Sep 2014) Voron