Graphs in RavenDBPre-processing the queries

time to read 4 min | 744 words

A query has two audiences: the users and the query engine. Ideally, you need to come up with a query language that would serve both. One of the early decisions that we made with the query language is that we want to be:

  • Very flexible for the user, giving them several ways to express themselves.
  • Be very rigid in the query engine, with only one way to do something.

These two requirements are directly contradicting one another, which is indeed somewhat of a problem. The key here is that we don’t want to produce multiple ways to do the same thing in the query engine. That is a great way to introduce:

  • Different actual execution plans.
  • Features that only work with a specific syntax.
  • More complexity overall.

Anyone who ever worked with the internals of Linq can attest to the complexity that is involved here.

Let’s take the simple query that we have been inspecting so far:

image

Now, let’s ask RavenDB to spit it back out for us, shall we? Here is how RavenDB thinks about this query:

image

In other words, the way RavenDB sees the query and the way the user sees the query are very different. You can see that we have the with edges clauses here, defining the edges on the query.

In other words, all of the query definitions are happening in the with and with edges clauses. When we need to actually perform the matches, the match clause only defines the graph pattern that we need to match on.  It is the responsibility of the query parser to arrange the query from the multiple ways that the user may want to define it to the single representation that is actually going to be executed by the query runner.

This may seem like a lot of ceremony, but that is only because we have a very simple query. Let’s change the “friends of friends who aren’t my friends” to something a bit more interesting: “Close friends of my close friends who aren’t my friends”. We are also going to want to limit the friends that we follow only to Users (so, for example, we’ll not follow a FriendOf link to a Pet).

Here is what the query looks like, when we use more concise syntax, and how RavenDB translates it:

image

You’ll note that even for the query above, I still used a separate with clause to make things easier, the following query is exactly the same:

image

The basic idea is that for trivial filtering, you’ll probably want to do that inline, inside the match clause. But anything more complex should go to the with clause where you can more easily express your logic.  Also note that aliases matter. The f1 and f2 here are not duplicated for no reason, part of processing the query is to bind a value to each of the aliases, and you cannot bind a single result to multiple aliases.

Another key aspect of this mode is that while this is pretty easy to follow, a with clause can contain any query. That means that you can use indexes as well, including Map/Reduce indexes. Here is one such example:

image

In this case, I”m not sure how good a graph query this is, I’ll admit, but it does a good job of demonstrating what you can do. We are taking a few queries, mixing them together and then mashing the results to find London companies who didn’t order as much as they used to.

This means that the source information for graph queries can be things like spatial queries, full text search, map/reduce, etc. A lot of the complexity in graphs queries is just getting to do the start of the graph pattern matching. With RavenDB, you have a very strong query language and facilities to help you get past that and directly into the graph operations.

This is enough about the pre-processing the query, in my next post, I’m going to go into depth into how graph queries work with document models.

More posts in "Graphs in RavenDB" series:

  1. (08 Nov 2018) Real world use cases
  2. (01 Nov 2018) Recursive queries
  3. (31 Oct 2018) Inconsistency abhorrence
  4. (29 Oct 2018) Selecting the syntax
  5. (26 Oct 2018) What’s the role of the middle man?
  6. (25 Oct 2018) I didn’t mean to build this feature!
  7. (22 Oct 2018) Query results
  8. (21 Sep 2018) Graph modeling vs. document modeling
  9. (20 Sep 2018) Pre-processing the queries
  10. (19 Sep 2018) The query language
  11. (18 Sep 2018) The overall design