Graphs in RavenDB: Pre-processing the queries

Sep 20 2018

Graphs in RavenDBPre-processing the queries

time to read 4 min | 744 words

A query has two audiences: the users and the query engine. Ideally, you need to come up with a query language that would serve both. One of the early decisions that we made with the query language is that we want to be:

Very flexible for the user, giving them several ways to express themselves.
Be very rigid in the query engine, with only one way to do something.

These two requirements are directly contradicting one another, which is indeed somewhat of a problem. The key here is that we don’t want to produce multiple ways to do the same thing in the query engine. That is a great way to introduce:

Different actual execution plans.
Features that only work with a specific syntax.
More complexity overall.

Anyone who ever worked with the internals of Linq can attest to the complexity that is involved here.

Let’s take the simple query that we have been inspecting so far:

Now, let’s ask RavenDB to spit it back out for us, shall we? Here is how RavenDB thinks about this query:

In other words, the way RavenDB sees the query and the way the user sees the query are very different. You can see that we have the with edges clauses here, defining the edges on the query.

In other words, all of the query definitions are happening in the with and with edges clauses. When we need to actually perform the matches, the match clause only defines the graph pattern that we need to match on. It is the responsibility of the query parser to arrange the query from the multiple ways that the user may want to define it to the single representation that is actually going to be executed by the query runner.

This may seem like a lot of ceremony, but that is only because we have a very simple query. Let’s change the “friends of friends who aren’t my friends” to something a bit more interesting: “Close friends of my close friends who aren’t my friends”. We are also going to want to limit the friends that we follow only to Users (so, for example, we’ll not follow a FriendOf link to a Pet).

Here is what the query looks like, when we use more concise syntax, and how RavenDB translates it:

You’ll note that even for the query above, I still used a separate with clause to make things easier, the following query is exactly the same:

The basic idea is that for trivial filtering, you’ll probably want to do that inline, inside the match clause. But anything more complex should go to the with clause where you can more easily express your logic. Also note that aliases matter. The f1 and f2 here are not duplicated for no reason, part of processing the query is to bind a value to each of the aliases, and you cannot bind a single result to multiple aliases.

Another key aspect of this mode is that while this is pretty easy to follow, a with clause can contain any query. That means that you can use indexes as well, including Map/Reduce indexes. Here is one such example:

In this case, I”m not sure how good a graph query this is, I’ll admit, but it does a good job of demonstrating what you can do. We are taking a few queries, mixing them together and then mashing the results to find London companies who didn’t order as much as they used to.

This means that the source information for graph queries can be things like spatial queries, full text search, map/reduce, etc. A lot of the complexity in graphs queries is just getting to do the start of the graph pattern matching. With RavenDB, you have a very strong query language and facilities to help you get past that and directly into the graph operations.

This is enough about the pre-processing the query, in my next post, I’m going to go into depth into how graph queries work with document models.

Tweet Share Share 6 comments

Tags:

More posts in "Graphs in RavenDB" series:

(08 Nov 2018) Real world use cases
(01 Nov 2018) Recursive queries
(31 Oct 2018) Inconsistency abhorrence
(29 Oct 2018) Selecting the syntax
(26 Oct 2018) What’s the role of the middle man?
(25 Oct 2018) I didn’t mean to build this feature!
(22 Oct 2018) Query results
(21 Sep 2018) Graph modeling vs. document modeling
(20 Sep 2018) Pre-processing the queries
(19 Sep 2018) The query language
(18 Sep 2018) The overall design

Comments

20 Sep 2018
18:48 PM

Ed Toro

Can you give an example of what a User document would look like? I'm interested in how the edge directions (like "Close" in your example) are specified. Thanks!

20 Sep 2018
19:47 PM

Oren Eini

Ed, That would be something like:

{
     "Name": "Ross",
      "Friends": [
             { "Name": "Rachel", "Close": true, "Id": "users/23-B" },
             { "Name": "Chandler", "Close": false, "Id": "users/21-B" },  
      ]
}

24 Sep 2018
11:35 AM

Ryan Heath

How would the graph query differ when you put parts of the friends in different (child)documents, to avoid getting to large documents.

// Ryan

24 Sep 2018
21:22 PM

Oren Eini

Ryan, I'm not sure that I'm following. You'll either start from the root and follow the link, or start fro the other document and go from there. Can you explain in more details?

25 Sep 2018
08:19 AM

Ryan Heath

Like the bids model on your blog.

What if you split the friends into multiple 'friendlist' documents, a single friend document contains say 100 friends:

users/1-A/friends/1
users/1-A/friends/2
etc etc

I'm not familiar with the graph language, so excuse me if it is a nooby question :)

// Ryan

26 Sep 2018
06:09 AM

Oren Eini

Ryan, In this case, we'll have something like:

UserFriends collection, and we'll issue a query like so:

(uf:UserFriends ( User == 'users/1-a') )-[:FriendOf]->(friend:Users)

And that would be that, pretty much.

Comment preview

Comments have been closed on this topic.

Oren Eini

Oren Eini

CEO of RavenDB