Reading the NSA’s codebaseLemonGraph review–Part VII–Summary

time to read 2 min | 344 words

As the final post in this series, I decided to see how I can create a complex query. Given the NSA’s statement, I decided to see if I can use LemonGraph to find a dog of interest. In particular, given our graph, I wanted to find start with a particular dog and find another dog that likes this dog that also like a dog that dislike the original.

As a reminder, here is the graph:


And starting from Arava, I want to find a dog that likes Arava that also likes a dog that isn’t liked by Arava.

The LemonGraph query language isn’t very expressive, or at least I’m not familiar enough with it to make it work properly. I decided on the following query:

n(type="dog", value="arava")->
               @e(type="likes", value="yes")->
               @e(type="likes", value="yes")->
               @e(type="likes", value="no")->
@N(type="dog", value="arava")

This is a bit of a brute force method to do this. It encodes the path directly. There are a few minor things that might not be obvious here. The @ prefix means don’t return this to the user and the N() indicates that we shouldn’t filter already seen values. I can certainly see how this can be useful for certain things Smile. I wonder if LemonGraph has a better way to express such a query.

This is the first time I actually reviewed this kind of codebase, where some things are done in C and some in Python. It was quite interesting to see the interaction between them. The codebase itself is really interesting, but I found it really hard to follow at times. The love affair with tuples and dynamic behavior made the code non trivial and likely is going to cause maintenance issues down the line. It is also quite obvious that this is intended for internal consumption, with very little time or effort spent on “productization”. By that I meant things like better query errors and more obvious thing to do.

It has an interesting approach to solving the problem of graph queries and I’ve learned quite a few things from it.

More posts in "Reading the NSA’s codebase" series:

  1. (13 Aug 2018) LemonGraph review–Part VII–Summary
  2. (10 Aug 2018) LemonGraph review–Part VI–Executing queries
  3. (09 Aug 2018) LemonGraph review–Part V–Query parsing
  4. (08 Aug 2018) LemonGraph review–Part IV–Compressed, sortable integers
  5. (07 Aug 2018) LemonGraph review–Part III - Figuring out queries
  6. (06 Aug 2018) LemonGraph review–Part II - Storing edges and properties
  7. (03 Aug 2018) LemonGraph review–Part I - Storing Nodes