Database Building 101High level graph operations
I talked about high level and low level data operations. So far, all we have seen are very low level operations (get node, get edges for, etc).
Let us see how we’ll deal with a bigger challenge. In this case, we want to implement a classic graph operation, doing a depth first search, filtering by both nodes and edges.
Here is how we can implement this:
In the real world, we’ll need quite a bit more. On each node (and edge) we’ll need to decide if to return it from the query, or just traverse through it, etc. And that is just to start with.
But I think this demonstrate the point of how to layer behavior on top of the lower level database operations.
There is one thing that we need to talk about still, this code will actually use a lot of individual transactions, one for each independent operation. That is quite expensive, we can open a single transaction and pass it to the functions we call, so there is just a single cost for the entire duration of the operation.
Other things we can do is to explicitly designate specific scenarios as important and change the design so we can answer them very quickly (as in the O(1) cost for accessing nodes/edge data).
More posts in "Database Building 101" series:
- (25 Aug 2016) Graph querying over large datasets
- (24 Aug 2016) Stable node ids
- (22 Aug 2016) High level graph operations
- (19 Aug 2016) Graphs aren’t toys
- (18 Aug 2016) The cost of graph storage
- (17 Aug 2016) The cost of graphing
- (16 Aug 2016) The storage layer
- (15 Aug 2016) Let us graph this for real
Comments
I'm wondering where this series is heading. Will you show us that document DBs are better than graph DBs or are you planning to build your own graph DB?
Tom, This is so we can talk about building databases. It isn't related to current work.
@TomCollins There is no better here and there. If your problem has graph bias a graph database;(probably no matter how low tech you can go, if well designed, will work better. The same can be said to documents databases and even relations databases. Your database should match your problem. As a rule of thumb, always use the type of database your problem has bias for.
Comment preview