Database Building 101: The storage layer

Aug 16 2016

Database Building 101The storage layer

time to read 3 min | 559 words

We are going to be using Voron to actually handle the storage of the data. Voron is a low level storage engine, which provide, among other things, fully ACID, MVCC and high performance.

For today, what'll look at are the following operations:

Add a node.
Add an edge between two nodes.
Traverse from a node to all its edges (cheaply)
Traverse from an edge to the other node (cheaply).

Here is the interface code that we will need to build:

I tried to make it as simple as it possible could be. I’m using NameValueCollection because it is a trivial property bag to serialize / deserialize without bringing any additional dependencies.

Let us focus on the initialization of this class. We need to create a StorageEnvironment in Voron, and setup some structures.

Voron has several different storage options for data, and in this case, we are using a table. A table (which is very much unlike a RDMBS table) it a way to store records that has some indexes, but in this case, we are doing something strange. We are defining a table that has no indexes (this is so strange that we need to explicitly allow it). This is useful because tables manages indexes for us automatically, but in this case we will use them strictly for storage. We’ll see why in just a bit.

The code here is a bit strange. Tables are actually meant to hold multiple values, and define indexes on this. So using them to store a single value is something that you wouldn’t normally do. Except that the id that we get back from the table has a very interesting property. It has O(1) cost of access. So given a node id, I can access it directly, regardless of how big my database is. That is a property that I want, because effectively random access of nodes is something that happens quite a lot and is highly desirable.

Now, let us see how we can connect two edges, shall we. This code ignores all error handling, missing nodes, etc. It is meant to explain the core stuff, this is a toy database, after all.

Here we continue doing strange stuff. We already know that we use the empty schema table to have an O(1) access to data. And we store the edge’s data there. But then we run into some new stuff. We create a B+Tree called “Edges_”+type, which hold all of the edges of a particular type. But the content of this B+Tree is not simple. Instead, it is using fixed size trees. Those are also B+Trees, but they have well known size both for keys (which must be long) and the value (which must be small < 256 bytes). Because they are very compact, we can pack quite a lot of data into them, and work with them efficiently.

The end result is that we are now storing the node data and access it at O(1) cost. We also store a B+Tree full of fixed size tree (whose name is the source node id) and whose keys are the destination nodes, and the values are the edge data.

Confusing yet? Not much different than Dictionary<SourceNodeId, Dictionary<DestinationNodeId, EdgeDataId>>. That is quite enough for today, tomorrow I’ll show how we can traverse the data, and what kind of costs are associated with it.

0 comments

Tags:

Oren Eini

Oren Eini

CEO of RavenDB