That No SQL ThingGraph databases
After a short break, let us continue the discussion. Think about a graph database as a document database, with a special type of documents, relations. A simple example is a social network:
There are four documents and three relations in this example. Relations in a graph database are more than just a pointer. A relation can be unidirectional or bidirectional, but more importantly, a relation is typed, I may be associated to you in several ways, you may be a client, family or my alter ego. And the relation itself can carry information. In the case of the relation document in the example above, we simply record the type of the association.
And that is about it, mostly. Once you figured out that graph database can be seen as document databases with a special document type, you are pretty much done.
Except that graph database has one additional quality that make them very useful. They allow you to perform graph operations. The most basic graph operation is traversal. For example, let us say that I want to know who of my friends is in town so I can go and have a drink. That is pretty easy to do, right? But what about indirect friends? Using a graph database, I can define the following query:
new GraphDatabaseQuery { SourceNode = ayende, MaxDepth = 3, RelationsToFollow = new[]{"As Known As", "Family", "Friend", "Romantic", "Ex"}, Where = node => node.Location == ayende.Location, SearchOrder = SearchOrder.BreadthFirst }.Execute();
I can execute more complex queries, filtering on the relation properties, considering weights, etc.
Graph databases are commonly used to solve network problems. In fact, most social networking sites use some form of a graph database to do things like “You might know…”.
Because graph databases are intentionally design to make sure that graph traversal is cheap, they also provide other operations that tend to be very expensive without it. For example, Shortest Path between two nodes. That turn out to be frequently useful when you want to do things like: “Who can recommend me to this company’s CTO so they would hire me”.
More posts in "That No SQL Thing" series:
- (03 Jun 2010) Video
- (14 May 2010) Column (Family) Databases
- (09 May 2010) Why do I need that again?
- (07 May 2010) Scaling Graph Databases
- (06 May 2010) Graph databases
- (22 Apr 2010) Document Database Migrations
- (21 Apr 2010) Modeling Documents in a Document Database
- (20 Apr 2010) The relational modeling anti pattern in document databases
- (19 Apr 2010) Document Databases – usages
Comments
Not sure how many posts coming that are going to cover this topic but it's probably a good idea to list some of the leading open source social graph databases in this category:
Neo4J appears to be the leading social graph db that everyone else compares themselves to:
http://neo4j.org/
Although Twitter invented and uses FlockDB so by that definition its also worth a look:
http://github.com/twitter/flockdb
And because its relevant one of the latest big announcements coming out of Facebook was that they've opened their Graph db so you can access it using their Graph API and even connect your own digital content to it using the Open Graph Protocol:
blogs.neotechnology.com/.../...raph-databases.html
Don't tell us you've written a graphdb extension to raven as well... ;-)
G,
I am prototyping a lot of things. :-)
I partially agree with you. You can easly implement a GraphDB-like model on top of any Document DBMS, but you would need also of special operators to walk and traverse graphs.
I'm working to OrientDB. It's a NoSQL hybrid document-graph dbms with special operators for graph operations. The main difference is that you can query it using SQL language with some extension like:
select from People where friends TRAVERSE(1,7) (name = 'Ayende' and surname = 'Rahien')
This means get all the people that have any relationship of friends with you up to the 7th level of separation.
bye,
Lvc@
I'm not sure I understand...
From everything I read, it seems to imply that a Document DB is suited to a database of largely "independent" objects which are aggregates. Document DB's are not well suited for relationships.
But in this post, it's stating the exact opposite. It's actually creating a database of largely "dependent" objects.
I thought this is exactly what a Document DB does not excel / or good at? Isn't this better suited to a Relational Database? Or am I missing something?
When is it "ok" to use relationships and when is it not? Is there a general rule to this? I am really struggling with this.
Eric,
Document DB & Graph DB are similar, but not identical.
Graph DB are optimized for graph operations, doc dbs aren't.
Comment preview