That No SQL Thing
Probably the worst thing about relational databases is that they are so good in what they are doing. Good enough to conquer the entire market on data storage and hold it for decades.
Wait! That is a bad thing? How?
It is a bad thing because relational databases are appropriate for a wide range of tasks, but not for every task. Yet it is exactly that that caused them to be used in contexts where they are not appropriate. In the last month alone, my strong recommendation for two different client was that they need to switch to a non relational data store because it would greatly simplify the work that they need to do.
That met with some (justified) resistance, predictably. People think that RDBMS are the way to store data. I decided to write a series of blog posts about the topic, trying to explain why you might want to work with a No SQL database.
Relational Databases have the following properties:
- Table / Row based
- Rich querying capabilities
- Foreign keys
Just about any of the No SQL approaches give up on some of those properties., usually, it gives up on all of those properties. But think about how useful an RDBMS is, how flexible it can be. Why give it up?
Indeed, the most common reason to want to move from a RDBMS is running into the RDBMS limitations. In short, RDBMS doesn’t scale. Actually, let me phrase that a little more strongly, RDBMS systems cannot be made to scale.
The problem is inherit into the basic requirements of the relational database system, it must be consistent, to handle things like foreign keys, maintain relations over the entire dataset, etc. The problem, however, is that trying to scale a relational database over a set of machine. At that point, you run head on into the CAP theorem, which state that if consistency is your absolute requirement, you need to give up on either availability or partition tolerance.
In most high scaling environments, it is not feasible to give up on either option, so relational databases are out. That leaves you with the No SQL options, I am going to dedicate a post for each of the following, let me know if I missed something:
- Key/Value store
- Key/Value store – sharding
- Key/Value store - replication
- Key/Value store – eventually consistent
- Document Databases
- Graph Databases
- Column (Family) Databases
Other databases, namely XML databases and Object databases, exists. Object databases suffer from the same problem regarding CAP as relational databases, and XML databases are basically a special case of a document database.