Schema-less databases
This post about how Friend Feed is using schema-less storage for most of their work is fascinating. In the ALT.Net Seattle there was a session about that, which generated a lot of interest.
My next post will have more details about the actual implementation details of doing something like that in a manner easily accessible in .Net, but just reading the post is very interesting. Another item that I found that was an interesting read, although it is far harder to read is: http://highscalability.com/how-i-learned-stop-worrying-and-love-using-lot-disk-space-scale
Comments
This is a very interesting subject. The trick is to know when to use it - on large, scalable systems, which seems counter-intuitive since normal RDBMS usage is intended for large, scalable systems.
When do you use RDBMS, then, and when do you use this approach? (Which is quite simple to google's approach with GData, I might add).
And now that I read the second link, I see that it does in fact speak of GData's approach. Huh.
Great read. RDBMS are meant to be performant and scalable, and maybe they are. But their manageability ends quickly when amount of data grows. I have very similar experience with a system built on top of SQL Server and an OR mapper. The system was meant for high availability, but upgrades are now very time consuming and dangerous because it takes hours to update database structure. You have only one chance to update database, because if something goes wrong you dont have time to retry in service time. What's worse, rollback takes the same time as completing the operation.
If I were designing the software again, I'd consider a more unstructured approach with custom indexing instead of rigid relational structure.
__In particular, making schema changes or adding indexes to a database with more than 10 - 20 million rows completely locks the database for hours at a time.
In MySql -- yes. It is the single worst performing DB I have ever worked with, especially considering index changes. Finding the workarounds for this can not be considered a breakthrough -- it would be much better if they just fixed it.
However, I do not yet see a value in having a schema-less DB2 or Sql Server (but it might be I just had no real use case).
Comment preview