I found myself reading this post, and at some point, I really wanted to cry:
We had relatively long, descriptive names in MySQL such as timeAdded or valueCached. For a small number of rows, this extra storage only amounts to a few bytes per row, but when you have 10 million rows, each with maybe 100 bytes of field names, then you quickly eat up disk space unnecessarily. 100 * 10,000,000 = ~900MB just for field names!
We cut down the names to 2-3 characters. This is a little more confusing in the code but the disk storage savings are worth it. And if you use sensible names then it isn’t that bad e.g. timeAdded -> tA. A reduction to about 15 bytes per row at 10,000,000 rows means ~140MB for field names – a massive saving.
Let me do the math for a second, okay?
A two terabyte hard drive now costs 120 USD. By my math, that makes:
- 1 TB = 60 USD
- 1 GB = 0.058 USD
In other words, that massive saving that they are talking about? 5 cents!
Let me do another math problem, oaky?
Developer costs about 75,000 USD per year.
- (52 weeks – 2 vacation weeks) x 40 work hours = 2,000 work hours per year.
- 75,000 / 2,000 = 37.5 $ / hr
- 37.5 / 60 minutes = 62 cents per minutes.
In other words, assuming that this change cost a single minute of developer time, the entire saving is worse than moot.
And it is going to take a lot more than one minute.
Update: Fixed decimal placement error in the cost per minute. Fixed mute/moot issue.
To those of you pointing out that real server storage space is much higher. You are correct, of course. I am trying to make a point. Even assuming that it costs two orders of magnitudes higher than what I said, that is still only 5$. Are you going to tell me that saving the price of a single cup of coffee is actually meaningful?
To those of you pointing out that MongoDB effectively stores the entire DB in memory. The post talked about disk size, not about memory, but even so, that is still not relevant. Mostly because MongoDB only requires indexes to fit in memory, and (presumably) indexes don't really need to store the field name per each indexed entry. If they do, then there is something very wrong with the impl.