History of storage costs and the software design impact

time to read 4 min | 642 words

Edgar F. Codd formulated the relational model in 1969. Ten years later, Oracle 2.0 comes to the market. And Sybase SQL Server came out with its first version in 1984. By the early 90s, it was clear that relational database has pushed out the competition (such as navigational or object oriented databases) to the sidelines. It made sense, you could do a lot more with a relational database, and you could do it easier, usually faster and certainly in a more convenient manner.

Let us look at what environment those relational databases were written for. In 1979, you could buy the IBM's 3370 direct access storage device. It offered a stunning 571MB (read, megabytes) of storage for the mere cost of $35,100. For reference the yearly salary of a programmer at that time was $17,535. In other words, the cost of a single 571MB hard drive was as much as two full time developers, for an entire year.

In 1980, the first drives with more than 1 GB storage appeared. The IBM 3380, for example, was able to store a whopping 2.52 GB of information, the low end version cost, at the time, $97,650 and it was about as big as a washing machine. By 1986, the situation improved and purchasing a good internal hard drive with all of 20MB at merely $800. For reference, a good car at the time would cost you less than $7,000.

Skipping ahead again, by 1996 you could actually purchase a 2.83 GB drive for merely $2,900. A car at that time would cost you $12,371. I could go on, but I'm sure that you get the point by now. Storage used to be expensive. So expensive that it dominated pretty much every other concern that you can think of.

At the time of this writing, you can get a hard disk with 10 TB of storage for about $400 [1]. And a 1 TB SSD drive will cost you less than $300[2]. Those numbers give us about a quarter of a dollar (26 cents, to be exact) per GB for SSD drives, and less than 4 cents per GB for the hard disk.

Compare that to a price of $38,750 per gigabyte in 1980. Oh, except that we forgot about inflation, so the inflation adjusted price for a single GB was $114,451.63. Now, you will be right if you'll point out that this is very unfair comparison. I'm comparing consumer grade hardware to high end systems. Enterprise storage systems, the kind you actually run databases on tend to be a bit above that price range. We can compare the cost of storing information in the cloud, and based on current pricing it looks like storing a GB on Amazon S3 for 5 years (to compare with expected life time of a hard disk) will cost less than $1.5, with Azure being even cheaper.

The really interesting aspect of those numbers is the way they shaped the software written at that time period. It made a lot of sense to put a lot more on the user, not because you were lazy, but because it was the only way to do things. Most document databases, for example, are storing the document structure alongside the document itself (so property names are stored in each document. It would be utterly insane to try to do that in a system where hard disk space was so expensive. On the other hand, decisions such as “normalization is critical” were mostly driven by the necessity to reduce storage costs, and only transitioned later on to the “purity of data model” reasoning once the disk space cost became a non issue.

 


[1] ST10000VN0004 - 7200RPM with 256MB Cache

[2] The SDSSDHII-1T00-G25 - with great then 500 MB / sec read/write speeds and close to 100,000 IOPS