The cost of the cloud
I mentioned that I’m teaching a Cloud Computing course at university in a previous post. That lead to some good questions that I have to field about established wisdom that I have to really think about. One such question that I run into was about the intersection of databases and the cloud.
One of the most important factors for database performance is the I/O rate that you can get. Let’s take a fairly typical off the shelf drive, shall we?
Cost of the drive is less than 500 $ US for a 2TB disk and it can write at close 5GB / sec with sustained writes sitting at 3GB /sec at User Benchmark, it is also rated to hit 1 million IOPS. That is a lot. And that is when you spend less than 500$ on that.
On the other hand, a comparable drive would be Azure P40, which cost 235.52$ per month for 2TB of disk space. It also offers a stunning rate of 7,500 IOPS (with bursts of 30,000!). The write rate is 250MB/sec with bursts of 1GB/sec. The best you can get on Azure, though, is an Ultra disk. Where a comparable disk to the on premise option would cost you literally thousands per month (and would be about a tenth of the performance).
In other words, the cloud option is drastically more costly. To be fair, we aren’t comparing the same thing at all. A cloud disk is more than just renting of the hardware. There is redundancy to consider, the ability to “move” the disk between instances, the ability to take snapshots and restore, etc.
A more comparable scenario would be to look at NVMe instances. If we’ll take L8sv2 instance on Azure, that gives us a 2TB NVMe drive with 400,000 IOPS and 2GB/sec throughput. That is at least within reach of the off the shelf disk I pointed out before. The cost? About 500$ per month. But now we are talking about a machine that has 8 cores and 64 GB of RAM.
The downside of NVMe instances is that the disk are transient. If there is a failure that requires us to stop and start the machine (basically, moving hosts), that would mean that the data is lost. You can reboot the machine, but not stop the cloud allocation of the machine without losing the data.
The physical hardware option is much cheaper, it seems. If we add everything around the disk, we are going to get somewhat different costs. I found a similar server to L8sv2 on Dell for about 7,000 $ US, for example. Pretty sure that you can get it for less if you know what you are doing, but it was my first try and it included 3.2 TB of enterprise grade NVMe drives.
Colocation pricing can run about 100$ a month (again, first search result, you can get for less) and that means that the total monthly cost is roughly 685$. That is comparable to the cloud, actually, but doesn’t account for the fact that you can use the same server for much longer than a single year. It is also probably wasting a lot of money on bigger hardware. What you don’t get, which you probably want, is the envelope around that. The ability to say: “I want another server” (or ten), the ability to move and manage your resources easily, etc. And that is as long as you are managing just hardware resources.
You don’t get any of the services or the expertise in running things. Given that even professional organizations can suffer devastating issues, you want to have an expert manage than, because an armature handling that topic lead to problems.
A lot of the attraction of the cloud comes from a very simple reason. I don’t want to deal with all of that stuff. None of that is your competitive advantage and you would rather just pay and not think about that. The key for the success of the cloud is that globally, you are paying less (in time, effort and manpower) than taking the cost of managing it yourself.
There are two counterpoints here, though.
- At some scale, it would make sense to move out from the cloud to your own hardware. Dropbox did that at some point, moving some of its infrastructure off the cloud to savings of over 75 million dollars. You don’t have to be at Dropbox size to benefit from having some of your own servers, but you do need to hit some tipping point before that would make sense.
- StackOverflow is famously running on their own hardware, and is able to get great results out of that. I wonder how much the age of StackOveflow has to do with that, though.
The cloud is a pretty good abstraction, but it isn’t one that you get for free. There are a lot of scenarios where it makes a lot of sense to have some portions of your system outside of the cloud. The default of “everything is in the cloud”, however, make a lot of sense. Specifically because you don’t need to do complex (and costly) sizing computations. Once you have the system running and the load figured out, you can decide if it make sense to move things to your own severs.
And, of course, this all assumes that we are talking about just the hardware. That is far from the case in today’s cloud. Cloud services are another really important aspect of what you get in the cloud. Consider the complexity of running a Kubernetes cluster, or setting up a system for machine vision or distributed storage or any of the things that the cloud providers has commoditized.
The decision of cloud usage is no longer a simple buy vs. rent but a much more complex one about where do you draw the line of what should be your core concerns and what should be handled outside of your purview.
A very good open-ended article. I like how it removes the biggest false statement from the table about a simple buy vs. rent which is no longer relevant.
This article makes particularly good points, but it fails to mention one obvious cost - the human cost of keeping thing running. If you are a large enough organization and the software you are supporting has a very direct impact on the business bottom line (like StackOverflow) then you can afford the humans needed to maintain the on-premises software. If not, you will often find that having one (1) employee working with a cloud-based service instead of a half dozen (6) employees working on-premises is cheaper. Each employee might cost between $6000 and $12000 per month (in the U.S. anyway). You might be able to have these people work on other things, but that will require more middle management which also cost money. Also, the people specialized in your database technology might be unsuited for developing you SPA.
Any cost calculations that do not include labor is likely to be far off the mark.
The labor cost is probably the most major aspect, yes. But be aware that you aren't trading that off completely. And having a single employee to the cloud is likely to bite you down the road, you still need capacity to handle 24x7 operations, etc.
The fact that there is 24x7 support at the cloud doesn't really help if you don't have someone on your team that even knows what is going there.