We have been working on a big benchmark of RavenDB recently. The data size that we are working on is beyond the TB range and we are dealing with over a billion documents. Working with such data sizes can be frustrating, because it takes quite a bit of time for certain things to complete. Since I had downtime while I was waiting for the data to load, I reached to a new toy I just got, a Raspberry PI 400. Basically, a Raspberry Pi 4 that is embedded inside a keyboard. It is a pretty cool machine and an awesome thing to play around with:
Naturally, I had to try it out with RavenDB. We have had support on running on ARM devices for a long while now, and we have dome some performance work on the Raspberry PI 3. There are actually a whole bunch of customers that are using RavenDB in production on the Pi. These range from embedding RavenDB in industrial robots, using RavenDB to store traffic analysis data on vehicles and deploying Raspberry PI servers to manage fleets of IoT sensors in remote locations.
The Raspberry PI 4 is supposedly much more powerful and I got the model with 4GB of RAM to play around with. And since I already had a data set that hit the TB ranges lying around, I decided to see what we could do with both of those.
I scrounged an external hard disk that we had lying around that had sufficient capacity and started the import process. This is where we are after a few minutes:
A couple of things to notice about this. At this point the import process is running for about two and half minutes and imported about 4 million documents. I want to emphasize that this is running on an HDD (and a fairly old one at that). Currently I can feel its vibrations on the table, so we are definitely I/O limited there.
Once I’ll be done with the data load (which I expect to take a couple of days), we’ll be testing this with queries. Should be quite fun to compare the costs of this to a cloud instance. Given typical cloud machines, we can probably cover the costs of the PI in a few days.