Awesome RavenDB feature of the Day, Compression

time to read 2 min | 387 words

RavenDB stores JSON documents. Internally on disk, we actually store the values in BSON format. This works great, but there are occasions where users are storing large documents in RavenDB.

In those cases, we have found that compressing those documents can drastically reduce the on-disk size of the documents.

Before we go on, we have to explain what this is for. It isn’t actually disk space that we are trying to save, although that is a nice benefit. What we are actually trying to do is reduce the IO cost that we have when loading / saving documents. By compressing the documents before they hit the disk, we can save in valuable IO time (at the expense of using relatively bountiful CPU time). Reducing the amount of IO we use have a nice impact on performance, and it means that we can put more documents in our page cache without running out of room.

And yes, it does reduce the total disk size, but the major thing is the IO cost.

Note that we only support compression for documents, not for indexes. The reason for that is quite simple, for indexes, we are doing a lot of random reads, whereas with documents, we almost always go with the read/write the full thing.

Because of that, we would have needed to break the index apart to manageable chunks (and thus allow random reads), but that would pretty much ensure poor compression ratio. We run some tests, and it just wasn’t worth the effort.

A final thought, this feature is going to be available for RavenDB Enterprise only.

I am not showing any code because the only thing you need to do to get it to work is use:

<add key="Raven/ActiveBundles" value="Compression"/>

And everything works, just a little bit smaller on disk Smile.