Negative feature response: Automatic attachment compression in RavenDB

time to read 1 min | 187 words

Following my previous post, which mentioned that you can save significantly on disk space if you store a plain text attachment using gzip, we go a feature request:

Perhaps in future attachments could have built-in compression as well?

The answer to that is no, but I thought that it is worth a post to explain why not.

Let’s consider the typical types of attachments that you’ll store in RavenDB. Based on experience, we usually see:

  • PDF files
  • Word / Excel / Power Point
  • Images (JPEG, PNG, GIF, etc)
  • Videoes
  • Designs (floor plans, CAD / DWG, etc)
  • Text files

Aside from the text files, pretty much all the data you’ll store as an attachment is already compressed. In fact, you’ll be hard pressed today to find any file format that does not already have built-in compression.

Compressing already compressed data is… suboptimal. I will not usually lead to significant space savings and can actually make the file size larger. It also burns CPU cycles unnecessarily.

It is better to shift the responsibility to the users in this case, since they have a lot more information about what they actually put into RavenDB and won’t have to guess.