ReSaving to Blob

time to read 3 min | 438 words

Mats (NPersist) is talking about Saving to Blob, basically suggesting that you can turn your persistence strategy into Table(Id, Blob):

An alternative would be to serialize the non-identity fields into a blob of some kind and save that to just one column in the database. The database table would then have primary key columns matching the identity fields and then just one more blob column for the serialized non-identity fields of the object.

Since this basically prevent any useful filtering on the data, he suggest that you can extract columns outside the blob, much in the way we currently do with indexes.

As far as I am concerned, this is the worst possible persistence strategy. It means that you have taken some of the great strengths of the database and turned it into a stupid file server, for no real benefit. Mats suggest that this can have positive performance implications, but I doubt that this is the case. Just consider the cost of doing something as simple as providing a sortable, paged, grid using this approach. If the user has chosen to sort on a column that is inside the blob, then you would need to pull all the table in memory, sort it, and then hand the application just the page it is interested in.

It also completely destroy the possibility of working with the database with database oriented tools. I can go into my DBs and look at the tabular data, but 0x20495bacfca12312 doesn't have much meaning to me. ETL tools, reporting, etc are going to become useless. Not to mention that now you are in versioning hell. Serialization is notoriously difficult to handle correct across versions, and to have all the data in the DB in a format that is sensitive to serialization is quite a problem.

Handling associations between objects is another issue, I can use the DB's FK capabilities if I am using the standard mode, but a blob can have no FK enforcement by the DB.  In fact, I can't really think of a good reason to do this, and I can think of plenty of reasons not to.

As far as I know, there were "Object <-> Blob Mappers" on the Java side at one time, and they are one of the points that always come up when an argument about OR/M come up, because that is a really bad way to handle this requirement.

Note: There is a really good article on this issue with Java Serialized Objects & Reporting that I have read a few years ago in an Oracle site, but I can't find it.

More posts in "Re" series:

  1. (17 Nov 2022) RavenDB in a Distributed Cloud Environment
  2. (25 Jul 2022) Build your own database at Cloud Lunch & Learn
  3. (15 Jul 2022) Non relational data modeling & Database engine internals
  4. (11 Apr 2022) Clean Architecture with RavenDB
  5. (14 Mar 2022) Database Security in a Hostile World
  6. (02 Mar 2022) RavenDB–a really boring database