ReSaving to Blob

time to read 3 min | 438 words

Mats (NPersist) is talking about Saving to Blob, basically suggesting that you can turn your persistence strategy into Table(Id, Blob):

An alternative would be to serialize the non-identity fields into a blob of some kind and save that to just one column in the database. The database table would then have primary key columns matching the identity fields and then just one more blob column for the serialized non-identity fields of the object.

Since this basically prevent any useful filtering on the data, he suggest that you can extract columns outside the blob, much in the way we currently do with indexes.

As far as I am concerned, this is the worst possible persistence strategy. It means that you have taken some of the great strengths of the database and turned it into a stupid file server, for no real benefit. Mats suggest that this can have positive performance implications, but I doubt that this is the case. Just consider the cost of doing something as simple as providing a sortable, paged, grid using this approach. If the user has chosen to sort on a column that is inside the blob, then you would need to pull all the table in memory, sort it, and then hand the application just the page it is interested in.

It also completely destroy the possibility of working with the database with database oriented tools. I can go into my DBs and look at the tabular data, but 0x20495bacfca12312 doesn't have much meaning to me. ETL tools, reporting, etc are going to become useless. Not to mention that now you are in versioning hell. Serialization is notoriously difficult to handle correct across versions, and to have all the data in the DB in a format that is sensitive to serialization is quite a problem.

Handling associations between objects is another issue, I can use the DB's FK capabilities if I am using the standard mode, but a blob can have no FK enforcement by the DB.  In fact, I can't really think of a good reason to do this, and I can think of plenty of reasons not to.

As far as I know, there were "Object <-> Blob Mappers" on the Java side at one time, and they are one of the points that always come up when an argument about OR/M come up, because that is a really bad way to handle this requirement.

Note: There is a really good article on this issue with Java Serialized Objects & Reporting that I have read a few years ago in an Oracle site, but I can't find it.

More posts in "Re" series:

  1. (19 Jun 2024) Building a Database Engine in C# & .NET
  2. (05 Mar 2024) Technology & Friends - Oren Eini on the Corax Search Engine
  3. (15 Jan 2024) S06E09 - From Code Generation to Revolutionary RavenDB
  4. (02 Jan 2024) .NET Rocks Data Sharding with Oren Eini
  5. (01 Jan 2024) .NET Core podcast on RavenDB, performance and .NET
  6. (28 Aug 2023) RavenDB and High Performance with Oren Eini
  7. (17 Feb 2023) RavenDB Usage Patterns
  8. (12 Dec 2022) Software architecture with Oren Eini
  9. (17 Nov 2022) RavenDB in a Distributed Cloud Environment
  10. (25 Jul 2022) Build your own database at Cloud Lunch & Learn
  11. (15 Jul 2022) Non relational data modeling & Database engine internals
  12. (11 Apr 2022) Clean Architecture with RavenDB
  13. (14 Mar 2022) Database Security in a Hostile World
  14. (02 Mar 2022) RavenDB–a really boring database