ReSaving to Blob

time to read 3 min | 438 words

Mats (NPersist) is talking about Saving to Blob, basically suggesting that you can turn your persistence strategy into Table(Id, Blob):

An alternative would be to serialize the non-identity fields into a blob of some kind and save that to just one column in the database. The database table would then have primary key columns matching the identity fields and then just one more blob column for the serialized non-identity fields of the object.

Since this basically prevent any useful filtering on the data, he suggest that you can extract columns outside the blob, much in the way we currently do with indexes.

As far as I am concerned, this is the worst possible persistence strategy. It means that you have taken some of the great strengths of the database and turned it into a stupid file server, for no real benefit. Mats suggest that this can have positive performance implications, but I doubt that this is the case. Just consider the cost of doing something as simple as providing a sortable, paged, grid using this approach. If the user has chosen to sort on a column that is inside the blob, then you would need to pull all the table in memory, sort it, and then hand the application just the page it is interested in.

It also completely destroy the possibility of working with the database with database oriented tools. I can go into my DBs and look at the tabular data, but 0x20495bacfca12312 doesn't have much meaning to me. ETL tools, reporting, etc are going to become useless. Not to mention that now you are in versioning hell. Serialization is notoriously difficult to handle correct across versions, and to have all the data in the DB in a format that is sensitive to serialization is quite a problem.

Handling associations between objects is another issue, I can use the DB's FK capabilities if I am using the standard mode, but a blob can have no FK enforcement by the DB.  In fact, I can't really think of a good reason to do this, and I can think of plenty of reasons not to.

As far as I know, there were "Object <-> Blob Mappers" on the Java side at one time, and they are one of the points that always come up when an argument about OR/M come up, because that is a really bad way to handle this requirement.

Note: There is a really good article on this issue with Java Serialized Objects & Reporting that I have read a few years ago in an Oracle site, but I can't find it.

More posts in "Re" series:

  1. (10 Oct 2017) Entity Framework Core performance tuning–Part III
  2. (09 Oct 2017) Different I/O Access Methods for Linux
  3. (06 Oct 2017) Entity Framework Core performance tuning–Part II
  4. (04 Oct 2017) Entity Framework Core performance tuning–part I
  5. (26 Apr 2017) Writing a Time Series Database from Scratch
  6. (28 Jul 2016) Why Uber Engineering Switched from Postgres to MySQL
  7. (15 Jun 2016) Why you can't be a good .NET developer
  8. (12 Nov 2013) Why You Should Never Use MongoDB
  9. (21 Aug 2013) How memory mapped files, filesystems and cloud storage works
  10. (15 Apr 2012) Kiip’s MongoDB’s experience
  11. (18 Oct 2010) Diverse.NET
  12. (10 Apr 2010) NoSQL, meh
  13. (30 Sep 2009) Are you smart enough to do without TDD
  14. (17 Aug 2008) MVC Storefront Part 19
  15. (24 Mar 2008) How to create fully encapsulated Domain Models
  16. (21 Feb 2008) Versioning Issues With Abstract Base Classes and Interfaces
  17. (18 Aug 2007) Saving to Blob
  18. (27 Jul 2007) SSIS - 15 Faults Rebuttal
  19. (29 May 2007) The OR/M Smackdown
  20. (06 Mar 2007) IoC and Average Programmers
  21. (19 Sep 2005) DLinq Mapping