The fallacy of IRepository

time to read 4 min | 693 words

I didn't have a controversial title in a few days :-)

The cause for this post is a post by Rob Conery where he suggests removing the RDMBS from the equation, at least during development. In particular, he suggests using an OODB during development and switching to RDBMS late in the game, when going to production. The reason for doing so in order to reduce the friction of having to maintain a database during development.

One thing that I feel that I have to point out. DB is a friction point in one of several conditions. If your tool doesn't support your rapid changes, then it is usually an indication of a problem with the tool. I can tell you that on my most recent project, there wasn't a day in which the DB schema of the project hasn't been changed in some way, and several times I had to make significant changes (rearranging the entire model). NHibernate takes away a lot of the pain using a DB, because you don't really care about what is going on. And using Active Record attributes or Fluent NHibernate makes it an even easier task. I don't know what the state of convention based configuration is for Fluent NHibernate, but that is a very promising direction.

Anyway, that is not the point of this post.

I agree with a lot of the points that Rob is making, and I'll expand on them in an additional post, but right now I wanted to actually address a comment I made on Rob's post, which I feel wasn't clear enough.

One big problem is that for most applications, trying to change OODB to RDBMS would not work without a LOT of work.

There are a couple of things that I tried to put into a very terse comment. The first is that you should practice the way you play, and that include putting any constraints that you have for production into the development environment. But even this isn't the point of this post.

If you look at the title, you'll see that I am decrying the fallacy of IRepository. In particular, this is what I disagree to:

Hi Oren - if you implemented IRepository<T> as I've done here, how would this not work? Can you be specific in terms of "a lot of work" and what that means?

In this case, the problem is that the interface for IRepository contains a lot of unspoken assumptions about the way you deal with persistence storage. Let us take an example of moving an IRepository between OODB and RDMBS. OODB query access patterns are completely different than the ones that you would use for RDMBS. A trivial difference that has profound implications is getting Blog with all its Posts and all their Comments. The only way of doing this with RDBMS is using joins (in a single statement), which is going to cause Cartesian product, which is expensive in the DB and have to be dealt with in the app layer. In the case of OODB, you just let the OODB handle that and move on. It is not using relational algebra, and it can handle this specific scenario pretty well.

Let us take it from the other way now, all my IRepository implementation recently has been using the future pattern, in which they return an IEnumerable<T> implementation, which is aggregated with all the queries for the request and then sent to the DB as a single remote call. That works really well. But what is going to happen if the OODB doesn't support this notion? (a cursory search didn't reveal anything enlightening, so I am assuming it is not supported for now).

You code previously assumed 1 remote call for N queries, but now you are faced with N remote calls for N queries. Even assuming that each query time is constant, the performance difference between the two is significant and crippling.

IRepository is a good way of decoupling you from the nitty grity details of how things work, but it doesn't decouple you from the abstract notions. Not for any real world implementation, at least.