Northwind Starter Kit ReviewData Access and the essence of needless work, Part I
This is a review of the Northwind Starter Kit project, this review revision 94815 from Dec 18 2011.
Update: Andrea, the project’s leader has posted a reply to this series of posts.
I like to start reviewing applications from their database interactions. That it usually low level enough to tell me what is actually going on, and it is critical to the app, so a lot of thought usually goes there.
In good applications, I have hard time finding the data access code, because it isn’t there. It is in the OR/M or the server client API (in the case of RavenDB). In some applications, if they work against legacy databases or without the benefit of OR/M or against a strange data source (such as a remote web service target) may need an explicit data layer, but most don’t.
NSK actually have 5 projects dedicated solely to data access. I find this.. scary.
Okay, let me start outlying things in simple terms. You don’t want to do things with regards to data access the way NSK does them.
Let us explore all the ways it is broken. First, in terms of actual ROI. There is absolutely no reason to have multiple implementations with different OR/Ms. There is really not a shred of reason to do so. The OR/M is already doing the work of handling the abstraction from the database layer, and the only thing that you get is an inconsistent API, inability to actually important features and a whole lot more work that doesn’t' get you anything.
Second, there are the abstractions used:
I don’t like repositories, because they abstract important details about the way you work with the database. But let us give this the benefit of the doubt and look at the implementation. There is only one implementation of IRepository, which uses NHibernate.
As you can see, this is pretty basic stuff. You can also see that there are several methods that aren’t implemented. That is because they make no sense to a data. The reason they are there is because IRepository<T> inherits from ICollection<T>. And the reason for that is likely because of this:
Mediates between the domain and data mapping layers using a collection-like interface for accessing domain objects.
The fact that this is totally the wrong abstraction to use doesn’t enter to the design, it seems.
Note that we also violate the contract of ICollection<T>.Remove:
true if item was successfully removed from the ICollection<T>; otherwise, false. This method also returns false if item is not found in the original ICollection<T>.
There are other reasons to dislike this sort of thing, but I’ll touch on that on my next post.
More posts in "Northwind Starter Kit Review" series:
- (26 Jan 2012) Conclusion
- (24 Jan 2012) That CQRS thing
- (23 Jan 2012) It is all about the services
- (20 Jan 2012) From start to finishing–tracing a request
- (18 Jan 2012) If you won’t respect the database, there will be pain
- (16 Jan 2012) Refactoring to an actual read model
- (13 Jan 2012) Data Access review thoughts
- (12 Jan 2012) The parents have eaten sour grapes, and the children’s teeth are set on edge
- (11 Jan 2012) Data Access and the essence of needless work, Part II
- (10 Jan 2012) Data Access and the essence of needless work, Part I
just my 2 (euro)cents: http://goo.gl/DQUma
@Ayende: I do not like your tone, always arrogance and haughtiness! @Andrea: Nice post! But when a wise man points the moon, the fool looks at the finger [Cit] ...it's mean that 2(euro)cents are too many, Ayende cannot understand the goal of the project and will continue to look at the finger!
@Marta THEN DON'T READ HIS BLOG!
Few comments: there is not reason to have multiple implementations with different OR/Ms Not true, the first target of NSK is a learning audience and everyone uses a different OR/M. NSK shows you how to abstract your architecture from the OR/M so that you can switch between them and recycle your data layer for future applications.
I'm with Oren on this. From what I have seen NSK is a big ball of mud. The goal of the project may be demonstrate "layering", "common design patterns" and a domain model, but it fails to in it's implementation.
I'm curious, if you don't use Repositories what do you use for your data access? I can't remember if you touched on it in a previous blog post but I would be interested to know.
I tend to use a Repository-like pattern (it's really more like the Table Data Gateway than the DDD Repository; in fact I only refer to it as "Repository" because I think it sounds nicer and because it's based on what I saw Rob Conery do in the MVC Storefront series) so I don't have to use ActiveRecord (which I detest for all but the simplest scenarios).
Also, is there any complete sample application that you would recommend as a general "best practices" guideline? Macto, obviously, but that's not quite finished yet :)
Andrea's post does raise an interesting point that puts the practice of supporting multiple ORMs into perspective: NSK has a long history and dates back to a time when the needs of .NET development were quite different to what they are today.
Back in 2004, when NSK first got going, .NET O/R mappers were all commercial offerings produced by two-guys-in-a-garage ISVs. (Anyone remember Pragmatier, EntityBroker, WilsonORMapper, Objectz.net?) These were coming and going like nobody's business, and you couldn't rely on the one you chose still being around six months later. With that in mind, abstracting out your ORM "just in case" you needed to replace it was the only sensible thing to do.
These days, of course, we have NHibernate and Entity Framework, both of which are well established and not going away any time soon, and any alternatives worth using are on Github. With that in mind, it's perfectly sensible to pick one and stick with it, and planning for the contingency where you'll need to swap one for another is much more a waste of time.
As others have pointed out, this project apparently started 6-7 years ago. I think it's important to realize what a different mindset everyone had back then, in terms of .NET development.
While the project has been maintained since, I doubt very much that the architecture has changed much at all in the last 6 years.
I'm not sure it's valid to use such an old architectural example (which was never intended to be used in a commercial/production system) as a learning tool for today.
@Ayende : can you elaborate on
You mean that ICollection<T> is the wrong abstraction, right?
@James : out of idle, can you list some of the "alternatives worth using" that are on Github. I'd like to take a look at them. You are talking of .NET tools, aren't you?
@Joseph: On the contrary, architectural patterns and practices have changed a lot since then. They're a lot more lightweight for starters (e.g. Don't Repeat Yourself, Convention over Configuration, XML bad/JSON good etc) and, as I said, there's the difference between having to choose from a couple of mature, stable ORMs versus a larger number of fleeting, immature ones.
@Giacomo: I'm thinking here of, for example, Subsonic and micro-ORMs such as Massive, Dapper etc. Just off the top of my head...
@James: Point of clarification. I meant that the architecture of NSK hasn't changed much at all, while patterns and practices have. NSK is just too far out-of-date to be used as a learning of example.
Wayne, For queries, I just make the queries from the controller. For reusable queries, I make an extension method for that. For sample application, look at this blog source code.
Giacomo, Yes, ICollection is bad here.
+1 for @Wayne M! @Ayende, have you come across any other applications we can have a look at, besides your own Raccoon Blog, who have implemented 'good practices'? By the way, I implemented the abstraction you talk about here in my first major mvc app, and I have to agree, it is just too much. Thank you and I am learning a great deal from the source of this blog.
I agree with ayende, the ICollection is not suitable for IRepository
Why is ICollection (and/or IQuerably/IEnumerable) the wrong abstraction? Saying that ICollection is bad is ok, but without explanation/reason we can't discuss/learn from it. The only risk I see, is that a developer could load a complete table into memory (you don't want that with large tables).
Also I understand your point about using the OR/M as the repository if all data can be called this way. But if you have several kind of data sources in different technologies, then it would be nice if you have one kind of interface. Also when an object (like Customer) is combined from data out of different data sources, the repository is for me a good place to initialize the object and return it. How would you solve this cases?
Because you already have IQueryable. What other abstraction do you need? Jumping through hoops to try to pretend the database isn't there just isn't worth the effort.
ICollection is the wrong abstraction because the interface promises things the concrete class can't (and shouldn't) do.
Thanks for the reply. I'll definitely check out the blog source code; I have to say I always thought it was "wrong" to put data retrieval methods directly in a controller without some kind of abstraction layer, but maybe that's because I have a mostly WebForms background and it's a cardinal sin to put raw data access code in a code-behind file.
DELETE * FROM TABLE_NAME and SELECT * FROM TABLE_NAME
are possible implementations of Clear and CopyTo :)
Implementing ICollection is absolutely the wrong approach for a repository pattern like this.
Inheritance is "Is A", not "Is Something Like A". 'nuff said.
Regarding the multiple ORM implementation and abstraction, I agree it's definitely not something people should try in real-world projects. You only have to try it once to hopefully realize how much extra work you're doing for something that will never actually be used. Perhaps their goal was to show how the patterns can be applied with different ORM solutions but they should have simply picked one and noted that others could be applied easily enough.
Most modern ORMs already use IQueryable<T> as a repository. Sometimes an interface implements the IQueryable to add Include/Fetch methods.
Operations like Insert/Update/Delete should be done by the unit-of-work.
Jim, I wrote several blog posts expressing my frustrations with different parts of NHibernate. Query parsing in particular has been a problematic area, the way NHibernate SQL generation is dumb, the fact that there are no less than 4 (or maybe 6) different ways to do a query is horrendous. I can go on, probably longer than you care.
It doesn't change the fact that NHibernate is actually: * A really good OR/M * Not meant to be a sample app
Speaking of needless abstractions, I can point out several in NHibernate.
You should write a blog post on what's WRONG with NHibernate. That would actually be productive.
What's the point of abstracting sql with hql for example, and there is no point abstracting data access so that it is db agnostic, no one changes db engines on a whim.
"In good applications, I have hard time finding the data access code, because it isn’t there. It is in the OR/M or the server client API"
I would say the opposite. Repetitive plumbing code might be a bad thing but explicit data access is not.
@jim These two features are a major point in my Point Of Sale application
hql has a better coding interface than SQL, and allows you to use your domain model objects and the property names as they appear in the model, rather than the table and field names, which when using a legacy database, can have absolutely no relationship between your domain name and the field name. It also allows you to express complicated joins between tables in a way that makes it easier to read later on. I have in my application one large hql query, which is the heart of the application, which if I had to write it in sql, would have grown to over 2 pages of sql and would be near unreadable.
It's very useful actually, this feature enabled me to support clients with both SQL and Oracle databases, without having to change my domain model at all (the server database is created from a third party application, so it has the same structure). All I had to change were a few minor references to schema names, and everything kept working. If I didn't had this feature, I would have lost of my largest clients.
SQL is imho abstract enough - it hides the details of the underlying data store behind a nice high-level language. It's bad IMHO that ORMs are all trying to hide it behind their own, often leaky and limited abstractions. It lets you forget about SQL when coding but you are quickly reminded of it's existence when there are any problems in runtime. Anyway, trying to hide ORM's abstraction behind another abstraction (the repository) makes everything worse - so Ayende's right about it. But C# is IMHO ugly and very limited as a data manipulation language and instead of trying to fix it with ORM we should rather think about ditching it and handling the data in a language designed for that purpose. So that ORMs would not be needed at all.