The wages of sinProper and improper usage of abstracting an OR/M
Originally posted at 3/11/2011
This time, this is a review of the Sharp Commerce application. Again, I have stumbled upon the application by pure chance, and I have very little notion about who wrote it.
In this case, I want to focus on the ProductRepository:
In particular, those methods also participate in Useless Abstraction For The Sake Of Abstraction Anti Pattern. Here is how they are implemented:
public AttributeItem GetAttributeItem(int attributeItemId) { return Session.Get<AttributeItem>(attributeItemId); } public Attribute GetAttribute(int attrbuteId) { return Session.Get<Attribute>(attrbuteId); } public IEnumerable<Attribute> GetAllAttributes() { return Session.QueryOver<Attribute>() .Future<Attribute>(); } public void SaveOrUpdate(Attribute attribute) { Session.SaveOrUpdate(attribute); }
And here is how they are called (from ProductService):
public AttributeItem GetAttributeItem(int attributeItemId) { return productRepository.GetAttributeItem(attributeItemId); } public Attribute GetAttribute(int attrbuteId) { return productRepository.GetAttribute(attrbuteId); } public void SaveAttribute(Attribute attribute) { productRepository.SaveOrUpdate(attribute); } public IList<Product> GetProducts() { return productRepository.GetAll(); } public Product GetProduct(int id) { return productRepository.Get(id); } public void SaveOrUpdate(Product product) { productRepository.SaveOrUpdate(product); } public void Delete(Product product) { productRepository.Delete(product); } public IEnumerable<Attribute> GetAllAttributes() { return productRepository.GetAllAttributes(); }
Um… why exactly?
But as I mentioned, this post is also about the proper usage of abstracting the OR/M. A repository was originally conceived as a to abstract away messy data access code into nicer to use code. The product repository have one method that actually do something meaningful, the Search method:
public IEnumerable<Product> Search(IProductSearchCriteria searchParameters, out int count) { string query = string.Empty; if (searchParameters.CategoryId.HasValue && searchParameters.CategoryId.Value > 0) { var categoryIds = (from c in Session.Query<Category>() from a in c.Descendants where c.Id == searchParameters.CategoryId select a.Id).ToList(); query = "Categories.Id :" + searchParameters.CategoryId; foreach (int categoryId in categoryIds) { query += " OR Categories.Id :" + categoryId; } } if (!string.IsNullOrEmpty(searchParameters.Keywords)) { if (query.Length > 0) query += " AND "; query += string.Format("Name :{0} OR Description :{0}", searchParameters.Keywords); } if (query.Length > 0) { query += string.Format(" AND IsLive :{0} AND IsDeleted :{1}", true, false); var countQuery = global::NHibernate.Search.Search.CreateFullTextSession(Session) .CreateFullTextQuery<Product>(query); var fullTextQuery = global::NHibernate.Search.Search.CreateFullTextSession(Session) .CreateFullTextQuery<Product>(query) .SetFetchSize(searchParameters.MaxResults) .SetFirstResult(searchParameters.PageIndex * searchParameters.MaxResults); count = countQuery.ResultSize; return fullTextQuery.List<Product>(); } else { var results = Session.CreateCriteria<Product>() .Add(Restrictions.Eq("IsLive", true)) .Add(Restrictions.Eq("IsDeleted", false)) .SetFetchSize(searchParameters.MaxResults) .SetFirstResult(searchParameters.PageIndex * searchParameters.MaxResults) .Future<Product>(); count = Session.CreateCriteria<Product>() .Add(Restrictions.Eq("IsLive", true)) .Add(Restrictions.Eq("IsDeleted", false)) .SetProjection(Projections.Count(Projections.Id())) .FutureValue<int>().Value; return results; } }
I would quibble about whatever this is the best way to actually implement this method, but there is little doubt that something like this is messy. I would want to put this in a very distant corner of my code base, but it does provides a useful abstraction. I wouldn’t put it in a repository, though. I would probably put it in a Search Service instead, but that isn’t that important.
What is important is to understand where there is actually a big distinction between code that merely wrap code for the sake of increasing the abstraction level and code that provide some useful abstraction over an operation.
More posts in "The wages of sin" series:
- (24 Mar 2011) Hit that database one more time…
- (23 Mar 2011) Inverse leaky abstractions
- (22 Mar 2011) Proper and improper usage of abstracting an OR/M
- (21 Mar 2011) Re-creating the Stored Procedure API in C#
- (18 Mar 2011) Over architecture in the real world
Comments
Maybe it's time to stop bashing that poor application?
If I get you right, service and repository abstractions should be used only to wrap complex code behind a useful abstraction, and there is no issue with using session.Get <product(productId) right in the controller.
Yet some customers will have the following requirement: controllers and services should not expose (depend on) the underlying ORM because "we may want to change in the future", eg we're coding with NHibernate but may want to switch over to LLBLGen later on, and the business & presentation layers should not get affected.
Apart from running away ;-) how would you address that requirement?
class LLBGenSession : NHibernate.ISession { ... }
then run your tests?
@Stephen - Ayende's answer will be "I've never had to do that" or something along those lines.
While it seems silly to have a repository which exposes nothing more than get/save/delete and basic queries that, according to Ayende should be shoved right up in the controller, most people don't WANT to shove all that NH stuff in the controller, they want it abstracted away regardless of it being a method with nothing more than 'return Session.Get...'
The only real issue with the repository pattern is when we end up with silghtly variating methods like:
ListAll
ListAllActive
ListAllActivePaged...
This is when the repository pattern is stupid and becomes a nightmare. (in my opinion)
I completely disagree with shoving all that NH query up in the front of the application however.
When dealing with paged data, dynamic sorting, etc, I typically have an abstraction to deal directly with this (IPagedDataSource), rather than piggy backing it on a repository. IPagedDataSource contains property such as CurrentPage, Limit, PageSize, List <sortby, Expression <func<entity,>
Not only does this give clients a firm contract to deal with but everything is very DRY and encapsulated. Sometimes I pull data using ORM, something View, or whatever.
The only thing I really like about the repository pattern is hiding prefetch paths, but now you get methods like FetchWith, FetchWithThis, etc.
I'm moving towards using my ORM for the WRITE side directly in my command handlers and using encapsulated expressions and extension methods to hand off some of the more complex logic.
@NC you can easily avoid the slightly variating methods using Expression Of Func(T,bool) as the parameter of the search method (using LINQ or QueryOver).
IMO. The biggest problem with the (decoupled) repository pattern is that you lose a lot of functionality of NHibernate (Fetch and Future, and probably some others that I don't remember) .
Although you can still create a repository that always returns .Future() ,which returns an IEnumerable, at the end of the query.
Also, If you don't mind creating a tightly coupled Repository you can also return QueryOver. This kind of repository might be helpful to get a DRY approach, and not just about abstracting the ORM.
Also, an unabstracted repository can still make it easier to change an ORM for 80% of the code... While you don't get a 100% abstracted ORM you don't have to change most of your code.
I have seen your last few posts on how this application has done OR/M Access the wrong way. Can you go into more detail on how you believe that you should be doing it? Or provide some links to where you already have?
Stop bashing this application. You should review an application considered a reference. Not an unknown application. Maybe the author was just trying stuff and playing around. Why don't you review the Orchard project, I think it is using Nhibernate. Better review one of your own applications.
@Max - I agree. What I meant however was that I think repositories are fine when used correctly, but are bad when we add similar or slightly varying methods. It's those sorta of queries which make me cringe :(
@Franck - I just wish Ayende would 'bash' an application and then show how he would do it, instead of just saying 'this is wrong' and never showing how he would do it.
Ok so don't use the repository pattern, great, demonstrate a rewrite of that section for how you would implement it, as a working example, so people can learn from it.
In my case I have been through many version of data access, having a nice abstraction layer that encapsulates that seems like a good idea, moving that logic into controllers seems to violate the open close principal, changing my data access technology might mean changes to a lot of code not just repositories?
I guess if a assumption can me made that the data access technology is not going to change probably having that logic in the controller is not a bad idea.
The author of the app already posted here yesterday, he doesn't seem to mind too much and he states it was more of an exercise. Granted, a comment is not very visible.
If you want to see how Ayende makes Apps, goto https://github.com/ayende, there are over 20 repositories right now that show off his way of doing stuff.
What a breath of fresh air. I can’t tell you how many applications I’ve worked on that implement “just in case” coding. Well written code -- code that has good separations of concerns does not need layers of abstractions. The side effect of separations of concerns is malleable and extensible code.
Fabian
I've never seen any application switch out it's data access technology. Nowadays dataaccess technologies have different providers built into them i.e. EF4, nHibernate
"I've never seen any application switch out it's data access technology."
Really? I have, quite a bit. Usually it is not "swapping out," but plugging a different database technology. It is not uncommon to run an SQL database, especially if you have an accounting package that uses it, alongside a document database.
Arguably you could say that it would be better to abstract this away with a middleware message broker, but for a lot of apps that's not necessary.
gs,
When I say swapping out, I'm referring to the database technology. For example switching out nHiberate for EF4. nHibernate and EF4 both support most databases, switching the database at the database technology level seems more prudent that moving that abstraction into your application.
If you're willing to take it as a given that persistence ignorance is an impractical goal, does this mean that container ignorance is also? I'm wondering about this as I write a system that uses something like IServiceLocator to hide it. I think it's clear it's necessary for a framework such as MVC, but what about for your corporate default architecture?
I agree with the comment regarding useless abstractions. For example I would avoid having IProductRepository inherit anything. However, I do like the repository pattern for a few reasons and with a few caveats. The repository can be a clear specification of how data is accessed. When developing an application, determining the types of CRUD and queries that will be performed upon an entity is an important step. It can affect database indexes, constraints and table structure. A repository interface can expose this information in a clear and concise manner. One caveat is over-abstraction, such as using a generic repository IRepository <t. A method which exposes IQueryable to the service or application layer hides the many of the semantics of IQueryable. There will always be differences in semantics of this interfaces based on the underlying provider. Moreover, even if switching to a different data access technology is not foreseen, it is more clear to encapsulate different queries that can be performed in a simple query object. By simple I mean something that does not utilize expression trees. Therefore, isolating data access and all associated data access semantics in the repository seems like a great decision.
Having a service layer which wraps a repository is a decision which depends on the situation. If a service class merely wraps a repository it certainly does seem redundant. However, there may be other service classes in the project which do perform additional non-data access operations. For example, a shopping cart service must be able to retrieve a shopping cart from a data source, but it must also have a submit operation which contacts a payment gateway. In this case, for repository operations the service can either expose the repository as a property, or mirror the methods. Mirroring the methods may bring an important degree of consistency. For example, it is nice to be able to say that the application layer performs all operations on the domain through the service layer. Furthermore, if the application is of a distributed nature, having a facade service layer can allow mapping to DTOs. A service facade layer may lead to the duplicate class hierarchy anti-pattern, however this is not caused by the act of wrapping a repository.
It gets boring i think. Instead of indicating again and again that it is a bad thing to use repositories or any other abstraction for NH, show us the opposite: How would you do it nowadays? Mainly in big applications with data from different sources, complex queries and so on?
Leo,
Abstraction for the sake of consistency leads you down the path of useless abstractions.
Its ironic that Ayende preaches simplicity and pragmatism over overcomplicated architectures when his own project (NHProf) is an overcomplicated mess. Nothing but a glorified profiler, and buggy at that.
@Ayende,
I too would like to see some code showing how it all should be done.
Maybe you can make a simple demo web app showing what you consider best practice?
@Bane k [MSFT] ... it WORKS for me (unlike Sharepoint, Commerce Server, Dynamics, EF, errr and pretty much any junk MSFT has slung at us over the last few years).
I have worked on this project for more than 3 years. 1.5 year into the project it was suddenly a requirement that we were not allowed to call any database directly, but would have to go through a WCF service tier. Luckily we had a data access layer, so this switch only affected the repository classes which were changed to call a WCF service instead of a database.
What I've learned from IT projects throughout my years is that the future is unpredictable and things will change. It's hard to predict what will change, but looking back in time it’s a big chance that frontend technology will change, so putting too much logic in MVC controllers is a bad thing if the application is to last for many years and is relatively big in size. While putting things in the business layer and data access layer is a good thing since they generally change less frequently.
@Bane k [MSFT] "preaches simplicity and pragmatism over overcomplicated architectures when his own project (NHProf) is an overcomplicated mess"
This is a logical fallacy. Just because you are on fire doesn't invalidate the credibility of the statement "being on fire is bad".
@Lars "we had a data access layer, so this switch only affected the repository classes which were changed to call a WCF service instead of a database"
I don't understand this logic at all. This statement could easily be "I coded all my data access in the controllers so all I had to change was my controllers" or "I used the query object pattern so all I had to change was my query objects". If you change data access methods you still have to change all the data access code. Tautological.
As long as you have some semblance or separation of concerns then it really doesn't matter if you've used an overly abstracted and useless repository pattern or whatever the heck else.
Ayende is simply arguing against useless repository abstractions. Nobody is advocating having the UI layer have hard coded sql strings or anything.
@Chuck Conway
I don't think wrapping calls to a repository in a service layer is an example of an abstraction as some of the other examples presented are. Even if it is viewed as an abstraction in that you're abstracting the data access component away from the service when there is potentially no need to, the scope of the abstraction is limited to just that and if limited to that scope it shouldn't grow into a useless abstraction. Furthermore, I believe it is a valuable distinction between layers in an application. You have a service layer which exposes the functionality of an application. A lot of times this functionality is data access, however that is left up to the implementation of the service.
All your recent posts seem to target (or assume) tiny web projects, which, to be fair, most .Net apps apparently are.
Repository pattern is part of DDD, which is specifically suited for large and complex application, typically enterprise projects. E.g. many (java) projects I've worked on typically have been running non-stop for over a decade by dozenz of developers (with high turnover), and have gone trough several major platform upgrades (newer JDK, EJB version, frameworks, app-servers).
Scattering all your data-access concerns all over your controllers would NOT have been a sustainable approach in such projects. You can ducttape changes after changes as the application grows for only so many years before the project starts to fall apart by its own weight.
A more structured and formalised way of working is absolutely necessary to make for a stronger foundation to support the project for many generations to come.
"Structured and formalised" often means ceremonial bureaucratic code. E.g. we even often make the presentation layer to strictly ONLY talk to app/domain services, even when the service is just a stupid wrapper around a single line of code, or a repository that's just a stupid wrapper over an ORM call. Nonetheless this consistency is enforced in this kind of formal working environment. Therefore, when you need to create a new Order, for example, you dont get the confusion of wether you should just call the repository directly or if there's already a service created for it, because you know there will ALWAYS be a service for it. The service may do smart thing (sending email, create a case, put in an audit, check security, etc), or it may be just a single line of stupid code, but it eliminates confusions. Even when it's a stupid single-liner, it may grow in the future, and you don't have to look all over the project (based on some unreliable source-code search-pattern) and apply this new logic everywhere.
Services and repositories have very specific roles, even when it only contains a single-line of code, it still has to be there to fill that role. The role can't be simply delegated elsewhere for simplicity sake, because in large apps, consistency is FAR more important than simplicity. Consistencies in the code is the only medium of communication between generations of developers in the project.
And to eliminate slightly variating "Findxxx" methods in Repositories, we can use QueryObject or Speification objects that can be tested in isolation.
Repository's role is primarily to maintain a collection of aggregate-roots and its lifetime, while the role to search specific objects within this collection can be wrapped behind those QueryObjects or Specifications.
Unfortunately one of Ayende's recent posts also strongly argue against these patterns, which makes sense in short-lived web projects.
One of the things that often happen in large projects, for example, is that the client would sometimes just decide to bring in some random backend services that replaces/overlaps with certain existing functionality in the system, just because they feel like it. Be it an order-provisioning system, inventory-management, case-management, etc or replacing the billing-engine with a more powerful one.
Many stuff that were originally stored in our own database will suddenly have to be moved to web-services or remote EJBs. This does happen quite often actually.
If all our data-access requirements are encapsulated behind a repository interface, switching over a new backend system is quite a walk in a park. We typically just give this interface requirement of ours to the new third-party vendor, and they will start working to deliver a backend service that meets our requirement.
How disasterous would that be had we implemented our data-access against ISession directly all over our presentation layers?
Comment preview