Analyzing the MVC Music Store: Data Access
I thought that it would be a good idea to take EF Prof for a spin one the sample MVC Music Store. It can illustrate some things about EF Prof and about the sample app.
The first step is to download EF Prof, once that is completed, extract the files, there is no install necessary.
We need to reference the HibernatingRhinos.Profiler.Appender in the application:
Then, from the Global.asax Application_Start, we initialize the profiler:
That is all the setup that you need to do :-)
Now, we go to http://localhost:1397/, and we can see the following in EF Prof:
There are several interesting this to see here.
- We executed three queries to load the main page.
- We actually opened four object contexts to handle this request.
- We have a query that generate a warning.
Let us deal with each one in turn:
Multiple queries for the same request. This requires analysis because often we are querying too much. But in this instance there is no problem, we are querying different things and we are doing so efficiently.
Four object contexts to handle a single request is bad. We can see that each query was actually executed through a different object context (and one was idle). There are several problems with having multiple object contexts per request:
- Each object context would open its own connection to the database. I am not sure, but I think that they do so lazily, which means that a single page request resulted in three connections to the database.
- Each object context implements its own unit of work. You might get two different instances that represent the same row in the database.
- You can’t aggregate all your operations into a single SaveChanges() call, you have to make multiple trips to the database.
- You require using System.Transactions and distributed transactions if you want to ensure a transaction boundary around your code.
In short, there are a lot of good reasons to go with request scoped object context, you should do so.
Now, let us look at that query with the warning:
EF Prof generated a warning about unbounded result set for this query. What does this mean? It means that if you had 100,000 genres, this query would attempt to load all of them. I am not an expert on music, but even I think that 100,000 genres are unlikely. The problem with these sort of queries is that it is likely that the number of genres will grow, and not adding a TOP or LIMIT clauses to the query means that you are open to problems when the data does grow.
And with that, let us see what happens when we look at a single album (http://127.0.0.1:1397/Store/Details/392):
This is pretty much the same as the before, with 5 queries required to process this required, but let us dig just a tiny bit deeper:
What we see here is that we have queries that are being generated from rendering the views. That usually trips a warning flag with me, because queries that are being generated from the view are likely to cause problems down the road, data access patterns should rarely change because of view changes, and that is usually and indication that at some point, we will have a SELECT N+1 here.
Now, let us try to add a new item to the cart (http://127.0.0.1:1397/ShoppingCart/AddToCart/669):
We can see that adding an item to the cart is a two steps process, first, we go to /ShoppingCart/AddToCart/669, then we are redirected to /ShoppingCart. Overall, we require 8 queries in two requests.
Let us look at the actual queries required to process AddToCart (note, however, that we are talking about two different object contexts):
Look at the first query, I don’t see us doing anything with the data here. Let us look a bit more closely on where it is generated:
And AddToCart looks like this:
It seems that the ShoppingCart.AddToCart requires an album instance, take a look at it, and see what it does with it. The only thing it does is to use the album id. But we already know the album id, we used that to get the album instance in the first place.
It seems that the first query is there solely to check that the value exists and throw an error from the Single() method. I think we can optimize that:
We removed that query, but we preserve the same behavior (if the application tries to save a new cart item with a missing album id, a FK violation would be thrown). Querying the database is one of the most expensive things that we can do, so it pays to watch out to where we can save in queries.
Now, let us add a few more items to the shopping cart and see what happens :-) I added 6 albums to my cart, and then went to (http://127.0.0.1:1397/ShoppingCart):
Wow! We require 10 queries to process this request, and you can see that EF Prof is urgently requesting that you’ll take a closer look at the last three.
EF Prof has detected that we have a SELECT N+1 error here, where we issue a query per album in the shopping cart. If we will look at the stack trace, we will find a familiar sight:
We have queries being generated from the views, in this case, from this code:
You can see that on line 47, as shown in the profiler, we are access the title property of the album, forcing a lazy load.
Tracking it further, we can see that Model.CartItems is set here:
And that GetCartItems is defined as:
And this generate the following query:
Well, that is easy enough to fix :-)
Which result in this query:
We now bring all the related albums in a single query. Which means that viewing the shopping cart result in:
Just four queries, instead of 10!
Not too shabby, for 15 minutes work with the profiler, even if I say so myself.
In general, when I am developing applications, the profiler is always running in the background, and I keep an eye on it to see if I am doing something that it warns me about, such as SELECT N +1, multiple contexts in the same request, too many calls to the database, etc.
Comments
Nice one Ayende :)
How are you loading EF Prof from a url? is it a silverlight app? or have you registered a url scheme with the OS that you redirect to onload? or something else?
I don't understand the question
Sorry, maybe I misunderstood what was going on:
Are you loading the EF Prof UI from this url? or is it just the web app and you have EF Prof running side-by-side?
I really, really like post like this one. You were professionally reviewing a piece of code, showing how you use your tools and making small changes to remove some bad O/RM performance areas. Additionally, it is great to get a peek into your thought process and how you approach a problem.
Demis, the url is most likely for the web app being run from the visual studio development server...
Microsoft sample bashing seems to be fashion now.
I would love to see a video of Nhprof used like this to fully see it in action.
Demis,
I meant, go to the url in the MVC Music Store application, and then see what queries are being generated in EF Prof.
Tobi,
What sample bashing?
I put the sample through an analysis process, which I full shared.
I point out potential problems with the code and am fully transparent with how I found those problems.
Paul,
TekPub has a series of videos on doing just that.
Ayende, how do you share one context with all repositories? Use Dependency Injection with a per Request Life Manager?
About GetCartItems, putting Include will generate overhead to all calls, including when I dont need Albums
This is so far away from sample bashing, I don't even have words for it. It is just an example to show off the analytical powers of the profiler that is so good that I almost feel like going home and never program again due to inferiority complex. An error like the select + 1 happens quite easily with the lazy loading features. This post shows a (very doable) way to find and avoid them.
The alerting system + Stack trace plain rocks. Great post!
Very enlightening view of EFProf. Impressive!
Would you just log the exception raised from the foreign key constraint in the global handler and redirect to a generic error page? When would you rely on foreign key constraints as opposed to an explicit business rule in the code?
Would you also recommend no queries from the view? I've noticed "Open Session in View" being recommended as the way to handle NHibernate sessions in MVC. Is there a way to stop queries being fired from the view but still have commands (inserts/updates) fired at the end of the request?
Felipe,
I would introduce a Context per Request. This can be done via DI, via HttpContext.Current.Item["ObjectContext"} or however else you wish to do it
Paul,
Depending on how likely I thought that error to be. This sort of error? I would probably let it go to generic error page.
And in this case, it is a very rare error, only occurring if the user manually modified stuff.
As for session management, I would handle that in an action filter. The same advice here applies for NH as well, you shouldn't have queries being generated from the view.
Nice!
In this comment am referring to an earlier post about the sample app.
Taking an EF based app and saying "component based design should be retired" in the earlier post is a bit over the board!
And also mentioning COM in same sentence with EJB :-)
COM was all about interop, where as EJB entity beans etc was how somebody "thought" data ought to be done before Hibernate.
Had a question on the oft repeated topic of anemic domain models.
In your opinion what behavior do we exactly put in the domain?
All the examples I see is some customer.AddOrder (that most probably actually sets the order.Customer back-ref to keep NH happy)
Or say a derived property cart.Total that you unit-test based on items added to cart.
Anything more complex like say tax calculation you put in a service / "component"!
So where exactly do we draw the distinction between a data holder entity v/s true domain?
Ajai
Ajai,
Everything that relates to the entity should be done internally.
For example, payroll calculation should be in the employee.
Because historically Microsoft ASP.NET samples have been done in favor of showcasing the product rather on good design. The ASP.NET 2.0 starter kits were a prime example of this. When they were released they all had terrible performance issues right from the start that had to be fixed by the community.
Eduardo,
They are true, I would suspect that Jeff executed his test on a local database. Because the cost of going to the database over the network is very high
@Ayende - just continuing on this discussion if you don't mind
"For example, payroll calculation should be in the employee"
What should be deposited into my account is my income tax withholding , other deductions such as medical insurance, dependent care etc.
So when doing DDD:
Do all these become properties/associations of the Employee class
Do we have variants of an Employee domain object depending on if it is the payroll module that is accessing it, or say organization chart module
Whatever you said about lazy loading etc in this post applies even more when you do an employee.CalculatePayroll (something out there has to be smart enough to eager load the empoyee with all the associations)
Finally does your domain make use of "services" - I am referring to say injecting services into domain classes
Ajai
Ajai,
You probably want to read this:
ayende.com/.../aspects-of-domain-design.aspx
In short, there are several players here:
var salary = Employee.CalculateTotalSalary(month);
var afterTaxes = TaxService.ApplyTaxes(Employee, salary)
Thanks for the link Ayende!
Looks like the EF, L2S code generation camp were not that far off with their partial method stuff!
Just kidding :-)
Liked the mixin comments by the re-motion guys...
Ajai
EF, bleh... when is NH 3.0 going to be out?
"For example, payroll calculation should be in the employee"
Payroll calculation clearly is a collaborative effort between the HR singleton and the management flyweights, with some random fluctuation coming from the employee's IEnumerable(Of Achievement).
Oren -
I must admit the profiler is a pretty slick tool. I am not looking forward to using it on some of code I own and discovering the undesirables ;)
I would say that it is kind of related to the weaknesses of EF. I wouldn't rewrite any NH-backed logic to take an Id, I would use Load instead. (In fact Id is not a part of the domain anyway).
I would also say that the main WTF there is that shopping cart has a reference to the datacontext. And actually uses it.
Looking at the AddToCart method where you replace the album instance with the int, I have one question. Why is the query being executed twice to get the album? Without your profiler, I would never guess there are two queries. I would have thought that since the album object is already populated, the album id would be pulled from the property. Is this due to how EF works, or is it due to Linq? Just curious and nice and informative post.
Tim,
The code execute a Linq query, EF execute that query.
Comment preview