NH Prof AlertsSelect N + 1
This is a bit from the docs for NH Prof, which I am sharing in order to get some peer review.
Select N+1 is a data access anti pattern, in which we are accessing the database in one of the least optimal ways. Let us take a look at a code sample, and then discuss what is going on. I want to show the user all the comments from all the posts, so they can delete all the nasty comments. The naןve implementation would be something like:
// SELECT * FROM Posts foreach (Post post in session.CreateQuery("from Post").List()) { //lazy loading of comments list causes: SELECT * FROM Comments where PostId = @p0 foreach (Comment comment in post.Comments) { //do something with comment } }
In this example, we can see that we are loading a list of posts ( the first select ) and then traversing the object graph. However, we access the lazily loaded collection, causing NHibernate to go to the database and bring the results one row at a time. This is incredibly inefficient, and the NHibernate Profiler will generate a warning whenever it encounters such a case. The solution for this example is simple, we simple force an eager load of the collection up front.
Using HQL:
var posts = session
.CreateQuery("from Post p left join fetch p.Comments")
.List();
Using the criteria API:
session.CreateCriteria(typeof(Post)) .SetFetchMode("Comments", FetchMode.Eager) .List();
In both cases, we will get a join and only a single query to the database. Note, this is the classic appearance of the problem, it can also surface in other scenarios, such as calling the database in a loop, or more complex object graph traversals. In those cases, it it generally much harder to see what is causing the issue.
NHibernate Profiler will detect those scenarios as well, and give you the exact line in the source code that cause this SQL to be generated. Another option for solving this issue is: MutliQuery and MultiCriteria, which are also used to solve the issue of Too Many Queries.
More posts in "NH Prof Alerts" series:
- (31 Dec 2008) Use statement batching
- (30 Dec 2008) Too many database calls per session
- (29 Dec 2008) Excessive number of rows returned
- (29 Dec 2008) Unbounded result set
- (28 Dec 2008) Use of implicit transactions is discouraged
- (28 Dec 2008) Select N + 1
Comments
I think this should say 'non-optiimal' or 'sub-optimal' as I'm not 100% certain that I cannot think of an even less-optimal way to access the DB :D
Minor point, but I think '...solve the issue of...' is what's meant here (recognizing that english isn't your first language and its my ONLY (non-programming) language, so offering as helpful input, not criticism).
Looking (really) forward to this becoming a product I can buy~!
I believe a more common name for "Select N+1" is "N+1 selects" antipattern, I don't think there's anything bad in selecting a number N+1 from a database :)
Steve,
Thanks, I updated the docs & the post.
Indirectly related, I'm not so sure how people deal with this cases when you have a separate DAL (i.e Repositories), maybe a hint would be useful for your users too.
Alberto,
I don't follow you, can you explain further?
If you have a PostRepository and a GetAll() method that lazy loads Comments. Now you want run into this problem for some scenario. What do you do?
Alberto,
You have GetAllWithComments(), or something like that
My concerns with that solution are:
1) Does it really work in more complex, real world apps? Seems like it would lead to method explosion in the repository, polluting the API.
2) It doesn't seem a big difference from a PI point of view to the (openly criticized) "explicit lazy loading" in EF.
1) Yes, it does. In practice, there are only a few ways of processing the graph.
2) There is a big difference, the only code that needs to know about this is the persistence code, not all the rest of the code, as is the case with EF.
1) Well, if you have N lazy loaded relations there are 2^N possible combinations (hence the method explosion) but you are probably right that in practice YAGNI all those.
2) But the consumers of the repository must now know which of the methods to call. Maybe I'm worrying to much, as it should be obvious (while "1" remains true) depending on what they want.
Thanks for the insight!
I might be mistaken, but if I do an eager load, doesn't that like join posts and comments? So when I have like a "bundle" of comments per post, I redundantly retrieve the post-data from the db :(
I would rather like two calls to the DB:
all the posts
get all the comments where the post-id is in the list of post-ids of the posts
This would reduce the DB-traffic as much as possible. But how would I do that painless with NH?
If I would have a methode like "GetAllWithComments()" this could first get all posts, and then issue a second query with a "in"-restriction or something. Or is there a better way?
Henning,
There are many options for eager loading with NH.
In this case, you are talking about the sub select option.
maybe I'm missing your point :) what are you trying to tell me by mentioning the "sub select option" - I'm not sure.
I did a quick test: I have two orders, one with one orderitem, the otherone with two orderitems.
When I do the following:
DetachedCriteria criteria = DetachedCriteria.For(typeof (Order))
<order orders = criteria.GetExecutableCriteria(_sessionFactory.OpenSession()).List <order();
I get three SQL-statements (N+1 select!).
When I do the following instead:
DetachedCriteria criteria = DetachedCriteria.For(typeof (Order))
<order orders = criteria.GetExecutableCriteria(_sessionFactory.OpenSession()).List <order();
I only get one statement, but each row contains the redundant order information.
I would like to get two statements: one for all orders and one for all orderitems, belonging to the previously fetched orders (in this case the orders are not restricted for simplicity).
... ahh; does it matter that I'm still using NH 1.2? :)
Yes, NH 1.2 doesn't have subselect fetching.
OK - so I linked my little playground to the 2.0.1 GA ... so what do I do next :) I mean - how do I this subselect fetching?
www.google.co.il/search?q=nhibernate+subselect
Select N+1 also seems to be caused by specifying lazy = false for the collection in the mapping file. I am guessing that NH2 defaults to using FetchMode.Select if you do not explicitly specify a fetch mode (I need to go read the docs / look at the code but commenting while I remember).
Might be worth adding something about this to the alert docs?
Comment preview