Ayende @ Rahien

Refunds available at head office

Solving the Select N+1 Problem

Note, I am currently writing the NH Prof documentation, so you are probably going to get quite a bit of NHibernate tidbits over the next few days.

I want to show the user all the comments from all the posts, so they can delete all the nasty comments. The naןve implementation would be something like:

foreach (Post post in session.CreateQuery("from Post").List()) // SELECT * FROM Posts

{
foreach (Comment comment in post.Comments) //lazy loading of comments list causes: SELECT * FROM Comments where PostId = @p0
{
//print comment...
}
}
In this example, we can see that we are loading a list of posts ( the first select ) and then traversing the object graph. However, we access the lazily loaded collection, causing NHibernate to go to the database and bring the results one row at a time. This is incredibly inefficient, and the NHibernate Profiler will generate a warning whenever it encounters such a case.
The solution for this example is simple, we simple force an eager load of the collection up front.
Using HQL:
var  posts = session.CreateQuery("from Post p left join fetch p.Comments").List();

Using the criteria API:
session.CreateCriteria(typeof(Post))

.SetFetchMode("Comments", FetchMode.Eager)
.List();

In both cases, we will get a join and only a single query to the database.
Note, this is the classic appearance of the problem, it can also surface in other scenarios, such as calling the database in a loop, or more complex object graph traversals. In those cases, it it generally much harder to see what is causing the issue.
NHibernate Profiler will detect those scenarios as well, and give you the exact line in the source code that cause this SQL to be generated.
Another option for solving this issue is: MutliQuery and MultiCriteria, which are also used to solved the issue of Too Many Queries.

Comments

Ray Vega
12/01/2008 08:21 AM by
Ray Vega

The Select N +1 problem is certainly something commonly used by proponents of stored procedures to dismiss ORMs as being inefficient and suffering from poor performance. Obviously, it takes a careful and somewhat experienced eye with ORMs to recognize when this occurs and how to fix it.

Your NHibernate Profiler should hopefully help mitigate this.

As an aside, any NHibernate tools in the future to help identify potential security holes such as not using named parameters (another favorite target of ORM opponents)?

Peter Morris
12/01/2008 09:00 AM by
Peter Morris

When accessing elements by index I can understand why an OPF would want to fetch individual elements, but when you are iterating over them using an IEnumerable it is pretty obvious you are going to be accessing them each in turn so why not fetch them in chunks of 50 or something by default?

Peter Morris
12/01/2008 09:01 AM by
Peter Morris

PS: By "why not fetch them" I meant the class in NHibernate implementing the enumerator.

Frans Bouma
12/01/2008 09:19 AM by
Frans Bouma

That join query is still inefficient, as you can introduce a lot of duplicates for Post which you have to wade through when materializing Post instances. With multi-branched graphs, it gets downhill pretty quickly.

Steve Degosserie
12/01/2008 12:12 PM by
Steve Degosserie

It seems that 'SetFetchMode(propertyPath, FetchMode.Eager)' doesn't take into account the 'SubSelect' fetch mode declared in the mapping, which is usually better than outer-joining.

Ayende Rahien
12/01/2008 03:23 PM by
Ayende Rahien

Peter,

It can, you need to configure it, NH in general doesn't try to be too smart in trying to figure out what the best data access strategy is.

That subject is too complex to guess successfully, so you need to tell it how you want it to behave.

See batch="50"

Ayende Rahien
12/01/2008 03:25 PM by
Ayende Rahien

Frans,

For multi branched graphs, you use other options. I am using this to show the classic issue.

Also, you can do eager loading in NH using other options than a join.

Andres G. Aragoneses
12/01/2008 03:46 PM by
Andres G. Aragoneses

Ayende: how does this NH Prof tool work? I mean, does it need to see the app running this code in order to report the warning. If yes, that's a shame because you're doing "lazy loading" of the NH usage of the code in your app :)

I mean, with a compile-time tool you would get all the warnings in a row, without trying to reproduce every bottleneck.

Ayende Rahien
12/01/2008 04:19 PM by
Ayende Rahien

Yes, you need to execute the code in order to get the warning.

Trying to do this with static analysis would be extremely complex.

The piece of code above is completely valid, but you would need to search for all calling code paths and analyze them to try to figure out what is going on here.

Not impossible, but an extremely complex solution

In addition to that, those sort of things are usually not something that you have to work hard to reproduce, they are generally very easy to find.

Andrés G. Aragoneses
12/01/2008 07:23 PM by
Andrés G. Aragoneses

Well, you would be amazed of the kind of things you can do with Gendarme today (it uses Cecil BTW, now that you mention it), so I'd say it's not so difficult.

In this case you would for IL that resembles a for/foreach construct on a property whose name you have not used in a fetch clause.

Ayende Rahien
12/01/2008 08:07 PM by
Ayende Rahien

Andres,

That is a very simplistic solution.

You would need to scan all the code paths to get to that, and that means that you have to do full system analysis.

You suggestion would create false positive for code path that just created the objects, or would ignore something that does Select N+1 by stepping through the Parent.Parent.Parent relationship

Andrés G. Aragoneses
12/01/2008 09:28 PM by
Andrés G. Aragoneses

That's why I said "in this case". Ideally you would build the rule meanwhile you come up with more test cases. I just mentioned an initial idea for the first testcase of the rule that would cover the code in this blogspot :)

I'd love to start doing Gendarme rules for NH but don't have too much time for it :(

Jeremie.Chassaing
12/01/2008 09:55 PM by
Jeremie.Chassaing

Ok this is a manual pingback...

I've seen quite a lot of people saying that lazy load is evil. But remember when they were saying that automatic memory management was bad !

For sure I hope one day will come where data fetching will totally transparent, with a minor performance impact.

This day we'll be able to spend our time on more important things.

But today we have to decide between fetch early (performance hit on some scenarii), lazy loads (perf hit on other cases) or no fetch (the data is not available). The 3rd choice can be ok to be Yagni. Fighting 1 vs 2 before reaching a real performance problem is premature optimisation...

Steve
12/03/2008 05:37 PM by
Steve

Stupid question:

if I have a Blog with a lazy-load collection of Posts.

And do:

Blog b = GetById(1);

foreach(Post p in b.Posts)

...

is that foreach calling each - or since I reference the collection, it gets them all at that point - one call ?

Ayende Rahien
12/03/2008 06:35 PM by
Ayende Rahien

Steve,

That will be two calls.

select blog

select posts where blogid =?

Steve
12/03/2008 09:15 PM by
Steve

Ok, thank you.

Adam
01/26/2009 04:02 PM by
Adam

So I guess the best solution is to pass your fetch strategy to your persistence layer wherever you make your calls?

I'm using NHprofiler and I'm getting a SELECT N+1 alert .. but im not sure why.

My object graph is as follows: AuthUsers have SystemUsers, SystemUsers have Permissions with a many-many (SystemUserPermission)

In my mappings, the realtionship is not lazy between either of those entities, and I get the warning. I do see that 4 queries were executed and it looks like it's looping, but my code isn't doing that and it seems to be the result of lazy loading..

SELECT permission0.SystemUserId as SystemUs1_1,

   permission0_.permissionid          as permissi2_1_,

   permission1_.PermissionId          as Permissi1_3_0_,

   permission1_.Name                  as Name3_0_,

   permission1_.PermissionDescription as Permissi3_3_0_

FROM Omega.dbo.systemuserpermission permission0_

   left outer join Omega.dbo.Permission permission1_

     on permission0_.permissionid = permission1_.PermissionId

WHERE permission0_.SystemUserId = 89 /* @p0 */

Any suggestions?

Ayende Rahien
01/26/2009 04:37 PM by
Ayende Rahien

You aren't using lazy load, so it is forcing the load of all the items in sub optimal fashion

Adam
01/26/2009 04:42 PM by
Adam

Ok, thanks for the reply.

So I should be using lazy load here?

I'm trying to implement your commons library for your unit of work/session in view pattern. When I was using lazy loading I was getting lazy exceptions, so I suppose this was a fix in the wrong direction.

Obviously I'm not the NHibernate expert you are, but I plan to go through everything with your profiler and learn/clean.

Comments have been closed on this topic.