Eagerly loading entity associations efficiently with NHibernate

time to read 29 min | 5706 words

One of the things that seems to pop up frequently is people wanting to load an entity with all of its associations eagerly. That is pretty easy to do when the associations are many-to-one (that is, there is only one of them for each root entity). Example of those would be things like Owner, Site, etc.

Here is an HQL query that would load a blog with its owner and site as well:

from Blog b left join fetch b.Owner left join fetch b.Site

The problem starts when you try to do the same for multiple collection associations. NHibernate allows you to do so, but the result is probably not what you would initially expect. This query, for example:

from Blog b left join fetch b.Posts left join fetch b.Users where b.Id = :id

Will result in the following SQL statement:

select blog0_.Id             as Id7_0_,
       posts1_.Id            as Id0_1_,
       user3_.Id             as Id5_2_,
       blog0_.Title          as Title7_0_,
       blog0_.Subtitle       as Subtitle7_0_,
       blog0_.AllowsComments as AllowsCo4_7_0_,
       blog0_.CreatedAt      as CreatedAt7_0_,
       posts1_.Title         as Title0_1_,
       posts1_.Text          as Text0_1_,
       posts1_.PostedAt      as PostedAt0_1_,
       posts1_.BlogId        as BlogId0_1_,
       posts1_.UserId        as UserId0_1_,
       posts1_.BlogId        as BlogId0__,
       posts1_.Id            as Id0__,
       user3_.Password       as Password5_2_,
       user3_.Username       as Username5_2_,
       user3_.Email          as Email5_2_,
       user3_.CreatedAt      as CreatedAt5_2_,
       user3_.Bio            as Bio5_2_,
       users2_.BlogId        as BlogId1__,
       users2_.UserId        as UserId1__
from   Blogs blog0_
       left outer join Posts posts1_
         on blog0_.Id = posts1_.BlogId
       left outer join UsersBlogs users2_
         on blog0_.Id = users2_.BlogId
       left outer join Users user3_
         on users2_.UserId = user3_.Id
where  blog0_.Id = 1 /* @p0 */

Something that may not be apparent immediately is going to result in a Cartesian product. This is pointed out in the documentation, but I think that we can all agree that while there may be reasons for this behavior, it is far from ideal.

Let us look at what other OR/Ms are doing, shall we?

The comparable query using Entity Framework would look like this:

db.Blogs
    .Include("Posts")
    .Include("Users")
    .Where(x => x.Id == i)
    .ToArray();

And the resulting SQL would be:

SELECT   [UnionAll1].[Id]             AS [C1],
         [UnionAll1].[Title]          AS [C2],
         [UnionAll1].[Subtitle]       AS [C3],
         [UnionAll1].[AllowsComments] AS [C4],
         [UnionAll1].[CreatedAt]      AS [C5],
         [UnionAll1].[C2]             AS [C6],
         [UnionAll1].[C1]             AS [C7],
         [UnionAll1].[C3]             AS [C8],
         [UnionAll1].[Id1]            AS [C9],
         [UnionAll1].[Title1]         AS [C10],
         [UnionAll1].[Text]           AS [C11],
         [UnionAll1].[PostedAt]       AS [C12],
         [UnionAll1].[BlogId]         AS [C13],
         [UnionAll1].[UserId]         AS [C14],
         [UnionAll1].[C4]             AS [C15],
         [UnionAll1].[C5]             AS [C16],
         [UnionAll1].[C6]             AS [C17],
         [UnionAll1].[C7]             AS [C18],
         [UnionAll1].[C8]             AS [C19],
         [UnionAll1].[C9]             AS [C20]
FROM     (SELECT CASE 
                   WHEN ([Extent2].[Id] IS NULL) THEN CAST(NULL AS int)
                   ELSE 1
                 END AS [C1],
                 [Extent1].[Id]             AS [Id],
                 [Extent1].[Title]          AS [Title],
                 [Extent1].[Subtitle]       AS [Subtitle],
                 [Extent1].[AllowsComments] AS [AllowsComments],
                 [Extent1].[CreatedAt]      AS [CreatedAt],
                 1                          AS [C2],
                 CASE 
                   WHEN ([Extent2].[Id] IS NULL) THEN CAST(NULL AS int)
                   ELSE 1
                 END AS [C3],
                 [Extent2].[Id]             AS [Id1],
                 [Extent2].[Title]          AS [Title1],
                 [Extent2].[Text]           AS [Text],
                 [Extent2].[PostedAt]       AS [PostedAt],
                 [Extent2].[BlogId]         AS [BlogId],
                 [Extent2].[UserId]         AS [UserId],
                 CAST(NULL AS int)          AS [C4],
                 CAST(NULL AS varbinary(1)) AS [C5],
                 CAST(NULL AS varchar(1))   AS [C6],
                 CAST(NULL AS varchar(1))   AS [C7],
                 CAST(NULL AS datetime)     AS [C8],
                 CAST(NULL AS varchar(1))   AS [C9]
          FROM   [dbo].[Blogs] AS [Extent1]
                 LEFT OUTER JOIN [dbo].[Posts] AS [Extent2]
                   ON [Extent1].[Id] = [Extent2].[BlogId]
          WHERE  [Extent1].[Id] = 1 /* @p__linq__1 */
          UNION ALL
          SELECT 2                          AS [C1],
                 [Extent3].[Id]             AS [Id],
                 [Extent3].[Title]          AS [Title],
                 [Extent3].[Subtitle]       AS [Subtitle],
                 [Extent3].[AllowsComments] AS [AllowsComments],
                 [Extent3].[CreatedAt]      AS [CreatedAt],
                 1                          AS [C2],
                 CAST(NULL AS int)          AS [C3],
                 CAST(NULL AS int)          AS [C4],
                 CAST(NULL AS varchar(1))   AS [C5],
                 CAST(NULL AS varchar(1))   AS [C6],
                 CAST(NULL AS datetime)     AS [C7],
                 CAST(NULL AS int)          AS [C8],
                 CAST(NULL AS int)          AS [C9],
                 [Join2].[Id]               AS [Id1],
                 [Join2].[Password]         AS [Password],
                 [Join2].[Username]         AS [Username],
                 [Join2].[Email]            AS [Email],
                 [Join2].[CreatedAt]        AS [CreatedAt1],
                 [Join2].[Bio]              AS [Bio]
          FROM   [dbo].[Blogs] AS [Extent3]
                 INNER JOIN (SELECT [Extent4].[UserId]    AS [UserId],
                                    [Extent4].[BlogId]    AS [BlogId],
                                    [Extent5].[Id]        AS [Id],
                                    [Extent5].[Password]  AS [Password],
                                    [Extent5].[Username]  AS [Username],
                                    [Extent5].[Email]     AS [Email],
                                    [Extent5].[CreatedAt] AS [CreatedAt],
                                    [Extent5].[Bio]       AS [Bio]
                             FROM   [dbo].[UsersBlogs] AS [Extent4]
                                    INNER JOIN [dbo].[Users] AS [Extent5]
                                      ON [Extent5].[Id] = [Extent4].[UserId]) AS [Join2]
                   ON [Extent3].[Id] = [Join2].[BlogId]
          WHERE  [Extent3].[Id] = 1 /* @p__linq__1 */) AS [UnionAll1]
ORDER BY [UnionAll1].[Id] ASC,
         [UnionAll1].[C1] ASC

At this point, I am pretty sure, your eyes shut down in self defense. This is one complex query. But, basically, this is a complex query because EF is executing the following two queries and unioning them.

Eager load Blog Posts:

SELECT CASE 
       WHEN ([Extent2].[Id] IS NULL) THEN CAST(NULL AS int)
       ELSE 1
     END AS [C1],
     [Extent1].[Id]             AS [Id],
     [Extent1].[Title]          AS [Title],
     [Extent1].[Subtitle]       AS [Subtitle],
     [Extent1].[AllowsComments] AS [AllowsComments],
     [Extent1].[CreatedAt]      AS [CreatedAt],
     1                          AS [C2],
     CASE 
       WHEN ([Extent2].[Id] IS NULL) THEN CAST(NULL AS int)
       ELSE 1
     END AS [C3],
     [Extent2].[Id]             AS [Id1],
     [Extent2].[Title]          AS [Title1],
     [Extent2].[Text]           AS [Text],
     [Extent2].[PostedAt]       AS [PostedAt],
     [Extent2].[BlogId]         AS [BlogId],
     [Extent2].[UserId]         AS [UserId],
     CAST(NULL AS int)          AS [C4],
     CAST(NULL AS varbinary(1)) AS [C5],
     CAST(NULL AS varchar(1))   AS [C6],
     CAST(NULL AS varchar(1))   AS [C7],
     CAST(NULL AS datetime)     AS [C8],
     CAST(NULL AS varchar(1))   AS [C9]
FROM   [dbo].[Blogs] AS [Extent1]
     LEFT OUTER JOIN [dbo].[Posts] AS [Extent2]
       ON [Extent1].[Id] = [Extent2].[BlogId]
WHERE  [Extent1].[Id] = 1 /* @p__linq__1 */

Eager load Blog Users:

SELECT 2                          AS [C1],
     [Extent3].[Id]             AS [Id],
     [Extent3].[Title]          AS [Title],
     [Extent3].[Subtitle]       AS [Subtitle],
     [Extent3].[AllowsComments] AS [AllowsComments],
     [Extent3].[CreatedAt]      AS [CreatedAt],
     1                          AS [C2],
     CAST(NULL AS int)          AS [C3],
     CAST(NULL AS int)          AS [C4],
     CAST(NULL AS varchar(1))   AS [C5],
     CAST(NULL AS varchar(1))   AS [C6],
     CAST(NULL AS datetime)     AS [C7],
     CAST(NULL AS int)          AS [C8],
     CAST(NULL AS int)          AS [C9],
     [Join2].[Id]               AS [Id1],
     [Join2].[Password]         AS [Password],
     [Join2].[Username]         AS [Username],
     [Join2].[Email]            AS [Email],
     [Join2].[CreatedAt]        AS [CreatedAt1],
     [Join2].[Bio]              AS [Bio]
FROM   [dbo].[Blogs] AS [Extent3]
     INNER JOIN (SELECT [Extent4].[UserId]    AS [UserId],
                        [Extent4].[BlogId]    AS [BlogId],
                        [Extent5].[Id]        AS [Id],
                        [Extent5].[Password]  AS [Password],
                        [Extent5].[Username]  AS [Username],
                        [Extent5].[Email]     AS [Email],
                        [Extent5].[CreatedAt] AS [CreatedAt],
                        [Extent5].[Bio]       AS [Bio]
                 FROM   [dbo].[UsersBlogs] AS [Extent4]
                        INNER JOIN [dbo].[Users] AS [Extent5]
                          ON [Extent5].[Id] = [Extent4].[UserId]) AS [Join2]
       ON [Extent3].[Id] = [Join2].[BlogId]
WHERE  [Extent3].[Id] = 1 /* @p__linq__1 */) 

The query may be complex, but it get the job done and does so without bothering the us much. The question is, can we do the same with NHibernate?

As it run out, we can, pretty easily too. The following code will do just that:

 var blogs = s.CreateQuery("from Blog b left join fetch b.Posts where b.Id = :id")
     .SetParameter("id", 1)
     .Future<Blog>();

 s.CreateQuery("from Blog b left join fetch b.Users where b.Id = :id1")
     .SetParameter("id1", 1)
     .Future<Blog>();

So, what is going on here? We are actually issuing two queries, each of them to eagerly load a single collection. The trick is that we are using future queries to do so. That means that the query that is going to be sent to the database to get those results is:

select blog0_.Id             as Id7_0_,
       posts1_.Id            as Id0_1_,
       blog0_.Title          as Title7_0_,
       blog0_.Subtitle       as Subtitle7_0_,
       blog0_.AllowsComments as AllowsCo4_7_0_,
       blog0_.CreatedAt      as CreatedAt7_0_,
       posts1_.Title         as Title0_1_,
       posts1_.Text          as Text0_1_,
       posts1_.PostedAt      as PostedAt0_1_,
       posts1_.BlogId        as BlogId0_1_,
       posts1_.UserId        as UserId0_1_,
       posts1_.BlogId        as BlogId0__,
       posts1_.Id            as Id0__
from   Blogs blog0_
       left outer join Posts posts1_
         on blog0_.Id = posts1_.BlogId
where  blog0_.Id = 1 /* @p0 */
select blog0_.Id             as Id7_0_,
       user2_.Id             as Id5_1_,
       blog0_.Title          as Title7_0_,
       blog0_.Subtitle       as Subtitle7_0_,
       blog0_.AllowsComments as AllowsCo4_7_0_,
       blog0_.CreatedAt      as CreatedAt7_0_,
       user2_.Password       as Password5_1_,
       user2_.Username       as Username5_1_,
       user2_.Email          as Email5_1_,
       user2_.CreatedAt      as CreatedAt5_1_,
       user2_.Bio            as Bio5_1_,
       users1_.BlogId        as BlogId0__,
       users1_.UserId        as UserId0__
from   Blogs blog0_
       left outer join UsersBlogs users1_
         on blog0_.Id = users1_.BlogId
       left outer join Users user2_
         on users1_.UserId = user2_.Id
where  blog0_.Id = 1 /* @p1 */

Note that we are essentially discarding the results of the second query. The reason for that is that we aren’t actually interested in those results, we execute this query solely to get NHibernate to fill the Users’ collection of the relevant entity.

I generally use this method so I would have the first query to eager load all the many-to-one associations, and then a series of queries (one per required loaded collection) to load one-to-many or many-to-many associations.