Ayende @ Rahien

Hi!
My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:

ayende@ayende.com

+972 52-548-6969

, @ Q c

Posts: 10 | Comments: 37

filter by tags archive

Reviewing OSS ProjectWhiteboard Chat–The Select N+1 issue

time to read 3 min | 430 words

As a reminder, I am reviewing the problems that I found while reviewing the Whiteboard Chat project during one of my NHibernate’s Courses. Here is the method:

[Transaction]
[Authorize]
[HttpPost]
public ActionResult GetLatestPost(int boardId, string lastPost)
{
    DateTime lastPostDateTime = DateTime.Parse(lastPost);
    
    IList<Post> posts = 
        _postRepository
        .FindAll(new GetPostLastestForBoardById(lastPostDateTime, boardId))
        .OrderBy(x=>x.Id).ToList();

    //update the latest known post
    string lastKnownPost = posts.Count > 0 ? 
        posts.Max(x => x.Time).ToString()
        : lastPost; //no updates
    
    Mapper.CreateMap<Post, PostViewModel>()
        .ForMember(dest => dest.Time, opt => opt.MapFrom(src => src.Time.ToString()))
        .ForMember(dest => dest.Owner, opt => opt.MapFrom(src => src.Owner.Name));

    UpdatePostViewModel update = new UpdatePostViewModel();
    update.Time = lastKnownPost; 
    Mapper.Map(posts, update.Posts);

    return Json(update);
}

In this post, I am going to discuss the SELECT N+1 issue that exists in this method.

Can you see the issue? It is actually hard to figure out.

Yep, it is in the Mapper.Map call, for each post that we return, we are going to load the post’s owner, so we can provide its name.

So far, I have had exactly 100% success rate in finding SELECT N+1 issues in any application that I reviewed. To be fair, that also include applications that I wrote.

And now we get to the real point of this blog post. How do you fix this?

Well, if you were using plain NHibernate, that would have been as easy as adding a Fetch call. But since this application has gone overboard with adopting repository and query objects mode, it is actually not an issue of just fixing this.

Welcome to the re-architecting phase of the project, because your code cannot be fixed to work efficiently.

I’ll discuss this in more details in my next post…

More posts in "Reviewing OSS Project" series:

  1. (15 Mar 2011) Whiteboard Chat–The Select N+1 issue
  2. (14 Mar 2011) Whiteboard Chat–Unbounded Result Sets and Denial of Service Attacks
  3. (11 Mar 2011) Whiteboard Chat–setup belongs in the initializer
  4. (10 Mar 2011) Whitboard Chat–overall design
  5. (09 Mar 2011) Whiteboard Chat

Comments

Paulo Quicoli

I don't see problem about adopting repositories. I want to believe they saw every time they need a post, they need also its owner.

So, maybe, in FindAll implementation they have a Fetch call for that relationship...

Scooletz

@Paulo Quicoli

... or even better: the query object of type GetPostLastestForBoardById can have some Fetch applier, which shows which paths of relations should be considered during query, but

pure ISession is much easier to use though. I took this course in my latest code review of my colleagues' project and it did work. The number of lines of code and the complexity were highly reduced.

Bas
Bas

Looking forward to your next post. I ran into this issue many times.

I dont want to use fetch/thenfetch inside my repository methods because I dont need to fetch it everytime.

Richard Dingwall

This problem has been present in pretty much every NHibernate application I've ever seen.

@ Paulo's assumption is a dangerous one - using the repository pattern to hide a session forces you into sub-optimal scenarios, where you typically lose performance via either SELECT N+1 (slow), being forced to disable lazy loading (slow), not able to use NH features like fetching, or blossoming of hundreds of wrapper query objects/methods for each load scenario. Just use the session directly.

zvolkov

In all the cases I've seen people adopting repository had no idea what they were doing. I would suggest somebody to live with repository for a year in production with moderately complex system (say, 10 times bigger than a blog engine) and see if they hold on to their opinion. When I was just learning NHibernate I started with repository and query objects before I realized that is not the right way to encapsulate logic in a cohesive way.

Jon wingfield

How do you unit test a controller that directly uses ISession? That's an host question. Wouldn't mocking the session, criteria, etc. be cumbersome?

Jon wingfield

sorry I meant honest. typing on a phone sucks.

Jon Wingfield

Ayende,

Thanks, that's a very interesting take and one that gives a huge advantage to NHibernate. I'm used to thinking in ADO.NET terms.

I have a few other concerns, though. The first is separation of concerns. I do think that Hibernate abstracts things away enough that typing:

_postRepository

    .FindAll(new GetPostLastestForBoardById(lastPostDateTime, boardId))

    .OrderBy(x=>x.Id).ToList();

isn't much different than typing session.CreateCriteria <t()... Either way, the controller now has to know HOW to get the data it needs as well as how to control the UI. It just has to be determined if that additional responsibility will be worth the reduction in code.

My one other concern is that changing the database to a document-type database would require rewriting large portions of the app. Sure, this doesn't happen on short (<3 year projects), but I'm actually in the situation now where we COULD benefit from switching to a document database. Problem is, we are too tightly coupled to the database implementation to do so (which would be the case here too, I believe).

Ayende Rahien

Jon,

Any switch would be a BIG undertaking, and you can't abstract that.

It is just not possible.

Oh, and if you want to talk about document databases...

peter

a bit off-topic, but the header for this post is a day ahead (wednesday the 16th).

Chris Wright

I don't like having a controller create queries directly. I want to be able to test the query in isolation if it has any complexity. (I think I'd be okay with a controller using criteria for relatively simple queries; not entirely certain.)

So I have the exact same code as you would have here, except extracting the query into a method on the repository. I don't mind having a repository supporting a hundred different queries, each of which is only used in one class -- except then I'd probably find some way of splitting it up into separate classes for ease of maintenance. In practice, for a mid-size application, that ended up being between zero and eight queries per repository (but this application wasn't very database-centric).

We weren't considering the possibility of switching from nhibernate when doing this, but I suppose it keeps all the queries in one place, which should help when scoping the task.

John Farrell

@Chris

" I don't mind having a repository supporting a hundred different queries, each of which is only used in one class"

So you enjoy having 100 classes be dependent on a single class even though the each class uses 1% of the total functionality?

Dmitry

They way I deal with this is by having the Find method return IQueryable and specification objects (like the GetPostLastestForBoardById) that encapsulate Expression<Func<T, bool>>. This way Fetch could be applied on a per-specification basis.

My problem with using ICriteria/IQuery inside the controller is you cannot reuse queries anywhere else.

Scooletz

@Dmitry

You can always write a query and name it. Then you have it reusable in cost of not knowing whats going on behind the curtain.

flukus

On a related note, how do you feel about using Automapper in this situation?

On a current project I keep changing my opinion. On one hand it moves the transform logic into it's own class. On the other hand it potentially creates performance problems like this.

At the moment I've settled on using automapper and cleaning up performance problems later. I hear you have a tool that can help this ;)

Comment preview

Comments have been closed on this topic.

FUTURE POSTS

  1. Production postmortem: The case of the memory eater and high load - one day from now
  2. Production postmortem: The case of the lying configuration file - about one day from now
  3. Production postmortem: The industry at large - 3 days from now
  4. The insidious cost of allocations - 4 days from now
  5. Find the bug: The concurrent memory buster - 5 days from now

And 4 more posts are pending...

There are posts all the way to Sep 10, 2015

RECENT SERIES

  1. Find the bug (5):
    20 Apr 2011 - Why do I get a Null Reference Exception?
  2. Production postmortem (10):
    14 Aug 2015 - The case of the man in the middle
  3. What is new in RavenDB 3.5 (7):
    12 Aug 2015 - Monitoring support
  4. Career planning (6):
    24 Jul 2015 - The immortal choices aren't
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats