Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,612
|
Comments: 51,245
Privacy Policy · Terms
filter by tags archive
time to read 5 min | 868 words

Originally posted at 3/10/2011

This time, this is a review of the Sharp Commerce application. Again, I have stumbled upon the application by pure chance, and I have very little notion about who wrote it. The problem is that this system seems to be drastically more complicated than it should be.

In this case, I want to look at the type of API that is exposed:

image

If this reminds you of the bad old days of having only Stored Procedure API available, that is not by chance. Far worst than that, however, is the call paths where this is used.

  • IEmailTemplateRepository.Get(EmailTemplateLookup) is implemented as
    public EmailTemplate Get(EmailTemplateLookup emailId)
    {
        return Get((int)emailId);
    }
    and is only used in:
    • EmailService.Get(EmailTemplateLookup) whose implementation is:
      public EmailTemplate GetEmail(EmailTemplateLookup template)
      {
          return emailTemplateRepository.Get(template);
      }
  • ICategoryRepository.GetParentCategories is only used from:
    • CategoryService.GetParentCategories which is implemented as:
      public IEnumerable<Category> GetParentCategories()
      {
          IEnumerable<Category> categories = categoryRepository.GetParentCategories();
      
          return categories;
      }
  • ICurrencyRepository.GetEnabledCurrencies is only used from:
    • CurrencyService.GetEnabledCurrencies which is implemented as:
      public IEnumerable<Currency> GetEnabledCurrencies()
      {
           return currencyRepository.GetEnabledCurrencies();
      }
  • For that matter, let us take a look at the entire CurrencyService class, shall we?

    public class CategoryService : ICategoryService
    {
        private readonly ICategoryRepository categoryRepository;
    
        public CategoryService(ICategoryRepository categoryRepository)
        {
            this.categoryRepository = categoryRepository;
        }
    
        public IList<Category> GetCategories()
        {
            return categoryRepository.GetAll();
        }
    
        public Category GetCategory(int id)
        {
            return categoryRepository.Get(id);
        }
    
        public void SaveOrUpdate(Category categoryModel)
        {
            categoryRepository.SaveOrUpdate(categoryModel);
        }
    
        public void Delete(Category category)
        {
            categoryRepository.Delete(category);
        }
    
        public IEnumerable<Category> GetParentCategories()
        {
            IEnumerable<Category> categories = categoryRepository.GetParentCategories();
    
            return categories;
        }
    }

    To be honest, I really don’t see the point.

    Now, just a hint on the next few posts, there are places where I think wrapping the usage of the NHibernate API was a good idea, even if I strongly disagree with how this was done.

    time to read 2 min | 291 words

    Originally posted at 3/10/2011

    This time, this is a review of the Sharp Commerce application. Again, I have stumbled upon the application by pure chance, and I have very little notion about who wrote it. The problem is that this system seems to be drastically more complicated than it should be.

    I am going to focus  on different parts of the system in each of those posts. In this case, I want to focus on the very basis for the application data access:

    image

    Are you kidding me? This is before you sit down to write a single line of code, mind you. Just the abstract definitions for everything makes my head hurt.

    It really hits you over the head when you get this trivial implementation:

    public class EmailTemplateRepository : Repository<EmailTemplate>, IEmailTemplateRepository
    {
        public EmailTemplate Get(EmailTemplateLookup emailId)
        {
            return Get((int)emailId);
        }
    }

    Yes, this is the entire class.  I am sorry, but I really don’t see the point. The mental weight of all of this is literally crashing.

    time to read 2 min | 232 words

    Originally posted at 3/10/2011

    You might have noticed that I am routinely pointing out that there are issues, drastic issues, with any piece of code that issues a query without setting a limit to the number of results.

    I got several replies saying that I am worrying too much about this issue. I don’t really understand that sort of thinking. Leaving aside the possibility of literally killing our application (as a result of Out of Memory Exception), unbounded result sets are dangerous. The main problem is that they aren’t just a theoretical problem, it is a problem that happens with regular intervals in production systems.

    The real issue is that it is not just System Down issues, this might be the best scenario, actually. In most production deployment, you are actually paying for the amount of data that you are passing around. When you start dealing with unbounded result set, you are literally writing an open check and handing it to strangers.

    I don’t know many people who can do something like this with equanimity.

    It doesn’t take a lot to get to the point where this sort of thing really hurts. Moreover, there are truly very few cases where you actually need to have access to the entire result set. For the most part, when I see developers doing that, it is usually out of sheer laziness.

    time to read 12 min | 2309 words

    Originally posted at 3/10/2011

    This is part of the series of posts about the Whiteboard Chat project. In this post, I am going to explore the data access methodology and why is it so painful to work with. In particular, I am going to focus on the problems arising in this method:

    [Transaction]
    [Authorize]
    [HttpPost]
    public ActionResult GetLatestPost(int boardId, string lastPost)
    {
        DateTime lastPostDateTime = DateTime.Parse(lastPost);
        
        IList<Post> posts = 
            _postRepository
            .FindAll(new GetPostLastestForBoardById(lastPostDateTime, boardId))
            .OrderBy(x=>x.Id).ToList();
    
        //update the latest known post
        string lastKnownPost = posts.Count > 0 ? 
            posts.Max(x => x.Time).ToString()
            : lastPost; //no updates
        
        Mapper.CreateMap<Post, PostViewModel>()
            .ForMember(dest => dest.Time, opt => opt.MapFrom(src => src.Time.ToString()))
            .ForMember(dest => dest.Owner, opt => opt.MapFrom(src => src.Owner.Name));
    
        UpdatePostViewModel update = new UpdatePostViewModel();
        update.Time = lastKnownPost; 
        Mapper.Map(posts, update.Posts);
    
        return Json(update);
    }

    You can clearly see that there was a lot of though dedicated to making sure that the architecture was done right. Here is the overall data access approach:

    image

    So are using the Specification & Repository patterns, which seems okay, at first glance.  Except that the whole system is seriously over engineered, and it doesn’t really provide you with any benefits. The whole purpose of a repository is to provide an in memory collection interface to a data source, but that pattern was established at a time where the newest thing on the block was raw SQL calls. Consider what NHibernate is doing, and you can see that here, one implementation of an in memory collection interface on top of a data store is wrapped in another, and it does nothing for us except add additional code and complexity.

    As an aside, even ignoring the uselessness of the repository pattern here, I really don’t like that the FindOne() method returns an IQueryable<T>, and I don’t really see a reason for that there.

    Then there is the issue of the Specification itself. It is not quite pulling its own weight. To be frank, it is actually sinking down the entire ship, rather than helping any. For example, let us look at the GetPostLastestForBoardById implementation:

    public class GetPostLastestForBoardById : Specification<Post>
    {
        private readonly DateTime _lastPost;
        private readonly int _boardId;
    
        public GetPostLastestForBoardById(DateTime lastPost, int boardId)
        {
            _lastPost = lastPost;
            _boardId = boardId;
        }
    
        public override Expression<Func<Post, bool>> MatchingCriteria
        {
            get
            {
                return x =>
                    x.Time > _lastPost &&
                    x.Board.Id == _boardId;
            }
        }
    }

    Are you kidding me, all of this infrastructure, all of those nested generics, for something like this? Is this really something that deserves its own class? I don’t think so.

    Worse than that, this sort of design implies that there is a value in sharing queries among different parts of the application. And the problem with that sort of thinking is that this premise is usually false. Except for querying by id, most queries in an application tend to be fairly unique to their source, and even when they are the actual physical query, they usually have different logical meaning. For example, “select all the late rentals” is a query that we might use as part of the rental system home screen, and also as part of the business logic to charge people for late rentals. The problem is that while on the surface, this query is the same, the meaning of what we are doing is usually drastically different. And because of that, trying to reuse that query in some manner is actually going to cause us problems. We have a single piece of code that now have two masters, and two reasons for change. That is not a good place to be. You’ll probably have to touch that piece of code too often, and have to make it more complicated than it would have been because it has dual purposes.

    You can see an example of how this is problematic in the method above. In my last post, I pointed out that there is a SELECT N+1 issue in the code. For each of the loaded posts, we also load the Owner’s name. NHibernate provides easy ways to fix this issue, by simply specifying what is the fetch paths that we want to use.

    Except… this architecture make is drastically harder to make something like that. The repository abstraction layer hides the NHibernate API that allows you to specify the fetch paths. So you are left with a few options. The most common approach for this issue is usually to re-architect specifications so that they can also provide the fetch path for the query. It looks something like this:

    public class GetPostLastestForBoardById : Specification<Post>
    {
        private readonly DateTime _lastPost;
        private readonly int _boardId;
    
        public GetPostLastestForBoardById(DateTime lastPost, int boardId)
        {
            _lastPost = lastPost;
            _boardId = boardId;
        }
    
        public override Expression<Func<Post, bool>> MatchingCriteria
        {
            get
            {
                return x =>
                    x.Time > _lastPost &&
                    x.Board.Id == _boardId;
            }
        }
        
        public override IEnumerable<Expression<Func<Post, bool>>> FetchPaths
        {
          get
          {
              yield return x => x.Owner;
          }
        }
    }

    Note that this design allows you to specify multiple fetch paths, which seems to fix the issue that we had in our API.  Expect that it doesn’t really work, even identical queries near one another (say, in different actions of the same controller) usually have different fetching requirements.

    Oh, and we haven’t even talked about the need to handle projections.

    Okay, so we can do it differently, we can let developer the option to configure the fetch paths for each query at the location where that query is made. Of course, you then get the following architecture:

    image

    If you think that this is ridiculous, I agree. If you don’t think so… well, never mind, I was told I can’t actually revoke someone’s Touch-The-Keyboard license.

    But let us say that you have solved even that problem somehow, probably by adding yet another abstraction layer on specifying fetch paths. At that point, your code is so abstract, it deserve being put in the Museum for Modern Art, but I’ll let you get away with it.

    The problem is that fetch paths are actually only part of the solution for things like Select N+1. One very powerful option that we have with NHibernate is the use of Future Queries, and we can utilize that feature to significantly reduce both the number of time that we have to go to the database and the size of the data that we have to read from it.

    Except… this design means that I am pretty much unable to implement this sort of  behavior. I will have to drastically change the system design before I can actually do a meaningful performance improvement.

    I see this sort of problem all the time when I am doing code reviews. And the problem with this is that it is usually a stop ship issue because we literally cannot fix things properly without doing a lot of very painful changes.

    Now, allow me to explain the underlying logic behind this post. Reading from the database is a common operation, and it should be treated as such. There should be very few abstractions involved, because there isn’t much to abstract. For the vast majority of cases, there is no logic involved in read operations, and even when there is, it is most often cross cutting logic. The answer that I give for things like "well, so how can I apply security filtering” is that you throw that into an extension method and do something like this:

    var query = 
      (
            from post in session.Query<Post>()
            where post.DueDate > DateTime.Now 
            select post
       )
       .FilterAccordingToUserPermissions(CurrentUser)
       .ToList();

    This make it very easy to review such queries and see that the appropriate actions where taken, and this type of action is only required when you can’t handle this as truly a part of the infrastructure.

    It is also important that I am talking specifically about read operations here. For write operations, the situation is (slightly) different. Write operations usually have business logic, and you want to keep that separate. I would still want to avoid doing anything too abstract, and I would certainly not introduce a repository for writes.

    What I would do is either create a service layer for writes that handle the actual business of doing writes, but (and this is important), I would only do so if there is actually logic to be executed. If this is merely a simple CUD operation that we are executing, there is really very little point to going with a service. Yes, that means using NHibernate directly from the controller method to save whatever it is that you are editing, assuming that there is nothing that requires business logic there.

    To be frank, for the pieces that actually require business logic, I would probably prefer to avoid an explicit service implementation and move to using an internal publish / subscribe (event publisher) in order to have a nicer internal architecture. But at that point, it is more of a preference than anything else.

    To summarize, do try to avoid needless complexity. Getting data from the database is a common operation, and should be treated as such. Adding additional layers of abstractions usually only make it hard. Worse, it creates, in your own application code, the sort of procedural operations that made it so hard to work with systems using Stored Procedures.

    You want to avoid that, it was a bad time, and we don’t want to go back there. But code such as the above is literally something that you would write for those sort of systems, where you have only the pre-defined paths for accessing the data store.

    It is no longer the case, and it is time that our code started acknowledging this fact.

    time to read 3 min | 430 words

    As a reminder, I am reviewing the problems that I found while reviewing the Whiteboard Chat project during one of my NHibernate’s Courses. Here is the method:

    [Transaction]
    [Authorize]
    [HttpPost]
    public ActionResult GetLatestPost(int boardId, string lastPost)
    {
        DateTime lastPostDateTime = DateTime.Parse(lastPost);
        
        IList<Post> posts = 
            _postRepository
            .FindAll(new GetPostLastestForBoardById(lastPostDateTime, boardId))
            .OrderBy(x=>x.Id).ToList();
    
        //update the latest known post
        string lastKnownPost = posts.Count > 0 ? 
            posts.Max(x => x.Time).ToString()
            : lastPost; //no updates
        
        Mapper.CreateMap<Post, PostViewModel>()
            .ForMember(dest => dest.Time, opt => opt.MapFrom(src => src.Time.ToString()))
            .ForMember(dest => dest.Owner, opt => opt.MapFrom(src => src.Owner.Name));
    
        UpdatePostViewModel update = new UpdatePostViewModel();
        update.Time = lastKnownPost; 
        Mapper.Map(posts, update.Posts);
    
        return Json(update);
    }

    In this post, I am going to discuss the SELECT N+1 issue that exists in this method.

    Can you see the issue? It is actually hard to figure out.

    Yep, it is in the Mapper.Map call, for each post that we return, we are going to load the post’s owner, so we can provide its name.

    So far, I have had exactly 100% success rate in finding SELECT N+1 issues in any application that I reviewed. To be fair, that also include applications that I wrote.

    And now we get to the real point of this blog post. How do you fix this?

    Well, if you were using plain NHibernate, that would have been as easy as adding a Fetch call. But since this application has gone overboard with adopting repository and query objects mode, it is actually not an issue of just fixing this.

    Welcome to the re-architecting phase of the project, because your code cannot be fixed to work efficiently.

    I’ll discuss this in more details in my next post…

    time to read 3 min | 468 words

    Originally posted at 3/8/2011

    As a reminder, I am reviewing the problems that I found while reviewing the Whiteboard Chat project during one of my NHibernate’s Courses. Here is the method:

    [Transaction]
    [Authorize]
    [HttpPost]
    public ActionResult GetLatestPost(int boardId, string lastPost)
    {
        DateTime lastPostDateTime = DateTime.Parse(lastPost);
        
        IList<Post> posts = 
            _postRepository
            .FindAll(new GetPostLastestForBoardById(lastPostDateTime, boardId))
            .OrderBy(x=>x.Id).ToList();
    
        //update the latest known post
        string lastKnownPost = posts.Count > 0 ? 
            posts.Max(x => x.Time).ToString()
            : lastPost; //no updates
        
        Mapper.CreateMap<Post, PostViewModel>()
            .ForMember(dest => dest.Time, opt => opt.MapFrom(src => src.Time.ToString()))
            .ForMember(dest => dest.Owner, opt => opt.MapFrom(src => src.Owner.Name));
    
        UpdatePostViewModel update = new UpdatePostViewModel();
        update.Time = lastKnownPost; 
        Mapper.Map(posts, update.Posts);
    
        return Json(update);
    }

    In this post, I want to focus on a very common issue that I see over & over. The problem is that people usually don’t notice those sort of issues.

    The problem is quite simple, there is no limit to the amount of information that we can request from this method. What this mean is that we can send it 1900-01-01 as the date, and force the application to get all the posts in the board.

    Assuming even a relatively busy board, we are talking about tens or hundreds of thousands of posts that are going to be loaded. That is going to put a lot of pressure on the database, on the server memory, and on the amount of money that you’ll pay in the end of the month for the network bandwidth.

    There is a reason why I strongly recommend to always use a limit, especially in cases like this, where it is practically shouting at you that the number of items can be very big.

    On the next post, we will analyze the SELECT N+1 issue that I found in this method (so far, my record is 100% success in finding this type of issue in any application that I reviewed)…

    time to read 1 min | 139 words

    We just finished a pretty major change to how the NHibernate Profiler interacts with NHibernate. That change was mostly driven out of the desire to fully support running under the client profile and to allow us to support the new logging infrastructure in NHibernate 3.x.

    The good news, this is done Smile, you can now use NH Prof from the client profile, and you don’t need to do anything to make it work for NHibernate 3.x.

    The slightly bad news is that if you were relying on log4net conifguration to configure NH Prof, there is a breaking change that affects you. Basically, you need to update your configuration. You can find the full details on how to do this in the documentation.

    time to read 3 min | 471 words

    Originally posted at 3/8/2011

    As a reminder, I am reviewing the problems that I found while reviewing the Whiteboard Chat project during one of my NHibernate’s Courses. Here is the method:

    [Transaction]
    [Authorize]
    [HttpPost]
    public ActionResult GetLatestPost(int boardId, string lastPost)
    {
        DateTime lastPostDateTime = DateTime.Parse(lastPost);
        
        IList<Post> posts = 
            _postRepository
            .FindAll(new GetPostLastestForBoardById(lastPostDateTime, boardId))
            .OrderBy(x=>x.Id).ToList();
    
        //update the latest known post
        string lastKnownPost = posts.Count > 0 ? 
            posts.Max(x => x.Time).ToString()
            : lastPost; //no updates
        
        Mapper.CreateMap<Post, PostViewModel>()
            .ForMember(dest => dest.Time, opt => opt.MapFrom(src => src.Time.ToString()))
            .ForMember(dest => dest.Owner, opt => opt.MapFrom(src => src.Owner.Name));
    
        UpdatePostViewModel update = new UpdatePostViewModel();
        update.Time = lastKnownPost; 
        Mapper.Map(posts, update.Posts);
    
        return Json(update);
    }

    In this post, I want to focus on the usage of AutoMapper, in the form of Mapper.CreateMap() call.

    It was fairly confusing to figure out why that was going on there. In fact, since I am not a regular user of AutoMapper, I assumed that this was the correct way of doing things, which bothered me. And then I looked deeper, and figured out just how troubling this thing is.

    Usage of Mapper.CreateMap is part of the system initialization, in general. Putting it inside the controller method result in quite a few problems…

    To start with, we are drastically violating the Single Responsibility Principle, since configuring the Auto Mapper really have very little to do with serving the latest posts.

    But it actually gets worse, Auto Mapper assumes that you’ll make those calls at system initialization, and there is no attempt to make sure that those static method calls are thread safe.

    In other words, you can only run the code in this action in a single thread. Once you start running it on multiple thread, you are open to race conditions, undefined behavior and plain out weirdness.

    On the next post, I’ll focus on the security vulnerability that exists in this method, can you find it?

    time to read 2 min | 374 words

    As a reminder, I am reviewing the problems that I found while reviewing the Whiteboard Chat project during one of my NHibernate’s Courses. Here is the method:

    [Transaction]
    [Authorize]
    [HttpPost]
    public ActionResult GetLatestPost(int boardId, string lastPost)
    {
        DateTime lastPostDateTime = DateTime.Parse(lastPost);
        
        IList<Post> posts = 
            _postRepository
            .FindAll(new GetPostLastestForBoardById(lastPostDateTime, boardId))
            .OrderBy(x=>x.Id).ToList();
    
        //update the latest known post
        string lastKnownPost = posts.Count > 0 ? 
            posts.Max(x => x.Time).ToString()
            : lastPost; //no updates
        
        Mapper.CreateMap<Post, PostViewModel>()
            .ForMember(dest => dest.Time, opt => opt.MapFrom(src => src.Time.ToString()))
            .ForMember(dest => dest.Owner, opt => opt.MapFrom(src => src.Owner.Name));
    
        UpdatePostViewModel update = new UpdatePostViewModel();
        update.Time = lastKnownPost; 
        Mapper.Map(posts, update.Posts);
    
        return Json(update);
    }

    In this post, I’ll just cover some of the easy to notice issues:

    The method name is GetLatestPost, but it is actually getting the latest posts (plural). Yes, it is minor, but it is annoying.

    Action filter attributes for authorization or transactions shouldn’t be there. To be rather more exact, the [Authorize] filter attribute shows up on all the methods, and transactions with NHibernate should always be used.

    Using them as attributes means that you can select to use them or not (and indeed, there are some actions where the [Transaction] attribute is not found).

    Stuff that is everywhere should be defined once, and you don’t have to re-define it all over the place.

    FUTURE POSTS

    No future posts left, oh my!

    RECENT SERIES

    1. Recording (18):
      29 Sep 2025 - How To Run AI Agents Natively In Your Database
    2. Webinar (8):
      16 Sep 2025 - Building AI Agents in RavenDB
    3. RavenDB 7.1 (7):
      11 Jul 2025 - The Gen AI release
    4. Production postmorterm (2):
      11 Jun 2025 - The rookie server's untimely promotion
    5. RavenDB News (2):
      02 May 2025 - May 2025
    View all series

    Syndication

    Main feed ... ...
    Comments feed   ... ...
    }