NHibernate performance concerns

time to read 5 min | 919 words

Darrel investigated NHibernate and came back with a Post Traumatic Stress Disorder. The issue he had was with NHibernate's Automatic Dirty Checking.

The implementation fo this feature is done by keeping the state of the object when it was loaded from the database inside the session, and comparing the initial data to the current state of the object when flushing. The problem, in Darrel's words is:

From my perspective there are two major drawbacks to this approach.  First, considering the data you are manipulating is probably consuming the majority of the applications memory, you have double the memory requirements for the writable objects.  I am assuming you only take this hit for objects retrieved as writable, but even so, this just seems unreasonable especially considering the fact that many people use Hibernate on a server that is a shared resource!

The second problem is that when doing a flush, it is necessary to do a comparison on each and every entity property.  This is necessary even if no changes have been made.

Let me tackle the second problem first. Yes, Flush()ing when you have large number of entities in the session is going to perform poorly. That is why NHibernate provide you with ways to tell it that it shouldn't make those checks (calling Evict() on the stuff that you know wasn't changed, etc). In practice, you only need to do this in rare cases. Loading large amount of objects, modifying a few and saving them is a use case that doesn't turn out that often, in my experiance, and when it does, it is simple to optimize it. If you really wish, you can provide a way (through the interceptor) to tell NHibernate which entities are dirty or not, and save it the effort.

The first problem that Darrel has is with the memory consumtion. I will start right off by saying that I have been a devot user of NHibernate*for over two years now, and I'm building big, complex systems using it. Memory consumtion was never an issue with NHibernate. Optimizing NHibernate's performance is almost solely focused on reducing the amount of queries and getting NHibernate to fetch the data in the most efficent way.

The one time I heard about an issue with memory consumtion and NHibernate, it wasn't because of the entities cache, but because the developers had abused NHibernate's query translation cache by generating hundreds of thousands of queries from user's inputs.

The basic assumption that I contest is that the entities are going to be the majority of the application's memory. This is not the case in nearly any application that I know of. Let us say that we have an entity that is 1Kb in size (this is a farily large entity, by the way). Let us say that we have 250 of those in the current request ( again, this is a very big number ). So we have 250Kb for the entities, and another 250Kb for the loaded state. 500Kb is a lot, you say?

Let us try finding other stuff (per request) that has similar size. The first thing that I can think of, of course, is ViewState. I am sure that I am not the only one who got a "WTF is ViewState doing that it is 80% of the page size, including pictures!?". And, of course, you pay for ViewState's (memory-wise) twice, once for the serialized data which remains in the request's variables collection until the request ends and once for the de-serialized data.

The page itself and its controlls collection is another very big object. And I am sure that you can see where I am going from here. So, even for a case with a lot of very big entities, I would bet that the other factors in the request would overhsadow the amount of memory consumed by NHibernate by a large precentage.

But, and this is important, since when has memory been a problem for scaling an application? (Leaving aside doing silly stuff like putting a 400Mb dataset per user in the session). I/O is a far bigger factor in scaling applications, especially database I/O.

To conclude, I don't see NHibernate's Automatic Dirty Checking memory consumtion as a problematic issue. Quite the reverse, actually. You have no idea how powerful this is until you realize that your business logic can do its stuff (and you can test it) completely seperated from the persistance mechanism, and NHibernate will simply pick up the changes and persist them for you. The first time I did it, I felt like I was doing magic.

Oh, and just to point out, there aren't really any other way to handle this, expect by doing exactly that. When I spoke with Luca Bolognese about DLinq, he mentioned that it uses basically the same approach to solve this issue.

* Mainly because whenever I stray from the Path I get hit hard on the head by the ADO.Net gods of venegance and boring code.