NHibernate is lazy, just live with it
At a client site, I found the following in all the mapping files:
<?xml version="1.0" encoding="utf-8" ?> <hibernate-mapping xmlns="urn:nhibernate-mapping-2.2" default-lazy="false"> <class name="Machine" table="Machines" lazy="false"> <id name="Id"> <generator class="identity"/> </id> <property name="Name"/> <set name="Parts" table="Parts" lazy="false"> <key column="BlogId"/> <one-to-many class="Part"/> </set> </class> </hibernate-mapping>
I think that you can figure out by now what is wrong. It is too bad that the <blink/> tag has gone out of use, because that would be an accurate reflection of how I responded when I saw that. After I finished frothing at the mouth, I explained that you never ever do things like, because it leads to pain, mayhem and tears.
The lead dev heard me out, then explained that their situation wasn’t a standard web application, their server had some business functionality, but it served mainly as a way for clients to come in and get some data. The architecture looks like this:
As you can imagine, NHibernate is sitting in the Application tier, but a lot of the work is done on the smart client application.
The lead dev explained that in their scenario, they needed to send the full entity to the smart client app, so they didn’t want lazy loading. Their scenario also precluded the need to do lazy loading over the network, so all was good, and they made the decision to use lazy=”false” intentionally. Their model was pretty good about not having deeply connected object graphs, too.
I hit Google and searched for the relevant posts:
- ayende strippers – Which leads to the post: The Stripper Pattern.
- castrate ayende * – Which leads to the post: Don’t castrate your architecture
* Googling that made me very nervous.
I include the search terms as a reminder to myself that post titles are important, because that is how I search for them after the fact. And I got some really strong (and strange) looks when I did that.
After we discussed that we put it aside and moved to other topics. Until we came to the part where we profiled the system.
We immediately found some trouble spots that we needed to resolve. One of them was getting a list of machines that required fixing.
The problem was that the code looked something like this:
public IEnumerable<Machine> GetMachinesRequiringFixes() { return (from machine in session.Linq<Machine>() where machine.Status == Statuses.Broken select machine).Take(10); }
Looks easy, right?
Except that this seemingly innocent piece of code was responsible for 400 queries.
What went wrong? How could loading 10 machines result in that many queries?
Well, let us look what NHibernate did, shall we?
select top 10 * from Machines where Status = 'Broken'
This gives NHibernate ten machines instances, except that when NHibernate investigate the mapping for a Machine, it realizes that it needs to load the Parts collection, since it was marked lazy=”false”, so NHibernate will execute ten queries for loading each Machine’s Parts.
select * from Parts where MachineId = 40 select * from Parts where MachineId = 41 select * from Parts where MachineId = 42 select * from Parts where MachineId = 43 select * from Parts where MachineId = 44 select * from Parts where MachineId = 45 select * from Parts where MachineId = 46 select * from Parts where MachineId = 47 select * from Parts where MachineId = 48 select * from Parts where MachineId = 49
But each Part also has an association for a MaintenanceHistory, which was also marked lazy=”false”, so NHibernate had to load those as well. (I’ll spare you that one :-) ).
From the client perspective, Machine,Parts & Maintenance History where all part of the same entity, and they wanted to be able to be able return that to the user in a single call. (More on that fallacy in my next post).
The problem is that when you set lazy=”false”, you literally ties NHibernate’s hands. Now, you could start playing with the fetch mode option, to change that to join or subselect, but the problem is that this works when you have only one collection association (in the entire graph), because otherwise you run into sever Cartesian products issues.
Now, NHibernate has the ability to handle just that scenario, but by setting lazy=”false”, you prevent NHibernate from having the chance to actually utilize them.
There is a good reason why lazy is set to true by default, and while I am sure that there is some limited number of scenarios where lazy=”false” is the appropriate choice, it isn’t for your scenario.
Comments
I think it is a common misconception that NHibernate is simpler without lazy loading. But actually I think it is far more difficult to get right - especially when you throw in further non-lazy 'simplifications' like treating sessions as transient in your architecture and using disconnected objects (preventing you from ever enabling lazy loading again).
Glad you wrote about it, now I can clobber people with a link next time
Is the downside of lazy=false strictly performance in this case? (I'm not implying that performance is unimportant.)
Could you talk more about which scenarios where lazy=false should be used? For example I am using nhibernate for a reporting only model, no associations. Does lazy=false make sense here?
When you load this page, do you have a content -> comment association and do you load the comments lazily? or do you load the content and the comments at once and display it?
Why isn't NHibernate's second query set a single query with an IN clause?
Yes, it would still return too much data but at least there would be less overhead/latency.
[)amien
I don't really see why NHibernate needs to fetch Parts etc. when the query doesn't refer to it. In fact, by doing so (and thus resulting in 400 queries) it is doing lazy loading (load on demand) while there's actually no need for it.
If you switch off lazy loading, like in the mapping file, the developer should instead EAGER load the data of the graphs. The queries generated now are the same as when he'd traverse the rows and access the Parts collection, wouldn't it? He also could have solved this by using a fetch strategy btw, not ideal either (always eager loading isn't ideal).
Jason,
It is eventually perf. It is also limiting what NH can do.
João P. Bragança,
Read the post, you shouldn't use lazy=false
If you are using reporting only model with no associations, then lazy=false is still isn't right.
It wouldn't do anything, anyway
Frank,
I didn't write the blog, and the blog isn't using NH, so I have no idea
Damien,
Because it is too early for NH to know.
Imagine a polymorphic query that load some things with different collections, for example
Makes writing standard SQL seem appealing.
Frans,
It has to load that because the mapping explicitly told it to.
Frans,
Also note that NH make a distinction between lazy=true/false and how to eagerly load things fetch=join/select
So what is the answer? What should he actually have done in this situation to load all that data? It is a problem we come across a lot.
I'm surprised NHIbernate doesn't efficiently eagerly load graphs via joins.
I couldn't live without this feature in EF.
The answer to Damien seems a bit weird. Why wouldn't NHibernate not know about further polymorphic collections or other nonsense? Aren't these things set up in the mapping files?
Frans,
If you set the mapping to "lazy=true" and do not eagerly load parts, what would be the Parts property set to? Setting it to null is something EF 1.0 tried to do and is a bad idea because you end up with a broken model. Do you throw an exception? I am not sure I like it either.
John,
The way NHibernate work, it has to process each loaded entity immediately, because that entity is being exposed to user code.
So before the entire list is done, it has to process all the work for the entity. It doesn't have the chance to look at everything.
The problem then becomes that you set lazy=false, didn't set fetch (so it is at the default select), which instruct NH to issue a separate select.
So, playing devils advocate, if default_lazy=false is so bad, why does NH allow you to do it ?
I understand in specific instances there might be a reason to override a relationship so that it isn't lazy loaded, but from whats said here there is no reason to have a default of default_lazy=false.
I'm reading alot of NH stuff recently which says "x is bad, don't do it", seems the tweaking and configuration of NH adds a fairly significant overhead to any project.
Mark,
Two reasons, backward compatibility and that there are scenarios where you want that.
Dmitry:
Yes, null. Because you didn't fetch it and lazy loading isn't loading it. That's why a proper eager loading strategy is essential, e.g. through prefetch paths, Includes or what have you.
(so you'd issue 3 queries, one for each entity type and merge them internally)
I agree @sql. For straight entities stored in the database, using nhib is great. Insert, update and query are very easy. As soon as you have any other query (most others), writing a sql statement and mapping that using linq is far easier.
Maybe I'm missing something, but you seem a bit harsh on lazy=false.
In cases where you know related entities are needed, lazy-loading has massive performance implications, especially in cases of smart client applications where server connectivity comes at great cost.
Your client was right in mapping it's entities with lazy="false", his only mistake was not setting the fetch mode.
I agree the decision not to lazy load has great implications, and should be dealt with care, but there is a great distance from getting the goosebumps over it.
The main lesson we ought to learn from your client is not " we heart lazy", but "NHibernate is not magic". or in short RTFM ;)
Sternr,
Nope, the problem with doing those things in the mapping is that they are global.
See my post about the different scenarios and how they do not apply
Comment preview