NHibernate – The difference between Get, Load and querying by id
One of the more common mistakes that I see people doing with NHibernate is related to how they are loading entities by the primary key. This is because there are important differences between the three options.
The most common mistake that I see is using a query to load by id. in particular when using Linq for NHibernate.
var customer = ( select customer from s.Linq<Customer>() where customer.Id = customerId select customer ).FirstOrDefault();
Every time that I see something like that, I wince a little inside. The reason for that is quite simple. This is doing a query by primary key. The key word here is a query.
This means that we have to hit the database in order to get a result for this query. Unless you are using the query cache (which by default you won’t), this force a query on the database, bypassing both the first level identity map and the second level cache.
Get and Load are here for a reason, they provide a way to get an entity by primary key. That is important for several aspects, most importantly, it means that NHibernate can apply quite a few optimizations for this process.
But there is another side to that, there is a significant (and subtle) difference between Get and Load.
Load will never return null. It will always return an entity or throw an exception. Because that is the contract that we have we it, it is permissible for Load to not hit the database when you call it, it is free to return a proxy instead.
Why is this useful? Well, if you know that the value exist in the database, and you don’t want to pay the extra select to have that, but you want to get that value so we can add that reference to an object, you can use Load to do so:
s.Save(
new Order
{
Amount = amount,
customer = s.Load<Customer>(1)
}
);
The code above will not result in a select to the database, but when we commit the transaction, we will set the CustomerID column to 1. This is how NHibernate maintain the OO facade when giving you the same optimization benefits of working directly with the low level API.
Get, however, is different. Get will return null if the object does not exist. Since this is its contract, it must return either the entity or null, so it cannot give you a proxy if the entity is not known to exist. Get will usually result in a select against the database, but it will check the session cache and the 2nd level cache first to get the values first.
So, next time that you need to get some entity by its primary key, just remember the differences…
Comments
should the where clause in the first snippet
("where customer.Id = customerId")
really be:
where customer.Id == customerId
Would this be catched by NHProf?
So,
Session.Delete(Session.Load <customer(1))
Only goes to the DB once? Cool. I thought it went twice, which always bugged me.
MF,
Yes, it should be
Andres,
Not currently, but that is a good suggestion.
Ng,
That depends on a lot of things, mostly if you have cascade associations.
Get will bring back an initialized entity and will eager load all associations?
Or loading associations will depend on your explicit mappings?
Valeriu,
Get works based on your mapping, it does't do any eager loading outside of what is defined there
Another difference -
Query based selects such as the first example will include all defined and active filters on the entity.
Load / Get will IGNORE all filters for that entity. Filters set up on the entity's relationships will still be used when loading sets/references on the entity returned from Load / Get.
This can cause quite a bit of headache if you're making heavy use of filters and not expecting this behavior.
I have seen this before:
s.Save(
);
I think it works, but I assume that's not recommended?
Will,
I haven't even considered that, but of course, you are right.
Neil,
Yuck, that is likely to cause "an object with the same id but with different reference is already associated with the current session"
Very enlightening. How does this all relate to custom fetching strategies? I can't see a way to apply them using anything but a criteria/query.
Rob,
It doesn't apply. If you need custom fetching, you need to use a query.
If you consider the reasons for Get / Load, you would see that it make sense that a custom fetching strategy would require a query.
There is no way for Get or Load to handle that.
That's what I thought. Just wanted to make sure that I wasn't missing something.
I know that you said that "if you know that the value exist in the database" but, thinking about concurrency, what if in your call it did exist but another user has deleted it before you make your call, will NHibernate still check with a select before insert or trust you and hope that you have used a FK constraint in the DB otherwise?
Anthony,
That is why we have FK for
Makes sense, but only if the object has already been loaded into the session I presume? I have seen it on some projects I worked on and always thought it looked a bit smelly! Having read your post I can see that calling Load is definitely the way to do it.
Cheers
Neil
Neil,
Yes, if it is already there, it might cause that.
Another nasty side effect that can happen is if there is cascade defined on the object, which might cause NH to initialize all columns to null / default because of this trick
Would it be bad to always use Load() and never use Get()? Or is there some scenario where you should use Get() over Load()? It sounds like Load() lets NHibernate figure out how to best handle the loading.
Jon,
You should use Get() if you don't know that the entity exists.
Because Load will always return a value, Get will return null if the value does not exists
Perhaps this will show I do not use NH (yet!), but how do you handle a case where you are 95% sure the object is in the database (ie a blogpost) and it better be loaded from the cache instead from the database (it will not change frequently). How to handle the 5% that is not in the database (deleted, wrong id through url hacking, google finds an old url, etc etc)? It feels wrong you have to catch the specific Load exception instead of checking null in order to return a nice 404 instead of an aweful 502 ...
// Ryan
Ryan,
You use a Get
Aaah, I somehow had the impression Get never checked the cache, but upon rereading your post this sentence "Get will usually result in a select against the database, but it will check the session cache and the 2nd level cache first to get the values first." makes me happy again :)
// Ryan
Which of the 3 ways is best to load an Author by Id AND his blogs (mapped one-to-many, lazy)?
Note: author.Blogs is lazy because 95% these are not needed on-the-fly; this scenario is about the other 5% when blogs must be available without making another round-trip to database..
In general: how to best load an entity by Id and a series of associations at the same time?
Suiden,
HQL or Criteria are the things to use
Comment preview