Ayende @ Rahien

It's a girl

NHibernate – The difference between Get, Load and querying by id

One of the more common mistakes that I see people doing with NHibernate is related to how they are loading entities by the primary key. This is because there are important differences between the three options.

The most common mistake that I see is using a query to load by id. in particular when using Linq for NHibernate.

var customer = (
	select customer from s.Linq<Customer>()
	where customer.Id = customerId
	select customer
	).FirstOrDefault();

Every time that I see something like that, I wince a little inside. The reason for that is quite simple. This is doing a query by primary key. The key word here is a query.

This means that we have to hit the database in order to get a result for this query. Unless you are using the query cache (which by default you won’t), this force a query on the database, bypassing both the first level identity map and the second level cache.

Get  and Load are here for a reason, they provide a way to get an entity by primary key. That is important for several aspects, most importantly, it means that NHibernate can apply quite a few optimizations for this process.

But there is another side to that, there is a significant (and subtle) difference between Get and Load.

Load will never return null. It will always return an entity or throw an exception. Because that is the contract that we have we it, it is permissible for Load to not hit the database when you call it, it is free to return a proxy instead.

Why is this useful? Well, if you know that the value exist in the database, and you don’t want to pay the extra select to have that, but you want to get that value so we can add that reference to an object, you can use Load to do so:

s.Save(
	new Order
	{
		Amount = amount,
		customer = s.Load<Customer>(1)
	}
);

The code above will not result in a select to the database, but when we commit the transaction, we will set the CustomerID column to 1. This is how NHibernate maintain the OO facade when giving you the same optimization benefits of working directly with the low level API.

Get, however, is different. Get will return null if the object does not exist. Since this is its contract, it must return either the entity or null, so it cannot give you a proxy if the entity is not known to exist. Get will usually result in a select against the database, but it will check the session cache and the 2nd level cache first to get the values first.

So, next time that you need to get some entity by its primary key, just remember the differences…

Comments

MF
04/30/2009 07:14 AM by
MF

should the where clause in the first snippet

("where customer.Id = customerId")

really be:

where customer.Id == customerId

Anders
04/30/2009 07:34 AM by
Anders

Would this be catched by NHProf?

NG
04/30/2009 07:45 AM by
NG

So,

Session.Delete(Session.Load <customer(1))

Only goes to the DB once? Cool. I thought it went twice, which always bugged me.

Ayende Rahien
04/30/2009 09:07 AM by
Ayende Rahien

Andres,

Not currently, but that is a good suggestion.

Ng,

That depends on a lot of things, mostly if you have cascade associations.

Valeriu
04/30/2009 10:09 AM by
Valeriu

Get will bring back an initialized entity and will eager load all associations?

Or loading associations will depend on your explicit mappings?

Ayende Rahien
04/30/2009 10:10 AM by
Ayende Rahien

Valeriu,

Get works based on your mapping, it does't do any eager loading outside of what is defined there

Will Shaver
04/30/2009 02:03 PM by
Will Shaver

Another difference -

Query based selects such as the first example will include all defined and active filters on the entity.

Load / Get will IGNORE all filters for that entity. Filters set up on the entity's relationships will still be used when loading sets/references on the entity returned from Load / Get.

This can cause quite a bit of headache if you're making heavy use of filters and not expecting this behavior.

Neil Mosafi
04/30/2009 03:36 PM by
Neil Mosafi

I have seen this before:

s.Save(

new Order

{

    Amount = amount,

    customer = new Customer { Id = 1 }

}

);

I think it works, but I assume that's not recommended?

Ayende Rahien
04/30/2009 04:23 PM by
Ayende Rahien

Will,

I haven't even considered that, but of course, you are right.

Ayende Rahien
04/30/2009 04:26 PM by
Ayende Rahien

Neil,

Yuck, that is likely to cause "an object with the same id but with different reference is already associated with the current session"

Rob
04/30/2009 05:01 PM by
Rob

Very enlightening. How does this all relate to custom fetching strategies? I can't see a way to apply them using anything but a criteria/query.

Ayende Rahien
04/30/2009 05:07 PM by
Ayende Rahien

Rob,

It doesn't apply. If you need custom fetching, you need to use a query.

If you consider the reasons for Get / Load, you would see that it make sense that a custom fetching strategy would require a query.

There is no way for Get or Load to handle that.

Rob
04/30/2009 05:13 PM by
Rob

That's what I thought. Just wanted to make sure that I wasn't missing something.

Anthony Dewhirst
04/30/2009 05:51 PM by
Anthony Dewhirst

I know that you said that "if you know that the value exist in the database" but, thinking about concurrency, what if in your call it did exist but another user has deleted it before you make your call, will NHibernate still check with a select before insert or trust you and hope that you have used a FK constraint in the DB otherwise?

Ayende Rahien
04/30/2009 05:53 PM by
Ayende Rahien

Anthony,

That is why we have FK for

Neil Mosafi
04/30/2009 10:32 PM by
Neil Mosafi

Makes sense, but only if the object has already been loaded into the session I presume? I have seen it on some projects I worked on and always thought it looked a bit smelly! Having read your post I can see that calling Load is definitely the way to do it.

Cheers

Neil

Ayende Rahien
05/01/2009 06:30 AM by
Ayende Rahien

Neil,

Yes, if it is already there, it might cause that.

Another nasty side effect that can happen is if there is cascade defined on the object, which might cause NH to initialize all columns to null / default because of this trick

Jon Kruger
05/04/2009 12:00 AM by
Jon Kruger

Would it be bad to always use Load() and never use Get()? Or is there some scenario where you should use Get() over Load()? It sounds like Load() lets NHibernate figure out how to best handle the loading.

Ayende Rahien
05/04/2009 02:07 AM by
Ayende Rahien

Jon,

You should use Get() if you don't know that the entity exists.

Because Load will always return a value, Get will return null if the value does not exists

Ryan Heath
05/06/2009 09:58 AM by
Ryan Heath

Perhaps this will show I do not use NH (yet!), but how do you handle a case where you are 95% sure the object is in the database (ie a blogpost) and it better be loaded from the cache instead from the database (it will not change frequently). How to handle the 5% that is not in the database (deleted, wrong id through url hacking, google finds an old url, etc etc)? It feels wrong you have to catch the specific Load exception instead of checking null in order to return a nice 404 instead of an aweful 502 ...

// Ryan

Ryan Heath
05/06/2009 10:48 AM by
Ryan Heath

Aaah, I somehow had the impression Get never checked the cache, but upon rereading your post this sentence "Get will usually result in a select against the database, but it will check the session cache and the 2nd level cache first to get the values first." makes me happy again :)

// Ryan

Suiden
06/03/2009 07:04 AM by
Suiden

Which of the 3 ways is best to load an Author by Id AND his blogs (mapped one-to-many, lazy)?

Note: author.Blogs is lazy because 95% these are not needed on-the-fly; this scenario is about the other 5% when blogs must be available without making another round-trip to database..

In general: how to best load an entity by Id and a series of associations at the same time?

Ayende Rahien
06/03/2009 10:49 AM by
Ayende Rahien

Suiden,

HQL or Criteria are the things to use

Comments have been closed on this topic.