NHibernateAvoid identity generator when possible
Before I start, I wanted to explain that NHibernate fully support the identity generator, and you can work with it easily and without pain.
There are, however, implications of using the identity generator in your system. Tuna does a great job in detailing them. The most common issue that you’ll run into is that identity breaks the notion of unit of work. When we use an identity, we have to insert the value to the database as soon as we get it, instead of deferring to a later time. It also render batching useless.
And, just to put some additional icing on the cake. On SQL 2005 and SQL 2008, identity is broken.
I know that “select ain’t broken” most of the time, but this time, it appears it does :-)
We strongly recommend using some other generator strategy, such as guid.comb (similar to new sequential id) or HiLo (which also generates human readable values).
More posts in "NHibernate" series:
- (19 Nov 2010) Complex relationships
- (27 Jun 2010) Streaming large result sets
- (25 May 2009) Why do we need to specify the query type twice?
- (20 Mar 2009) Avoid identity generator when possible
- (26 Mar 2007) Nullable DateTime Issues
- (08 Jan 2007) Fetching multiply collections in one roundtrip
Comments
Sometimes I feel you should be clearer in your post. Couple of more sentences wouldn't hurt your 25h a day time (you aren't sleeping anyway so what ;-).
but excellent article helped, now its clearer (so no SQL server generated identities:)
what is your take on a keytable identity generation (in some other O/RM) ?
Though he doesn't mention it (at least on my quick scan through it), identifiers like Guid.Comb (which I've been using on nearly everything lately, NHibernate or not) helps solve the problem of migrating data that was created in a non-production environment into the production environment.
On several systems I've worked on, no one is allows to load data directly into production. Instead, it has to be loaded, edited, etc in a staging or other environment first. It rarely moves in nice, tidy packages. Instead, new records and changes move in odd batches that require knowing for certain which things are new, which are updates and which are collisions.
The worst of these systems went to immense effort to avoid collisions in the int identifiers that they were using. Worse, because they seemed unable to handle the foreign key problems, they actually went completely without foreign keys. The result was that when my team showed up, the database had 100+ tables and absolutely no foreign keys between them and only a handful of primary keys.
While I know that guids (much less guid.combs) have a non-zero chance of colliding, they solve enough problems well enough that I'm not going back to ints ever again.
Maybe you shouldn't even always use an explicit ID at all on entities. Let the object themselves be the identifier and leave it to the persistence layer to handle the references (just like we let the CLR handle memory pointers, or Db4o handle ids).
You can let the repository handle ID tracking (IDictionary <guid,> ). Navigational collections between entities can be made using IDictionary aswell. The point being that ID:s used for relational primary keys are a navigational persistence concern rather than a concern of the Entity.
In a DDD-like system it can be more appropriate to solve retrieval of entities and distributed communication using natural ID:s (e.g. Product #) or navigational mechanisms. I'll be exploring this in an upcoming blog post.
Why ?
On a related note I never really liked SQL Server identity mechanism. Oracle has solved this in much more polished way IMO (sequence, INSERT INTO ... RETURNING)
Link to nhforge artice you linked seems broken.
identity insert is the kiss of death when you are dealing with complex object graphs in nhibernate thick clients.
It defies the logic behind "flushmode.never" because any call to save an object will immediately result in an insert statement without a call to flush, or the commit of a transaction.
ORM style generator can generate the identifiers before objects are sent to database. This is advantageous because you don’t need to go to database in order to have the ID, then set a relation based on this id.
Petar Repac, in SQL Server 2005 they have introduced the OUTPUT clause on the INSERT, UPDATE and DELETE statements with which you can generate a resultset of the changed rows. Getting the generated identity value with the OUTPUT clause is always safe.
I know that “select ain’t broken” most of the time, but this time, it appears it does. Sometimes I feel you should be clearer in your post. But its excelent aticles.
Comment preview