Performance and Explicit Domain Models
Udi talks about limitations for DDD as a result of performance constraints. He says:
Ayende and I had an email conversation that started with me asking what would happen if I added an Order to a Customer’s “Orders” collection, when that collection was lazy loaded. My question was whether the addition of an element would result in NHibernate hitting the database to fill that collection. His answer was a simple “yes”. In the case where a customer can have many (millions) of Orders, that’s just not a feasible solution.
He then goes on to describe several solutions for the issue, including [ThreadStatic] events, something I have never consider but has some interesting possibilities.
I wanted to add another option to the mix. You can extend NHibernate so calling customer.Orders.Add( new Order() ) will not force a load of the collection if the collection is not loaded, but would still add the new order to the unit of work.
What is the caveat? It violates the principal of least surprise.
My NHibernate.Generics collection used to do just that, I did it on the belief that it would save performance in un-needed trips to the database. I paid for that performance in grief when I found bugs where this failed:
1: customer.Orders.Add(new Order())
2: // sometime later, still in the same Unit of Work
3: if(customer.Orders.Contains(theOrder)==false)
4: {
5: //do something bad
6: }
Comments
I don't think your initial response to Udi was correct. If the underlying collection is a bag, then the collection shouldn't be initialized on an add. See here:
http://www.hibernate.org/hib_docs/nhibernate/1.2/reference/en/html/performance.html#performance-collections-mostefficentinverse
I'm not sure if this method has the same caveat as your code snippit.
That being said, this should be act as a minor efficiency boost. If the collection is so large that NHibernate can't deal with it efficiently, then I think it is a really bad idea to map this direction of the association. It might be buying transparent persistence, but at the cost of having a collection that you shouldn't use for anything other then inserting, which is the polar opposite of transparent. I'd probably encapsulate the addition on a customer repository.
"In the case where a customer can have many (millions) of Orders"... one shouldn't expose a naive customer.Orders collection in the first place, DDD or not.
This isn't a lazy loading problem, this is a fundamental modeling error.
I can see where you're going with respect to the issue you raise in your code example, but it would be better served by an example that hasn't already been invalidated.
Trisk,
I rarely use bags, so you are correct in your point, I should have mentioned that.
Jeremy,
Agreed, take a look at my post about "Architecting for Performance" - that deals with this issue
Even if you don't expose the "Orders" collection but make use of it internally for business rules, you still have to deal with this issue. For instance, a "Preferred Customer" is a Customer whose lifetime value is greater than 100K, and preferred customers get a 10% discount on all orders.
There are other ways of implementing these rules in the Domain Model such that you don't have to load all the Orders to know if a customer is preferred or not.
I'll check out the link about the bag.
Wouldn't that be rather easily fixed by having your collection class actually adding the entities as soon as its underlying NHibernate collection is loaded from the database?
If we ignore the problem with millions of rows in this case, of course.
Comment preview