NHibernate's Xml In
I wrote it because of a particular problem that I have run into, which is not something that I have heard much discussion about. In my application, I need to query the database about a certain data, and the best way to do that would be using an in query. For the purpose of discussion, I want to find all the customers associated with the current user.
Unfortantely, I can't do something as simple as "Where.Customer.User == CurrentUser", because a customer may be associated to a user in many complex and interesting ways (the end result is a 3 pages query, btw). Therefor, I make the calculation of who are the relevant customers for a user when they login, and cache that.
So, I need to ask the database for all the customers assoicated with the user, and since I already know about the user's customers. In() was a natrual way to go. The problem is that I actually began to run into SQL Server 2,100 parameters per query limit when important users (who has a lot of associations) started to use that.
Can you say, major stumbling block? There are several solutions for that, and the one I choose is to extend NHibernate to perform an IN query on an XML datatype, as described here. You can see the implementation here.
Why use an IN on XML instead of a join against the data? I can send it to the database using BulkCopy and then join against that very easily, no? (I describe one such way here)
Using Bulk Copy & Join approach would probably turn out to be faster than an IN on an XPath (haven't tested, though). But as it turn out, I had several reasons for that:
- Using the BulkCopy & Join approach would mean that I need to perform two DB Queries, instead of one.
- Someone should be responsible for clearing the joined table at one point, that is another thing to deal with.
- It requires a two steps process, with no easy way to back out of that if there is a small amount of items that we want to check.
- Caching
Using the BulkCopy & Join basically means that I have no real way to avoid hitting the database altogether. Using this approach, I am merely adding a (potentially large) parameter to the query, and let NHibernate deals with everything else.
The way you use this is simple:
session.CreateCriteria(typeof(Customer))
.Add(XmlIn.Create("id", potentiallyLargeAmount))
List();
It will automatically default to normal IN behavior if you are not running on SqlServer 2005 or if the amount of items that you are checking is smaller than 100.
Comments
I don't see why a subquery wouldn't work there, but maybe I don't understand the problem...
SELECT * FROM CustomerOrders
WHERE CustomerID IN (
)
That would work with 2100 results from the inner query, wouldn't it?
Another way would be to coalesce the different columns into a single comma separated string and perform IN on that... can't think of the query off the top of my head, but that's possible as well.
Ben, how are you going to get the data for the subquery in the first place?
That is really a monster one, and I would rather not run it overmuch.
I thought about using string splitting, but that seems like a hack .The XML is also a hack, but less so.
I'm going to hit a similar situation this iteration.. a treeview where the user can select multiple leaves (of users), then create an aggregate report off the selected users...
This is way cool!
Why not use literals instead of parameters? So instead of '... IN (@P1, @P2, @P3, ....)' have it emit '... IN (213, 267, 3080, ...)'. I don't know much about NHibernate, but I would be surprised if one could not hook into its sql generation infrastructure.
Yves,
There are several reasons, the first is that it can blow up very fast, because NH is keeping cached plans.
The second is the DB compiled plans cache.
The third and most important is that you now opens yourself to SQL vulnerability.
Interesting technique and nice that you use the simpler method if less than 2100 parameters or a non-SQL2K5 database. My only complaint is the XmlIn naming. You are letting an implementation detail leak into your naming. Shouldn't this be called something like "In" so that similar hacks, though not necessarily XML-based, could be implemented for other databases?
Initially I called in BigIn, but that sounds... not nice.
Comment preview