Ayende @ Rahien

Refunds available at head office

A guide into OR/M implementation challenges: Custom Queries

Continuing to shadow Davy’s series about building your own DAL, this post is about the Executing Custom Queries.

The post does a good job of showing how you can create a good API on top of what is pretty much raw SQL. If you do have to create your own DAL, or need to use SQL frequently, please use something like that rater than using ADO.Net directly.

ADO.Net is not a data access library, it is the building blocks you use to build a data access library.

A few comments on queries in NHibernate. NHibernate uses a more complex model for queries, obviously. We have a set of abstractions between the query and generated SQL, because we are supporting several query options and a lot more mapping options. But while a large amount of effort goes into translating the user desires to SQL, there is an almost equivalent amount of work going into hydrating the queries. Davy’s support only a very flat model, but with NHibernate, you can specify eager load options, custom DTOs or just plain value queries.

In order to handle those scenarios, NHibernate tracks the intent behind each column, and know whatever a column set represent an entity an association, a collection or a value. That goes through a fairly complex process that I’ll not describe here, but once the first stage hydration process is done, NHibernate has a second stage available. This is why you can write queries such as “select new CustomerHealthIndicator(c.Age,c.SickDays.size()) from Customer c”.

The first stage is recognizing that we aren’t loading an entity (just fields of an entity), the second is passing them to the CustomerHealthIndicator constructor. You can actually take advantage of the second stage yourself, it is called Result Transformer, and you can provide you own and set it a query using SetResultTransformer(…);

Comments

josh
08/28/2009 09:18 PM by
josh

"NHibernate tracks the intent behind each column"

I'm coming to appreciate just how hard that part can be. Thanks to you, Tuna, and everyone else for the work on NH.

Andrey Shchekin
08/28/2009 10:14 PM by
Andrey Shchekin

string sql? I have a re-invented DAL/OR used on older projects, and queries are one of the most interesting functionality pieces.

My final solution was sql-ish but still had Intellisense, etc:

Prepare.Update(Tables.Person)

.Set(Tables.Person.Salary).To(Tables.Person.Salary * 2)

.Where(Tables.Person.Age > 10)

which was unperfect (there was no Linq then), but allowed Update/Insert queries and much easier query composition than strings or even Linq allows.

This allows to write query transformers (which take customer's search request and turn it into a query) in an sql-independent terms, and it also actually allows to optimize a query for a certain SQL engine before it gets sent (in some cases evaluate the query before it gets sent).

I do not really understand why strings (HQL or SQL) are not considered a bad smell -- they are not auto-refactorable or searchable and require tests that have access to either original schema or a memory DB to be verified.

Mr_Simple
08/29/2009 01:38 AM by
Mr_Simple

SQL isnt a bad smell because it's simple - mapped against your final solution - I'd rather just fire a simple SQL UPDATE statement.

Rafal
08/29/2009 06:35 AM by
Rafal

@Andrey

Nice idea, it has also been implemented in Sooda ORM (sooda.sourceforge.net) long time before Linq. In Sooda it is used only for building where clauses, but a text query language (SOQL) is also supported. The nice part is that both approaches generate the same query AST, not raw sql, so it allows for advanced query post-processing - for example transparently adding security restrictions to all queries executed by user. NHibernate excels in many features, but I think lags behind in query API (the criteria API is ugly, HQL AST has been introduced only recently) - I could not believe it when I tried to use NH after several years of working with Sooda. Hope the situation has improved with recent additions of Linq support and AST, but haven't yet tested these features.

Andrey Shchekin
08/29/2009 08:30 AM by
Andrey Shchekin

__SQL isnt a bad smell because it's simple - mapped against your final solution - I'd rather just fire a simple SQL UPDATE statement.

That is if you do not have any variables (which require both creating a place holder in SQL and adding a variable) and if you do not have to build WHERE from several independent parts (that can have their own variables). And, as Rafal said, this is allows for much better security, for example-- you can write an interceptor that checks what fields are going to be updated and whether this is allowed for current user.

More importantly, plain SQL will be untestable without real DB or memory DB which has all features SQL requires in the same syntax as real DB.

SqlDataSource is also very simple, but it does not make it smell good.

Ajai Shankar
08/30/2009 05:32 AM by
Ajai Shankar

Hi Ayende

Have a question about eager loading...

If I have an entity with more than one association (person having many addresses and contacts) when trying to eager fetch both of them - duplicate entries seem to be returned due to left joins.

Same is the case when eager fetching both child & grand child collections (person having contacts each having many phones)

What would be the best way to eager load them without getting duplicates?

If I understand correctly distinct root entity transformer eliminates duplicates only on the root...

And today came across fabiomaulo.blogspot.com/.../...mance-analisys.html which has session.GetNamedQuery("NHCompany.All").List <nhcompany()

that loads the entire graph in a single SQL!

How is that possible?

Thanks

Ajai

Ayende Rahien
08/30/2009 06:50 AM by
Ayende Rahien

Ajai,

The duplicate results because of a join are expected. Use SetResultTransofmer( new DistinctRootEntityResultTransformer())

There wouldn't be any duplicates inside the collections.

Not sure how Fabio is doing this, but take into account that there is a difference between loading the graph and loading the graph connected

Just nobody
08/31/2009 09:48 AM by
Just nobody

"The duplicate results because of a join are expected. Use SetResultTransofmer( new DistinctRootEntityResultTransformer())

There wouldn't be any duplicates inside the collections."

What about bag collections in combination with eager fetching? I don't believe that DistinctRootEntityResultTransformer would eliminate duplicate rows generated by left join from second level of the entity graph.

Ayende Rahien
08/31/2009 11:57 AM by
Ayende Rahien

Nobody,

To get duplicate results in a bag you would have to create a triple join.

Between the entities, the bag and another collection.

NH doesn't recommend eagerly loading several collections at one shot, because of the problems with Cartesian products.

Just nobody
09/01/2009 03:01 AM by
Just nobody

"To get duplicate results in a bag you would have to create a triple join."

This is exactly the case described by Ajai.

"NH doesn't recommend eagerly loading several collections at one shot, because of the problems with Cartesian products."

Correct...but sometimes you need "all" tree loaded. Until now, I identified 3 solutions:

  1. Use multicriteria/multiquery to get subsets.

  2. Eliminate duplicate rows with a custom ResultTransformer

  3. Iterate each level and use Initialize()

1 is the best, but it cant be used in all scenario

3 is a pain for deeper hierarchies.

Other suggestions/analyses?

Ayende Rahien
09/01/2009 07:40 AM by
Ayende Rahien

Nobody,

4) issue multiple queries without multi criteria.

That would be faster.

5) design your model so you don't need that.

Just nobody
09/01/2009 08:45 AM by
Just nobody

"5) design your model so you don't need that."

Yup...this is the one :p

About #4: NH docs says:

"An interesting usage of this feature is to load several collections of an object in one round-trip, without an expensive cartesian product (blog * users * posts).

Blog blog = s.CreateMultiQuery()

.Add("select b from Blog b left join fetch b.Users where b.Id = :id")

.Add("select b from Blog b left join fetch b.Posts where b.Id = :id")

.SetInt32("id", 123)

.UniqueResult

<blog();"

Could you give us more details about how to load the blog with multiple queries(in a faster way)?

Ayende Rahien
09/01/2009 08:52 AM by
Ayende Rahien

Nobody,

It is just executing them one at a time:

Blog blog = s.CreateQuery("select b from Blog b left join fetch b.Users where b.Id = :id")

.UniqueResult();

s.CreateQuery("select b from Blog b left join fetch b.Posts where b.Id = :id").List();

Just nobody
09/01/2009 09:39 AM by
Just nobody

I didn't know that the second query will hydrate the blog instance with his posts. Still, it's not clear to me if this is faster than multi criteria: it makes 2 round-trips. Or I misunderstand you and this option is for cases where multi criteria is not available?

Thank you for all

Ayende Rahien
09/01/2009 09:42 AM by
Ayende Rahien

Nobody,

In this case, those are two round trips, yes.

So same as with lazy loading.

When you are loading more than a single object, it matters, a lot.

Nick Aceves
09/01/2009 05:14 PM by
Nick Aceves

"SQL isnt a bad smell because it's simple - mapped against your final solution - I'd rather just fire a simple SQL UPDATE statement."

You're right. I'm going to quit using C# and start programming in machine code now... after all, how can you get simpler than a string of 1's and 0's?

I'm being a little bit extreme, but you get the point. SQL is the machine code of data access. You should use an abstraction on top of it whenever possible. It'll be safer, cleaner, and save you time (both short-term and long-term).

Comments have been closed on this topic.