Ayende @ Rahien

Refunds available at head office

A bug story

I run into a bug today with the way NHibernate dealt with order clauses. In particular, it can only happen if you are:

  • Use parameters in the order clause
  • Using SQL Server 2005
  • Using a limit clause

If you met all three conditions, you would run into a whole host of problems (in particular, NH-1527 and NH-1528). They are all fixed now, and I am writing this post as the build run. The underlying issue is that SQL Server 2005 syntax for paging is broken, badly.

Let us take the this statement:

SELECT   THIS_.ID         AS ID0_0_,
         THIS_.AREA       AS AREA0_0_,
         THIS_.PARENT     AS PARENT0_0_,
         THIS_.PARENTAREA AS PARENTAREA0_0_,
         THIS_.TYPE       AS TYPE0_0_,
         THIS_.NAME       AS NAME0_0_
FROM     TREENODE THIS_
WHERE    THIS_.NAME LIKE ?
         AND THIS_.ID > ?
ORDER BY (SELECT THIS_0_.TYPE AS Y0_
          FROM   TREENODE THIS_0_
          WHERE  THIS_0_.TYPE = ?) ASC

And let us say that we want to get a paged view of the data. How can we do it? Here is the code:

SELECT   TOP 1000 ID0_0_,
                  AREA0_0_,
                  PARENT0_0_,
                  PARENTAREA0_0_,
                  TYPE0_0_,
                  NAME0_0_
FROM     (SELECT ROW_NUMBER()
                   OVER(ORDER BY __HIBERNATE_SORT_EXPR_0__) AS ROW,
                 QUERY.ID0_0_,
                 QUERY.AREA0_0_,
                 QUERY.PARENT0_0_,
                 QUERY.PARENTAREA0_0_,
                 QUERY.TYPE0_0_,
                 QUERY.NAME0_0_,
                 QUERY.__HIBERNATE_SORT_EXPR_0__
          FROM   (SELECT THIS_.ID         AS ID0_0_,
                         THIS_.AREA       AS AREA0_0_,
                         THIS_.PARENT     AS PARENT0_0_,
                         THIS_.PARENTAREA AS PARENTAREA0_0_,
                         THIS_.TYPE       AS TYPE0_0_,
                         THIS_.NAME       AS NAME0_0_,
                         (SELECT THIS_0_.TYPE AS Y0_
                          FROM   TREENODE THIS_0_
                          WHERE  THIS_0_.TYPE = ?) AS __HIBERNATE_SORT_EXPR_0__
                  FROM   TREENODE THIS_
                  WHERE  THIS_.NAME LIKE ?
                         AND THIS_.ID > ?) QUERY) PAGE
WHERE    PAGE.ROW > 10
ORDER BY __HIBERNATE_SORT_EXPR_0__

Yes, in this case, we could use TOP 1000 as well, but that doesn't work if we want pages data that isn't started at the beginning of the data set.

Now, here is an important fact, the question marks that you see? Those are positional parameters. Do you see the bug now?

SQL Server 2005 (and 2008) paging support is broken. I find it hard to believe that a feature that is just a tad less important than SELECT is so broken. Any other database get it right, for crying out load.

Anyway, by now you noticed that when we processed the statement to add the limit clause, we had re-written the structure of the statement and changed the order of the parameters. Tracking that problem down was a pain, just to give an idea, here is a bit of the change that I had to make:

/// <summary>
/// We need to know what the position of the parameter was in a query
/// before we rearranged the query.
/// This is used only by dialects that rearrange the query, unfortunately, 
/// the MS SQL 2005 dialect have to re shuffle the query (and ruin positional parameter
/// support) because the SQL 2005 and 2008 SQL dialects have a completely broken 
/// support for paging, which is just a tad less important than SELECT.
/// See  	 NH-1528
/// </summary>
public int? OriginalPositionInQuery;

I fixed the issue, but it is an annoying problem that keep occurring. Paging in SQL Server 2005/8 is broken!

Oh, and just to clarify some things. The ability to use complex expressions for the order by clause using the projection API is fairly new for NHibernate, it is incredibly powerful and really scares me.

Comments

Tuna Toksoz
10/15/2008 10:19 PM by
Tuna Toksoz

I had never loved mssql paging syntax, and never will.

Jason Stangroome
10/16/2008 03:08 AM by
Jason Stangroome

I've always avoided positional parameters because it fails to fully encapsulate the structure of the SQL statement. Refactoring the SQL has the potential to break the calling code.

Surely NHibernate can be adapted to use named parameters with RDBMSes that support it.

Ayende Rahien
10/16/2008 05:57 AM by
Ayende Rahien

Jason,

This is when using the criteria API, and the criteria API uses positional parameters because that avoids the need to create named parameters.

They are actually translated to named parameters anyway, the bug happened during construction of the SQL by NH, not at the DB level

Andrey Shchekin
10/16/2008 06:24 AM by
Andrey Shchekin

__This is when using the criteria API, and the criteria API uses positional parameters because that avoids the need to create named parameters.

What do you mean by the 'need to create named parameters' ?

I think it makes more sense to use named parameters (by some GUID?) that are translated into positional for the RDBMSs that do not support named.

Ayende Rahien
10/16/2008 06:31 AM by
Ayende Rahien

Andrey,

This related to the way NHibernate's works internally.

Ken Egozi
10/16/2008 08:03 AM by
Ken Egozi

@Ayende: Being the huge NHibernate fan that I am, I must admit here that the problems lay in NH this time.

As Jason and Andrey has pointed out, had the query used named-parameters the 'problem' would have gone away.

as for the 'broken' paging syntax:

the thing is, that this syntax, although not trivial in first look, is actually very smart, and consistent with the way CTEs are used in SQL Server 2k5+. inventing a propriety SQL verbs like "LIMIT" looks cool, but it actually increases complexity.

plus, the SQL syntax you show (which NH is generating) is weird.

there is a redundant nesting level and the TOP is unneeded.

what should be done to a query:

  1. take out the ORDER BY clause and extract it's expression parts.

  2. add the ORDER BY expression parts (without DESC/ASC) to the SELECT clause. number them NHsort0, NHsort1, etc.

  3. rename the expressions in the ORDER BY clause to match their new designated names

  4. add ROWNUMBER() OVER(ORDER BY theorder_by ) as ROW

  5. wrap the SELECT with an external select, with the field names from the original select, without the ROW and the injected orders

  6. add "ROW BETWEEN thefirst AND thelast" to the WHERE clause

  7. add ORDER BY theneworder_by

so

 SELECT

  THIS.ID AS ID00_,

  THIS.AREA AS AREA00_,

  THIS.PARENT AS PARENT00_,

  THIS.PARENTAREA AS PARENTAREA00_,

  THIS.TYPE AS TYPE00_,

  THIS.NAME AS NAME00_

 FROM TREENODE THIS_

 WHERE

  THIS_.NAME LIKE ?

  AND THIS_.ID > ?

 ORDER BY (WHADEVER) A

would become

 SELECT

  ID00,

  AREA00,

  PARENT00,

  PARENTAREA00,

  TYPE00,

  NAME00

 FROM (

  SELECT

   ROWNUMBER OVER(ORDER BY (WHATEVER) AS NHROW,

   THIS.ID AS ID00_,

   THIS.AREA AS AREA00_,

   THIS.PARENT AS PARENT00_,

   THIS.PARENTAREA AS PARENTAREA00_,

   THIS.TYPE AS TYPE00_,

   THIS.NAME AS NAME00_,

   (WHATEVER) AS NH_SORT0

  FROM TREENODE THIS_

  WHERE

   THIS_.NAME LIKE ?

   AND THIS_.ID > ?

 )

 WHERE NHROW_ BETWEEN 10 AND 1000

 ORDER BY NH_SORT0

is the middle query only for not repeating the order-by clause? that's crazy. I'm not sure but I think it will cause an extra table scan - first scan to generate the sort_field, second scan (in the middle query) to order by it.

if you want to avoid the duplication, it's better to use a CTE for that I think.

I'll see if I can spare a little time to look into the SQL dialect and try to make things a wee bit nicer in that aspect

jrnail23
10/16/2008 02:43 PM by
jrnail23

To clarify, is the bug here a problem with NHibernate, or a bug in SQL Server 2005/2008?

Ayende Rahien
10/16/2008 04:22 PM by
Ayende Rahien

The bug was with NHibernate, the problem is that SQL Server syntax sucks

Bogdan Pietroiu
10/19/2008 05:15 PM by
Bogdan Pietroiu

yeah, nasty bug and is not the only one

if you execute an HQL like this

select this.ProductID, this.Discontinued, this.ProductName, this.UnitPrice, this.UnitsInStock, this.UnitsOnOrder, this.ReorderLevel, this.QuantityPerUnit, this.Category.CategoryName, this.Category.CategoryID, this.Supplier.SupplierID, this.Supplier.CompanyName from Product as this order by this.Category.CategoryID asc,this.ProductName asc

on sql server 2005 (Northwind) also using a limit and ordered by more than two fields will generate invalid SQL Server statement like this

SELECT TOP 50 x00, x10, x20, x30, x40, x50, x60, x70, x80, x90, x100, x110 FROM (SELECT ROWNUMBER() OVER(ORDER BY hibernatesortexpr0, hibernatesortexpr1) as row, query.x00, query.x10, query.x20, query.x30, query.x40, query.x50, query.x60, query.x70, query.x80, query.x90, query.x100, query.x110, query.hibernatesortexpr0, query.hibernatesortexpr1_ FROM (select product0.ProductID as x00, product0.Discontinued as x10, product0.ProductName as x20, product0.UnitPrice as x30, product0.UnitsInStock as x40, product0.UnitsOnOrder as x50, product0.ReorderLevel as x60, product0.QuantityPerUnit as x70, category1.CategoryName as x80, category1.CategoryID as x90, product0.SupplierID as x100, supplier2.CompanyName as x110, product0.CategoryID asc as hibernatesortexpr0, product0.ProductName as hibernatesortexpr1 from dbo.Products product0, dbo.Categories category1, dbo.Suppliers supplier2_ where product0.CategoryID=category1.CategoryID and product0.SupplierID=supplier2.SupplierID) query ) page WHERE page.row > 0 ORDER BY hibernatesortexpr0, hibernatesortexpr1

Note the "product0.CategoryID asc as hibernatesortexpr0__" in the select field list.

this is a new behavior starting R3860, previous the same setup was generating invalid orderby

ORDER BY hibernatesortexpr0hibernatesortexpr1

-> concatenating fields in the order by clause

B.

Ayende Rahien
10/19/2008 05:59 PM by
Ayende Rahien

Is this still a problem in the trunk?

If so, please create a JIRA issue

Anonymous
10/21/2008 02:55 PM by
Anonymous

Is this the same as NH-1350 ?

Ayende Rahien
10/22/2008 04:04 PM by
Ayende Rahien

Ken,

Named parameters would solve the issue, yeah, but that is not really related to the core of the issue, which is how much trouble you have to go in trying to get paging to work.

I don't care that CTE are smart. I want a simple solution to a very common and very simple problem. Smart == complex == hard to use.

The reason for the query being so complex is that we need to handle a lot more scenarios in a generic fashion.

In particular, take a look at http://jira.nhibernate.org/browse/NH-1155 to see one issue that significantly complicated the code.

Comments have been closed on this topic.