Ayende @ Rahien

Unnatural acts on source code

Another paging approach

I have hard time believing that, but this is something that I learned from how MS CRM works. This is the first time that I ever saw this approach, although it seems so obvious when you think about it.

Updated: Fixed a bug found by Nathan Baulch, thanks!

The idea here is to solve the problem of SQL 2000 not really having a good way to do paging. The basic pattern is this:

select top [N] * from [TableName] 
where [orderCriteria] > @orderCriteriaPreviousMaxValue or 
(
	[Id] > @PreviousID and 
	[orderCriteria] = @orderCriteriaPreviousMaxValue
) 
order by [orderCriteria], [Id]

What do I mean by that?

Let us say that I want to get the first page of customers, I would issue the following query:

select top 25 * from Customers 
order by Customers.Name, Customers.Id

Now, I want to get the next page, I can do this using this approach:

select top 25 * from Customers
where Customers.Name >= @CustomerNamePreviousMaxValue or
(
	Customers.Id > @PreviousId and 
	Customers.Name > @CustomerNamePreviousMaxValue
)
order by Customers.Name, Customers.Id

All we need to remember is the last row's value from the previous query, and pass it to this one. Using this approach we can efficently page through result sets, without having to deal with the limitations of SQL Server limited top capabilities.

The usage of the filtering and the ordering ensure that we will always get the next result back. I find it quite elegant. What is surprising, I don't think that I have ever seen it in any other place.

Comments

Jon Skeet
12/06/2007 11:24 AM by
Jon Skeet

Hmmm. I can see that it will work when the ordering value changes per row, but it doesn't work in other cases.

For instance, display a set of products ordering them by price - if you need to display 25 per page and there are 40 products with the same price, I think the above would skip the bottom 15 of them.

Still, nice strategy when it works.

Jon

Ayende Rahien
12/06/2007 11:30 AM by
Ayende Rahien

It will work, notice that we also order by the ID of the row, so this allows us not to skip anything

Jon Skeet
12/06/2007 11:47 AM by
Jon Skeet

Doh, I missed the second part of the where clause. Apologies :)

Nathan Baulch
12/06/2007 11:49 AM by
Nathan Baulch

I also don't think this strategy would work.

Consider this dataset, ordered by Name then Id:

Aaron, 4

Bart, 5

Colleen, 6

Daniel, 1

Ernie, 2

Fred, 3

If you wanted to get the second page of three, your query would use this condition:

Name >= 'Colleen' and Id > '6'

Clearly this will not return anything!

What I think it should be is:

Name > @lastName or (Name = @lastName and Id > @lastId)
Ken Egozi
12/06/2007 12:04 PM by
Ken Egozi

and you won't be able to write:

showing 26-50 of 400

(cuz maybe new orders came in with name<= lastReturnedName and youre are actually showing 29-53 or something)

Jon Skeet
12/06/2007 12:06 PM by
Jon Skeet

Nathan: Yes, that looks right to me. In particular, it still copes with the situation I was worried about.

Jon

Ayende Rahien
12/06/2007 12:10 PM by
Ayende Rahien

Nathan ,

That is the approach that the CRM is using, I thought it was too complicated and removed that, thanks for noticing this.

I fixed it the post, thanks

Ayende Rahien
12/06/2007 12:15 PM by
Ayende Rahien

Ken,

Actually, that is not a worry with the fix that Nathan suggested.

Never said it was perfect, it is there specifically to deal with SQL 2000 broken paging model

John Chapman
12/06/2007 12:44 PM by
John Chapman

Why not just use a temp table to solve this problem? If you want items 26-50 select the top 50 rows in to a temp table where you use an identity column starting at 1 in the temp table. Then simply return the results of the temp table where TempId >= 26.

This seems far easier to read and understand than the prior suggestions. Not only that but the logic does not change when you modify the order by clause. The approach above requires thought and effort to ensure that mistakes are not made while checking the where clause (such as the original post's mistake).

While someone may have concerns over the amount of data in the temp table I would argue that it would not be an issue. How many pages is a user likely to move through? 500 items maybe? In that case the maximum size of the temp table would be 500, which is not much. I wouldn't expect the user to make it to page 1000. If they page that much then there are far bigger issues with the available searching mechanism and I feel for those users.

Matt L.
12/06/2007 01:47 PM by
Matt L.

I've seen approaches like John's, but that can be pretty slow if each row is really big, each page is really big or you're many pages into the dataset. I like the solution Ayende offers, but I'm assuming that if you have, say "order by lastname asc, salary desc" can't you find yourself in a weird situation? Let's say there are a lot of Smiths and your page ends on Smith. Your select would have to be "lastname > @lastlastname OR (lastname = @lastlastname and salary < @lastsalary)."

Is that correct? It's early, and I have not yet finished my coffee.

Matt L.
12/06/2007 01:48 PM by
Matt L.

And that is a lesson in not skimming ... you addressed this in the post itself. I am an idiot.

Bunter
12/06/2007 02:07 PM by
Bunter

While this works with next-> (and can be modified to work with previous->) paging, it doesn't really help you if you have "jump to page 12 ... N" functionality. Since most of the paging works like that, you obviously haven't seen this solution much...

Shane
12/06/2007 02:43 PM by
Shane

I've used that approach in some VB6 apps. It works great for us, but as mentioned it does lack the goto page N functionality.

Funny thing is, I originally used this same pattern in the early 90's on Cobol/CICS/DB2.

John Chapman
12/06/2007 03:25 PM by
John Chapman

@Matt

If the row size of your table is really a problem you could always place just the primary key values of your row in to the temp table, then perform a join between the temp table and the actual query table to pull the results you need. This could be a real pain with many joins, but I think it would resolve the issue you raised.

Or alternatively just leave out the large columns to keep the size of the temp table down and join to just those columns upon returning the results.

Jacques Philip
12/06/2007 05:54 PM by
Jacques Philip

This works good also if you overlap the pages by one record by using:

where [orderCriteria] >= @orderCriteriaPreviousMaxValue or

(

[Id] >= @PreviousID and 

[orderCriteria] = @orderCriteriaPreviousMaxValue

)

Then you can use an Ajax AutoComplete ListBox to bring up a page starting with a particular record.

When the user types some characters in a TextBox, an Ajax call checks for records where [orderCriteria] >= @CharactersTyped and returns the filtered list in a ListBox.

When the user selects a record in the ListBox, the page starting with that record is called in the same paging scheme.

Steve Campbell
12/06/2007 06:25 PM by
Steve Campbell

Excellent! Can you backdate your post so I can read it last week?

Avish
12/06/2007 07:51 PM by
Avish

My solution is a little less performant, but allows for "jump to page N" and doesn't require the ID column to be sortable (think GUIDs):

SELECT TOP PageLength *

FROM

(

SELECT TOP (PageLength * PageNum) *

FROM TargetTable

ORDER BY OrderingField DESCENDING

) AS Sql2KRequiresAliasesForNestedSubQueries

ORDER BY OrderingField ASCENDING

Avish
12/06/2007 09:29 PM by
Avish

Agh, screw it, that's buggy. I hope the idea gets through, though. Fixing it is left as an exercise to the reader.

Mark Pearce
12/08/2007 07:35 PM by
Mark Pearce

John, with larger datasets you'll find that a memory-variable table runs up to twice as fast as a temp table. Declared like this:

DECLARE @TempTable TABLE (OrderId int, RowId int identity)

Mark Pearce
12/08/2007 07:45 PM by
Mark Pearce

This is a good essay on the temp table variable approach, including a cost-and-efficiency analysis:

http://www.4guysfromrolla.com/webtech/042606-1.shtml

Thomas
12/09/2007 03:22 PM by
Thomas

Just a quick note that MS SQL Server 2005 has a 'ROW_NUMBER()' feature that may help with paging. For more information see:

http://www.j-dee.com/2007/08/08/efficient-data-paging-with-sql-server-2005-using-row_number/

Comments have been closed on this topic.