December 20, 2004 06:20 PM

Implementing Paging

Let users navigate to adjacent and non-adjacent pages
Rating: (0)
SQL Server Magazine
InstantDoc ID #44138
When you write applications that return query results to the client and display the results on the screen, you need to take into account the end user's screen-size limitations. If a query's result set contains too many rows to fit on the screen, you have to introduce logic into the application to split the original result set into chunks or pages. You then provide buttons or other graphical elements that let the user navigate between the result pages. The process of splitting the data into chunk...

...This article is for paid Professional Members only.

Already a Professional Member? Please log in now:

NOT A PROFESSIONAL MEMBER? YOU CHOOSE:

Professional Membership

Monthly

Annual

VIP Membership

Monthly

Annual

Add a Comment

asdfadsf

Bob7/15/2006 5:13:44 PM


I’d like to point out a very important knowledge base article that was revised after I opened a case with Microsoft to clarify the confusion surrounding IDENTITY issues. I strongly recommend going over it as it has many implications related to paging in SQL Server 2000.
Here’s the link to the KB article (273586) The behavior of the IDENTITY function when used with SELECT INTO or INSERT .. SELECT queries that contain an ORDER BY clause:

http://support.microsoft.com/default.aspx?scid=kb;en-us;273586

Regards,
Itzik Ben-Gan

BG_SQL 5/17/2005 9:41:18 PM


Andy,

You can clear the data cache using DBCC DROPCLEANBUFFERS. Also, make sure you have an appropriate index in place for the paging technique that I suggested to work fast. An ideal index would be a covering one on (sort_columns, covered_columns), but if you cannot afford one, you can make it a non-covering index on sort_columns only. Just be aware that each target row would cause a lookup, which is much slower and I/O consuming than the covering index.

As for the anonymous user that requested an example, here’s one using SELECT INTO:

set nocount on
use tempdb
go
drop table ta, tb, #t
go
create table ta(a int not null)
insert into ta values(3)
insert into ta values(1)
insert into ta values(2)

create table tb(b int not null primary key)
insert into tb values(4)
insert into tb values(1)
insert into tb values(3)
insert into tb values(2)

select b, identity(int, 1, 1) as rn
into #t
from ta join tb on a = b
order by b

select * from #t order by b

-- Output:
b rn
----------- -----------
1 2
2 3
3 1

-- Plan:
|--Table Insert(OBJECT:([#t]), SET:([#t].[rn]=[Expr1005],
[#t].[b]=[tb].[b]))
|--Top(ROWCOUNT est 0)
|--Compute Scalar(DEFINE:([Expr1005]=setidentity([Expr1004], -7, 0,
'#t')))
|--Sort(ORDER BY:([tb].[b] ASC))
|--Compute Scalar(DEFINE:([Expr1004]=getidentity(-7, 0, '#t')))
|--Nested Loops(Inner Join, OUTER REFERENCES:([ta].[a]))
|--Table Scan(OBJECT:([tempdb].[dbo].[ta]))
|--Clustered Index
Seek(OBJECT:([tempdb].[dbo].[tb].[PK__tb__522A84E8]),
SEEK:([tb].[b]=[ta].[a]) ORDERED FORWARD)

I’m afraid I currently don’t have an example for INSERT INTO, but I’ll keep looking for one.

Regards,
Itzik Ben-Gan


BG_SQL 12/21/2004 4:11:43 AM


Knowledge Base article ID 273586 seems very pertinent. The article confirms the wisdom of Itzik's warnings about using the SELECT INTO syntax. However, the article does not mention the INSERT SELECT syntax. This could lead one to believe that the two situations are handled differently, and that the INSERT SELECT syntax is not subject to the same problems. There does not seem to be a KB article advising against depending upon the INSERT SELECT syntax to behave as commonly expected.


Anonymous User 12/20/2004 1:45:47 PM


Itzik,

Thanks for sharing the fact that a case exists where the Sort operation followed the Compute Scalar operation. Are you at liberty to say which syntax the case used? If so, did the case use SELECT...INTO or INSERT...SELECT (not that the syntax would necessarily make any difference)?


Anonymous User 12/19/2004 9:50:37 PM


The technique described in this column has been implemented a little bit better (more flexible) here:

http://weblogs.sqlteam.com/jeffs/archive/2003/12/22/672.aspx

There's another post in the blog that shows a dynamic stored proc that implements this procedure in a more generic maner.

Anonymous User 12/19/2004 4:23:13 PM


Itzik,

I tested the two techniques above over a table containing 20 million rows (with a row size of around 4000 bytes). Every test resulted in the "SELECT TOP" technique as being considerable faster.

I expected that the technique in the article would be quicker once you started to request the end pages (say out of 20M rows, requesting page 2M wite 10 rows per page), since the "SELECT TOP" technique would have to scan the entire table in just to get the last ten rows. But again it was faster, both in execution time and according to the execution plan.

However, I am concerned that my tests are being skewed by caching etc, so any tips for setting up a good test bed would be appreciated.

Andy

asdavey12/19/2004 4:00:10 PM


Hi asdavey,

I’ve also been using a similar technique, but usually not for flexible paging of multiple pages, rather for ad-hoc page requests where the user is interested in a single or a very small number of pages. For many pages, the technique is slow because you scan all history leading to the desired page every time, and end up scanning data multiple times. The exception is small tables, where you can use the technique for multiple page requests. Once you scan the data for the first time it already resides in cache, and that’s probably why you found it fast. For large tables it’s problematic in performance, and consuming so much space in cache usually comes at the expense of other data.

Itzik Ben-Gan


BG_SQL 12/18/2004 1:43:39 AM


I’m saying this because I discussed the issue with the developers from the engine/optimizer team. I was explicitly told that they do not guarantee that the ORDER BY would be applied before assigning the IDENTITY values in SQL Server 2000. The issue came up as a result of a case where IDENTITY values did not reflect the ORDER BY clause, and you could see in the plan the compute scalar operator generating the values before the sort. I’m afraid I cannot provide details as I’m under NDA, and the whole discussion was in a private forum.
This is not considered a bug, since Microsoft never guaranteed anything about the relationship between ORDER BY and IDENTITY. And this also relates to the risk in relying on a certain “expected” behavior that is not documented in BOL or elsewhere. However, since they are very well aware that so many developers assume IDENTITY values would reflect the sort, and have lots of code relying on that, I was told that they’re going to guarantee this in SQL Server 2005. But again, in SQL Server 2005 you won’t need such a guarantee, since you’ll be able to use the ROW_NUMBER() function.


BG_SQL 12/18/2004 1:15:19 AM


Itzik,

When you say that a sort before identity assignment is simply not guaranteed do you mean that it's not explicitly stated in BOL or do you mean that you have inside knowledge about optimizer behavior?

Thanks!


Anonymous User 12/18/2004 1:06:14 AM


You must log on before posting a comment.

Are you a new visitor? Register Here
GOOGLE LINKS
SPONSORED LINKS
FEATURED LINKS