Data Paging: How To?

  • Thread starter Thread starter George Shubin
  • Start date Start date
G

George Shubin

I'm working on a windows app in VS2003 whose SQL2005 data has outgrown
expectations. The main data entry form has a grid full of records with a
few identifying fields to select from. As each line on the grid is
selected, that single record is grabbed from the database and displayed in a
set of data entry controls for editing. Pretty simple and straightforward.

The problem is the data table has now grown to almost 200,000 records, and
form loading and network traffic and response time have been impacted. I
have been assigned the task of implementing some sort of data paging logic
to alleviate the problems. I've never done anything like this, and as I
think about it, it becomes pretty complex.

How would you go about doing this?

Thanks for any ideas.

GS
 
Ah, 200,000 rows is not that large. I expect there are some more fundamental
problems here. How are your tables indexed? Are you fetching any BLOBs? How
many users? What is the server hardware configuration? How much RAM on the
server? What other applications/services are running on the server? What
version of SQL Server?

--
____________________________________
William (Bill) Vaughn
Author, Mentor, Consultant
Microsoft MVP
INETA Speaker
www.betav.com/blog/billva
www.betav.com
Please reply only to the newsgroup so that others can benefit.
This posting is provided "AS IS" with no warranties, and confers no rights.
__________________________________
Visit www.hitchhikerguides.net to get more information on my latest book:
Hitchhiker's Guide to Visual Studio and SQL Server (7th Edition)
Between now and Nov. 6th 2006 you can sign up for a substantial discount.
Look for the "Early Bird" discount checkbox on the registration form...
 
How are your tables indexed?
Primary Key is an int, assigned by the program, not SQL Server.
No additional indexes.

The primary key column plus four other columns are loaded into a dataset, a
dataview is defined and assigned as the datasource for a grid. (Infragistics
UltraGrid). Filtering and sorting for searches is applied to the dataview.
When the grid's row changes, a one-record datatable is created using the
primary key value to grab the record from the database. This is edited and
updated. The large 200,000 dataset is loaded only once and is not edited,
but can take up to 45-60 seconds to load.
4 -5 using this database, 12-15 using other databases on same server.

All users' workstations are less than a year old and have 2 GB RAM and at
least 3.2 GHz P4s
4GB RAM, 1 TB Raid0 Storage, P4 3.4 GHz, two one-gigabit Ethernet NICs
Nothing, strictly dedicated to SQL Server and some misc. file storage.
Latest SQL Express


Thanks.
 
George said:
Primary Key is an int, assigned by the program, not SQL Server.
No additional indexes.

The primary key column plus four other columns are loaded into a
dataset, a dataview is defined and assigned as the datasource for a
grid. (Infragistics UltraGrid). Filtering and sorting for searches
is applied to the dataview. When the grid's row changes, a
one-record datatable is created using the primary key value to grab
the record from the database. This is edited and updated. The large
200,000 dataset is loaded only once and is not edited, but can take
up to 45-60 seconds to load.

you shouldnt do this, don't load 200,000 rows in a clientside
construct. The user can see a limited set of rows anyway, so you should
load that set from the db and not the 200,000 rows.

As you're using Sqlserver 2005, you can create a rather fast paging
query pretty easily using CTE's. Below is an example outline of such a
query.

WITH __actualSet AS
(
SELECT TOP (pagenumber * pagesize)+1, field1, field2, ..., fieldn,
ROW_NUMBER() OVER (sortclause | first field) AS __rowcnt
FROM table1 ...
)
SELECT * FROM __actualSet
WHERE __rowcnt BETWEEN startCount AND endCount
ORDER BY __rowcnt ASC

The '(sortclause | first field)' clause means that you either have to
specify the orderby clause of the query there or an ORDER BY of the
first field in the set.

startCount = ((pageNumber - 1) * pageSize)
endCount = (pageNumber * pageSize)

and pageNumber and pageSize start both at 1

So if you can view say 50 rows per page, the user can thus page per 50
rows. If the user wants to load page 2 of the customers table (which
now contains 200,000 rows), and you want to sort the set on
companyname, you end up with something like: (I use 3 fields here, you
get the idea)

WITH __actualSet AS
(
SELECT TOP (2 * 50)+1, CustomerID, CompanyName, ContactAddress,
ROW_NUMBER() OVER (ORDER BY CompanyName ASC) AS __rowcnt
FROM Customers
)
SELECT * FROM __actualSet
WHERE __rowcnt BETWEEN 50 AND 100
ORDER BY __rowcnt ASC

You can eventually write a stored proc for this or write the query in
a string and send it to the db. In sqlserver 2005, TOP can have a
parameter as value, so that's pretty flexible.

Then in your screen, grab the PK of the row to fetch for editing,
which should be a simple select query and should be very fast.

Frans

--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------
 
Let's review your answers:

--
____________________________________
William (Bill) Vaughn
Author, Mentor, Consultant
Microsoft MVP
INETA Speaker
www.betav.com/blog/billva
www.betav.com
Please reply only to the newsgroup so that others can benefit.
This posting is provided "AS IS" with no warranties, and confers no rights.
__________________________________
Visit www.hitchhikerguides.net to get more information on my latest book:
Hitchhiker's Guide to Visual Studio and SQL Server (7th Edition)
Between now and Nov. 6th 2006 you can sign up for a substantial discount.
Look for the "Early Bird" discount checkbox on the registration form...
-----------------------------------------------------------------------------------------------------------------------

George Shubin said:
Primary Key is an int, assigned by the program, not SQL Server.
No additional indexes.

Indexes help SQL Server fetch rows quickly. However in your case since
you're fetchin 200,000 rows for some reason no amount of indexing will
help--it will actually hurt. Franz is right. Fetch JUST the rows you need in
the immediate future. This means dozens, not hundreds or thousands of rows.
The primary key column plus four other columns are loaded into a dataset,
a dataview is defined and assigned as the datasource for a grid.
(Infragistics UltraGrid). Filtering and sorting for searches is applied
to the dataview. When the grid's row changes, a one-record datatable is
created using the primary key value to grab the record from the database.
This is edited and updated. The large 200,000 dataset is loaded only once
and is not edited, but can take up to 45-60 seconds to load.

4 -5 using this database, 12-15 using other databases on same server.

Every version of SQL Server can handle this number of users on a simple
system if your application is written correctly. Again, this means fetching
fewer rows using parameters to focus the rowset.
All users' workstations are less than a year old and have 2 GB RAM and at
least 3.2 GHz P4s

This is irrelevant. These (very nice) systems all wait at the same speed.
4GB RAM, 1 TB Raid0 Storage, P4 3.4 GHz, two one-gigabit Ethernet NICs

This could support hundreds of users (or more) given a properly designed
application.
Nothing, strictly dedicated to SQL Server and some misc. file storage.

Just make sure not to add Reporting Services or a Print server if you expect
to maintain performance on this system.
Latest SQL Express

SQL Server Express edition is designed to be performance capped, but this
should not be an issue until you try to fetch one or more tables in a single
query.

Order a copy of my book. I describe how to best configure systems like yours
and get the most performance out of SQL Express.
 
William said:
SQL Server Express edition is designed to be performance capped, but
this should not be an issue until you try to fetch one or more tables
in a single query.

I never knew Express was performance capped, I thought MSDE was capped
for performance, so this is interesting. Do you have a link for this,
as I remember to have searched for this before but couldn't find clear
info in what way the performance is capped.

FB

--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------
 
No, it does not have a governor like MSDE, but consider that SQL Server
gains much of its performance benefit from RAM cached procedures and data.
SQL Express limits that cache to 1GB of RAM. It also is designed to ignore
additional processors when managing its worker threads. This will limit the
amount of work you can expect the server to accomplish in a given length of
time.

And I agree. These details are tough to find in the MS content on the web--I
know, I just spent 20 minutes looking for them and did not find references
to these limiting factors.

--
____________________________________
William (Bill) Vaughn
Author, Mentor, Consultant
Microsoft MVP
INETA Speaker
www.betav.com/blog/billva
www.betav.com
Please reply only to the newsgroup so that others can benefit.
This posting is provided "AS IS" with no warranties, and confers no rights.
__________________________________
Visit www.hitchhikerguides.net to get more information on my latest book:
Hitchhiker's Guide to Visual Studio and SQL Server (7th Edition)
Between now and Nov. 6th 2006 you can sign up for a substantial discount.
Look for the "Early Bird" discount checkbox on the registration form...
 
William said:
No, it does not have a governor like MSDE, but consider that SQL
Server gains much of its performance benefit from RAM cached
procedures and data. SQL Express limits that cache to 1GB of RAM. It
also is designed to ignore additional processors when managing its
worker threads. This will limit the amount of work you can expect the
server to accomplish in a given length of time.

Ah, thanks Bill :). Although a limited cache of 1GB might not be a
thing one would run into easily IMHO, ignoring additional processors
(and I assume with that thus: 2nd cores on a dual core) IS indeed
limiting.

FB

--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------
 
Actually, a dual core CPU does improve performance--it permits the OS to use
one processor and the SQL engine to use the other. It's just that the SQL
engine can't use both...

--
____________________________________
William (Bill) Vaughn
Author, Mentor, Consultant
Microsoft MVP
INETA Speaker
www.betav.com/blog/billva
www.betav.com
Please reply only to the newsgroup so that others can benefit.
This posting is provided "AS IS" with no warranties, and confers no rights.
__________________________________
Visit www.hitchhikerguides.net to get more information on my latest book:
Hitchhiker's Guide to Visual Studio and SQL Server (7th Edition)
Between now and Nov. 6th 2006 you can sign up for a substantial discount.
Look for the "Early Bird" discount checkbox on the registration form...
 
Back
Top