Where can I find a design pattern for paging through a list?

  • Thread starter Thread starter John
  • Start date Start date
J

John

Hi,

How do websites like google page through large numbers of records?

For example; if I do a search on Google, get 187,000 matches, and go to the
22nd page of the results; I'm sure Google doesn't do a new SELECT statement
and traverse through 440 records (22pgs x 20 per page). But I'm also pretty
sure Google doesn't keep a hell of a lot of session data open either.

I know there are all kinds of ways to do this, I've created hacks which did
the job in the past. But there has to be a 'best practice' way of handling
this.

Can anybody point me in the right direction? I'm open to technology
specific ways, but would prefer generic.

Thanks,
John
 
In the ADO days, you could do this with built-in objects. A small recordset
that returns (in googles case) probably 2 or 3 columns, but that is 200,000
deep - it's not really that significant. The hardest part is rendering and
formatting it.

So - I believe they may just do that. Any Design Pattern would need to be
more low level than what we can do. In other words, if this functionality
isn't built into the objects we're using then chances are, our attempts at
trying to be more efficient will likely not be good.


All that aside, if you had some huge resultset and needed to break it up -
you could always do it inside the stored procedure.. something like:

declare @pagenum as integer
declare @count as integer

set @pagenum = 1

select @count=count(*) from myviewortable where mywhereclause=1

select top 25 field1, field2, @count from myviewortable where
mywhereclause=1 and
field1 not in (select top (25 * @pagenum) field1 from myviewortable
where mywhereclause=1)


(this is completely off the top of my head) so basically, you can get a
rowcount (which is fast), then pull back 25 at a time.. when you get to page
3 for example, the exclude list will be "select top 75" - so the main select
will bring back from 76 to 100 - and it will also bring back the total
number of rows (@count), so you can create the facade on the front end of
"pages".. hopefully that's not confusing, I'm not explaining it well.. but
do you get the point?
 
Drebin said:
In the ADO days, you could do this with built-in objects. A small
recordset that returns (in googles case) probably 2 or 3 columns, but
that is 200,000 deep - it's not really that significant. The hardest
part is rendering and formatting it.

So - I believe they may just do that. Any Design Pattern would need
to be more low level than what we can do. In other words, if this
functionality isn't built into the objects we're using then chances
are, our attempts at trying to be more efficient will likely not be
good.


All that aside, if you had some huge resultset and needed to break it
up - you could always do it inside the stored procedure.. something
like:

declare @pagenum as integer
declare @count as integer

set @pagenum = 1

select @count=count(*) from myviewortable where mywhereclause=1

select top 25 field1, field2, @count from myviewortable where
mywhereclause=1 and
field1 not in (select top (25 * @pagenum) field1 from myviewortable
where mywhereclause=1)


(this is completely off the top of my head) so basically, you can get
a rowcount (which is fast), then pull back 25 at a time.. when you
get to page 3 for example, the exclude list will be "select top 75" -
so the main select will bring back from 76 to 100 - and it will also
bring back the total number of rows (@count), so you can create the
facade on the front end of "pages".. hopefully that's not confusing,
I'm not explaining it well.. but do you get the point?


news:[email protected]...


I don't think you can specify an expression for "top" (at least, I have
never gotten that to work :-) )

What we do here:
- create a temp table
- use a cursor with the full where-clause
- fetch absolute to the first record in the page
- add that row to the temp table
- fetch next, repeat until one page is copied
- select all from the temp table
(you can store the entire row in the temp table, or just the key-value
and join original and temp tables)

Hans Kesting
 
Hans Kesting said:
I don't think you can specify an expression for "top" (at least, I have
never gotten that to work :-) )

I just tried this, this works:

select top 10 * from AccountData where net_worth > 0
and account_nbr not in (select top 30 account_nbr from AccountData where
net_worth > 0)

and if I increment:

select top 10 * from AccountData where net_worth > 0
and account_nbr not in (select top 40 account_nbr from AccountData where
net_worth > 0)

select top 10 * from AccountData where net_worth > 0
and account_nbr not in (select top 50 account_nbr from AccountData where
net_worth > 0)

select top 10 * from AccountData where net_worth > 0
and account_nbr not in (select top 60 account_nbr from AccountData where
net_worth > 0)

These all work - and as expected!!
 
Hi Drebin,

I never even considered a NOT IN subquery, but that definitely looks like
the way to go.

Thanks,
John

PS-I think Hans is probably using a non-SQLServer database, hence the mix up
with the 'TOP' expression.
 
Understood - but let me add, this is more likely a last resort type
scenario, depending on the size and data you are bringing back, returning a
high number of rows like that, but only displaying a handful at a time, is
*relatively* scalable.. your mileage may vary.

Point is, you may want to spend some time on metering your network and
CPU/memory to see when it will start being a problem and act on it
accordingly. It may not even become an issue if your app is not too big.
 
John said:
Hi Drebin,

I never even considered a NOT IN subquery, but that definitely looks
like the way to go.

Thanks,
John

PS-I think Hans is probably using a non-SQLServer database, hence the
mix up with the 'TOP' expression.

No, it is sqlserver. What I meant was: in a *stored procedure* I can use
"TOP 10" (or whatever), but I can't use "TOP @amount" or
"TOP (10*@pagenumber)". A fixed number works, an expression is
not accepted.
I realize (now) that you can also build a query-string where you can use
any expression to come up with a "TOP n" phrase, before you send it to
the database engine. It's just that we *always* use stores procedures,
never strings of sql code.

Hans Kesting
 
Ahh.. got it. I didn't catch that. I wonder if you couldn't create a dynamic
SQL statement that fills a temp table, then do your "not in" from there..
I'll mess around with this more...

Good one!
 
Back
Top