Creating 'Top 20' Query

  • Thread starter Thread starter Jaek
  • Start date Start date
J

Jaek

I'm trying to create a query to return the top 20 of each of a 'type' of
record in a table.

So say the table has columns:

Book_Type {Good, Bad, Indifferent}
Book_Title
Book_Sales [in $]

What I'm trying to do is get the top 20 sellers in each of the three
Book_Type values.

I'm thinking some kind of subquery...?

Thanks,
Jaek
 
Try a correlated subquery in the where clause

SELECT SomeTable.*
FROM SomeTable
WHERE Book_Sales in
(SELECT Top 20 Book_Sales
FROM SomeTable as S
WHERE S.Book_Type = SomeTable.Book_Type
ORDER BY Book_Sales DESC)

John Spencer
Access MVP 2002-2005, 2007-2009
The Hilltop Institute
University of Maryland Baltimore County
 
Awesome, thank you John! I think that's exactly the trick, I'm going to try
it now.

I think what was conceptually difficult for me to figure was that you could
correlate on that 'type' field using two aliased versions of the table.


John Spencer MVP said:
Try a correlated subquery in the where clause

SELECT SomeTable.*
FROM SomeTable
WHERE Book_Sales in
(SELECT Top 20 Book_Sales
FROM SomeTable as S
WHERE S.Book_Type = SomeTable.Book_Type
ORDER BY Book_Sales DESC)

John Spencer
Access MVP 2002-2005, 2007-2009
The Hilltop Institute
University of Maryland Baltimore County
I'm trying to create a query to return the top 20 of each of a 'type' of
record in a table.

So say the table has columns:

Book_Type {Good, Bad, Indifferent}
Book_Title
Book_Sales [in $]

What I'm trying to do is get the top 20 sellers in each of the three
Book_Type values.

I'm thinking some kind of subquery...?

Thanks,
Jaek
 
Hi - Just wanted to follow up to say that using the format John suggested
accomplished the task, thanks again John.

The one hiccup, and I guess a followup question, is just that on a table of
about 84,000 records the query I wrote took a pretty long time to run (>a
couple of minutes, I got distracted after that), and the next step for me is
to do the same kind of query but involving several joins including a big
sales history table (1.5 million records) and at least three calculated
fields.

So the followup question would be are there any tricks for maybe lightening
the load so I don't have to run the query overnight!?

Here's the actual SQL I used for the first query (with 'Bisac_corr' being
the field 'type' stood in for in the simplified example, and 'Price_Latest'
being the field examined for top 20 values.)

SELECT T1.ISBN13, T1.TitleShort, T1.Author, T1.DatePub, T1.SAC,
T1.Bisac_Corr, T1.Publisher_Corr, T1.Price_Latest
FROM tblEntityTitles AS T1
WHERE (((T1.Price_Latest) In (SELECT Top 20 Price_Latest FROM
tblEntityTitles as T2 WHERE (T2.Bisac_Corr = T1.Bisac_Corr) ORDER BY
Price_Latest DESC)))
ORDER BY T1.Bisac_Corr;


John Spencer MVP said:
Try a correlated subquery in the where clause

SELECT SomeTable.*
FROM SomeTable
WHERE Book_Sales in
(SELECT Top 20 Book_Sales
FROM SomeTable as S
WHERE S.Book_Type = SomeTable.Book_Type
ORDER BY Book_Sales DESC)

John Spencer
Access MVP 2002-2005, 2007-2009
The Hilltop Institute
University of Maryland Baltimore County
I'm trying to create a query to return the top 20 of each of a 'type' of
record in a table.

So say the table has columns:

Book_Type {Good, Bad, Indifferent}
Book_Title
Book_Sales [in $]

What I'm trying to do is get the top 20 sellers in each of the three
Book_Type values.

I'm thinking some kind of subquery...?

Thanks,
Jaek
 
If you are going to use a correlated subquery, make sure you have indexes on
the fields you are using in the where and the order by clause. That can make
a significant difference in performance.

Correlated sub-queries are slow since you are running a query for each and
every record in the main query. If you can limit the records you are working
with by additional criteria in the where clause I believe that will limit the
number of times the correlated subquery runs.

You might try something like the following query. It could be more efficient.
I get confused sometimes on whether to use > or < in the comparison between
T1.Price_Latest and T2.Price_Latest in the JOIN clause. If it gives you the
20 lowest prices try changing the comparison operator.

SELECT T1.ISBN13, T1.TitleShort, T1.Author, T1.DatePub, T1.SAC,
T1.Bisac_Corr, T1.Publisher_Corr, T1.Price_Latest
FROM tblEntityTitles AS T1 INNER JOIN tblEntityTitles as T2
ON T1.Bisac_Corr = T2.Bisac_Corr
AND T1.Price_Latest < T2.PriceLatest
GROUP BY T1.ISBN13, T1.TitleShort, T1.Author, T1.DatePub, T1.SAC,
T1.Bisac_Corr, T1.Publisher_Corr, T1.Price_Latest
HAVING Count(T2.PriceLatest) <= 20
ORDER BY T1.Bisac_Corr

John Spencer
Access MVP 2002-2005, 2007-2009
The Hilltop Institute
University of Maryland Baltimore County
 
Back
Top