Pecentile Rank formula

Matt Moore · Nov 28, 2003

Does anyone know the best way to determine percentile rank
for any set of values in a query?

Regards,

m

Tom Ellison · Nov 28, 2003

Dear Matt:

The general answer is to divide the rank of each row by the total
number of rows and multiply by 100.

If there are columns I1, I2, . . . which form a unique key to rows in
your query, additional data columns C1, C2, . . . plus column N, the
column of which you want to show the percentile rank:

SELECT I1, I2, C1, C2,
(SELECT COUNT(*) FROM YourQuery Q1
WHERE Q1.I1 = Q.I1 AND Q1.I2 = Q.I2
AND Q1.N <= Q.N) / COUNT(*) * 100 AS PercentileRank
FROM YourQuery Q
GROUP BY I1, I2, C1, C2
ORDER BY N DESC

The ORDER BY N will put them in decreasing order by the ranked column.
This is optional, but it's good to see them this way as a test of the
query.

A specific answer could be produced for your situation if I knew the
names of your columns.

Tom Ellison
Microsoft Access MVP
Ellison Enterprises - Your One Stop IT Experts

Matt Moore · Nov 28, 2003

I don't understand this example. Here's more info...

Here's a dataset called DATA...

Name StatA StatB StatC PRStatA PRStatB PRStatC
A 3.5 600 50
B 15.6 456 34
C 7.8 750 20
D 10.1 656 52

What I'm trying to understand is the formula that would
calculate the Percentile Rank of each stat showing how
that stat falls in this domain of values.

I didn't understand why you would put the conditions you
put in the WHERE statement and how that would help. So I
hope this can help you explain your approach better.

Regards,

matt

Tom Ellison · Nov 28, 2003

Dear Matt:

So then, using my previous post, Name would be the C1, you have no I1,
I2, or C2, and StatA is the N column, as are also StatB and StatC.

SELECT [Name], StatA, StatB, StatC,
(SELECT COUNT(*) FROM DATA Q1
WHERE Q1.StatA <= Q.StatA) / (SELECT COUNT(*) FROM DATA)
* 100 AS PRStatA,
(SELECT COUNT(*) FROM DATA Q1
WHERE Q1.StatB <= Q.StatB) / (SELECT COUNT(*) FROM DATA)
* 100 AS PRStatB,
(SELECT COUNT(*) FROM DATA Q1
WHERE Q1.StatC <= Q.StatC) / (SELECT COUNT(*) FROM DATA)
* 100 AS PRStatC
FROM DATA Q

Forget the grouping - I was thinking of a much more complex set of
data than what you describe. For testing, use ORDER BY StatA DESC so
the PRStatA column will run from 100 and downward. You can later
order by Name or whaterev you like.

Tom Ellison
Microsoft Access MVP
Ellison Enterprises - Your One Stop IT Experts

Matt Moore · Nov 28, 2003

Thanks so much for the help on this one!!

I never would have thought of doing it this way!

Warmest regards and holiday wishes,

matt

-----Original Message-----
Dear Matt:

So then, using my previous post, Name would be the C1, you have no I1,
I2, or C2, and StatA is the N column, as are also StatB and StatC.

SELECT [Name], StatA, StatB, StatC,
(SELECT COUNT(*) FROM DATA Q1
WHERE Q1.StatA <= Q.StatA) / (SELECT COUNT(*) FROM DATA)
* 100 AS PRStatA,
(SELECT COUNT(*) FROM DATA Q1
WHERE Q1.StatB <= Q.StatB) / (SELECT COUNT(*) FROM DATA)
* 100 AS PRStatB,
(SELECT COUNT(*) FROM DATA Q1
WHERE Q1.StatC <= Q.StatC) / (SELECT COUNT(*) FROM DATA)
* 100 AS PRStatC
FROM DATA Q

Forget the grouping - I was thinking of a much more complex set of
data than what you describe. For testing, use ORDER BY StatA DESC so
the PRStatA column will run from 100 and downward. You can later
order by Name or whaterev you like.

Tom Ellison
Microsoft Access MVP
Ellison Enterprises - Your One Stop IT Experts

I don't understand this example. Here's more info...

Here's a dataset called DATA...

Name StatA StatB StatC PRStatA PRStatB PRStatC
A 3.5 600 50
B 15.6 456 34
C 7.8 750 20
D 10.1 656 52

What I'm trying to understand is the formula that would
calculate the Percentile Rank of each stat showing how
that stat falls in this domain of values.

I didn't understand why you would put the conditions you
put in the WHERE statement and how that would help. So I
hope this can help you explain your approach better.

Regards,

matt

a
test of the if
I knew the

Click to expand...

.

Tom Ellison · Dec 1, 2003

Dear Matt:

A way to simplify it slightly might be:

SELECT [Name], StatA, StatB, StatC,
(SELECT COUNT (*) FROM Data) AS TCount,
(SELECT COUNT(*) FROM DATA Q1
WHERE Q1.StatA <= Q.StatA) AS CtStatA,
(SELECT COUNT(*) FROM DATA Q1
WHERE Q1.StatB <= Q.StatB) AS CtStatB,
(SELECT COUNT(*) FROM DATA Q1
WHERE Q1.StatC <= Q.StatC) AS CtStatC
FROM DATA Q

In a form or report, do the dividing ty TCount and multiply by 100 as
a calculated control.

Tom Ellison
Microsoft Access MVP
Ellison Enterprises - Your One Stop IT Experts

Strange issue with this formula:

I needed to perform this calculation 4 times (in order to
create 4 columns) and I get one of 2 errors. "System
memory exceeded" or "Query too complex".

I thought about creating a summary table to aggregate some
of the data before performing this calculation in order to
successfully perform it. Do you know of a better solution
or have any other ideas on the approach to working around
this issue?

Regards,

Matt

-----Original Message-----
Dear Matt:

So then, using my previous post, Name would be the C1, you have no I1,
I2, or C2, and StatA is the N column, as are also StatB and StatC.

SELECT [Name], StatA, StatB, StatC,
(SELECT COUNT(*) FROM DATA Q1
WHERE Q1.StatA <= Q.StatA) / (SELECT COUNT(*) FROM DATA)
* 100 AS PRStatA,
(SELECT COUNT(*) FROM DATA Q1
WHERE Q1.StatB <= Q.StatB) / (SELECT COUNT(*) FROM DATA)
* 100 AS PRStatB,
(SELECT COUNT(*) FROM DATA Q1
WHERE Q1.StatC <= Q.StatC) / (SELECT COUNT(*) FROM DATA)
* 100 AS PRStatC
FROM DATA Q

Forget the grouping - I was thinking of a much more complex set of
data than what you describe. For testing, use ORDER BY StatA DESC so
the PRStatA column will run from 100 and downward. You can later
order by Name or whaterev you like.

Tom Ellison
Microsoft Access MVP
Ellison Enterprises - Your One Stop IT Experts

I don't understand this example. Here's more info...

Here's a dataset called DATA...

Name StatA StatB StatC PRStatA PRStatB PRStatC
A 3.5 600 50
B 15.6 456 34
C 7.8 750 20
D 10.1 656 52

What I'm trying to understand is the formula that would
calculate the Percentile Rank of each stat showing how
that stat falls in this domain of values.

I didn't understand why you would put the conditions you
put in the WHERE statement and how that would help. So I
hope this can help you explain your approach better.

Regards,

matt

-----Original Message-----
Dear Matt:

The general answer is to divide the rank of each row by
the total
number of rows and multiply by 100.

If there are columns I1, I2, . . . which form a unique
key to rows in
your query, additional data columns C1, C2, . . . plus
column N, the
column of which you want to show the percentile rank:

SELECT I1, I2, C1, C2,
(SELECT COUNT(*) FROM YourQuery Q1
WHERE Q1.I1 = Q.I1 AND Q1.I2 = Q.I2
AND Q1.N <= Q.N) / COUNT(*) * 100 AS
PercentileRank
FROM YourQuery Q
GROUP BY I1, I2, C1, C2
ORDER BY N DESC

The ORDER BY N will put them in decreasing order by the
ranked column.
This is optional, but it's good to see them this way as a
test of the
query.

A specific answer could be produced for your situation if
I knew the
names of your columns.

Tom Ellison
Microsoft Access MVP
Ellison Enterprises - Your One Stop IT Experts

On Fri, 28 Nov 2003 06:10:38 -0800, "Matt Moore"

Does anyone know the best way to determine percentile
rank
for any set of values in a query?

Regards,

m

.

Click to expand...

.

Click to expand...