Rankings within a database

  • Thread starter Thread starter David Crane
  • Start date Start date
D

David Crane

I have a database that includes both numeric and categorization
information about a set of retail stores. I want to create one
additional column in the database that represents the rank of each store
within its segment (i.e. the ranking for a numeric performance field
within the store's segment, as defined by one of the categorization
fields). I want the ranking to be in any of three possible forms -
either a numeric rank, or a percentile, or simple categorization (e.g.,
Top 20%, Bottom 80%).

I have not been able to figure out any way to do this other than by
manual, brute force - creating a page for each segment, calculating
rankings there, and then bringing results back to the main database by
formula. I have tried the percentile function - which appears to work on
an entire database, even if the database is filtered (not the way sums
and averages work). I have also hunted for database functions, but have
not found anything to do what I need.

Can anyone offer ideas on whether there is something other than my brute
force approach?

Thanks in advance
 
First off you should not store this data in a new column in your table.
It should be calculated in a query.

I would calculate the new column to be a percent of the top performer in that RetailSegment

You can see this by creating an new blank access database

Add the following table

tblRetailStore
------------------
RetailStoreID (autonum) <--- PrimaryKey
NumericInfo (number)
RetailSegment (text)


Add some values to that table then create a new query

use the following formula to calculate the RankByRetailSegment

RankByRetailSegment: [tblRetailStore].[NumericInfo]/DMax("NumericInfo","tblRetailStore","RetailSegment='" & [tblRetailStore]![RetailSegment] & "'")


Or just cut and paste this SQL Statement into the SQL design view of the query

SELECT tblRetailStore.RetailStoreID, tblRetailStore.NumericInfo, tblRetailStore.RetailSegment, [tblRetailStore].[NumericInfo]/DMax("NumericInfo","tblRetailStore","RetailSegment='" & [tblRetailStore]![RetailSegment] & "'") AS RankByRetailSegment
FROM tblRetailStore
ORDER BY tblRetailStore.RetailSegment, [tblRetailStore].[NumericInfo]/DMax("NumericInfo","tblRetailStore","RetailSegment='" & [tblRetailStore]![RetailSegment] & "'") DESC;

HTH
-Merk

I have a database that includes both numeric and categorization
information about a set of retail stores. I want to create one
additional column in the database that represents the rank of each store
within its segment (i.e. the ranking for a numeric performance field
within the store's segment, as defined by one of the categorization
fields). I want the ranking to be in any of three possible forms -
either a numeric rank, or a percentile, or simple categorization (e.g.,
Top 20%, Bottom 80%).

I have not been able to figure out any way to do this other than by
manual, brute force - creating a page for each segment, calculating
rankings there, and then bringing results back to the main database by
formula. I have tried the percentile function - which appears to work on
an entire database, even if the database is filtered (not the way sums
and averages work). I have also hunted for database functions, but have
not found anything to do what I need.

Can anyone offer ideas on whether there is something other than my brute
force approach?

Thanks in advance
 
Thank you for this idea. My problem is that I am constrained to a one
file Excel solution, so I cannot use this idea for my current effort.
Any thoughts on how to do this within Excel?

Thanks

First off you should not store this data in a new column in your table.
It should be calculated in a query.

I would calculate the new column to be a percent of the top performer in that RetailSegment

You can see this by creating an new blank access database

Add the following table

tblRetailStore
------------------
RetailStoreID (autonum) <--- PrimaryKey
NumericInfo (number)
RetailSegment (text)


Add some values to that table then create a new query

use the following formula to calculate the RankByRetailSegment

RankByRetailSegment: [tblRetailStore].[NumericInfo]/DMax("NumericInfo","tblRetailStore","RetailSegment='" & [tblRetailStore]![RetailSegment] & "'")


Or just cut and paste this SQL Statement into the SQL design view of the query

SELECT tblRetailStore.RetailStoreID, tblRetailStore.NumericInfo, tblRetailStore.RetailSegment, [tblRetailStore].[NumericInfo]/DMax("NumericInfo","tblRetailStore","RetailSegment='" & [tblRetailStore]![RetailSegment] & "'") AS RankByRetailSegment
FROM tblRetailStore
ORDER BY tblRetailStore.RetailSegment, [tblRetailStore].[NumericInfo]/DMax("NumericInfo","tblRetailStore","RetailSegment='" & [tblRetailStore]![RetailSegment] & "'") DESC;

HTH
-Merk

I have a database that includes both numeric and categorization
information about a set of retail stores. I want to create one
additional column in the database that represents the rank of each store
within its segment (i.e. the ranking for a numeric performance field
within the store's segment, as defined by one of the categorization
fields). I want the ranking to be in any of three possible forms -
either a numeric rank, or a percentile, or simple categorization (e.g.,
Top 20%, Bottom 80%).

I have not been able to figure out any way to do this other than by
manual, brute force - creating a page for each segment, calculating
rankings there, and then bringing results back to the main database by
formula. I have tried the percentile function - which appears to work on
an entire database, even if the database is filtered (not the way sums
and averages work). I have also hunted for database functions, but have
not found anything to do what I need.
 
First off you should not store this data in a new column in your table.
It should be calculated in a query.

I don't think you should be doing this in a query either, rather it's
something for the target application (Excel in this case). Try
re-writing your query using ANSI-standard SQL to see what I'm getting
at. The OP didn't specify MS Access so your use of the proprietary
function DMAX is inappropriate.

Anyhow, I suspect there is no RDBMS involved here.

--

Merkling said:
First off you should not store this data in a new column in your table.
It should be calculated in a query.

I would calculate the new column to be a percent of the top performer in that RetailSegment

You can see this by creating an new blank access database

Add the following table

tblRetailStore
------------------
RetailStoreID (autonum) <--- PrimaryKey
NumericInfo (number)
RetailSegment (text)


Add some values to that table then create a new query

use the following formula to calculate the RankByRetailSegment

RankByRetailSegment: [tblRetailStore].[NumericInfo]/DMax("NumericInfo","tblRetailStore","RetailSegment='" & [tblRetailStore]![RetailSegment] & "'")


Or just cut and paste this SQL Statement into the SQL design view of the query

SELECT tblRetailStore.RetailStoreID, tblRetailStore.NumericInfo, tblRetailStore.RetailSegment, [tblRetailStore].[NumericInfo]/DMax("NumericInfo","tblRetailStore","RetailSegment='" & [tblRetailStore]![RetailSegment] & "'") AS RankByRetailSegment
FROM tblRetailStore
ORDER BY tblRetailStore.RetailSegment, [tblRetailStore].[NumericInfo]/DMax("NumericInfo","tblRetailStore","RetailSegment='" & [tblRetailStore]![RetailSegment] & "'") DESC;

HTH
-Merk

I have a database that includes both numeric and categorization
information about a set of retail stores. I want to create one
additional column in the database that represents the rank of each store
within its segment (i.e. the ranking for a numeric performance field
within the store's segment, as defined by one of the categorization
fields). I want the ranking to be in any of three possible forms -
either a numeric rank, or a percentile, or simple categorization (e.g.,
Top 20%, Bottom 80%).

I have not been able to figure out any way to do this other than by
manual, brute force - creating a page for each segment, calculating
rankings there, and then bringing results back to the main database by
formula. I have tried the percentile function - which appears to work on
an entire database, even if the database is filtered (not the way sums
and averages work). I have also hunted for database functions, but have
not found anything to do what I need.

Can anyone offer ideas on whether there is something other than my brute
force approach?

Thanks in advance
 
Is this something like what you want to achieve:

Store | Segm | Perms
--------------------
Store1 | A | 1
Store2 | A | 2
Store3 | A | 3
Store4 | A | 4
Store5 | B | 5
Store6 | B | 6
Store7 | B | 7
Store8 | B | 8
Store9 | C | 9
Store10| C | 10
Store11| C | 11
Store12| C | 12

Put this data in cells A2:C13

i.e. 'Store1' will be A2, 'A' in B2, '1' in C3 etc.

Put this formula in cell D2 and 'array enter' it (i.e.
ctrl+shift+enter):

=1+SUM(--(($B$2:$B$13=B2)*$C$2:$C$13>C2))

Copy the formula down to D13.

The values in D should be the rank as regards performance within the
store's segment e.g. Store1 is ranked 4 out of 4 because it has the
lowest performance value of the four segment A stores.

The values in column D should stay the same if you sort the rows into
another order.

--

David Crane said:
Thank you for this idea. My problem is that I am constrained to a one
file Excel solution, so I cannot use this idea for my current effort.
Any thoughts on how to do this within Excel?

Thanks

First off you should not store this data in a new column in your table.
It should be calculated in a query.

I would calculate the new column to be a percent of the top performer in that RetailSegment

You can see this by creating an new blank access database

Add the following table

tblRetailStore
------------------
RetailStoreID (autonum) <--- PrimaryKey
NumericInfo (number)
RetailSegment (text)


Add some values to that table then create a new query

use the following formula to calculate the RankByRetailSegment

RankByRetailSegment: [tblRetailStore].[NumericInfo]/DMax("NumericInfo","tblRetailStore","RetailSegment='" & [tblRetailStore]![RetailSegment] & "'")


Or just cut and paste this SQL Statement into the SQL design view of the query

SELECT tblRetailStore.RetailStoreID, tblRetailStore.NumericInfo, tblRetailStore.RetailSegment, [tblRetailStore].[NumericInfo]/DMax("NumericInfo","tblRetailStore","RetailSegment='" & [tblRetailStore]![RetailSegment] & "'") AS RankByRetailSegment
FROM tblRetailStore
ORDER BY tblRetailStore.RetailSegment, [tblRetailStore].[NumericInfo]/DMax("NumericInfo","tblRetailStore","RetailSegment='" & [tblRetailStore]![RetailSegment] & "'") DESC;

HTH
-Merk

David Crane<[email protected]> 11/25/03 11:07AM >>>
I have a database that includes both numeric and categorization
information about a set of retail stores. I want to create one
additional column in the database that represents the rank of each store
within its segment (i.e. the ranking for a numeric performance field
within the store's segment, as defined by one of the categorization
fields). I want the ranking to be in any of three possible forms -
either a numeric rank, or a percentile, or simple categorization (e.g.,
Top 20%, Bottom 80%).

I have not been able to figure out any way to do this other than by
manual, brute force - creating a page for each segment, calculating
rankings there, and then bringing results back to the main database by
formula. I have tried the percentile function - which appears to work on
an entire database, even if the database is filtered (not the way sums
and averages work). I have also hunted for database functions, but have
not found anything to do what I need.
 
Wow! This is exactly what I wanted to achieve. Thak you.

I thought I understood Excel. But your formula is beyond me. Can you
explain how this formula works?
 
You are welcome. I'm sure my formula is NOT beyond you; rather, you
probably haven't investigated array formulas yet. They offer a whole
new dimension so it's well worth spending an hour or so getting to
grips with them.

The Excel help isn't very informative, so read Chip's introduction:

http://www.cpearson.com/excel/array.htm

then move on to these *extremely* detailed explanations and examples:

http://www.emailoffice.com/excel/arrays-bobumlas.html

Note that not all worksheet functions can be used in an array formula
(CONCATENATE is a notable omission). The array formula in my example
would be simpler if the RANK function could be used but unfortunately
it errors when used with an array (despite what it says in the help!)

One other thing that may need explaining is the use of the double
minus sign. It's a method to convert a boolean logical value (i.e.
TRUE or FALSE) to a numeric:

=--(TRUE) returns 1
=--(FALSE) returns 0
=-+(TRUE) returns -1
=-+(FALSE) returns 0

Glad I could help. Post back if you want any further details or help
getting those other categorizations you require.

--
 
Hi Onedaywhen,
One other thing that may need explaining is the use of the double
minus sign. It's a method to convert a boolean logical value (i.e.
TRUE or FALSE) to a numeric:

Is there any advantage of that over the N() function?

Regards

Stephen Bullen
Microsoft MVP - Excel
www.BMSLtd.co.uk
 
Hi Stephen,
No advantage I know of. I merely find the double minus more intuitive
i.e. easier for me to remember. Similarly, why +- when just minus will
do? Answer, using two signs sticks in my mind better!

I'll try to use N in future posts; it'll save me having to explain my
usage.

--
 
Hi Onedaywhen,
No advantage I know of. I merely find the double minus more intuitive
i.e. easier for me to remember. Similarly, why +- when just minus will
do? Answer, using two signs sticks in my mind better!

I'll try to use N in future posts; it'll save me having to explain my
usage.

No problem - it was just that I've seen it used quite frequently
recently, and wondered whether there had been a 'group consensus' that
it was maybe quicker or something.

Regards

Stephen Bullen
Microsoft MVP - Excel
www.BMSLtd.co.uk
 
Back
Top