Categorize a list of values?

  • Thread starter Thread starter Matthew White
  • Start date Start date
M

Matthew White

Hello all,

I need to categorize or group a list of numbers. For example, consider
this list of ten numbers:

1,1,2,2,5,6,7,8,9,9

I need to group them into three groups (A, B, C). A is the lowest 3 or
them, B is the "next lowest" three, and C is the highest 4 of them.

I've had some success using percentilerank, then using an if
statement:

"IF(value>=0.6,"C",IF(value>=0.3,"B","A"))"

to place in the group based on percentilerank, but it breaks down if
my list gets too heavy in one area, such as my 1's in the example.

My real list contains 68 numbers, and I break them up in to groups of
16,16,16,and 20.

Any better ideas?

Thanks,

Matt
 
Matthew,

I'm intrigued when you say '... but it breaks down if my list gets too
heavy in one area, such as my 1's in the example.'. In what way does it
break down, as I have tried 70 rows all with 1 and it seems to work okay for
me.

Once you have categorised them, you could then use conditional formatting,
as you only have 3 categories, to highlight them.

One other thing, you categorisation formula can be written as
=IF(value>=60%,"C",IF(value>=30%,"B","A"))
for better clarity.
 
"relevant size" should, of course, have been "relative sizes". Oh, well.


For first option (always grouped in 3, 3 and 6 numbers):

=IF(($A2<=SMALL($A$2:$A$11,3))*(COUNTIF(D1:D$1,"A")<3),"A",
IF(($A2<=SMALL($A$2:$A$11,6))*(COUNTIF(D1:D$1,"B")<3),"B","C"))


For second (3 groups split by relative sizes):

=IF($A2<=SMALL($A$2:$A$11,3),"A",IF($A2<=SMALL($A$2:$A$11,6),"B","C"))


No intermediate (percentilerank) formulae.

I'm sure there are better solutions, but at the moment I seem to be having a
bad brain day...


Steve D.
 
Note that the 6 in both formulae refers to 3+3.


Stephen Dunn said:
For first option (always grouped in 3, 3 and 6 numbers):

=IF(($A2<=SMALL($A$2:$A$11,3))*(COUNTIF(D1:D$1,"A")<3),"A",
IF(($A2<=SMALL($A$2:$A$11,6))*(COUNTIF(D1:D$1,"B")<3),"B","C"))

For second (3 groups split by relative sizes):

=IF($A2<=SMALL($A$2:$A$11,3),"A",IF($A2<=SMALL($A$2:$A$11,6),"B","C"))
 
Thanks for the replies!

A little more info:

* The number of categories can vary 2-5

* The number of numbers in the list can vary ~10 - 80

* The number values can fall outside the "number range" e.g. myy list
of 68 numbers can have many with a value of 69.

Here is a sample list that gives me trouble (even using the formula
Steve suggested):

1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,18,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,69,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,1,2,3,4,5,6,7,8,9,10,11,18,18,18,18,18,18
(I hope I got these in right)

These go in groups A(16) B(16) C(16) D(20)

I am fine if one of the 18s is switched to, say, a 12. With the list
the way it is here, I get A(16) B(16) C(18) D(18).


Now - reading this newsgroup, I discovered Chip Pearson, and his
ranking page:
http://www.cpearson.com/excel/rank.htm

I am using the Reverse Unique Ranks with some success. Still testing
though...

Thanks again,

Matt
 
Back
Top