Excel query

  • Thread starter Thread starter The PC Guy
  • Start date Start date
T

The PC Guy

Hello:
I have a fairly simple query task that I've been unable to find an
answer to.
I have a spreadsheet with the data arranged in five columns and a little
over 600 rows.
I am interested in finding the top five values that appear most often.
I know you can use the MODE function to determine the most common value.
I've been unable to devise a way to determine the other four most common
values.
Any help you can would be greatly appreciated.

--
regards,
bruce

Bruce Von Deylen
Pierceton, Ind. USA

E-mail: brucevd<at>michiana<dot>org
 
Put the unique values in another column

Assume the first value is in H2

in I2 put in
=Countif(C:C,H2)

then drag fill down.

Then in J2 and down you can put in the RANK function.
 
Hello Tom:
Thanks for your reply.
Unfortunately, I'm not sure I understand you. I've got more than 3,000
entries on this spreadsheet. How would I determine the unique values and
then place them in another column as you describe?

--
regards,
bruce

The PC Guy
Bruce Von Deylen
Pierceton, Ind. USA
Serving the computer impaired in northern
Indiana and southwest lower Michigan

E-mail: brucevd<at>michiana<dot>org
 
How about this:

Copy all the values to a new column (3000 rows long) (say F)

then select that column and do Data|Filter|Advanced filter and put the unique
entries in the next column over (column G).
See Debra Dalgleish's site for some tips on how to do this:
http://www.contextures.com/xladvfilter01.html#FilterUR

Then you can use Tom suggestion
=countif(f:f,g2)
in column H

and the rank formula in column I.
 
Even easier if the column you wish to sort contains numeric values (no
additional columns needed). Select all of the column headings and then
choose

DATA --> FILTER --> AUTO FILTER

Little drop down boxes will appear next to each heading. Click the dropdown
to the right of the column you need the top five list for and choose the
option Top 10 in the drop down box. You can then specify if you want the
top 5, 10, 12 whatever (by count or percentage) and it will hide all
non-top-5 items until you turn the filter off.

HTH

ryanb.
 
That doesn't do what he wants.

However, he doesn't need to use a separate column to use
Data=>Filter=>Advanced filter - he only needs to select the single column he
wants to work with (in his original data).

--
Regards,
Tom Ogilvy

ryanb. said:
Even easier if the column you wish to sort contains numeric values (no
additional columns needed). Select all of the column headings and then
choose

DATA --> FILTER --> AUTO FILTER

Little drop down boxes will appear next to each heading. Click the dropdown
to the right of the column you need the top five list for and choose the
option Top 10 in the drop down box. You can then specify if you want the
top 5, 10, 12 whatever (by count or percentage) and it will hide all
non-top-5 items until you turn the filter off.

HTH

ryanb.

Dave Peterson said:
How about this:

Copy all the values to a new column (3000 rows long) (say F)

then select that column and do Data|Filter|Advanced filter and put the unique
entries in the next column over (column G).
See Debra Dalgleish's site for some tips on how to do this:
http://www.contextures.com/xladvfilter01.html#FilterUR

Then you can use Tom suggestion
=countif(f:f,g2)
in column H

and the rank formula in column I.
find
and
 
I may have read more into the OP's problem. I thought that the PC Guy wanted to
choose from all the columns as a group--not each separately.

But I've been wrong before...


Tom said:
That doesn't do what he wants.

However, he doesn't need to use a separate column to use
Data=>Filter=>Advanced filter - he only needs to select the single column he
wants to work with (in his original data).
 
I saw it as a table (database) - where only one column had a specific value
that he wanted to count.

any interpretation such as this would be reading into the problem since
there was nothing to indicate one or the other or anthing in between.

if the values occur across all columns and there may be unique value in
each, then off course one would have to create a single column with all
values to use the advanced filter.
 
Hello Dave:
Thanks to you and everyone else for your responses.
I was primarily interested in finding the top (most frequently
occurring) values for all 5 columns and 600 rows of data (all numeric).
I gather from the discussion so far you can do this for individual
columns but not for multiple columns.
Am I correct?
--
regards,
bruce

Bruce Von Deylen
Pierceton, Ind. USA

E-mail: brucevd<at>michiana<dot>org
 
You could get the results you want for each column by adding a helper column for
each column and then use the =rank() function.

But if I wanted the ranks for all 3000 as an aggregate, I'd put them in their
own column and do it that way. (A few copy|Pastes and you'll be half done.)
 
Oops. I'd use two helper columns per column. One for the unique entries
(data|filter|Advanced filter stuff) and one for the =rank() function.
 
Do all of your cells have numeric values? If so, you could try the
following:

Assuming your range of values is in A1:E600

F1: =MODE(A1:E600)

*F2: =MODE(IF(A1:E600<>F1,A1:E600,""))

*F3: =MODE(IF((A1:E600<>F1)*(A1:E600<>F2),A1:E600,""))

*F4: =MODE(IF((A1:E600<>F1)*(A1:E600<>F2)*(A1:E600<>F3),A1:E600,""))

*F5:
=MODE(IF((A1:E600<>F1)*(A1:E600<>F2)*(A1:E600<>F3)*(A1:E600<>F4),A1:E600,"")
)

The formulae marked with an asterisk need to be Array-Entered. (Instead of
hitting Enter after typing in the formula, hit Ctrl-Shift-Enter.)

If you have non-numeric or blank cells in the range, then you'll need
something a little more complicated...

/i.
 
That's pretty good thinking!
Do all of your cells have numeric values? If so, you could try the
following:

Assuming your range of values is in A1:E600

F1: =MODE(A1:E600)

*F2: =MODE(IF(A1:E600<>F1,A1:E600,""))

*F3: =MODE(IF((A1:E600<>F1)*(A1:E600<>F2),A1:E600,""))

*F4: =MODE(IF((A1:E600<>F1)*(A1:E600<>F2)*(A1:E600<>F3),A1:E600,""))

*F5:
=MODE(IF((A1:E600<>F1)*(A1:E600<>F2)*(A1:E600<>F3)*(A1:E600<>F4),A1:E600,"")
)

The formulae marked with an asterisk need to be Array-Entered. (Instead of
hitting Enter after typing in the formula, hit Ctrl-Shift-Enter.)

If you have non-numeric or blank cells in the range, then you'll need
something a little more complicated...

/i.
 
'Add a command button (command1) to the sheet where your
'data is located and copy all of this code into the sheets'
'general delcarations area... Now, press the command button
'and answer the questions: this will ask for the row start,
'number of rows, and column start positions & will take the data
'from that range.

'Note: Only accepts starting column positions of A-V
' and will only get 9,999,999 entries before exploding!

Option Explicit

Private Sub CommandButton1_Click()

Dim MY_ARRAY1(1 To 9999999) As String
Dim MY_ARRAY2(1 To 9999999, 2) As String
Dim STEP_ARRAY1 As Double
Dim STEP_ARRAY2 As Double
Dim STEP_ARRAY3 As Double
Dim TMPCOL$
Dim TMPCOL_FIX1 As Double
Dim CURR_VAL$
Dim DOES_EXIST As Boolean
Dim MY_ARRAY2_END As Double

Dim NumRows
NumRows = InputBox("Enter the number of rows of data:", "", 600)
If Not IsNumeric(NumRows) Then MsgBox "Invalid Entry!", vbInformation +
vbOKOnly, "": Exit Sub

Dim RowStart
RowStart = InputBox("Enter the starting row number:", "", 1)
If Not IsNumeric(RowStart) Then MsgBox "Invalid Entry!", vbInformation
+ vbOKOnly, "": Exit Sub

Dim ColStart
ColStart = InputBox("Enter the starting Column letter:", "", "A")
Select Case UCase(ColStart)
Case "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L",
"M", "N", "O", "P", "Q", "R", "S", "T", "U", "V"
TMPCOL = UCase(ColStart)
Case Else
MsgBox "Invalid Entry! Please enter a valid column (A - V)",
vbInformation + vbOKOnly, "": Exit Sub
End Select

For STEP_ARRAY1 = 1 To NumRows * 5
If STEP_ARRAY1 = NumRows Then MsgBox Now
If STEP_ARRAY1 < NumRows + 1 Then
TMPCOL_FIX1 = -(RowStart - 1)
ElseIf STEP_ARRAY1 < NumRows * 2 + 1 Then
If STEP_ARRAY1 = NumRows + 1 Then TMPCOL =
Set_New_Column(TMPCOL)
TMPCOL_FIX1 = NumRows - (RowStart - 1)
ElseIf STEP_ARRAY1 < NumRows * 3 + 1 Then
If STEP_ARRAY1 = NumRows * 2 + 1 Then TMPCOL =
Set_New_Column(TMPCOL)
TMPCOL_FIX1 = NumRows * 2 - (RowStart - 1)
ElseIf STEP_ARRAY1 < NumRows * 4 + 1 Then
If STEP_ARRAY1 = NumRows * 3 + 1 Then TMPCOL =
Set_New_Column(TMPCOL)
TMPCOL_FIX1 = NumRows * 3 - (RowStart - 1)
ElseIf STEP_ARRAY1 < NumRows * 5 + 1 Then
If STEP_ARRAY1 = NumRows * 4 + 1 Then TMPCOL =
Set_New_Column(TMPCOL)
TMPCOL_FIX1 = NumRows * 4 - (RowStart - 1)
End If
MY_ARRAY1(STEP_ARRAY1) = Range(TMPCOL & (STEP_ARRAY1 -
TMPCOL_FIX1)).Value
Next

MY_ARRAY2_END = 0
For STEP_ARRAY1 = 1 To NumRows * 5
DOES_EXIST = False
CURR_VAL = MY_ARRAY1(STEP_ARRAY1)
For STEP_ARRAY2 = 1 To NumRows * 5
If MY_ARRAY2(STEP_ARRAY2, 0) = CURR_VAL Then DOES_EXIST = True:
Exit For
Next
If DOES_EXIST = False Then
MY_ARRAY2_END = MY_ARRAY2_END + 1
MY_ARRAY2(MY_ARRAY2_END, 0) = CURR_VAL
End If
Next

For STEP_ARRAY2 = 1 To MY_ARRAY2_END
For STEP_ARRAY1 = 1 To NumRows * 5
If MY_ARRAY1(STEP_ARRAY1) = MY_ARRAY2(STEP_ARRAY2, 0) Then _
MY_ARRAY2(STEP_ARRAY2, 1) = Val(MY_ARRAY2(STEP_ARRAY2, 1)) + 1
Next
Next

Dim GreatestValue(1 To 5) As Integer
Dim GreatestString(1 To 5) As String
GreatestValue(1) = 0
For STEP_ARRAY2 = 1 To 5
GreatestValue(STEP_ARRAY2) = 0
GreatestString(STEP_ARRAY2) = ""
For STEP_ARRAY3 = 1 To MY_ARRAY2_END
If Val(MY_ARRAY2(STEP_ARRAY3, 1)) > GreatestValue(STEP_ARRAY2)
Then
If STEP_ARRAY2 = 1 Then
GreatestValue(STEP_ARRAY2) = Val(MY_ARRAY2(STEP_ARRAY3,
1))
GreatestString(STEP_ARRAY2) = MY_ARRAY2(STEP_ARRAY3,
0)
ElseIf STEP_ARRAY2 = 2 Then
If MY_ARRAY2(STEP_ARRAY3, 0) <> GreatestString(1) Then
GreatestValue(STEP_ARRAY2) =
Val(MY_ARRAY2(STEP_ARRAY3, 1))
GreatestString(STEP_ARRAY2) =
MY_ARRAY2(STEP_ARRAY3, 0)
End If
ElseIf STEP_ARRAY2 = 3 Then
If MY_ARRAY2(STEP_ARRAY3, 0) <> GreatestString(1) And
MY_ARRAY2(STEP_ARRAY3, 0) <> GreatestString(2) Then
GreatestValue(STEP_ARRAY2) =
Val(MY_ARRAY2(STEP_ARRAY3, 1))
GreatestString(STEP_ARRAY2) =
MY_ARRAY2(STEP_ARRAY3, 0)
End If
ElseIf STEP_ARRAY2 = 4 Then
If MY_ARRAY2(STEP_ARRAY3, 0) <> GreatestString(1) And
MY_ARRAY2(STEP_ARRAY3, 0) <> GreatestString(2) And
MY_ARRAY2(STEP_ARRAY3, 0) <> GreatestString(3) Then
GreatestValue(STEP_ARRAY2) =
Val(MY_ARRAY2(STEP_ARRAY3, 1))
GreatestString(STEP_ARRAY2) =
MY_ARRAY2(STEP_ARRAY3, 0)
End If
ElseIf STEP_ARRAY2 = 5 Then
If MY_ARRAY2(STEP_ARRAY3, 0) <> GreatestString(1) And
MY_ARRAY2(STEP_ARRAY3, 0) <> GreatestString(2) And
MY_ARRAY2(STEP_ARRAY3, 0) <> GreatestString(3) And
MY_ARRAY2(STEP_ARRAY3, 0) <> GreatestString(4) Then
GreatestValue(STEP_ARRAY2) =
Val(MY_ARRAY2(STEP_ARRAY3, 1))
GreatestString(STEP_ARRAY2) =
MY_ARRAY2(STEP_ARRAY3, 0)
End If
End If
End If
Next
Next

Dim TmpMsg$
TmpMsg = "Top 5 Greatest Values:" & vbCrLf
For STEP_ARRAY1 = 1 To 5
TmpMsg = TmpMsg & vbCrLf & "#" & STEP_ARRAY1 & " " &
GreatestString(STEP_ARRAY1) & " = " & GreatestValue(STEP_ARRAY1)
Next

MsgBox TmpMsg

End Sub

'Helper funtion to determine new column...
Private Function Set_New_Column(ByVal Starting_Column As String) As
String

Select Case Starting_Column
Case "A"
Set_New_Column = "B"
Case "B"
Set_New_Column = "C"
Case "C"
Set_New_Column = "D"
Case "D"
Set_New_Column = "E"
Case "E"
Set_New_Column = "F"
Case "F"
Set_New_Column = "G"
Case "G"
Set_New_Column = "H"
Case "H"
Set_New_Column = "I"
Case "I"
Set_New_Column = "J"
Case "J"
Set_New_Column = "K"
Case "K"
Set_New_Column = "L"
Case "L"
Set_New_Column = "M"
Case "M"
Set_New_Column = "N"
Case "N"
Set_New_Column = "O"
Case "O"
Set_New_Column = "P"
Case "P"
Set_New_Column = "Q"
Case "Q"
Set_New_Column = "R"
Case "R"
Set_New_Column = "S"
Case "S"
Set_New_Column = "T"
Case "T"
Set_New_Column = "U"
Case "U"
Set_New_Column = "V"
Case "V"
Set_New_Column = "W"
Case "W"
Set_New_Column = "X"
Case "X"
Set_New_Column = "Y"
Case "Y"
Set_New_Column = "Z"
End Select

End Function
 
Back
Top