Matching lists of words

  • Thread starter Thread starter pauljoyce
  • Start date Start date
P

pauljoyce

Hi,

I've a long list of about 15000 different words (they are in order o
frequency in the English language). And I have a long list of word
that occur in some literature I'm looking at, and how often they occur
I'd like to match the two lists together so that on say the left han
side I have the list of words in order of frequency and on the righ
side I can see how often each of these words occurs in the literature
Many of the words in the literature don't occur in the frequency list
so I can't just so sort them alphabetically or something like this.

Any help would be greatly appreciated
 
Here's one crack at it ..

Assuming your frequency list is in Sheet1,cols A and B (word - freq)
the literature list is in cols C and D (word - freq), all data from row1
down,
as per sample data / structure below

Text1 5000 Text5 400
Text2 4000 Text2 300
Text3 3000 Text7 200
Text4 2000
Text5 1000

Put:
in E1: =IF(COUNTIF($C$1:$C$3,A1)>=1,A1,"")
in F1: =IF(E1="","",VLOOKUP(E1,C:D,2,0))
in G1: =IF(F1="","",F1-ROW())

Select E1:G1, fill down as many rows as there is data in col A (15,000
rows?)

Note: Adjust the range $C$1:$C$3 to suit

In another Sheet2
---------------------
Put in A1:

=IF(ISERROR(MATCH(LARGE(Sheet1!$G:$G,ROW(A1)),Sheet1!$G:$G,0)),"",OFFSET(She
et1!$A$1,MATCH(LARGE(Sheet1!$G:$G,ROW(A1)),Sheet1!$G:$G,0)-1,COLUMN(D1)))

Copy A1 across to B1,
fill down as many rows as as there is data in col A in Sheet1

Cols A and B will return the descending sort of the list of words
in the frequency list which match / exist in the literature list,
and the frequency of these words in the literature list
 
Based on the sample data in Sheet1,
the results in Sheet2 would appear as:

Text5 400
Text2 300
 
Back
Top