Frequency issue

  • Thread starter Thread starter Mark
  • Start date Start date
M

Mark

Hi,

I am trying to compare two sets of five numbers on the
same row, (B5:F5) with (H5:L5). Each set of numbers does
not have any duplicate values. And the value of the
numbers in both sets range from 1-75 (not sure if this
matters much). I would like to count how many numbers in
one set matches numbers in the other set. I have tried
using the following code in cell N5, but am not getting
the right count. I have tried typing the code in cell N5
and pressing "ctrl/shift/enter," but am not having any
luck. Any suggestions?

=SUM(IF(FREQUENCY(B5:F5,H5:L5)>0,1))

=FREQUENCY(B5:F5,H5:L5)

Thanks for the assistance!
 
One way:

=SUM(N(B5:F5=TRANSPOSE(H5:L5)))

Array-entered, meaning press ctrl/shift/enter after
inserting the formula. Excel will place {} around the
formula to indicate it is an array formula.

HTH
Jason
Atlanta, GA
 
One way:

=SUM(N(B5:F5=TRANSPOSE(H5:L5)))

Array-entered, meaning press ctrl/shift/enter after
inserting the formula. Excel will place {} around the
formula to indicate it is an array formula.
...

That does a lot of extra, unnecessary work. If you rely on the OP's assurance
that neither array contains duplicates, then

=SUMPRODUCT(COUNTIF(B5:F5,H5:L5))

would suffice, and doesn't need to be array-entered.

This would give the wrong result whenever the largest value in B5:F5 exceeds the
largest value in H5:L5 because FREQUENCY returns one entry more than the number
of entries in its second argument, with the extra entry containing the count of
values in the first argument exceeding all the values in the second.
 
Mark

Select 5 adjacent cells N5:R5

Type in your Frequency formula =FREQUENCY(B5:F5,H5:L5) to active cell then
hit CRTL/SHIFT/ENTER

Gord Dibben Excel MVP
 
I like that, Harlan. But if I asked you to explain *why*
your method is better in terms of Excel efficiency, could
you explain it? I'm often asked by users why one formula
is "better" than another, even though both formulas
produce the correct results.

Thanks.
Jason
 
Select 5 adjacent cells N5:R5

Type in your Frequency formula =FREQUENCY(B5:F5,H5:L5) to active cell then
hit CRTL/SHIFT/ENTER
...

First off, FREQUENCY's result is always a vertical array, so entering it into a
horizontal range won't give a very useful result.

Also, OP is looking for exact matches, i.e., equality. Try

=FREQUENCY({2,4,6,8,10},{3,6,9,12,15})

the result is {1;2;1;1;0;0} rather than {0;1;0;0;0;0}. While FREQUENCY(X,X)
returns pretty much the same thing as COUNTIF(X,X), the latter is more robust,
and FREQUENCY(X,Y) is decidedly different than COUNTIF(X,Y).
 
I like that, Harlan. But if I asked you to explain *why*
your method is better in terms of Excel efficiency, could
you explain it? I'm often asked by users why one formula
is "better" than another, even though both formulas
produce the correct results.
...

Efficiency first.

SUM(N(B5:F5=TRANSPOSE(H5:L5)))

effectively performs 25 equality comparisons, then 25 boolean to number
conversions using N, finally sums the 25 numbers.

SUMPRODUCT(COUNTIF(B5:F5,H5:L5))

also effectively performs 25 equality comparisons and 25 sums in the COUNTIF
function, but the sums in the COUNTIF function are very likely implemented as
CPU register increment operations rather than sums of arbitrary numbers.
Register incrementing is much faster than arbitrary addition. The result of
SUMIF is an array of only 5 numbers, so there are only 5 more adds.

Jason's: 25 =, 25 conversions, 25 adds
Harlan's: 25 =, 25 increments, 5 adds

Clearly 5 adds is fewer, so faster, than 25 adds. The question remains whether
25 boolean-to-number conversions are slower or faster than 25 register
increments. Even given the most efficient implementation of the conversion
immaginable, each conversion would likely take at least 4 times as many clock
cycles as each register increment.


Extensibility second.

Excel has a rather nasty limit of 7 nested function call levels. Any time you
don't need to use a function call, especially if there's an operator that gives
the same result, the function call should be avoided. Your formula could be
written as

=SUMPRODUCT(--(B5:F5=TRANSPOSE(H5:L5)))

and it'd use the same number of function calls as my formula (but it'd still
involve more adds). As an added bonus, it wouldn't need to be array-entered.

If you'd claim using N() is clearer than --, so more easily maintained, I'd
point out that summing the result of COUNTIF is clearer still.
 
Back
Top