Question: Relative CPU Usage of SUMIFS & SUMPRODUCT

  • Thread starter Thread starter Matthew
  • Start date Start date
M

Matthew

I'm building an Excel model that will summarize data from a table (18,000
rows x 65 columns), and I want to minimize the number of calculations / CPU
utilization. So my question is: Is it more efficient (CPU utilization) to
use SUMIFS or SUMPRODUCT?

Thanks,
 
Matthew said:
I'm building an Excel model that will summarize data from
a table (18,000 rows x 65 columns), and I want to minimize
the number of calculations / CPU utilization.
So my question is: Is it more efficient (CPU utilization)
to use SUMIFS or SUMPRODUCT?

In my experience, it is a misdirection to focus on the CPU efficiency of one
particular operation, unless you are designing interrupt code. (And even
then, it depends.)

Generally, as long as you pay reasonable attention to good design, you
should choose the function to use based on which one makes more sense in the
context (readability, ease of maintenance, etc). If it's a toss-up based on
those factors, it is probably a toss-up with respect to efficiency overall.

Especially for calculations that will take several seconds, not to mention
many minutes in your case, the total computation time is influenced more by
extraneous factors, namely: other workbook calculations and external system
processing such as interrupts.

I do not have Excel 2007, so I cannot compare SUMIFS to SUMPRODUCT. The
following is a comparison of SUMIF and SUMPRODUCT in Excel 2003.

Yes, in some contexts, SUMIF is more efficient than SUMPRODUCT. This is
based on a macro that I used to compare the calculation time for
SUMIF(B1:B10000,">=0",C1:C10000) and SUMPRODUCT(--(B1:B10000>=0),C1:C10000).

But the important thing to notice is: the total run time of the macro was
about the same in both cases. This demonstrates the general principle
stated above: extraneous factors often outweigh the differences between
small parts of the entire operation.

Moreover, the difference between SUMIF and SUMPRODUCT depends on the
complexity of the formulas in the referenced cells. For example, when the
referenced cells contained constants, SUMPRODUCT took 5 times longer than
SUMIF. But when the referenced cells contained =RAND(), SUMPRODUCT took 3.5
times longer. This, again, demonstrates the general principle: as the
complexity of the calculations in the referenced cells increases, the
differences between SUMIF and SUMPRODUCT decreases.

(In all cases, on my computer, we are talking about 2 to 13 milliseonds per
formula calculation. But the actual times and even the relative differences
will vary greatly from system to system.)

Finally, it should be noted that these calculation comparisons were done in
a pristine environment where the only calculations involved the SUMIF or
SUMPRODUCT formula and the referenced cells. In real life, if your workbook
is complicated enough to have many tens of thousands SUMIFS or SUMPRODUCT
formulas, it probably also has many thousands of other formulas that may or
may not be recalculated. Again, that overhead may outweigh the difference
between the SUMIFS and SUMPRODUCT efficiency.

One last comment, for which I have no empirical data. I suspect that the
performance difference between SUMIFS and SUMPRODUCT is less than the
difference between SUMIF and SUMPRODUCT. The basis for that is: generally,
more complex features cost more in over-all computational time, even when
used in an equivalent manner (one criterion argument, in this case). For
example, whereas the internal implementation of SUMIF can be optimized for
the one criterion, SUMIFS must be able to handle a variable number of
arguments, just as SUMPRODUCT does.

(Of course, SUMPRODUCT will always have the additional overhead of adding
products. But as that is "mouse nuts" compared to the other overhead.)

I'm sorry if this is not the simple canned response ("this one" or "that
one") you are looking for. But the fact is: performance choices are not
simple to make. My point is simply: don't sweat the small stuff.
Undoubtedly, that "microfocus" will cause you to lose sight of the "big
picture".
 
Clarification....
Matthew said:
I'm building an Excel model that will summarize data
from a table (18,000 rows x 65 columns)
[....]
for calculations that will take several seconds, not
to mention many minutes in your case

I'm sorry: I don't know how long your calculations will take. I don't know
the complexity of and how many SUMIFS or SUMPRODUCT formulas you are talking
about.

On the other hand, if your calculations do not take "several seconds, not to
mention many minutes", it is even more questionable whether you should focus
on the CPU utilization of SUMIFS v. SUMPRODUCT.


----- original message -----
 
You made some good points for consideration but I think you may have drifted
from the question posed:
So my question is: Is it more efficient (CPU utilization)
to use SUMIFS or SUMPRODUCT?

Granted, "CPU utilization" may not be the true consideration, but I think
they meant overall efficiency.

In Excel 2007 SUMIFS is "significantly" more efficient than SUMPRODUCT.

Range = 18,000 rows
Average calc time (5 calcs)

......................SUMPRODUCT vs SUMIFS
1 condition..........0.025024...........0.009200
2 conditions........0.042650...........0.012822
3 conditions........0.060202...........0.014202
4 conditions........0.077808...........0.014776

Another advantage is that SUMIFS only calculates based on the used range.

There is no difference in efficiency between these formulas:

=SUMIFS(E:E,A:A,"w",B:B,"x",C:C,"y",D:D,"z")

=SUMIFS(E1:E18000,A1:A18000,"w",B1:B18000,"x",C1:C18000,"y",D1:D18000,"z")

One disadvantage of SUMIFS is that it's limited to "straight comparisons"
(just like SUMIF).

--
Biff
Microsoft Excel MVP


JoeU2004 said:
Clarification....
Matthew said:
I'm building an Excel model that will summarize data
from a table (18,000 rows x 65 columns)
[....]
for calculations that will take several seconds, not
to mention many minutes in your case

I'm sorry: I don't know how long your calculations will take. I don't
know the complexity of and how many SUMIFS or SUMPRODUCT formulas you are
talking about.

On the other hand, if your calculations do not take "several seconds, not
to mention many minutes", it is even more questionable whether you should
focus on the CPU utilization of SUMIFS v. SUMPRODUCT.


----- original message -----

JoeU2004 said:
In my experience, it is a misdirection to focus on the CPU efficiency of
one particular operation, unless you are designing interrupt code. (And
even then, it depends.)

Generally, as long as you pay reasonable attention to good design, you
should choose the function to use based on which one makes more sense in
the context (readability, ease of maintenance, etc). If it's a toss-up
based on those factors, it is probably a toss-up with respect to
efficiency overall.

Especially for calculations that will take several seconds, not to
mention many minutes in your case, the total computation time is
influenced more by extraneous factors, namely: other workbook
calculations and external system processing such as interrupts.

I do not have Excel 2007, so I cannot compare SUMIFS to SUMPRODUCT. The
following is a comparison of SUMIF and SUMPRODUCT in Excel 2003.

Yes, in some contexts, SUMIF is more efficient than SUMPRODUCT. This is
based on a macro that I used to compare the calculation time for
SUMIF(B1:B10000,">=0",C1:C10000) and
SUMPRODUCT(--(B1:B10000>=0),C1:C10000).

But the important thing to notice is: the total run time of the macro
was about the same in both cases. This demonstrates the general
principle stated above: extraneous factors often outweigh the
differences between small parts of the entire operation.

Moreover, the difference between SUMIF and SUMPRODUCT depends on the
complexity of the formulas in the referenced cells. For example, when
the referenced cells contained constants, SUMPRODUCT took 5 times longer
than SUMIF. But when the referenced cells contained =RAND(), SUMPRODUCT
took 3.5 times longer. This, again, demonstrates the general principle:
as the complexity of the calculations in the referenced cells increases,
the differences between SUMIF and SUMPRODUCT decreases.

(In all cases, on my computer, we are talking about 2 to 13 milliseonds
per formula calculation. But the actual times and even the relative
differences will vary greatly from system to system.)

Finally, it should be noted that these calculation comparisons were done
in a pristine environment where the only calculations involved the SUMIF
or SUMPRODUCT formula and the referenced cells. In real life, if your
workbook is complicated enough to have many tens of thousands SUMIFS or
SUMPRODUCT formulas, it probably also has many thousands of other
formulas that may or may not be recalculated. Again, that overhead may
outweigh the difference between the SUMIFS and SUMPRODUCT efficiency.

One last comment, for which I have no empirical data. I suspect that the
performance difference between SUMIFS and SUMPRODUCT is less than the
difference between SUMIF and SUMPRODUCT. The basis for that is:
generally, more complex features cost more in over-all computational
time, even when used in an equivalent manner (one criterion argument, in
this case). For example, whereas the internal implementation of SUMIF
can be optimized for the one criterion, SUMIFS must be able to handle a
variable number of arguments, just as SUMPRODUCT does.

(Of course, SUMPRODUCT will always have the additional overhead of adding
products. But as that is "mouse nuts" compared to the other overhead.)

I'm sorry if this is not the simple canned response ("this one" or "that
one") you are looking for. But the fact is: performance choices are not
simple to make. My point is simply: don't sweat the small stuff.
Undoubtedly, that "microfocus" will cause you to lose sight of the "big
picture".
 
T. Valko said:
You made some good points for consideration
Thanks.


but I think you may have drifted from the question posed:


Granted, "CPU utilization" may not be the true consideration,
but I think they meant overall efficiency.

Whether you call it "CPU utilization" or "overall efficiency", my point
remains the same. It might have gotten lost in the "noise" (TMI).

I understood the OP's question as it was posed, and I understand that SUMIFS
might be more efficient than SUMPRODUCTS, at least under some/most
circumstances.

I do not believe I "drifted" from those facts. In fact, I reiterated them.
("Yes, in some contexts, SUMIF is more efficient than SUMPRODUCT".)

But my point is still the same: the difference is usually "mouse nuts" in
the grand scheme of things. Ergo, users are usually misdirected when they
focus on such miniscule details.

Arguably, since I believe the differences are probably miniscule, it really
should not matter if the OP chooses SUMIFS because he has deluded himself
into believing it will make a significant difference performance-wise.

But I'm concerned with the principle of the matter.

Certainly, there are instances where the choice of function does make a
difference. But those are functions that have global impact; for example,
volatile functions. That's what I mean by paying "reasonable attention to
good design".

In Excel 2007 SUMIFS is "significantly" more efficient than SUMPRODUCT.
[....]
Range = 18,000 rows
[....]
.....................SUMPRODUCT vs SUMIFS
[....]
4 conditions........0.077808...........0.014776

How complex were the formulas that you had in the referenced cells? That
is, how significant was the "extraneous" computations compared to the SUMIF
and SUMPRODUCT evaluation themselves?

Perhaps the following will illustrate the concepts I am trying to explain.

I cannot offer a counter-example using SUMIFS, since I don't have Excel
2007. Perhaps you can follow my lead (below) and post an update.

In B1:B18000, I have the constants 1 through 18000. In C1:F18000, I have
the formula =CHOOSE(RANDBETWEEN(1,4),"w","x","y","z"). Of course, since
RANDBETWEEN() is volatile, this will ensure that all 72,000 cells are
re-evaluated when they are calculated.

I compare the performance of each of the following solutions separately:

A1: =SUMPRODUCT(--(C1:C18000="w"),B1:B18000) +
SUMPRODUCT(--(D1:D18000="x"),B1:B18000) +
SUMPRODUCT(--(E1:E18000="y"),B1:B18000) +
SUMPRODUCT(--(F1:F18000="z"),B1:B18000)

A2: =SUMIF(C1:C18000,"w",B1:B18000) +
SUMIF(D1:D18000,"x",B1:B18000) +
SUMIF(E1:E18000,"y",B1:B18000) +
SUMIF(F1:F18000,"z",B1:B18000)

A3: =SUMPRODUCT(--(C1:C18000="w"),B1:B18000)

A4: =SUMIF(C1:C18000,"w",B1:B18000)

A1 and A2 are not the same logic as SUMIFS, of course. And I am making 4
function calls per execution instead of 1, incurring more overhead.

But then again, that's the point: to demonstrate the effect of overhead v.
individual function execution.

When I measure the time to calculate A3 and A4 10 times without
recalculating C1:F18000, I get the following results:

A3 (SUMPROD): total 0.294 sec, avg 26.278 msec

A4 (SUMIF): total 0.073 sec, avg 6.656 msec

So the SUMIF solution seems to be 3.9 to 4.0 times faster (on my computer).

(Note: "total" does not equal to 10 times "avg" because "total" includes
macro overhead as well, e.g. looping logic, timer function calls, timer
maintenance, and VB overhead.)

When I measure the time to calculate A1 and A2 without recalculating
C1:F18000, I get the following results:

A1 (SUMPROD): total 0.747 sec, avg 61.958 msec

A2 (SUMIF): total 0.285 sec, avg 25.677 msec

So the SUMIF solution seems to be 2.4 to 2.6 times faster, but the
difference is about 37% less.

When I measure A1 and A2 with recalculation of C1:F18000 each time, I get
the following results:

A1 (SUMPROD): total 5.030 sec, avg 442.844 msec

A2 (SUMIF): total 4.576 sec, avg 405.718 msec

So the SUMIF solution seems to be only about 1.1 times faster.

As you can see, as the overhead computation increases, the difference
between the SUMIF and SUMPRODUCTS diminishes to the point where there is
almost no difference.


To reiterate my point: whether or not to focus on the performance
difference between SUMIFS and SUMPRODUCT depends on the relative
contribution that those calculations make to the whole set of calcuations.

If the user presumes that their contribution is large, we can see that even
though SUMIFS might be significantly faster, the total time difference is
not likely to be large (less 1 sec in my case).

On the other hand, if their contribution is small, we can see that the
difference between SUMIFS and SUMPRODUCT is diminished significantly by the
other calculations.

In either case, it does not seem to make sense to worry about the
performance difference between SUMIFS and SUMPRODUCT in particular.

One disadvantage of SUMIFS is that it's limited to
"straight comparisons" (just like SUMIF).

Of course, it does not make sense to ask about the relative performance of
SUMIFS v. SUMPRODUCT if SUMIFS cannot do the job. In that case, SUMPRODUCTS
is "infinitely" more efficient ;-).

So, since the OP is considering SUMPRODUCT v. SUMIFS, it is reasonable to
assume that both functions can be used. Ergo, the SUMPRODUCT expression
uses equivalently simple "straight comparisons".


An aside, for my edification....
There is no difference in efficiency between these formulas:
=SUMIFS(E:E,A:A,"w",B:B,"x",C:C,"y",D:D,"z")
=SUMIFS(E1:E18000,A1:A18000,"w",B1:B18000,"x",C1:C18000,"y",D1:D18000,"z")

That could make a significant difference between SUMPRODUCT v. SUMIFS with
sparse tables; more correctly, tables of potentially long columns, but with
significantly fewer rows filled in at the top.

Arguably, that is something for the OP to consider, if it is applicable.

But I'm curious about the "no difference" assertion.

I presume that is based on the assumption that some range property (I forget
which one) demarcates the last-used cell in a column. Right?

Is that demarcation always correct?

In Excel 2003, I notice that when I put a value or formula in some distance
cell (e.g. AI65000), then delete it, ctrl+End still goes to that distance
cell.

Does that also screw up the last-used cell demarcation in that column? For
example, would that cause the range AI:AI to be processed as AI1:AI65000,
for example? In Excel 2007 as well as Excel 2003?


----- original message -----

T. Valko said:
You made some good points for consideration but I think you may have
drifted from the question posed:
So my question is: Is it more efficient (CPU utilization)
to use SUMIFS or SUMPRODUCT?

Granted, "CPU utilization" may not be the true consideration, but I think
they meant overall efficiency.

In Excel 2007 SUMIFS is "significantly" more efficient than SUMPRODUCT.

Range = 18,000 rows
Average calc time (5 calcs)

.....................SUMPRODUCT vs SUMIFS
1 condition..........0.025024...........0.009200
2 conditions........0.042650...........0.012822
3 conditions........0.060202...........0.014202
4 conditions........0.077808...........0.014776

Another advantage is that SUMIFS only calculates based on the used range.

There is no difference in efficiency between these formulas:

=SUMIFS(E:E,A:A,"w",B:B,"x",C:C,"y",D:D,"z")

=SUMIFS(E1:E18000,A1:A18000,"w",B1:B18000,"x",C1:C18000,"y",D1:D18000,"z")

One disadvantage of SUMIFS is that it's limited to "straight comparisons"
(just like SUMIF).

--
Biff
Microsoft Excel MVP


JoeU2004 said:
Clarification....
I'm building an Excel model that will summarize data
from a table (18,000 rows x 65 columns)
[....]
for calculations that will take several seconds, not
to mention many minutes in your case

I'm sorry: I don't know how long your calculations will take. I don't
know the complexity of and how many SUMIFS or SUMPRODUCT formulas you are
talking about.

On the other hand, if your calculations do not take "several seconds, not
to mention many minutes", it is even more questionable whether you should
focus on the CPU utilization of SUMIFS v. SUMPRODUCT.


----- original message -----

JoeU2004 said:
I'm building an Excel model that will summarize data from
a table (18,000 rows x 65 columns), and I want to minimize
the number of calculations / CPU utilization.
So my question is: Is it more efficient (CPU utilization)
to use SUMIFS or SUMPRODUCT?

In my experience, it is a misdirection to focus on the CPU efficiency of
one particular operation, unless you are designing interrupt code. (And
even then, it depends.)

Generally, as long as you pay reasonable attention to good design, you
should choose the function to use based on which one makes more sense in
the context (readability, ease of maintenance, etc). If it's a toss-up
based on those factors, it is probably a toss-up with respect to
efficiency overall.

Especially for calculations that will take several seconds, not to
mention many minutes in your case, the total computation time is
influenced more by extraneous factors, namely: other workbook
calculations and external system processing such as interrupts.

I do not have Excel 2007, so I cannot compare SUMIFS to SUMPRODUCT. The
following is a comparison of SUMIF and SUMPRODUCT in Excel 2003.

Yes, in some contexts, SUMIF is more efficient than SUMPRODUCT. This is
based on a macro that I used to compare the calculation time for
SUMIF(B1:B10000,">=0",C1:C10000) and
SUMPRODUCT(--(B1:B10000>=0),C1:C10000).

But the important thing to notice is: the total run time of the macro
was about the same in both cases. This demonstrates the general
principle stated above: extraneous factors often outweigh the
differences between small parts of the entire operation.

Moreover, the difference between SUMIF and SUMPRODUCT depends on the
complexity of the formulas in the referenced cells. For example, when
the referenced cells contained constants, SUMPRODUCT took 5 times longer
than SUMIF. But when the referenced cells contained =RAND(), SUMPRODUCT
took 3.5 times longer. This, again, demonstrates the general principle:
as the complexity of the calculations in the referenced cells increases,
the differences between SUMIF and SUMPRODUCT decreases.

(In all cases, on my computer, we are talking about 2 to 13 milliseonds
per formula calculation. But the actual times and even the relative
differences will vary greatly from system to system.)

Finally, it should be noted that these calculation comparisons were done
in a pristine environment where the only calculations involved the SUMIF
or SUMPRODUCT formula and the referenced cells. In real life, if your
workbook is complicated enough to have many tens of thousands SUMIFS or
SUMPRODUCT formulas, it probably also has many thousands of other
formulas that may or may not be recalculated. Again, that overhead may
outweigh the difference between the SUMIFS and SUMPRODUCT efficiency.

One last comment, for which I have no empirical data. I suspect that
the performance difference between SUMIFS and SUMPRODUCT is less than
the difference between SUMIF and SUMPRODUCT. The basis for that is:
generally, more complex features cost more in over-all computational
time, even when used in an equivalent manner (one criterion argument, in
this case). For example, whereas the internal implementation of SUMIF
can be optimized for the one criterion, SUMIFS must be able to handle a
variable number of arguments, just as SUMPRODUCT does.

(Of course, SUMPRODUCT will always have the additional overhead of
adding products. But as that is "mouse nuts" compared to the other
overhead.)

I'm sorry if this is not the simple canned response ("this one" or "that
one") you are looking for. But the fact is: performance choices are
not simple to make. My point is simply: don't sweat the small stuff.
Undoubtedly, that "microfocus" will cause you to lose sight of the "big
picture".
 
Just venturing a passing thought here:
Isn't SUMIFS a brand new function available only in xl 2007?
Is the topic SUMIF vs SUMPRODUCT or SUMIFS vs SUMPRODUCT?
I'm quite confused ..
--
Max
Singapore
http://savefile.com/projects/236895
Downloads:27,000 Files:200 Subscribers:70
xdemechanik
---
 
Max said:
Isn't SUMIFS a brand new function available only in xl 2007?
Is the topic SUMIF vs SUMPRODUCT or SUMIFS vs SUMPRODUCT?
I'm quite confused ..

I can see how you might be, since I never said, "I do not have Excel 2007,
so I cannot compare SUMIFS to SUMPRODUCT. The following is a comparison of
SUMIF and SUMPRODUCT in Excel 2003", or "I cannot offer a counter-example
using SUMIFS, since I don't have Excel
2007. Perhaps you can follow my lead (below) and post an update".

Oh, wait a minute: those are direct quotes from my previous postings.
Klunk!


----- original message -----
 
Well, I disagree with your entire premise.

Basically, what you're saying is:

The difference in efficiency isn't significant enough to use the most
efficient methods available so why bother.

While that might be true in some cases it also doesn't allow someone to
develop the skills/gain the experience needed when those considerations can
be a significant factor.

You're a programmer. Surely you try to make your code as efficient as
possible, don't you? So what's the difference between writing efficient code
and writing efficient formulas? There is none from my perspective.

Why use a LOOKUP function when you can use a many level nested IF function?
The results are the same so why bother?

Writing efficient formulas/code doesn't take any extra effort if you have
the knowledge to do so. If you don't have that knowledge then these forums
are a great place to learn.

Is it more difficult to write this formula:

=SUMIFS(E1:E18000,A1:A18000,"w",B1:B18000,"x",C1:C18000,"y",D1:D18000,"z")

Or this formula:

=SUMPRODUCT(--(A1:A18000="w"),--(B1:B18000="x"),--(C1:C18000="y"),--(D1:D18000="z"),E1:E18000)

The few extra keystrokes notwithstanding.
 
Isn't SUMIFS a brand new function available
only in xl 2007?

Yes. It works just like SUMIF but it can reference multiple ranges with
multiple criteria.

x...y...2
a...y...3
x...y...2
y...x...2
x...x...1

=SUMIFS(C1:C5,A1:A5,"x",B1:B5,"y")

The SUMPRODUCT equivalent:

=SUMPRODUCT(--(A1:A5="x"),--(B1:B5="y"),C1:C5)

It also shares the same performace advantage that SUMIF has when compared to
the equivalent of SUMPRODUCT.
Is the topic SUMIF vs SUMPRODUCT or
SUMIFS vs SUMPRODUCT?

Well, here's the question asked by the OP:
 
T. Valko said:
Well, I disagree with your entire premise.

That's a fair comment.

Basically, what you're saying is:
The difference in efficiency isn't significant enough
to use the most efficient methods available so why bother.

Not exactly.

I am saying that the OP is deluding himself if he believes the choice
between SUMIFS and SUMPRODUCT is critical to "minimize the number of
calculations / CPU utilization" and to make his model "more efficient (CPU
utilization)".

And I am saying that is not the primary consideration when just "building an
Excel model that will summarize data from a table (18,000 rows x 65
columns)", as the OP wrote.

I would have no objection with a qualified response like: SUMIFS is faster
than SUMPRODUCT in some instances, but whether or not that will benefit your
design depends on how much of the total computation is actually consumed by
the SUMIFS or SUMPRODUCT statement(s).

I think that is the response I provided, somewhere in all the "noise" :-).

At one time, I had written something to the effect that it would help us
help him if the OP shared a little more of his design with us, especially
how the SUMIFS or SUMPRODUCT usage scaled. But I don't see that statement
off-hand. It might have gotten deleted in my attempt to make my posting
"more efficient" ;-).

You're a programmer. Surely you try to make your code as
efficient as possible, don't you?

Well, first and foremost, I focus on making __algorithms__ and __designs__
as efficient as possible. I don't sweat the small stuff like how efficient
a line of code is, at least not until I suspect it is a problem. (We are
usually wrong.)

I certainly used to, in my less-experienced days, especially when I was
writing in assembly language and especially when I was keypunching my own
cards and lugging 2000-card boxes around.

But the computing world is a very different place now.

That is not to say that I never wrestle over data structure, data type, and
operator and function usage. But I try to put that into proper perspective.

So what's the difference between writing efficient code
and writing efficient formulas?

None, IMHO.

"Efficient code" means efficient algorithms and design, first and foremost.

"Efficient formulas" is more akin to writing lines of code, IMHO. But it
can also mean efficient worksheet design and "algorithms", like doing an
expensive lookup in one place, perhaps a helper cell, instead of repeating
it ad nauseum.

Why use a LOOKUP function when you can use a many level
nested IF function? The results are the same so why bother?

There may be circumstances where the nested IF is more efficient than
LOOKUP. But generally, that is not why I opt for one or the other.

There are circumstances where you must use IF instead of LOOKUP. There are
circumstances where you must use LOOKUP instead of IF, at least in Excel
2003. There are circumstances when using LOOKUP is simply "tidier" -- more
readable and easier to maintain. There are circumstances when using nested
IFs may be more readable and easier to maintain.

Those are my primary considerations when choosing IF v. LOOKUP.

Honestly, I never considered performance, except where the nested IF might
cause needless re-evaluation of a complex conditional expression that can be
avoided by using LOOKUP.

But note that the performance issue there is not the performance difference
between IF and LOOKUP per se, but in the evaluation of their arguments.
That is a design issue.

(You mentioned the fact that SUMIFS permits a variable range like A:A,
whereas SUMPRODUCT does not. There are circumstances where that could be a
significant performance advantage over a fixed range like A1:A18000. But
again, that's a qualified statement.)

Writing efficient formulas/code doesn't take any extra effort
if you have the knowledge to do so. If you don't have that
knowledge then these forums are a great place to learn.

They can also be a great place to learn bad habits, if they are not exposed.
We all have different knowledge and experiences to share, different areas of
expertise.

Is it more difficult to write this formula:
=SUMIFS(E1:E18000,A1:A18000,"w",B1:B18000,"x",C1:C18000,"y",D1:D18000,"z")
Or this formula:
=SUMPRODUCT(--(A1:A18000="w"),--(B1:B18000="x"),--(C1:C18000="y"),--(D1:D18000="z"),E1:E18000)
The few extra keystrokes notwithstanding.

I would say they are equally easy to write.

I certainly would never consider keystrokes per se, at least not until I
approach the cell character limit. However, I do consider readability; in
some sense, they are related.

But I would write the following instead:

=SUMPRODUCT((A1:A18000="w")*(B1:B18000="x")*(C1:C18000="y")*(D1:D18000="z"),E1:E18000)

(Hmm, I never compared the two forms to see if there is a significant
performance difference. :-)

In that form, there is a difference of 3 keystrokes per criterion (minus
one). Again, we do not know how the OP's formula scales with the size of
his table(s). If he is talking about 65 criteria per formula, 210
keystrokes could make a big difference. On the other hand, SUMIFS would not
permit more than 29 criteria (14 pairs), whereas SUMPRODUCT would be
unbounded in the form that I use except for the cell character limit.

In fact, that is one primary consideration that I believe the OP should be
concerned with: the flexibility of SUMPRODUCT v. SUMIFS in terms of the
criteria that can be expressed, scalability to his design, and perhaps
familiarity.

For example, if he is writing SUMPRODUCT in a majority of places where
SUMIFS cannot be used, there is probably no harm in his continuing to use
SUMPRODUCT in places where he has a choice. There may be some benefit to
consistency of formulation in terms of readability and maintainability, as
well as the flexibility to add unforeseen criteria that SUMIFS does not
permit.

Conversely, if he is using mostly SUMIF and SUMIFS, there is no harm in his
continuing to use SUMIFS in places where he has a choice.


To summarize, the point is: there are a lot of other things to consider
first in choosing between SUMIFS and SUMPRODUCT before considering any
potential performance difference.

The operative words are "first" and "potential".

That's just my opinion. "You can lead a horse to water, but you cannot make
him drink", especially if he's beaten to death already :-).


----- original message -----
 
I have to disagree with Joel that the differences are probably miniscule.

using efficient formulae rather than inefficient formulae will frequently
reduce calculation times from hours or minutes to seconds or fractions of a
second.

Given the fact that slow response time is a significant factor in user
errors its definitely worth using more efficient formulae.

But I do agree that if a workbook recalculates in less than a second its
usually not worth optimising (unless you need a large number of repeated
calculations of course).
But my point is still the same: the difference is usually "mouse nuts" in
the grand scheme of things. Ergo, users are usually misdirected when they
focus on such miniscule details.

Arguably, since I believe the differences are probably miniscule, it
really
should not matter if the OP chooses SUMIFS because he has deluded himself
into believing it will make a significant difference performance-wise.

Charles
___________________________________
The Excel Calculation Site
http://www.decisionmodels.com
 
I would have no objection with a qualified response like:
SUMIFS is faster than SUMPRODUCT in some instances, but whether or not that
will benefit your
design depends on how much of the total computation
is actually consumed by the SUMIFS or
SUMPRODUCT statement(s).

I'll meet you half way:

SUMIFS is faster than SUMPRODUCT in some instances, but whether or not that
will benefit your design depends on how much of the total computation is
actually consumed by the SUMIFS or SUMPRODUCT statement(s). However, using
SUMIFS versus the equivalent SUMPRODUCT certainly won't hurt your efforts.
 
Just to throw another consideration in, using SUMIFS is no good if you
want backwards compatibility, i.e. if the workbook might be used with
Excel versions prior to 2007 - mind you, using SUMPRODUCT with full
coloumn references (which should be avoided, but which is allowed in
Excel 2007) will also screw up with earlier versions.

Pete
 
As the OP, let me clarify a couple of points:
1). This model will intentionally only work in Excel 2007 (or 2010), so I'm
not worried about Excel 2003 compatibility
2). CPU utilization is very important concern. With limited number of
SUMPRODUCTS / SUMIFS I currently, the model takes ~2 seconds to completely
recalculate on a P9600 (2.53GHz) with 3GB of RAM.
3). The full model I'm building will contain ~50 times the number of
calculations I currently have, so if SUMIFS is even 10% faster than
SUMPRODUCT, that will make a noticeable difference to the end user
experience.

Does the above clarify why I'm concerned with relative CPU usage?
 
Does the above clarify why I'm concerned with relative CPU usage?

Yes!

My recommendation remains unchanged....

If you can replace variations of SUMPRODUCT with SUMIFS then do so. Also,
the new COUNTIFS function is equally more efficient than the equivalent
version of SUMPRODUCT.
 
Back
Top