SUMPRODUCT QUESTION REGARDING DASHES

  • Thread starter Thread starter DAKOTASTEVE
  • Start date Start date
D

DAKOTASTEVE

Just wondering, this formula was suggested and it works,
but I was curious why the two instances of using double
dashes is necessary for the formula to work.

Formula:
=SUMPRODUCT(--($D$2:$D$4=A2),--($E$2:$E$4=B2),($F$2:$F$4))

Data I was using:
Fund Acct# Amount Fund Org # Amount
100 22001 $20.02 100 22001 $30.10
100 22002 30.00 100 22002 $40.24
200 24003 60.00 200 24001 31.00

Basically what the formula does above is replace values in
List A with Values in List B, and if list B doesn't
correspond with fund and acct in List A, list A gets a
zero value returned. The formula works great, I just want
to learn why the dashes are used, if I don't use the
dashes, the formula doesn't work!
 
Steve,

I'm not sure since i wasn't involved in the initial problem, but it looks
that he/she who proposed the formula to you wanted to transform a text (like
20002) to an actual number.

To shwo this enter in cell A1 '20002 (mind the apostroph; also note that
the "number" is aligned left). In cell B1 enter =istext(a1) (You
should get TRUE)
Now in cell C1 =--A1 (the number in now right aligned) and in
cell D1 =istext(C1) (You showld now get FALSE)

--
Regards,
Auk Ales

* Please reply to this newsgroup only *
* I will not react on unsolicited e-mails *
 
No, it's transforming an array of booleans (true/false) to an array of
numbers (1/0) that SUMPRODUCT can actually work on.

It's an (unnecesarry ) practice that has crept in recently, it could just as
easily be written as

=SUMPRODUCT(($D$2:$D$4=A2)*($E$2:$E$4=B2),($F$2:$F$4))

or even

=SUMPRODUCT(($D$2:$D$4=A2)*($E$2:$E$4=B2)*($F$2:$F$4))

--

HTH

Bob Phillips
... looking out across Poole Harbour to the Purbecks
(remove nothere from the email address if mailing direct)
 
Hi Guys,

=SUMPRODUCT(--($D$2:$D$4=A2),--($E$2:$E$4=B2),($F$2:$F$4))

This formula is the same as:

=SUMPRODUCT(($D$2:$D$4=A2)*($E$2:$E$4=B2)*$F$2:$F$4)

The first formula uses the -- to eliminate a process cycle
in the formula interepter and thus makes it faster. The
best way to actually see what I'm talking about is to use
the EVALUATE FORMULA menu command on the two formulas and
notice the difference.

Biff
 
Unnecessary? It's not recent either, -- is but not the syntax.

[1]

=SUMPRODUCT(($D$2:$D$4=A2)*($E$2:$E$4=B2)*($F$2:$F$4))

has been the practice here. A single array SumProduct formula, so to speak.
A form that ignores a nice feature that goes with SumProduct's native
syntax. Any text value in the range to sum would error out with [1].

[2]

=SUMPRODUCT(($D$2:$D$4=A2)*($E$2:$E$4=B2),($F$2:$F$4))

allows for the text values in the range to sum. And it's a 2-array
SumProduct formula. Just as in [1], the required coercion of logical values
into numbers takes place as "side-effect" of the multiplication.

[3a]

=SUMPRODUCT(($D$2:$D$4=A2)+0,($E$2:$E$4=B2)+0,($F$2:$F$4))

follows the native syntax of the SumProduct function. I for one argued for
[3] and [2] equally often here and elsewhere.

The +0 bit does the coercion in [3]. Alternative coercers like *1 is also
often used. Harlan argued against *1 and stated that -- would even be
better. The timing results shows that coercion with -- is faster than with
+0 and with +0 is faster than *1 (mutatis mutandis faster than * as implicit
coercer in [1]). The time difference between -- and +0 is quite small.
However, the SumProduct formulas with -- are stable qua temporal profile
than those with +0 (which is a bit jumpy from run to run), although they
converge qua speed to the same figure (mean). Hence:

[3b]

=SUMPRODUCT(--($D$2:$D$4=A2),--($E$2:$E$4=B2),($F$2:$F$4))

That said, we also know that if the range to sum would span F2:H4, we are
forced to use the * syntax (that is, [1]).
 
Aladin Akyurek said:
Unnecessary? It's not recent either, -- is but not the syntax.

The original question was about the double-dash, and it was that I was
commenting on, not the syntax.
. Any text value in the range to sum would error out with [1].

It could equally be argued that that is a reson for using that syntax as it
may point to an error in the data.
follows the native syntax of the SumProduct function. I for one argued for
[3] and [2] equally often here and elsewhere.

I have seen many of your arguments for 2, but I confess that I have not seen
any regarding 3 as yet.
That said, we also know that if the range to sum would span F2:H4, we are
forced to use the * syntax (that is, [1]).

Exactly. Which suggests that we have at least 3 syntax for different
situation. As I have debated many times with Harlan in these forums, in my
experience maintenance costs as least as much as the development, so the
design should seek to mitigate the maintenance burden. Although speed is a
consideration, I think it would be most liely that there are (very) few
instances of a spreadsheet that would warrant the savings of double dash
coercion over asterisk coercion - hence unnecessary in my view.

Bob
 
Biff
THanks very much, you added real insight!
I tried to thank you but your anti-spam address prevented
me from doing so!
thx again
SteveC
Los Gatos,CA
 
Hi Steve,

Thanks for the feedback. When I was writing the reply, I
couldn't think of how to accurately describle the process
as Bob Phillips did so well. So I thought maybe a visual
demonstration would help.

Biff
 
Back
Top