Hi Dana,
Inline
Would using Solver be an acceptable solution? If I understand the question,
you have a number like 90.
You multiply 90 by .33 to get 29.7. You have to decide to use integer 29,
or rounded up 30. You do this for each percentage given.
These rounded numbers must total the original 90.
I hope I understand this correctly. This is hard to explain, so here goes.
Have two columns that round the percentages down, and up.( to get the 29,
and 30 from above). In the next 3 columns, put 0,1, and for the third, use
"SUM" and sum the 0 & 1 cells. Since you can not use IF functions, this is
a way to have Solver decide on the two cells. The 0/1 cells will be a
constraint that are Binary. The Sum cell is another constraint that says it
should total 1. Solver will alternate the two 0/1 cells so that one of them
will be 1, and the other will be zero. Now, use Sumproduct on the two
rounded values (29/30, and the two cells 0/1). This will be the value that
Solver picks. To determine the "Error" I used for example (29.7- Solver's
The above can be done with just one binary variable. Suppose the
ROUNDUP values are in C16:C23 and the ROUNDDOWN values in D16
23.
Then, suppose the binary variables are in E16:E23. In F16, use the
formula =C16*E16+D16*(1-E16). This ensures that one of C16 or D16 is
picked. Copy F16 down to F17:F23.
Solver picks. To determine the "Error" I used for example (29.7- Solver's
Pick)^2.
Basically, the difference squared. (Note that using ABS(difference) will
not work in Solver, so that is why ^2).
Yes, ABS will work. It just makes the problem non-linear, but in this
case, it already is non-linear.
Copy these down for the other percentages.
Add a Sum formulas that sum the differences. Add another formula that does
this: =90-Sum(Solver's pick). Solver needs to make this equal zero.
For Solver, try to minimize the total of the "differences."
For constraints:
0/1 cells are binary
Each of Sum(x,y) cells that sum the 0/1 cells should total 1.
Set the cell that has "90-Sum(Solver's pick). " equal to zero.
You can also just designate F16:F23 as non-negative integer decision
variables. That too works just fine. ;-)
In Excel XP, you often have to run Solver twice to get the "better"
solution.
For some numbers, Solver suggested a solution that was not similar to
rounding the 7 smallest percentages, and then adjusting the 33%.
I'm not a Stat's expert, so I'm not sure if (difference)^2 is the best
measure. It appears that if you round the 7 smallest percentages, some
From what I remember of my PhD introductory stats class, sum-of-ABS is
statistically a better measure that sum-of-squares. However, sum-of-
square has become the preferred method because it was easier to program
in days-gone-by. There is a similar issue with how one calculates
variance. SUM((Xi-X-bar)^2) is the numerically superior method, but
SUM(Xi^2)-n*X-bar^2 is easier to calculate with 'pen and paper.'
measure. It appears that if you round the 7 smallest percentages, some
numbers will make the larger .33 value have to move a larger integer amount
to get back to the original value.
For example, if using (difference)^2, Solver suggests taking you number (90
* 4% = 3.6) and rounding down to 3.
This results from the use of the difference rather than the % of the
difference. When using just the difference, consider the choices for
adjusting 29.7 and 3.6. It is preferable to drop 3.6 to 3 rather than
29.7 to 29.0! However, one could use ((Adjusted-Original)/Original)^2
and now the changing 3.6 to 3 is more 'expensive' than changing it to
4!
This allowed your 90*33% = 29.7 to use 30 instead of a value 28!
A more sophisticated method would be to allocate the difference to not
just the largest value but among some of the largest values. I believe
Harlan Grove has posted some UDF solutions along those lines (though I
am not sure).
The penalty of (29.7 - 28)^2 was too high doing it this way.
Yeah, depending on how one defines the penalty function, one can get
different results. Here are a few I ran:
A B C D E F
1.42 1.02 0.043 0.042 0.043 0.44
30 30 29 30 29 29
16 16 16 16 16 16
14 13 14 13 14 13
12 12 11 11 11 12
10 10 10 10 10 10
3 3 4 4 4 4
2 3 3 3 3 3
3 3 3 3 3 3
90 90 90 90 90 90
(A): Two variables as you documented with sum-of-difference-squared
(B): One variable as illustrated above with sum-of-difference-squared
(C): One variable with sum-of-percent-difference-squared
(D): Just INT specification with sum-of-percent-difference-squared
(E): One variable as above with sum-of-percent-difference-squared
(F): Just INT specification with sum-of-abs-percent-difference
The top row is the minimum value achieved by Solver. The numbers are
*not* comparable across different methods.
My guess from looking at the results of the sum-of-percent-difference-
squared is that the objective function is reasonably flat near the
optimum and Solver finds (and indeed cycles through) one of a few
'reasonably' close choices.
--
Regards,
Tushar Mehta, MS MVP -- Excel
www.tushar-mehta.com
Excel, PowerPoint, and VBA add-ins, tutorials
Custom MS Office productivity solutions