Matching rows to an ideal row

  • Thread starter Thread starter CamENG
  • Start date Start date
C

CamENG

Hello,

I have an interesting situation, that I hope this deep well of knowledge can help me solve. I basically have an ideal row of data in excel (each column represents a different variable), and I then have a list of other rows which have varied values in their columns (as shown below).

Ideal Row 40 20 12300 0.50

Row A 38 18 12000 0.48
Row B 45 11 12299 0.25
Row C 28 21 18000 0.52

What I want to now do is see which of these rows is the most like the idealrow (based on equal weighting of the columns).

The only idea I have, is to take the difference between each value in the test row against the value in the ideal row and then sum the differences to see which row contains the smallest sum of differences.

however, is there a better way to do this? Anybody have any thoughts?

All the best,
CamENG
 
Hello,

I have an interesting situation, that I hope this deep well of knowledge can help me solve. I basically have an ideal row of data in excel (each column represents a different variable), and I then have a list of other rows which have varied values in their columns (as shown below).

Ideal Row   40   20   12300   0.50

Row A       38   18   12000   0.48
Row B       45   11   12299   0.25
Row C       28   21   18000   0.52

What I want to now do is see which of these rows is the most like the ideal row (based on equal weighting of the columns).

The only idea I have, is to take the difference between each value in thetest row against the value in the ideal row and then sum the differences to see which row contains the smallest sum of differences.

however, is there a better way to do this? Anybody have any thoughts?

All the best,
CamENG

There is no "best" answer to this. The method you have described as
"equal weight" actually gives the large column D most weight because
differences in the other columns are swamped by the much larger
differences in the D column. You've probably already noticed this.
One simple way to cope with this would be to find the percentage
difference from the ideal row =ABS((Ideal - Row))/Ideal*100% and add
the absolute % differences. (You don't need the 100%)

This method still gives most weight to the column with the greatest %
variation. A statistician would very likely standardize the
differences (Ideal - Row)/STDEV(column), square these and add them up.

Cheers, xt
 
Hi XT,

Ya, I did notice that the difference from the largest column would swamp out the other columns. Thanks for your suggestions and I'm going to have a play today with the spreadsheet.

The reason I initially thought about posting this in this group, is that I was thinking that perhaps there already exists some kind of ideal 'solver' for this situation in Excel, but since it's so simple, probably not.

Cheers,
CamENG
 
Back
Top