Increaseing Precision in polynomial trendline equations

  • Thread starter Thread starter KevinW
  • Start date Start date
K

KevinW

How can I increase the precision in Excel's "Display Equation" option
for trendlines?

I am trying to fit a polynomial regression (trendline) to a X, Y
scatterplot I have created.

Excel's built-in trendline function shows a good fit (R2=0.9999) for a
6th order polynomial, however, If I re-plot the data using the
coefficients from the "Display equation" option, the data diverge
significantly.

Apparently, for high order polynomials, you need to have a high level
of precision in your coefficients (many decimal places accurate) in
order to actually re-plot the same curve.

I also tried using the "linest" function as described in Mr. Liengme's
website (http://www.stfx.ca/people/bliengme/ExcelTips/Polynomial.htm),
however, I can't seem to make this work properly. I can make it
calculate, but the values don't match data.

Any help with this problem would be appreciated.

Thanks
Kevin

PS Here is the equation that Excel displays on my chart;

y = 4E-09x6 - 9E-07x5 + 8E-05x4 - 0.0038x3 + 0.0996x2 - 1.5179x +
45.146

I am using Excel 2004 for Mac (OSX) Ver11.2
 
For a 6th order polynomial:
select a block 7 columns wide, 5 rows deep
enter =LINEST(y-range, x-range^{1,2,3,4,5,6},TRUE,TRUE)
press SHIFT+CTRL+ENTER (hold Shift down, hold CTRL down and the tap the
ENTER key)
the top line will display your coefficients

by the way: do you have a good reason for fitting to a 6th order?
how many data points?

happy new year
 
Dr Liegme;

Thanks for your quick reply.

I had previously extended the example from your website from a 3r
order poly to a 6th order, no problem. At least it returned values fo
all for all the cells. Unfortunately, the coefficients don't mak
sense. When I try to re-create the curve using those values, the curv
does not match.

I would expect that the results be fairly similar to that shown b
Excel in the "display equation" option, however they are obviousl
different.

I've tried to re-create your 3rd order polynomial example exactly a
you have displayed it on your webpage, but without success. Instead o
the coefficients,

{2,3,-6,8}

I get;

{2.16667, 2, 148.8333, -152.333}.

Clearly, I am having the same issue with the higher order polynomials
As far as I can tell, I am recreating your example and the LINES
function exactly as you have shown and I can't explain the differenc
between your results and mine. I will go back and try to trouble-shoo
the 3rd order poly example to get your results. Any insight you hav
would again be appreciated.

Thanks
Kevin

PS. I am using a 6th order polynomial to describe a curve that I hav
digitized. I have ~1000 x-y data points. I would like to have th
equation so that I can input a given x value (in my case stress) an
return a given y value (in my case, the Larson-Miller Parameter).
don't plan to extrapolate as I understand that high order polynomial
tend to rapidly diverge. Indeed, when I use excel's trendline functio
to extend the chart, it is okay on one end of the curve, but 'curl
over' on the other end
 
I tried to re-create the example at;

http://www.stfx.ca/people/bliengme/ExcelTips/Polynomial.htm

Here is the equation I used (using the 'array enter' crtl-shft-rtrn);

=LINEST(B2:B5,A2:A5^{1,2,3},TRUE,TRUE)

and here is the matrix result;

x3 x2 x b
2.166666667 2 148.8333333 -152.3333333
0 0 0 0
1 0 N/A N/A
0 0 N/A N/A
14105 0 N/A N/A


Excel's automatic trendline seems to work fine, correctly identifying
the equation as y = 2x^3 + 3x^2 - 6x + 8

I can't figure out why it won't work correctly. Am I missing something
simple? Do I need to change a preference setting I don't know about?

Thanks
Kevin
 
On the Windows side, versions of XL before 2003 had a variety of
problems with LINEST (not that 2003 is guaranteed to be error proof
{grin}). Maybe, XL2004 hasn't caught up with XL2003? Of course, it is
possible you are doing something wrong though I cannot of think of
what.

To get the results of a chart's trendline result into a Excel worksheet
see my enhancements of Dave Braden's code at
http://groups.google.com/group/microsoft.public.excel.charting/msg/0eda
30f29434786d?hl=en&

--
Regards,

Tushar Mehta
www.tushar-mehta.com
Excel, PowerPoint, and VBA add-ins, tutorials
Custom MS Office productivity solutions
 
Right click on the displayed trendline equation and format as scientific
notation with 14 decimal places.

Often fitting a polynomial with this high a degree is overfitting the
data. Even if the polynomial degree is theoretically justified, fitting
it will often be an extremely difficult numerical problem, well beyond
the capabilities of pre-2003 LINEST. If you provide your data (inline
text, not attachments in newsgroups. please), I could provide more
information.

Jerry
 
First off, thanks to Bernard, Tushar & Jerry for their helpful comments
and suggestions. I learned alot about a useful function (LINEST) and
regression curve fitting in general, something I hadn't expected when I
posted my question.

Right click on the displayed trendline equation and format as
scientific
notation with 14 decimal places.

It worked, and that was exactly what I had been trying to do
originally. With more (numeric)precision, my re-ploted data matched my
original line exactly.

I find that sometimes Excel is 'fussy' about whether a text box is
already selected when you open a formatting menu - sometimes it only
shows the "Font" menu and not the "Colors &
Lines/Font/Number/Alignment" menu depending on exactly what you have
selected. Once I figured that out, I was able to increase the displayed
precisions as I needed.

One general observation (which might be old news to some)- whenever I
work with trendlines and curve fitting, I find that Excel does not
always properly 'refresh' the trendline equation on the chart. If I
switch curves (exponential to log etc) or change the order of the
polynomial, the equation doesn't change, or shows something different
than If I plot an entirely new trendline with the same fitting
equation.

This was the case when I tried your (Jerry's) suggestion as well. I
increased the precision on the equation I had showing in Excel, and
plotted it, but the data diverged again. When I set a new trendline and
compared the equations, the coefficient of the 4th term was completely
different, and the new equation worked properly.

I'd still like to be able to use the LINEST function sometimes in the
future, but until I can figure out my issue with replicating Bernard's
example,
(<http://www.stfx.ca/people/bliengme/ExcelTips/Polynomial.htm>)
I'll have to be cautious. I will be back at my office this week, and
will try the example on my Windows computer and report back on any
results. For simplicity's sake, I hope the problem was between the
keyboard and the chair.

Thanks again
Kevin
Often fitting a polynomial with this high a degree is overfitting the
data. Even if the polynomial degree is theoretically justified,
fitting
it will often be an extremely difficult numerical problem, well beyond
the capabilities of pre-2003 LINEST. If you provide your data (inline
text, not attachments in newsgroups. please), I could provide more
information.

Jerry


Just to be clear, I am -not- trying to find the 'best fit' for a plot
of scattered data. Rather, I am trying to find an equation to describe
an existing line. I scanned and digitized a Larson-Miller curve I will
be using extensively for my thesis. The data points are very close
together. I'd like to be able to enter a value and return the
corresponding value from the curve. As long as the answer is the same
as the original curve, then I'm happy with the equation for the line.

Digitizing and inputing the curve into Excel is probably more accurate
than trying to manually read values over and over from a hardcopy
plot.

Here is an example of my data;

Row X Y
1) 30.07 62.08
2) 30.08 62.08
3) 30.09 62.08
4) 30.09 61.96
5) 30.10 61.96
6) 30.11 61.83
7) 30.12 61.83
8) 30.13 61.71
...(snip 1000 data points)....
1047) 38.81 6.25
1048) 38.82 6.25
1049) 38.83 6.24

I may want to know what the 'X' is for Y=61. Since my digitizer didn't
input a number for exactly 61, I would have to interpolate, maybe using
some sort of look up table etc or just use something close. Either way
it would be slow and semi-manual. However, Since I have *alot* of data,
the polynomial equation equation fits the line well (at least within the
accuracy of the scan etc). I can get my X for any Y I select (but bound
by {30.07,62.08} and {38.83,6.24} i.e. no extrapolating).

I hope that makes sense - if you like I could still post the data,
however there is alot of it.
 
The progression of X's is not clear. For 30.07 to 38.83 by 0.01, there
are only 877 points, yet you indicate that you have 1049 points.

Fitting a 6th degree polynomial to this narrow a range of x-values is a
very difficult numerical problem. Assuming the obvious 877 points, the
condition number for X'X is ~ 6.8E+32. For pre-2003 LINEST to give
meaningful coefficients for anything higher than a cubic would be a
numerical accident.

The chart polynomial trendline is numerically better than LINEST, and
has the potential to give reasonable results here. You might also try
R, the free open-source implementation of the S statistical programming
language
http://www.r-project.org
LINEST in Excel 2003 or later may also give reasonable results.

Since you seem to be more interested in interpolation than the actual
coefficient values, you may be OK. Prediction within this range should
be much more numerically stable than the coefficient estimates themselves.

Jerry

KevinW wrote:

....
 
KevinW said:
Just to be clear, I am -not- trying to find the 'best fit' for a plot
of scattered data. Rather, I am trying to find an equation to describe
an existing line. I scanned and digitized a Larson-Miller curve I will
be using extensively for my thesis. The data points are very close
together. I'd like to be able to enter a value and return the
corresponding value from the curve. As long as the answer is the same
as the original curve, then I'm happy with the equation for the line.

Jon Peltier (a PhD metalurgist) might be able to add more, but what I have
seen of Larson-Miller curves (based on a Google search) they are smooth
monotonic curves that may not be fit well by a single low-order polynomial
over the entire range.

You might do better with local interpolation. y = (a+b*x)/(1+c*x) is a
simple function that you can fit with 3 observations for local monotonic
interpolation. Linear interpolation (c=0) is a special case.

To fit the function, you would need 3 points bracketing your desired point,
all having distinct x-values and distinct y-values. Given the discreteness
of your observations, I would tend to use the median x-value for a given
y-value, ... You might compare or even average the results from 3 points
with 2 of them to the left of the desired point and 3 poitns with 2 of them
to the right of the desired point.

Jerry
 
As Previously mentioned in this thread, I was having difficulty with the
LINEST function in Excel 2004 for Mac (OSX) Ver11.2

I couldn't get it to return the coefficients for even a simple
quadratic equation, let alone the 6th order polynomial I was using to
describe my line.

After opening my file at my office, I was able to get LINEST to perform
correctly for my 6th order polynomial as well as reproduce the example
at http://www.stfx.ca/people/bliengme/ExcelTips/Polynomial.htm no
problem.

At work I am running Excel 2002 on Windows XP.

Since I didn't make any mods to my spreadsheet from my home computer,
I'd have to say that LINEST doesn't work for Excel 2004 for Mac.

Thanks
Kevin

P.S. With regards to Jerry’s suggestion on fitting multiple curves etc-
I could probably increase the accuracy of reproducing the best fit line
that I digitize, however the underlying scatter in the raw data that
curve was created from is huge. The Excel polynomial fit is good enough
for my needs. Generally, the Larson Miller Parameter is reported with
zero decimal places (ie for a given design stress the corresponding
LMP=29 or 30).
 
Back
Top