Calculating linear regression slope

  • Thread starter Thread starter Bonnie
  • Start date Start date
B

Bonnie

I need to make a query that will calculate the slope for
the change in weight over time. My data includes: ID,
date, weight. Can anyone help?
 
Hi,




slope = ( COUNT(*)*SUM(x*y) -SUM(x)*SUM(y) ) /
(COUNT(*)*SUM(x^2)-SUM(x)^2)

where x = [date] and y=weight; the slope, if positive, is the increase of
weight in function of time (if you use standard date_time value, the time
unit is "per day")


intercept=(SUM(y)-slope*SUM(x))/COUNT(*)


estStdErr=(SUM((y-intercept-slope*x)^2)/(COUNT(*)-2))^0.5


then, roughly, if the distribution of the data is from a bell shape
population, the real slope is, 99%, or more, of the cases, between the two
values: slope-k and slope+k where

k= 3 * estStdErr * SUM((x-AVG(x))^2) ^-0.5

if COUNT(*)>=16


( ref. CRC Standard Mathematical Tables and Formulae, 30th ed., article
7.10.1 and table 7.12.3)


Clearly, that is only an approximation, since a weight cannot be negative,
so, technically, the weight distribution is not a perfect bell shape curve.



Hoping it may help,
Vanderghast, Access MVP
 
Thanks for your reply, Mike. I haven't had much experience with formulas. Does the COUNT refer to all the weights, say for a distinct ID? I'm trying to get the slope for many distinct IDs with many weights, each with different date ranges so I can track any significant weight changes. I'll try to apply this to my data. If you have any more helpful comments I'd appreciate it.
 
Hi,


The COUNT and the SUM apply to the groups as defined in the GROUP BY clause.
A complete query could look like:

SELECT Id,
( COUNT(*)*SUM(x*y) -SUM(x)*SUM(y) ) /
(COUNT(*)*SUM(x^2)-SUM(x)^2) AS Slope

FROM MyTable
GROUP BY Id


Hoping it may help,
Vanderghast, Access MVP




Bonnie said:
Thanks for your reply, Mike. I haven't had much experience with formulas.
Does the COUNT refer to all the weights, say for a distinct ID? I'm trying
to get the slope for many distinct IDs with many weights, each with
different date ranges so I can track any significant weight changes. I'll
try to apply this to my data. If you have any more helpful comments I'd
appreciate it.
 
I'm still having trouble. The slope returned is always 0. First I made a query to extract patient weights for the last three years, grouping by ID. I used that query for making the slope query

SELECT [MyTable].NewID, [MyTable].Patient,
(Count(*)*Sum([Date]*[weight])-Sum([Date])*Sum([weight]))
(Count(*)*Sum([Date]^2)-Sum([Date])^2) AS Slop
FROM [MyTable
GROUP BY [MyTable].NewID, [MyTable].Patient

Was I supposed to define the date to be "in days" somewhere in the query? If so, how do I do that
I would appreciate any help. Thanks


----- Michel Walsh wrote: ----

Hi


The COUNT and the SUM apply to the groups as defined in the GROUP BY clause
A complete query could look like

SELECT Id
( COUNT(*)*SUM(x*y) -SUM(x)*SUM(y) )
(COUNT(*)*SUM(x^2)-SUM(x)^2) AS Slop

FROM MyTabl
GROUP BY I


Hoping it may help
Vanderghast, Access MV




Bonnie said:
Thanks for your reply, Mike. I haven't had much experience with formulas
Does the COUNT refer to all the weights, say for a distinct ID? I'm tryin
to get the slope for many distinct IDs with many weights, each wit
different date ranges so I can track any significant weight changes. I'l
try to apply this to my data. If you have any more helpful comments I'
appreciate it
 
Hi,


Is Date really a date_time data field?


With


weightsDate
ID weight date
0 100 2001-01-01
0 101 2001-01-02
0 102 2001-01-03
0 103 2001-01-04
1 10 2001-01-01
1 20 2001-01-02
1 30 2001-01-03



and the query

SELECT Id,
( COUNT(*)*SUM(date*weight) -SUM(date)*SUM(weight) ) /
(COUNT(*)*SUM(date^2)-SUM(date)^2) AS Slope

FROM WeightsDate
GROUP BY Id


I obtain

Query30
Id Slope
0 1
1 10



and indeed, I increased the weigth by one, for each date, for id=0 , and by
10 for id=1 (if you look back at my data), so the slopes are right, with
that data.

Can you get the numerator:

SELECT NewID, Patient, COUNT(*), SUM(date), SUM(weight), SUM(date*weight)
FROM MyTable
GROUP BY NewID, Patient


and manually check if COUNT(*) * SUM( date*weight ) = SUM(date) *
SUM(weight)

which the only way to get a zero, without a division by zero. Note that you
must have more than a single point per group to have a non zero value for
the slope.





Hoping it may help,
Vanderghast, Access MVP



Bonnie said:
I'm still having trouble. The slope returned is always 0. First I made a
query to extract patient weights for the last three years, grouping by ID.
I used that query for making the slope query:
SELECT [MyTable].NewID, [MyTable].Patient,
(Count(*)*Sum([Date]*[weight])-Sum([Date])*Sum([weight])) /
(Count(*)*Sum([Date]^2)-Sum([Date])^2) AS Slope
FROM [MyTable]
GROUP BY [MyTable].NewID, [MyTable].Patient;

Was I supposed to define the date to be "in days" somewhere in the query? If so, how do I do that?
I would appreciate any help. Thanks.


----- Michel Walsh wrote: -----

Hi,


The COUNT and the SUM apply to the groups as defined in the GROUP BY clause.
A complete query could look like:

SELECT Id,
( COUNT(*)*SUM(x*y) -SUM(x)*SUM(y) ) /
(COUNT(*)*SUM(x^2)-SUM(x)^2) AS Slope

FROM MyTable
GROUP BY Id


Hoping it may help,
Vanderghast, Access MVP




Bonnie said:
Thanks for your reply, Mike. I haven't had much experience with
formulas.
Does the COUNT refer to all the weights, say for a distinct ID? I'm trying
to get the slope for many distinct IDs with many weights, each with
different date ranges so I can track any significant weight changes. I'll
try to apply this to my data. If you have any more helpful comments I'd
appreciate it.
 
Thank you for your help, Mike. I got it to work. Couldn't have done it without you!

----- Michel Walsh wrote: -----

Hi,


Is Date really a date_time data field?


With


weightsDate
ID weight date
0 100 2001-01-01
0 101 2001-01-02
0 102 2001-01-03
0 103 2001-01-04
1 10 2001-01-01
1 20 2001-01-02
1 30 2001-01-03



and the query

SELECT Id,
( COUNT(*)*SUM(date*weight) -SUM(date)*SUM(weight) ) /
(COUNT(*)*SUM(date^2)-SUM(date)^2) AS Slope

FROM WeightsDate
GROUP BY Id


I obtain

Query30
Id Slope
0 1
1 10



and indeed, I increased the weigth by one, for each date, for id=0 , and by
10 for id=1 (if you look back at my data), so the slopes are right, with
that data.

Can you get the numerator:

SELECT NewID, Patient, COUNT(*), SUM(date), SUM(weight), SUM(date*weight)
FROM MyTable
GROUP BY NewID, Patient


and manually check if COUNT(*) * SUM( date*weight ) = SUM(date) *
SUM(weight)

which the only way to get a zero, without a division by zero. Note that you
must have more than a single point per group to have a non zero value for
the slope.





Hoping it may help,
Vanderghast, Access MVP



Bonnie said:
I'm still having trouble. The slope returned is always 0. First I made a
query to extract patient weights for the last three years, grouping by ID.
I used that query for making the slope query:
SELECT [MyTable].NewID, [MyTable].Patient,
(Count(*)*Sum([Date]*[weight])-Sum([Date])*Sum([weight])) /
(Count(*)*Sum([Date]^2)-Sum([Date])^2) AS Slope
FROM [MyTable]
GROUP BY [MyTable].NewID, [MyTable].Patient;
Was I supposed to define the date to be "in days" somewhere in the query?
If so, how do I do that?
I would appreciate any help. Thanks. clause.
A complete query could look like:
SELECT Id,
( COUNT(*)*SUM(x*y) -SUM(x)*SUM(y) ) /
(COUNT(*)*SUM(x^2)-SUM(x)^2) AS Slope
FROM MyTable GROUP BY Id Thanks for your reply, Mike. I haven't had much experience with
formulas.
Does the COUNT refer to all the weights, say for a distinct ID? I'm trying
to get the slope for many distinct IDs with many weights, each with
different date ranges so I can track any significant weight changes. I'll
try to apply this to my data. If you have any more helpful comments I'd
appreciate it.
 
Back
Top