Analysis of variance approach to regression analysis - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

Analysis of variance approach to regression analysis

Description:

a component that is due to the change in X ('regression sum of squares' ... Example: Oxygen consumption related to treadmill duration? The regression equation is ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 23
Provided by: lsi4
Category:

less

Transcript and Presenter's Notes

Title: Analysis of variance approach to regression analysis


1
Analysis of variance approach to regression
analysis
  • an alternative approach to testing for a linear
    association

2
(No Transcript)
3
(No Transcript)
4
Basic idea well, kind of
  • Break down the variation in Y (total sum of
    squares) into two components
  • a component that is due to the change in X
    (regression sum of squares)
  • a component that is just due to random error
    (error sum of squares)
  • If the regression sum of squares is much greater
    than the error sum of squares, conclude that
    there is a linear association.

5
Row Year Men200m 1 1900 22.20
2 1904 21.60 3 1908 22.60 4
1912 21.70 5 1920 22.00 6 1924
21.60 7 1928 21.80 8 1932
21.20 9 1936 20.70 10 1948
21.10 11 1952 20.70 12 1956
20.60 13 1960 20.50 14 1964
20.30 15 1968 19.83 16 1972
20.00 17 1976 20.23 18 1980
20.19 19 1984 19.80 20 1988
19.75 21 1992 20.01 22 1996 19.32
Winning times (in seconds) in Mens 200 meter
Olympic sprints, 1900-1996. Are men getting
faster?
6
(No Transcript)
7
Analysis of Variance Table
Analysis of Variance Source DF SS
MS F P Regression 1 15.796
15.796 177.7 0.000 Residual Error 20 1.778
0.089 Total 21 17.574
The regression sum of squares, 15.796, accounts
for most of the total sum of squares, 17.574.
There appears to be a significant linear
association between year and winning times
lets formalize it.
8
(No Transcript)
9
The cool thing is that the decomposition holds
for the sum of the squared deviations, too. That
is .
Total sum of squares (SSTO)
Regression sum of squares (SSR)
Error sum of squares (SSE)
10
Breakdown of degrees of freedom
Degrees of freedom associated with SSTO
Degrees of freedom associated with SSR
Degrees of freedom associated with SSE
11
Definition of Mean Squares
The regression mean square (MSR) is defined as
where, as you already know, the error mean square
(MSE) is defined as
12
The Analysis of Variance (ANOVA) Table
13
Expected Mean Squares
If there is no linear association (ß1 0), wed
expect the ratio MSR/MSE to be 1. If there is
linear association (ß1?0), wed expect the ratio
MSR/MSE to be greater than 1. So, use the ratio
MSR/MSE to draw conclusion about whether or not
ß1 0.
14
The F-test
Hypotheses
Test statistic
P-value What is the probability that wed get
an F statistic as large as we did, if the null
hypothesis is true? (One-tailed test!)
Determine the P-value by comparing F to an F
distribution with 1 numerator degrees of freedom
and n-2 denominator degrees of freedom.
Reject the null hypothesis if P-value is small
as defined by being smaller than the level of
significance.
15
Analysis of Variance Table
MSE SSE/(n-2) 1.8/20 0.09
DFE n-2 22-2 20
MSR SSR/1 15.8
Analysis of Variance Source DF SS
MS F P Regression 1 15.8 15.8
177.7 0.000 Residual Error 20 1.8
0.09 Total 21 17.6
DFTO n-1 22-1 21
F MSR/MSE 15.796/0.089 177.7
P Probability that an F(1,20) random variable
is greater than 177.7 0.000
16
Equivalence of F-test to T-test
Predictor Coef SE Coef T P Constant
76.153 4.152 18.34 0.000 Year
-0.0284 0.00213 -13.33 0.000
Analysis of Variance Source DF SS
MS F P Regression 1 15.796
15.796 177.7 0.000 Residual Error 20 1.778
0.089 Total 21 17.574
17
Equivalence of F-test to t-test
  • For a given significance level, the F-test of
    ß10 versus ß1?0 is algebraically equivalent to
    the two-tailed t-test.
  • Will get same P-values.
  • If one test leads to rejecting H0, then so will
    the other. And, if one test leads to not
    rejecting H0, then so will the other.

18
F test versus T test?
  • F-test is only appropriate for testing that the
    slope differs from 0 (ß1?0). Use the t-test if
    you want to test that the slope is positive
    (ß1gt0) or negative (ß1lt0) .
  • F-test will be more useful to us later when we
    want to test that more than one slope parameter
    is 0.

19
Getting ANOVA table in Minitab
  • Default output for either command
  • Stat gtgt Regression gtgt Regression
  • Stat gtgt Regression gtgt Fitted line plot

20
Example Oxygen consumption related to treadmill
duration?
21
(No Transcript)
22
The regression equation is vo2 - 1.10 0.0644
duration Predictor Coef SE Coef
T P Constant -1.104 3.315
-0.33 0.741 duration 0.064369
0.005030 12.80 0.000 S 4.128
R-Sq 79.6 R-Sq(adj) 79.1 Analysis of
Variance Source DF SS MS
F P Regression 1 2790.6 2790.6
163.77 0.000 Residual Error 42 715.7
17.0 Total 43 3506.2
Write a Comment
User Comments (0)
About PowerShow.com