Title: Two Variable Relationships Positive Association
1Two Variable Relationships(Positive Association)
Y
X
(a) Linear
2Two Variable Relationships(Negative Association)
Y
X
(b) Linear
3Two Variable Relationships
Y
X
(c) Curvilinear
4Two Variable Relationships
Y
X
(d) Curvilinear
5Two Variable Relationships(No association)
Y
X
(e) No Relationship
6Correlation
- The correlation coefficient is a quantitative
measure of the strength of the linear
relationship between two variables. The
correlation ranges from 1.0 to - 1.0. A
correlation of ? 1.0 indicates a perfect linear
relationship, whereas a correlation of 0
indicates no linear relationship.
7Correlation
- SAMPLE CORRELATION COEFFICIENT
- where
- r Sample correlation coefficient
- n Sample size
- x Value of the independent variable
- y Value of the dependent variable
8Correlation
- SAMPLE CORRELATION COEFFICIENT
- or the algebraic equivalent
9Correlation(sales in midwest)
10Correlation
11Correlation
Correlation between Years and Sales
Software Correlation Output
12Correlation
- TEST STATISTIC FOR CORRELATION
- where
- t Number of standard deviations r is from 0
- r Simple correlation coefficient
- n Sample size
13Correlation Significance Test
Rejection Region ? /2 0.025
Rejection Region ? /2 0.025
Since t4.752 gt 2.048, reject H0, there is a
significant linear relationship
14Correlation
- Spurious correlation occurs when there is a
correlation between two otherwise unrelated
variables.
15SUM OF SQUARED RESIDUALS
16- TOTAL SUM OF SQUARES
- where
- SST Total sum of squares
- n Sample size
- y Values of the dependent variable
- Average value of the dependent variable
17Measures of VariationThe Sum of Squares
- SST Total Sum of Squares
- measures the variation of the Yi values around
their mean Y
_
- SSR Regression Sum of Squares
- explained variation attributable to the
relationship between X and Y
- SSE Error Sum of Squares
- variation attributable to factors other than the
relationship between X and Y. This is the measure
you minimize to get the coefficient estimates.
18Measures of Variation The Sum of Squares
Y
Ù
SSE å(Yi - Yi )2
Yi b0 b1Xi
Ù
_
SST å(Yi - Y)2
_
Ù
SSR å(Yi - Y)2
_
Y
X
Xi
19Two data points (x1,y1) and (x2,y2) of a certain
sample are shown.
y2
y1
x1
x2
Total variation in y
Variation explained by the regression line)
Unexplained variation (error)
20 Measures of Variation The Sum of Squares
Example
SSR
SSE
SST
21- SUM OF SQUARES ERROR (RESIDUALS)
- where
- SSE Sum of squares error
- n Sample size
- y Values of the dependent variable
- Estimated value for the average of y for
the given x value
22- SUM OF SQUARES REGRESSION
- where
- SSR Sum of squares regression
- Average value of the dependent variable
- y Values of the dependent variable
- Estimated value for the average of y for the
given x value
23 24- The coefficient of determination is the portion
of the total variation in the dependent variable
that is explained by its relationship with the
independent variable. The coefficient of
determination is also called R-squared and is
denoted as R2.
25- COEFFICIENT OF DETERMINATION (R2)
26Midwest Sales Example
- COEFFICIENT OF DETERMINATION (R2)
69.31 of the variation in the sales data for
this sample can be explained by the linear
relationship between sales and years of
experience.
27Simple Linear Regression(only!)
- COEFFICIENT OF DETERMINATION SINGLE INDEPENDENT
VARIABLE CASE - where
- R2 Coefficient of determination
- r Simple correlation coefficient
28Coefficients of Determination (R2) and
Correlation (r)
R2 1,
Y
r 1
Y
R2 1,
r -1
Y
b0
b1
X
i
i
Y
b0
b1
X
i
i
X
X
R2 .8,
R2 0,
r 0.9
r 0
Y
Y
Y
b0
b1
X
Y
b0
b1
X
i
i
i
i
X
X
29- STANDARD DEVIATION OF THE REGRESSION SLOPE
COEFFICIENT (POPULATION) - Standard deviation of the regression slope
(Called the standard error of the slope) - Population standard error of the estimate
30- ESTIMATOR FOR THE STANDARD ERROR OF THE ESTIMATE
- where
- SSE Sum of squares error
- n Sample size
- k number of independent variables in the
model
31- ESTIMATOR FOR THE STANDARD DEVIATION OF THE
REGRESSION SLOPE - where
- Estimate of the standard error of the least
squares slope - Sample standard error of the estimate
32- TEST STATISTIC FOR TEST OF SIGNIFICANCE OF THE
REGRESSION SLOPE - where
- b1 Sample regression slope coefficient
- ?1 Hypothesized slope
- sb1 Estimator of the standard error of the
slope
33Significance Test of Regression Slope (model
utility)
Rejection Region ? /2 0.025
Rejection Region ? /2 0.025
Since t4.753 gt 2.048, reject H0 conclude that
the true slope is not zero
34Simple Regression Steps
- Develop a scatter plot of y and x. You are
looking for a linear relationship between the two
variables. - Calculate the least squares regression line for
the sample data. - Calculate the correlation coefficient and the
simple coefficient of determination, R2. - Conduct one of the significance tests.
35- CONFIDENCE INTERVAL FOR THE SLOPE
- or equivalently
- where
- sb1 Std. error of the regression slope
- s Standard error of the estimate
36- CONFIDENCE INTERVAL FOR
- Point estimate of the dependent variable
- t Critical value with n -
2 d.f. - s Standard error of the
estimate - n Sample size
- xp Specific value of the independent
variable - Mean of independent variable observations
37Prediction Interval
38Residual Analysis
- Before using a regression model for description
or prediction, you should do a check to see if
the assumptions concerning the normal
distribution and constant variance of the error
terms have been satisfied. One way to do this is
through the use of residual plots.