Title: Statistics Section 2
1Statistics Section 2
- Relationships Two-variable statistics
2Review
- One-variable frequency statistics
- Descriptive statistics
- The sum of squares song
- Z problems and using the table
- Characteristics of the normal curve/Z curve
- Mean of 0 and standard deviation of 1
- Inflection point
3Linear regression
- Scatter plot
- Linear relationship and the line of best fit
- Y intercept and slope
- Positivenegativedirectinverse relationship
compare slope coefficients - Perfect/imperfect relationship
- The concept of error
4More on regression
- The least-squares regression line
- The standard error of estimate
- Homoscedasticity
5Computing coefficients
Y a bX
6An example
X 30 38 52 90 95 305
Y 160 180 180 210 240 970
X2 900 1,444 2,704 8,100
9,025 22,173
XY 4,800 6,840 9,360 18,900 22,800 62,700
Y2 25,600 32,400 32,400 44,100
57,600 192,100
(SX) (SX2) (SY) (SY2) (SXY)
7An example
SSX SX2 - (SX)2 22,173 - 3052
N 5 22,173 - 93025/5 22,173 -
18,605 3,568
SP SXY - (SX)(SY) 62,700 - (305)(970)
N 5 62,700 - 295,850/5 62,700 -
59,170 3,530
- b SP/SSX 3,530 / 3,568 0.989
8Example, continued
- __ __
- a Y - b X (970 / 5) - .989 (305 / 5)
- 194 - .989 ( 61) 194 - 60.329
- 133.671
- Y a bX 133.671 .989 X
- If X 50, Y a bX 133.671.989(50)
- 133.671 49.45 183.121
9Rules of thumb
- Sums of squares are always 0 or positive
- SP may be 0, negative, or positive
- Consequently, the coefficients a and b may be 0,
negative, or positive. - Be especially careful with the formula
- __ __
- a Y - b X when b is negative.
10When not to use linear regression for prediction
- When the assumption of homoscedasticity is
seriously violated - When the line of best fit is not straight
- When the actual scores for both variables are
already known - For scores outside the range of the data used to
compute coefficients.
11The standard error of estimate
SSY -
SP2 / SSX
SYX
N - 2
SSY SY2 - (SY)2 / N 192,100 - (970)2 /
5 192,100 - 940,900 / 5 192,100
- 188,180 3,920
SP 3,530 SSX 3,568
3,920 - (3,530)2 / 3,568
3,920 - 12,460,900/3568
SYX
3
12Standard error, continued...
142.532 11.94
13Multiple regression
- More predictor variables add accuracy of
prediction - But there are diminishing returns