Title: Using Statistics To Make Inferences 10
1Using Statistics To Make Inferences 10
- Summary
- To fit a straight line through data.
-
- Goals
- Given raw data, or the appropriate sums, to fit
a straight line through the data. - Practical
-
- Recall last weeks practical. Perform scatter
plots, evaluate correlations and add regression
lines. -
2Regression
We wish to fit the straight line y ax b
through our data xi,yi i1,,n by estimating a
and b.
Recall these terms from lecture 9 on correlation.
Note the symbol denoting an estimate and the
for average.
3Example
- Is the height of sons related to that of their
fathers?
Father (x) 63 68 70 64 66 72 67 71 68 62 Son (y)
65 66 72 66 69 74 69 73 65 66
Note choice of x and y variables.
First plot the data.
4Scatterplot
Father (x) 63 68 70 64 66 72 67 71 68 62 Son
(y) 65 66 72 66 69 74 69 73 65 66
5Calculation
Father (x) 63 68 70 64 66 72 67 71 68 62 Son
(y) 65 66 72 66 69 74 69 73 65 66
n 10
Sxi 63 68 62 671
Syi 65 66 66 685
Sxi2 632 682 622 45127
Sxiyi 63 65 68 66 62 66 46047
6Calculation Sxx
n 10 Sxi 671 Syi 685 Sxi2 45127 Sxiyi
46047
7Calculation Sxy
n 10 Sxi 671 Syi 685 Sxi2 45127 Sxiyi
46047
8Calculation Gradient a
n 10 Sxi 671 Syi 685 Sxx 102.9 Sxy
83.5
Gradient or slope
9Calculation Intercept b
n 10 Sxi 671 Syi 685
Sxx 102.9 Sxy 83.5
Gradient or slope
Intercept or constant
Son 14.05 0.81 Father
10Results
To produce the line select two extreme values for
the x variable.
Father 72 then Son 14.05 0.8115 72
72.478
Father 62 then Son 14.05 0.8115 62
64.363
11Results
Father 72 Son 72.48 and Father 62 Son
64.36
12SPSS
Analyze gt Regression gt Linear
13SPSS
Same intercept and gradient
14Aside
- A test of extraversion was administered to two
groups of subjects - 8 adolescent boys and 5
adolescent girls. Assume both populations are
normally distributed. Do the two group means
differ significantly at a level of 0.05? - Observed data for boys
- 119.81 127.59 126.61 139.07 152.12 140.11 155.77
111.71 - Observed data for girls
- 111.55 108.60 118.92 122.41 114.82
What are the key words in the question?
15Aside
C C C C C C C C C C C c
- A test of extraversion was administered to two
groups of subjects - 8 adolescent boys and 5
adolescent girls. Assume both populations are
normally distributed. Do the two group means
differ significantly at a level of 0.05? - Observed data for boys
- 119.81 127.59 126.61 139.07 152.12 140.11 155.77
111.71 - Observed data for girls
- 111.55 108.60 118.92 122.41 114.82
16Aside
- What key words describe this data?
17Aside
- two sample
- comparison of means
Which tests might be appropriate?
18Aside
C C C C C C C c
- z or t
- Which is appropriate here?
Since s is not available we use a two sample t
test.
19Example
- On January 28, 1986, the space shuttle
Challenger was launched at a temperature of 31F.
The ensuing catastrophe was caused by a
combustion gas leak through a joint in one of the
booster rockets, which was sealed by a device
called an O-ring. The data in the print version
of the notes relate launch temperature to the
number of O-rings under thermal distress for 24
previous launches. -
20Challenger's rollout from Orbiter Processing
Facility to the Vehicle Assembly Building
21The crew of the final, ill-fated flight of the
Challenger
22The Challenger breaks apart 73 seconds into its
final mission
23Debris recovered from Space Shuttle Challenger
24Example
- On January 28, 1986, the space shuttle
Challenger was launched at a temperature of 31F.
The ensuing catastrophe was caused by a
combustion gas leak through a joint in one of the
booster rockets, which was sealed by a device
called an O-ring. The data in the print version
of the notes relate launch temperature to the
number of O-rings under thermal distress for 24
previous launches. - First plot the data.
Variables, number of rings that fail and
temperature Which is dependent?
25Scatterplot
26Calculation Sxx
Here the independent variable is Temp (x) and the
dependent variable is Ring (y)
n 24 STempi 1680 SRingi 10 STempi2
118800 STempi Ringi 627
27Calculation Sxy
Here the independent variable is Temp (x) and the
dependent variable is Ring (y)
n 24 STempi 1680 SRingi 10 STempi2
118800 STempi Ringi 627
28Calculation Gradient a
n 24 STempi 1680 SRingi 10
Sxx 1200 Sxy -73
29Calculation Intercept b
n 24 STempi 1680 SRingi 10
30Results
Prediction at temp 31 then ring 2.789, that
is at 31F, 2.789 rings might be expected to fail!
The temperature of interest is a gross
extrapolation!!
31SPSS
Same intercept and gradient
32SPSS
Scatter plots/Regression Graphs gt Legacy Dialogs
gt Scatter/Dot
Simple scatter
33SPSS
34SPSS
To fit a line 1. Open the output file 2.
Double click on the graph (the chart editor will
open) 3. Click on the reference line icon
4. Check attach label to line to add regression
equation 5. Click apply and close
35SPSS
Alternately Graphs gt Interactive gt Scatterplot
36SPSS
Use the Fit tab at the top of the window
37SPSS
As you will often find, interactive graph makes
the scatter plot, but without the regression line
requested. One way to fix this problem is to
include the type scale option for the variables.
Right click on the variable to set this.
38SPSS
39Future Lectures
- During the last two weeks of the course there
will be no formal lectures. - During the lecture session I will review
examination questions as proposed by the audience.
40Future Tutorials
- During the remaining tutorial sessions we will
not require the cluster rooms. -
- I will be available in my office to deal with
individual queries.
41Read
- Read Howitt and Cramer pages 75-82
-
- Read Russo (e-text) pages 202-213
- Read Davis and Smith pages 173-192