Regression and Prediction - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

Regression and Prediction

Description:

weight. height. Linear Regression Formula. However, we don't simply draw in this regression line. ... The data to the right represent eight families. ... – PowerPoint PPT presentation

Number of Views:91
Avg rating:3.0/5.0
Slides: 30
Provided by: drrober7
Category:

less

Transcript and Presenter's Notes

Title: Regression and Prediction


1
Regression and Prediction
  • Chapter 7

2
Linear Regression
  • Closely related to correlation, but goes a step
    further.
  • Regression refers to a type of analysis in which
    we use an individuals score on one variable to
    predict his/her score on the other.
  • This prediction is obtained by applying a best
    fitting straight line to the data set.

3
  • Theoretically, this is how it works.

First we are given a data set with a correlation
coefficient of r. Next we must obtain a line of
best fit.
4
Once a line of best fit is completed, we can
predict a score for y, based on any score for x.
5
By using regression, we match the score obtained
on x to a point on the regression line. This
point is then matched to y to obtain the
predicted score on y. Note, our predictions are
not perfect because the correlation between x and
y is not perfect. There are extraneous factors
that influence the relationship between x and y.
6
Linear Regression Formula
However, we dont simply draw in this regression
line. Really, we draw the line only
theoretically using the following formula... Y
a byX Where Y represents the predicted
score of y. a represents the Y-intercept of the
regression line. b represents the slope of the
regression line.
7
(No Transcript)
8
More Formulas
9
An Example
10
(No Transcript)
11
(No Transcript)
12
Now, given the regression equation, find the
predicted values of y for each x value.
13
To do this, we just plug in each x value into
the derived regression equation...
14
(No Transcript)
15
Another Formula (Formula 7.6)
Heres another formula that allows us to predict
Y. Y Y r sy (X -X) Where sy standard
deviation of y sx
sx standard deviation of x If we
figure out the means and standard deviations of x
and y from the previous example we can apply
this new formula to predict y scores based on x.
16
An Example
Note, this is the exact same equation as
determined by the previous regression formula.
17
Another Formula (7.7)
Y 110.5 8(2113) - 20(884) (X - 2.5) 8(64) -
(20)2
Y 110.5 (-6.93)(X - 2.5) Y 110.5
-6.93x 17.33 Y 127.83 (-6.93)x
18
Another Example
A researcher believes that Drug A has the
potential to reduce human reaction time. Five
subjects are given different doses of Drug A and
then told to press a button each time they hear a
tone. Their reaction times are measured.
Calculate the regression equation using the
previous three formulas.
Drug Dose X (ml/kg) X
Reaction Time (msec) Y
1 1 2 1 3 2 4 2 5 4

19
X 3 ?x 15 ?x2 55 sx 1.41 Y 2
?y 10 ?y2 26 sy 1.10 ?xy 37
r 0.90
SSxy ?xy - (? x)(?y) SSxy 37 -
(15)(10) 7 N
5
SSx ?x2 - (? x)2 SSx 55 - (15)2
10 N
5
b SSxy b 7 0.7 SSx
10
20
a Y - bX
a 2 - 0.7(3) a -0.1
Y a bX Y -0.1 0.7X
X 1 Y -0.1 0.7(1) 0.6 X 2
Y -0.1 0.7(2) 1.3 X 3 Y -0.1
0.7(3) 2 X 4 Y -0.1 0.7(4) 2.7 X
5 Y -0.1 0.7(5) 3.4
21
Another Example (7.6)
Y 2 (0.90)(1.10/1.41) (X - 3)
Y 2 (0.7)(X - 3) Y 2 0.7x -
2.1 Y -0.1 0.7x
22
Another Example (7.7)
Y 2 5(37) - (15)(10) (X - 3)
5(55) - (15)2
Y 2 (0.7)(x - 3)
Y 2 0.7x -2.1
Y -0.1 0.7x
23
Standard Error
  • We know now that our predictions for y based
    on x will not be perfect, however we can
    calculate approximately how far off our
    predictions will be.
  • This is called the standard error of the
    estimate.

24
Explained and Unexplained Variation
This diagram represents each scores variation
from the mean i.e., it tells how far each score
is from the mean. We can add up this variation
using ?(Y - Y)2 to get what is referred to
as total variation.
25
This diagram represents each of the predicted
scores variation from the mean, i.e., it tells
us how far each predicted score is from the
mean. We can add this up using the formula
?(Y- Y)2 to get the explained variation. The
explained variation Y is attributed to X
26
This diagram represents each scores variation
from the regression line i.e., it tells how far
each score is from the regression line. We can
add up this variation using ?(Y - Y)2 to get
what is referred to as unexplained variation.
27
  • The total variation is actually comprised of
    the explained variation and the unexplained
    variation. Thus...

Total variation unexplained variation
explained variation
28
  • Whats important about all of this is that
    because we can break up the total variation in a
    variable into explained and unexplained
    variation, we can calculate an important ratio.

r2 explained variation total variation
r2 is called the coefficient of determination.
It tells us what percentage of the total
variation in Y can be attributed due to X.
29
Example
  • How much of the variation in IQ scores from
    our previous example can be attributed to the
    number of children?
  • Recall, we calculated r to be -0.82.

Simply calculate r2
r2 -0.822 r2 0.67
67 of the variation in IQ scores can be
attributed to number of children in the family.
Write a Comment
User Comments (0)
About PowerShow.com