Week 5 objectives

About This Presentation

Title:

Week 5 objectives

Description:

The standard errors for the coefficients of intercept, Aroma, Body, and Oakiness are not small. ... under Predictors' were aroma, body, flavour and oakiness, ... – PowerPoint PPT presentation

Number of Views:28

Avg rating:3.0/5.0

Slides: 37

Provided by: University520

Category:

more less

Transcript and Presenter's Notes

Title: Week 5 objectives

1
Week 5 objectives

The simple linear regression model
Standard errors
The meaning of R-squared
Accuracy of the estimation of slope and
intercept?
Predictions and accuracy of predictions
Diagnostic plots
Multiple regression models
Prediction accuracy in multiple regression

2
1. Review questions about fitting straight lines

Suppose that a scatterplot shows a reasonably
strong, linear association between x and y
variables. It is then natural to represent that
linear association by a straight line.
The fitted line is called a regression line.

3
How to fit a straight line?
LS minimizes the sum of squares of residuals,
which are the vertical (ie y direction) distances
from line to points
4
Example of fitted line Price vs. age of houses
in Adelaide
How do we interpret slope and intercept?
5
What is the simple linear regression model?
In fitting a line by least squares, in effect an
underlying model is proposed Here, is
the true intercept and is the true
slope
6
What is the relationship between the true line
and the fitted line?
The true slope and intercept are These
parameters are estimated by How accurate are
these estimates, and how accurate is the
prediction given an
7
2. What is standard error?

The standard error of any estimate or predicted
value is an estimate of its standard deviation.
For the present work, we use standard error
simply as a general indication of accuracy
Minitab can provide standard errors

8
Lecture Example 1
selling price of houses (y) and LGA (local
government area valuation, x) Units are 000
9
3. What is R2 and what does it mean?

R2 is the coefficient of determination, and
measures the proportion of variance among the
original y observations which is explained by
the linear regression upon x.
So the closer R2 is to 100, the more perfect is
the regression model

10
Example of least squares fit
How accurate is this fit?
11
How good is the linear regression model?
Interpretations using R-squared
12
4. How accurate will be the estimation of slope
and intercept?

High values of R-squared indicate a generally
good fit of the data to the regression line
But specific measures of the accuracy of
estimates are provided by standard errors, which
can be found under the StDev column of the
regression output.

13
selling price of houses (y) and LGA (local
government area valuation, x)Units are 000.
intercept 5.59, slope 1.285How accurate
are these values?
Lecture Example 1 continued
14
Lecture example 1 continued accuracy of
estimates of intercept and slope

R-squared is 96, which suggests the model
overall fit is excellent
The estimate of slope seems to be quite precise,
the estimate is 1.28 with a standard error of
0.107.
The estimate of intercept seems to contain a
great deal of uncertainty, because the estimate
is 5.59 and the standard error is 17.4.
But, the intercept measures the sale price of a
house whose LGA 0! This unrealistic situation
is well outside the range of existing LGA values.
It is not surprising for the estimate to be
inaccurate.

15
Lecture exercise 1

How good is the regression model?
How accurate are the estimates of the intercept
and slope?
Use the Minitab output on the next slide for the
relationship between weekly profit and traffic
volume for a fast food outlet

16
Lecture exercise 1 continued
Regression Analysis weekly profit versus traffic
volume The regression equation is weekly profit
3.45 0.143 traffic volume Predictor
Coef SE Coef T P Constant
3.4544 0.7396 4.67 0.005 traffic
0.14254 0.02936 4.85
0.005 S 0.6897 R-Sq 82.5
17
Lecture exercise 1 solution

R-squared is 82.5, which suggests the model
overall fit is very good
The estimate of slope seems to be very precise,
the estimate is 0.14254 with a standard error of
0.02936.
The estimate of intercept seems to be less
accurate, because the estimate is 3.4544 and the
standard error is 0.7396.
Note that the intercept measures the weekly
profit of an outlet with no traffic which is not
a particularly realistic situation, but it is not
too far outside the range of existing traffic
volume.

18
5. Predictions and its accuracy
Lecture Example 2 Regression model from
Lecture example 1 The fitted equation is
5.6 1.29x How to predict the selling price of a
house when its LGA is 140,000?
Answer 5.59 1.29140 185.5
(,000)
19
So, when are predictions from a regression line
likely to be accurate?

When R-squared indicates that the quality of fit
to the linear regression model is good
When we use either interpolation, or
extrapolation close to the existing range of
x-values
The standard error of predictions measures the
accuracy of predictions and gives the definitive
answer.

20
Lecture example 2 continued Accuracy of the
prediction of 185,000
Interpolation or extrapolation for this
prediction ?

R-squared of 96 indicates that the quality of
fit to the linear regression model is good

21
How to find the standard error of the prediction?

In performing regression you chose Stat gt
Regression gt Regression, then entered sale price
under Response and LGA under Predictor.
Also select Options, enter 140 in the box
Prediction interval for new observations.

22
Lecture example 2 continued
23
Lecture exercise 2

Based on the fitted regression model, what is the
average weekly profit for a fast food outlet with
traffic volume of 35?
Discuss the accuracy of that prediction.

Regression Analysis weekly profit versus traffic
volume The regression equation is weekly profit
3.45 0.143 traffic volume Predictor
Coef SE Coef T P Constant
3.4544 0.7396 4.67 0.005 traffic
0.14254 0.02936 4.85
0.005 S 0.6897 R-Sq 82.5 Predicted
Values for New Observations Fit SE Fit
8.443 0.425
24
Lecture exercise 2 solution

Average weekly profit for an outlet with traffic
volume of 35 is 8,443.
Accuracy of prediction is reasonably high.
Reasons
The R-squared value indicates that the overall
fit is very good
The prediction was obtained using interpolation
rather that extrapolation
The standard error of prediction is 425, which
is reasonably small compared to the fitted value.

25
6. Robustness questions in fitting lines

Outliers can have an effect, but also
points of high leverage (far flung values of
explanatory variable x) can be very influential,
especially for estimating slope
Some residual plots are useful see the text
pages 97-98.
The next two slides show plots which identify
three possible outliers. The data is in
restrnt.mtp, and Sales is regressed upon newcap,
value and seats.

26
(No Transcript)
27
(No Transcript)
28
7. Multiple linear regression add more
explanatory variables to the model
Extend the model in Example (4.7.1) to where
is traffic volume (1000/day) is seating
capacity of the outlet is weekly profit
('000) An extra explanatory variable is
. See the textbook for details
29
Example of multiple regression
where are scores given by tasters to n
different randomly assigned food portions. There
are three additives A, B, C being investigated.
and are the amount of A, B and
C, respectively, in ith portion. score
given to ith portion. What do the "slopes"
and measure?
30
Multiple Linear Regression output standard
errors are under the StDev column
31
8. Prediction accuracy in Multiple Regression

Follow the same procedure as for simple linear
regression
In Minitab, several explanatory variable names
have been entered in the Predictors box
Select Options, and in the Prediction intervals
for new observations box, enter the designated
explanatory variable values in the same order

32
Example
Effect of four explanatory variables on the
assessed quality of 38 wines
33
Results of multiple regression
34
Comments on standard errors

The standard errors for the coefficients of
intercept, Aroma, Body, and Oakiness are not
small.
The possible reason for the large standard error
of the intercept is that, the intercept estimates
the average quality of a wine whose explanatory
variable values are all zero, which is well
outside the range of recorded values from the
explanatory Descriptive Statistics.

35
Can we predict and give standard error for the
average quality for a wine with the
characteristics aroma 5.5, body 4.6,
flavour 5.0 and oakiness 4.5?

The explanatory variable names entered under
Predictors were aroma, body, flavour and
oakiness, in that order
So under Options, enter the values 5.5 4.6 5.0
4.5 in that order in the box Prediction interval
for new observations

36
The predicted average quality is 12.9, with
standard error 0.25, a reasonably accurate
estimate

Write a Comment

User Comments (0)

About PowerShow.com

Week 5 objectives - PowerPoint PPT Presentation

Week 5 objectives

The standard errors for the coefficients of intercept, Aroma, Body, and Oakiness are not small. ... under Predictors' were aroma, body, flavour and oakiness, ... – PowerPoint PPT presentation