Chapter 12: Multiple Regression and Model Building - PowerPoint PPT Presentation

About This Presentation
Title:

Chapter 12: Multiple Regression and Model Building

Description:

Chapter 12: Multiple Regression and Model Building 12.11: Residual Analysis: Checking the Regression Assumptions McClave: Statistics, 11th ed. Chapter 12: ... – PowerPoint PPT presentation

Number of Views:276
Avg rating:3.0/5.0
Slides: 120
Provided by: JoeN81
Category:

less

Transcript and Presenter's Notes

Title: Chapter 12: Multiple Regression and Model Building


1
Chapter 12 Multiple Regression and Model Building
2
Where Weve Been
  • Introduced the straight-line model relating a
    dependent variable y to an independent variable x
  • Estimated the parameters of the straight-line
    model using least squares
  • Assesses the model estimates
  • Used the model to estimate a value of y given x

3
Where Were Going
  • Introduce a multiple-regression model to relate a
    variable y to two or more x variables
  • Present multiple regression models with both
    quantitative and qualitative independent
    variables
  • Assess how well the multiple regression model
    fits the sample data
  • Show how analyzing the model residuals can help
    detect problems with the model and the necessary
    modifications

4
12.1 Multiple Regression Models
5
12.1 Multiple Regression Models
  • Analyzing a Multiple-Regression Model
  • Step 1 Hypothesize the deterministic portion of
    the model by choosing the independent variables
    x1, x2, , xk.
  • Step 2 Estimate the unknown parameters ? 0, ?1,
    ?2, , ?k .
  • Step 3 Specify the probability distribution of ?
    and estimate the standard deviation ? of this
    distribution.

6
12.1 Multiple Regression Models
  • Analyzing a Multiple-Regression Model
  • Step 4 Check that the assumptions about ? are
    satisfied if not make the required modifications
    to the model.
  • Step 5 Statistically evaluate the usefulness of
    the model.
  • Step 6 If the model is useful, use it for
    prediction, estimation and other purposes.

7
12.1 Multiple Regression Models
  • Assumptions about the Random Error ?
  • The mean is equal to 0.
  • The variance is equal to ? 2.
  • The probability distribution is a normal
    distribution.
  • Random errors are independent of one another.

8
12.2 The First-Order Model Estimating and
Making Inferences about the ? Parameters
  • A First-Order Model in Five Quantitative
    Independent Variables
  • where x1, x2, , xk are all quantitative
    variables that are not functions of other
    independent variables.

9
12.2 The First-Order Model Estimating and
Making Inferences about the ? Parameters
  • A First-Order Model in Five Quantitative
    Independent Variables
  • The parameters are estimated by finding the
    values for the ? s that minimize

10
12.2 The First-Order Model Estimating and
Making Inferences about the ? Parameters
  • A First-Order Model in Five Quantitative
    Independent Variables
  • The parameters are estimated by finding the
    values for the ? s that minimize

Only a truly talented mathematician (or geek)
would choose to solve the necessary system of
simultaneous linear equations by hand. In
practice, computers are left to do the
complicated calculation required by multiple
regression models.
11
12.2 The First-Order Model Estimating and
Making Inferences about the ? Parameters
  • A collector of antique clocks hypothesizes that
    the auction price can be modeled as

12
12.2 The First-Order Model Estimating and
Making Inferences about the ? Parameters
  • Based on the data in Table 12.1, the least
    squares prediction equation, the equation that
    minimizes SSE, is

13
12.2 The First-Order Model Estimating and
Making Inferences about the ? Parameters
  • Based on the data in Table 12.1, the least
    squares prediction equation, the equation that
    minimizes SSE, is

The estimate for ? 1 is interpreted as the
expected change in y given a one-unit change in
x1 holding x2 constant
The estimate for ? 2 is interpreted as the
expected change in y given a one-unit change in
x2 holding x1 constant
14
12.2 The First-Order Model Estimating and
Making Inferences about the ? Parameters
  • Based on the data in Table 12.1, the least
    squares prediction equation, the equation that
    minimizes SSE, is

Since it makes no sense to sell a clock of age 0
at an auction with no bidders, the intercept term
has no meaningful interpretation in this example.
15
12.2 The First-Order ModelEstimating and
Making Inferences about the ? Parameters
Test of an Individual Parameter Coefficient in
the Multiple Regression Model
  • One-Tailed Test
  • Two-Tailed Test

16
12.2 The First-Order ModelEstimating and
Making Inferences about the ? Parameters
Test of the Parameter Coefficient on the Number
of Bidders
17
12.2 The First-Order ModelEstimating and
Making Inferences about the ? Parameters
Test of the Parameter Coefficient on the Number
of Bidders
Since t gt t, reject the null hypothesis.
18
12.2 The First-Order ModelEstimating and
Making Inferences about the ? Parameters
A 100(1-?) Confidence Interval for a ? Parameter
19
12.2 The First-Order ModelEstimating and
Making Inferences about the ? Parameters
A 100(1-?) Confidence Interval for ? 1
20
12.2 The First-Order ModelEstimating and
Making Inferences about the ? Parameters
A 100(1-?) Confidence Interval for ? 1
Holding the number of bidders constant, the
result above tells us that we can be 90 sure
that the auction price will rise between 11.20
and 14.28 for each 1-year increase in age.
21
12.3 Evaluating Overall Model Utility
  • Reject H 0 for ?i
  • Do Not Reject H 0 for ?i
  • Evidence of a linear relationship between y and xi
  • There may be no relationship between y and xi
  • Type II error occurred
  • The relationship between y and xi is more complex
    than a straight-line relationship

22
12.3 Evaluating Overall Model Utility
  • The multiple coefficient of determination, R2,
    measures how much of the overall variation in y
    is explained by the least squares prediction
    equation.

23
12.3 Evaluating Overall Model Utility
  • High values of R2 suggest a good model, but the
    usefulness of R2 falls as the number of
    observations becomes close to the number of
    parameters estimated.

24
12.3 Evaluating Overall Model Utility
Ra2 adjusts for the number of observations and
the number of parameter estimates. It will
always have a value no greater than R2.
25
12.3 Evaluating Overall Model Utility
26
12.3 Evaluating Overall Model Utility
Rejecting the null hypothesis means that
something in your model helps explain variations
in y, but it may be that another model provides
more reliable estimates and predictions.
27
12.3 Evaluating Overall Model Utility
  • A collector of antique clocks hypothesizes that
    the auction price can be modeled as

28
12.3 Evaluating Overall Model Utility
  • A collector of antique clocks hypothesizes that
    the auction price can be modeled as

Something in the model is useful, but the F-test
cant tell us which x-variables are individually
useful.
29
12.3 Evaluating Overall Model Utility
  • Checking the Utility of a Multiple-Regression
    Model
  • Use the F-test to conduct a test of the adequacy
    of the overall model.
  • Conduct t-tests on the most important ?
    parameters.
  • Examine Ra2 and 2s to evaluate how well the model
    fits the data.

30
12.4 Using the Model for Estimation and
Prediction
  • The model of antique clock prices can be used to
    predict sale prices for clocks of a certain age
    with a particular number of bidders.
  • What is the mean sale price for all 150-year-old
    clocks with 10 bidders?

31
12.4 Using the Model for Estimation and
Prediction
  • What is the mean auction sale price for a single
    150-year-old clock with 10 bidders?

The average value of all clocks with these
characteristics can be found by using the
statistical software to generate a confidence
interval. (See Figure 12.7) In this case, the
confidence interval indicates that we can be 95
sure that the average price of a single
150-year-old clock sold at auction with 10
bidders will be between 1,154.10 and 1,709.30.
32
12.4 Using the Model for Estimation and
Prediction
33
12.4 Using the Model for Estimation and
Prediction
  • What is the mean sale price for a single
    50-year-old clock with 2 bidders?

34
12.4 Using the Model for Estimation and
Prediction
  • What is the mean sale price for a single
    50-year-old clock with 2 bidders?

Since 50 years-of-age and 2 bidders are both
outside of the range of values in our data set,
any prediction using these values would be
unreliable.
35
12.5 Model Building Interaction Models
  • In some cases, the impact of an independent
    variable xi on y will depend on the value of some
    other independent variable xk.
  • Interaction models include the cross-products of
    independent variables as well as the first-order
    values.

36
12.5 Model Building Interaction Models
37
12.5 Model Building Interaction Models
  • In the antique clock auction example, assume the
    collector has reason to believe that the impact
    of age (x1) on price (y) varies with the number
    of bidders (x2) .
  • The model is now
  • y ?0 ?1x1 ?2x2 ?3x1x2 ? .

38
12.5 Model Building Interaction Models
39
12.5 Model Building Interaction Models
  • In the antique clock auction example, assume the
    collector has reason to believe that the impact
    of age (x1) on price (y) varies with the number
    of bidders (x2) .
  • The model is now
  • y ?0 ?1x1 ?2x2 ?3x1x2 ? .

40
12.5 Model Building Interaction Models
  • In the antique clock auction example, assume the
    collector has reason to believe that the impact
    of age (x1) on price (y) varies with the number
    of bidders (x2) .
  • The model is now
  • y ?0 ?1x1 ?2x2 ?3x1x2 ? .

The MINITAB results are reported in Figure 12.11
in the text.
41
12.5 Model Building Interaction Models
  • In the antique clock auction example, assume the
    collector has reason to believe that the impact
    of age (x1) on price (y) varies with the number
    of bidders (x2) .
  • The model is now
  • y ?0 ?1x1 ?2x2 ?3x1x2 ? .

42
12.5 Model Building Interaction Models
Once the interaction term has passed the t-test,
it is unnecessary to test the individual
independent variables.
43
12.6 Model Building Quadratic and Other Higher
Order Models
  • A quadratic (second-order) model includes the
    square of an independent variable
  • y ?0 ?1x ?2x2 ?.
  • This allows more complex relationships to be
    modeled.

44
12.6 Model Building Quadratic and Other Higher
Order Models
  • A quadratic (second-order) model includes the
    square of an independent variable
  • y ?0 ?1x ?2x2 ?.
  • ?1 is the shift parameter and
  • ?2 is the rate of curvature.

45
12.6 Model Building Quadratic and Other Higher
Order Models
  • Example 12.7 considers whether home size (x)
    impacts electrical usage (y) in a positive but
    decreasing way.
  • The MINITAB results are shown in Figure 12.13.

46
12.6 Model Building Quadratic and Other Higher
Order Models
47
12.6 Model Building Quadratic and Other Higher
Order Models
  • According to the results, the equation that
    minimizes SSE for the 10 observations is

48
12.6 Model Building Quadratic and Other Higher
Order Models
49
12.6 Model Building Quadratic and Other Higher
Order Models
  • Since 0 is not in the range of the independent
    variable (a house of 0 ft2?), the estimated
    intercept is not meaningful.
  • The positive estimate on ?1 indicates a positive
    relationship, although the slope is not constant
    (weve estimated a curve, not a straight line).
  • The negative value on ?2 indicates the rate of
    increase in power usage declines for larger homes.

50
12.6 Model Building Quadratic and Other Higher
Order Models
  • The Global F-Test
  • H0 ?1 ?2 0
  • Ha At least one of the coefficients ? 0
  • The test statistic is F 189.71, p-value near 0.
  • Reject H0.

51
12.6 Model Building Quadratic and Other Higher
Order Models
  • t-Test of ?2
  • H0 ?2 0
  • Ha ?2lt 0
  • The test statistic is t -7.62, p-value .0001
    (two-tailed).
  • The one-tailed test statistic is .0001/2 .00005
  • Reject the null hypothesis.

52
12.6 Model Building Quadratic and Other Higher
Order Models
  • Complete Second-Order Model with Two
    Quantitative Independent Variables
  • E(y) ?0 ?1x1 ?2x2 ?3x1x2 ?4x12 ?5x22

y-intercept
Signs and values of these parameters control the
type of surface and the rates of curvature
Changing ?1 and ?2 causes the surface to shift
along the x1 and x2 axes
Controls the rotation of the surface
53
12.6 Model Building Quadratic and Other Higher
Order Models
54
12.7 Model Building Qualitative (Dummy)
Variable Models
  • Qualitative variables can be included in
    regression models through the use of dummy
    variables.
  • Assign a value of 0 (the base level) to one
    category and 1, 2, 3 to the other categories.

55
12.7 Model Building Qualitative (Dummy)
Variable Models
  • A Qualitative Independent Variable with k Levels
  • where xi is the dummy variable for level i 1
    and

56
12.7 Model Building Qualitative (Dummy)
Variable Models
  • For the golf ball example from Chapter 10, there
    were four levels (the brands).Testing differences
    in brands can be done with the model

57
12.7 Model Building Qualitative (Dummy)
Variable Models
  • Brand A is the base level, so ?0 represents the
    mean distance (?A) for Brand A, and
  • ?1 ?B - ?A
  • ?2 ?C - ?A
  • ?3 ?D - ?A

58
12.7 Model Building Qualitative (Dummy)
Variable Models
  • Testing that the four means are equal is
    equivalent to testing the significance of the ?s
  • H0 ?1 ?2 ?3 0
  • Ha At least of one the ?s ? 0

59
12.7 Model Building Qualitative (Dummy)
Variable Models
  • Testing that the four means are equal is
    equivalent to testing the significance of the ?s
  • H0 ?1 ?2 ?3 0
  • Ha At least of one the ?s ? 0

The test statistic is the F-statistic. Here F
43.99, p-value ? .000. Hence we reject the null
hypothesis that the golf balls all have the same
mean driving distance.
60
12.7 Model Building Qualitative (Dummy)
Variable Models
  • Testing that the four means are equal is
    equivalent to testing the significance of the ?s
  • H0 ?1 ?2 ?3 0
  • Ha At least of one the ?s ? 0

The test statistic if the F-statistic. Here F
43.99, p-value ? .000. Hence we reject the null
hypothesis that the golf balls all have the same
mean driving distance.
Remember that the maximum number of dummy
variables is one less than the number of levels
for the qualitative variable.
61
12.8 Model Building Models with Both
Quantitative and Qualitative Variables
  • Suppose a first-order model is used to evaluate
    the impact on mean monthly sales of expenditures
    in three advertising media television, radio and
    newspaper.
  • Expenditure, x1, is a quantitative variable
  • Types of media, x2 and x3, are qualitative
    variables (limited to k levels -1)

62
12.8 Model Building Models with Both
Quantitative and Qualitative Variables
63
12.8 Model Building Models with Both
Quantitative and Qualitative Variables
64
12.8 Model Building Models with Both
Quantitative and Qualitative Variables
  • Suppose now a second-order model is used to
    evaluate the impact of expenditures in the three
    advertising media on sales.
  • The relationship between expenditures, x1, and
    sales, y, is assumed to be curvilinear.

65
12.8 Model Building Models with Both
Quantitative and Qualitative Variables
  • In this model, each medium is assumed to have
    the save impact on sales.

66
12.8 Model Building Models with Both
Quantitative and Qualitative Variables
In this model, the intercepts differ but the
shapes of the curves are the same.
67
12.8 Model Building Models with Both
Quantitative and Qualitative Variables
In this model, the response curve for each media
type is different that is, advertising
expenditure and media type interact, at varying
rates.
68
12.9 Model Building Comparing Nested Models
  • Two models are nested if one model contains all
    the terms of the second model and at least one
    additional term. The more complex of the two
    models is called the complete model and the
    simpler of the two is called the reduced model.

69
12.9 Model Building Comparing Nested Models
  • Recall the interaction model relating the auction
    price (y) of antique clocks to age (x1) and
    bidders (x2)

70
12.9 Model Building Comparing Nested Models
  • If the relationship is not constant, a
    second-order model should be considered

71
12.9 Model Building Comparing Nested Models
  • If the complete model produces a better fit, then
    the ?s on the quadratic terms should be
    significant.
  • H0 ?4 ?5 0
  • Ha At least one of ?4 and ?5 is non-zero

72
12.9 Model Building Comparing Nested Models
  • F-Test for Comparing Nested Models

73
12.9 Model Building Comparing Nested Models
  • F-Test for Comparing Nested Models
  • where
  • SSER sum of squared errors for the reduced
    model
  • SSEC sum of squared errors for the complete
    model
  • MSEC mean square error (s2) for the complete
    model
  • k g number of ? parameters specified in H0
  • k 1 number of ? parameters in the complete
    model
  • n sample size
  • Rejection region F gt F?, with k g numerator
    and n (k 1) denominator degrees of freedom.

74
12.9 Model Building Comparing Nested Models
  • The growth of carnations (y) is assumed to be a
    function of the temperature (x1) and the amount
    of fertilizer (x2).
  • The data are shown in Table 12.6 in the text.

75
12.9 Model Building Comparing Nested Models
  • The growth of carnations (y) is assumed to be a
    function of the temperature (x1) and the amount
    of fertilizer (x2).

The complete second order model is The least
squares prediction equation from Table 12.6
is rounded to
76
12.9 Model Building Comparing Nested Models
  • The growth of carnations (y) is assumed to be a
    function of the temperature (x1) and the amount
    of fertilizer (x2).

To test the significance of the contribution of
the interaction and second-order terms, use H0
?3 ?4 ?5 0 Ha At least one of ?3, ?4 or
?5 ? 0 This requires estimating the complete
model in reduced form, dropping the parameters
in the null hypothesis. Results are given in
Figure 12.31.
77
12.9 Model Building Comparing Nested Models
78
12.9 Model Building Comparing Nested Models
Reject the null hypothesis the complete model
seems to provide better predictions than the
reduced model.
79
12.9 Model Building Comparing Nested Models
  • A parsimonious model is a general linear model
    with a small number of ? parameters. In
    situations where two competing models have
    essentially the same predictive power (as
    determined by an F-test), choose the more
    parsimonious of the two.

80
12.9 Model Building Comparing Nested Models
  • A parsimonious model is a general linear model
    with a small number of ? parameters. In
    situations where two competing models have
    essentially the same predictive power (as
    determined by an F-test), choose the more
    parsimonious of the two.

If the models are not nested, the choice is more
subjective, based on Ra2, s, and an understanding
of the theory behind the model.
81
12.10 Model Building Stepwise Regression
  • It is often unclear which independent variables
    have a significant impact on y.
  • Screening variables in an attempt to identify the
    most important ones is known as stepwise
    regression.

82
12.10 Model Building Stepwise Regression
83
12.10 Model Building Stepwise Regression
  • Stepwise regression must be used with caution
  • Many t-tests are conducted, leading to high
    probabilities of Type I or Type II errors.
  • Usually, no interaction or higher-order terms are
    considered and reality may not be that simple.

84
12.11 Residual Analysis Checking the Regression
Assumptions
  • Regression analysis is based on the four
    assumptions about the random error ? considered
    earlier.
  • The mean is equal to 0.
  • The variance is equal to ? 2.
  • The probability distribution is a normal
    distribution.
  • Random errors are independent of one another.

85
12.11 Residual Analysis Checking the Regression
Assumptions
  • If these assumptions are not valid, the results
    of the regression estimation are called into
    question.
  • Checking the validity of the assumptions involves
    analyzing the residuals of the regression.

86
12.11 Residual Analysis Checking the Regression
Assumptions
  • A regression residual is defined as the
    difference between an observed y-value and its
    corresponding predicted value

87
12.11 Residual Analysis Checking the Regression
Assumptions
  • Properties of the Regression Residuals
  • The mean of the residuals is equal to 0.
  • The standard deviation of the residuals is equal
    to the standard deviations of the fitted
    regression model.

88
12.11 Residual Analysis Checking the Regression
Assumptions
  • If the model is misspecified, the mean of ? will
    not equal 0.
  • Residual analysis may reveal this problem.
  • The home-size electricity usage example
    illustrates this.

89
12.11 Residual Analysis Checking the Regression
Assumptions
  • The plot of the first-order model shows a
    curvilinear residual pattern
  • while the quadratic model shows a more random
    pattern.

90
12.11 Residual Analysis Checking the Regression
Assumptions
  • A pattern in the residual plot may indicate a
    problem with the model.

91
12.11 Residual Analysis Checking the Regression
Assumptions
  • A residual larger than 3s (in absolute value) is
    considered an outlier.
  • Outliers will have an undue influence on the
    estimates.
  • 1. Mistakenly recorded data
  • 2. An observation that is for some reason truly
    different from the others
  • 3. Random chance

92
12.11 Residual Analysis Checking the Regression
Assumptions
  • A residual larger than 3s (in absolute value) is
    considered an outlier.
  • Leaving an outlier that should be removed in the
    data set will produce misleading estimates and
    predictions (1 2 above).
  • So will removing an outlier that actually belongs
    in the data set (3 above).

93
12.11 Residual Analysis Checking the Regression
Assumptions
  • Residual plots should be centered on 0 and within
    3s of 0.
  • Residual histograms should be relatively
    bell-shaped.
  • Residual normal probability plots should display
    straight lines.

94
12.11 Residual Analysis Checking the Regression
Assumptions
Regression Analysis is Robust with respect to
(small) nonnormal errors.
  • Slight departures from normality will not
    seriously harm the validity of the estimates, but
    as the departure from normality grows, the
    validity falls.

95
12.11 Residual Analysis Checking the Regression
Assumptions
  • If the variance of ? changes as y changes, the
    constant variance assumption is violated.

96
12.11 Residual Analysis Checking the Regression
Assumptions
  • A first-order model is used to relate the
    salaries (y) of social workers to years of
    experience (x).

97
12.11 Residual Analysis Checking the Regression
Assumptions
98
12.11 Residual Analysis Checking the Regression
Assumptions
  • The model seems to provide good predictions, but
    the residual plot reveals a non-random pattern
  • The residual increases as the estimated mean
    salary increases, violating the constant variance
    assumption

99
12.11 Residual Analysis Checking the Regression
Assumptions
  • Transforming the dependent variable often
    stabilizes the residual
  • Possible transformations of y
  • Natural logarithm
  • Square root
  • sin-1y1/2

100
12.11 Residual Analysis Checking the Regression
Assumptions
101
12.11 Residual Analysis Checking the Regression
Assumptions
102
12.11 Residual Analysis Checking the Regression
Assumptions
103
12.11 Residual Analysis Checking the Regression
Assumptions
104
12.11 Residual Analysis Checking the Regression
Assumptions
105
12.11 Residual Analysis Checking the Regression
Assumptions
106
12.12 Some Pitfalls Estimability,
Multicollinearity and Extrapolation
107
12.12 Some Pitfalls Estimability,
Multicollinearity and Extrapolation
108
12.12 Some Pitfalls Estimability,
Multicollinearity and Extrapolation
Problem 1 Parameter Estimability
If x does not take on a sufficient number of
different values, no single unique line can be
estimated.
109
12.12 Some Pitfalls Estimability,
Multicollinearity and Extrapolation
Problem 2 Multicollinearity
Multicollinearity exists when two or more of the
independent variables in a regression are
correlated.
If xi and xj move together in some way, finding
the impact on y of a one-unit change in either of
them holding the other constant will be difficult
or impossible.
110
12.12 Some Pitfalls Estimability,
Multicollinearity and Extrapolation
Problem 2 Multicollinearity
Multicollinearity can be detected in various
ways. A simple check is to calculate the
correlation coefficients (rij) for each pair of
independent variables in the model. Any
significant rij may indicate a multicollinearity
problem.
  • If severe multicollinearity exists, the result
    may be
  • Significant F-values but insignificant t-values
  • Signs on ?s opposite to those expected
  • Errors in ? estimates, standard errors, etc.

111
12.12 Some Pitfalls Estimability,
Multicollinearity and Extrapolation
  • The Federal Trade Commission (FTC) ranks
    cigarettes according to their tar (x1), nicotine
    (x2), weight in grams (x3) and carbon monoxide
    (y) content .
  • 25 data points (see Table 12.11) are used to
    estimate the model

112
12.12 Some Pitfalls Estimability,
Multicollinearity and Extrapolation
113
12.12 Some Pitfalls Estimability,
Multicollinearity and Extrapolation
  • F 78.98, p-value lt .0001
  • t?1 3.97, p-value .0007
  • t?2 -0.67, p-value .5072
  • t?3 -0.3, p-value .9735

114
12.12 Some Pitfalls Estimability,
Multicollinearity and Extrapolation
The negative signs on two variables and the
insignificant t-values are suggestive of
multicollinearity .
  • F 78.98, p-value lt .0001
  • t?1 3.97, p-value .0007
  • t?2 -0.67, p-value .5072
  • t?3 -0.3, p-value .9735

115
12.12 Some Pitfalls Estimability,
Multicollinearity and Extrapolation
  • The coefficients of correlation, rij, provide
    further evidence
  • rtar, nicotine .9766
  • rtar, weight .4908
  • rweight, nicotine .5002
  • Each rij is significantly different from 0 at the
    ? .05 level.

116
12.12 Some Pitfalls Estimability,
Multicollinearity and Extrapolation
  • Possible Responses to Problems Created by
    Multicollinearity in Regression
  • Drop one or more correlated independent variables
    from the model.
  • If all the xs are retained,
  • Avoid making inferences about the individual ?
    parameters from the t-tests.
  • Restrict inferences about E(y) and future y
    values to values of the xs that fall within the
    range of the sample data.

117
12.12 Some Pitfalls Estimability,
Multicollinearity and Extrapolation
Problem 3 Extrapolation
The data used to estimate the model provide
information only on the range of values in the
data set. There is no reason to assume that the
dependent variables response will be the same
over a different range of values.
118
12.12 Some Pitfalls Estimability,
Multicollinearity and Extrapolation
Problem 3 Extrapolation
119
12.12 Some Pitfalls Estimability,
Multicollinearity and Extrapolation
Problem 4 Correlated Errors
If the error terms are not independent (a
frequent problem in time series), the model tests
and prediction intervals are invalid. Special
techniques are used to deal with time series
models.
Write a Comment
User Comments (0)
About PowerShow.com