Multicollinearity - PowerPoint PPT Presentation

1 / 75
About This Presentation
Title:

Multicollinearity

Description:

Chapter 7 Multicollinearity What is in this Chapter? How do we detect this problem? What are the consequences? What are the solutions? An example by Gauss What is in ... – PowerPoint PPT presentation

Number of Views:404
Avg rating:3.0/5.0
Slides: 76
Provided by: acer181
Category:

less

Transcript and Presenter's Notes

Title: Multicollinearity


1
Chapter 7
  • Multicollinearity

2
What is in this Chapter?
  • How do we detect this problem?
  • What are the consequences?
  • What are the solutions?
  • An example by Gauss

3
(No Transcript)
4
(No Transcript)
5
What is in this Chapter?
  • In Chapter 4 we stated that one of the
    assumptions in the basic regression model is that
    the explanatory variables are not exactly
    linearly related. If they are, then not all
    parameters are estimable
  • What we are concerned with in this chapter is the
    case where the individual parameters are not
    estimable with sufficient precision (because of
    high standard errors)
  • This often occurs if the explanatory variables
    are highly intercorrelated (although this
    condition is not necessary).

6
(No Transcript)
7
What is in this Chapter?
  • This chapter is very important, because
    multicollinearity is one of the most
    misunderstood problems in multiple regression
  • There have been several measures for
    multicollinearity suggested in the literature
    (variance-inflation factors VIF, condition
    numbers CN, etc.)
  • This chapter argues that all these are useless
    and misleading
  • They all depend on the correlation structure of
    the explanatory variables only.

8
What is in this Chapter?
  • It is argued here that this is only one of
    several factors determining high standard errors
  • High intercorrelations among the explanatory
    variables are neither necessary nor sufficient to
    cause the multicollinearity problem
  • The best indicators of the problem are the
    t-ratios of the individual coefficients.

9
What is in this Chapter?
  • This chapter also discusses the solutions offered
    for the multicollinearity problem
  • principal component regression
  • dropping of variables
  • However, they are ad hoc and do not help
  • The only solutions are to get more data or to
    seek prior information

10
7.1 Introduction
  • Very often the data we use in multiple regression
    analysis cannot give decisive (significant)
    answers to the questions we pose.
  • This is because the standard errors are very high
    or the t-ratios are very low.
  • This sort of situation occurs when the
    explanatory variables display little variation
    and/or high intercorrelations.

11
7.1 Introduction
  • The situation where the explanatory variables are
    highly intercorrelated is referred to as
    multicollinearity
  • When the explanatory variables are highly
    intercorrelated, it becomes difficult to
    disentangle the separate effects of each of the
    explanatory variables on the explained variable

12
7.2 Some Illustrative Examples
  • Thus only(ß1 2ß2) would be estimable.
  • We cannot get estimates of ß1 and ß2 separately.
  • In this case we say that there is perfect
    multicollinearity, because x1 and x2 are
    perfectly correlated (with 1).
  • In actual practice we encounter cases where r2 is
    not exactly 1 but close to 1.

13
7.2 Some Illustrative Examples
  • As an illustration, consider the case where
  • so that the normal equations are
  • The solution is .
  • Suppose that we drop an observation and the new
    values are

14
7.2 Some Illustrative Examples
  • Now when we solve the equations
  • We get

15
7.2 Some Illustrative Examples
  • Thus very small changes in the variances and
    covariances produce drastic changes in the
    estimates of regression parameters.
  • It is easy to see that the correlation
    coefficient between the two explanatory variables
    is given by
  • which is very high.

16
7.2 Some Illustrative Examples
  • In practice, addition or deletion of observations
    would produce changes in the variances and
    covariances
  • Thus one of the consequences of high correlation
    between x1 and x2 is that the parameter estimates
    would be very sensitive to the addition or
    deletion of observations
  • This aspect of multicollinearity can be checked
    in practice by deleting or adding some
    observations and examining the sensitivity of the
    estimates to such perturbations

17
(No Transcript)
18
(No Transcript)
19
(No Transcript)
20
(No Transcript)
21
7.2 Some Illustrative Examples
22
7.2 Some Illustrative Examples
Thus the variance of will be high if 1. s2
is high. 2. S11 is low. 3. is high.
23
7.2 Some Illustrative Examples
  • Even if is high, if s2 is low and S11
    high, we will not have the problem of high
    standard errors.
  • On the other hand, even if is low, the
    standard errors can be high if s2 is high and
    S11 is low (i.e., there is not sufficient
    variation in x1).
  • What this suggests is that high value of do
    not tell us anything whether we have a
    multicollinearity problem or not.
  • When we have more than two explanatory variables,
    the simple correlations among them become all the
    more meaningless.

24
7.2 Some Illustrative Examples
  • As an illustration, consider the following
    example with 20 observations on x1, x2, and x3
  • x1 (1, 1, 1, 1, 1, 0, 0, 0, 0, 0, and 10
    zeros)
  • x2 (0, 0, 0, 0, 0, 1, 1, 1, 1, 1, and 10
    zeros)
  • x3 (1, 1, 1, 1, 1, 1, 1, 1, 1, 1, and 10
    zeros)

25
7.2 Some Illustrative Examples
  • Obviously, x3x1x2 and we have perfect
    multicollinearity.
  • But we can see that
  • ,and thus the simple correlations are not
    high.
  • In the case of more than two explanatory
    variables, what we have to consider are multiple
    correlations of each of the explanatory variables
    with the other explanatory variables.

26
7.2 Some Illustrative Examples
  • Note that the standard error formulas
    corresponding to equations (7.1) and (7.2) are
  • where s2 and Sii are defined as before in the
    case of two explanatory variables and
    represents the squared multiple correlation
    coefficient between xi and the other explanatory
    variables.

27
7.3 Some Measures of Multicollinearity
  • It is important to be familiar with two measures
    that are often suggested in the discussion of
    multicollinearity the variance inflation factor
    (VIF) and the condition number (CN).
  • The VIF is defined as
  • where is the squared multiple
    correlation coefficient between xi and the other
    explanatory variables.

28
7.3 Some Measures of Multicollinearity
  • A measure that considers the correlations of the
    explanatory variable with the explained variable
    is Theils measure, which is defined as
  • where
  • R2 squared multiple correlation from a
    regression
  • of y on x1, x2,..,xk
  • squared multiple correlation from a
    regression
  • of y on x1, x2,..,xk with xi
    omitted

29
7.3 Some Measures of Multicollinearity
  • The quantity is termed the
    incremental contribution to the squared
    multiple correlation by Theil.
  • If x1, x2,..,xk are mutually uncorrelated, then
    m will be 0 because the incremental contributions
    all add up to .
  • In other cases m can be negative as well as
    highly positive.

30
7.4 Problems with Measuring Multicollinearity
  • Let us define
  • C real consumption per capita
  • Y real per capita current income
  • Yp real per capita permanent income
  • YT real per capita transitory income
  • YYTYp and Yp and YT are uncorrelated

31
7.4 Problems with Measuring Multicollinearity
  • Suppose that we formulate the consumption
    function as
  • All these equation are equivalent. However, the
    correlations between the explanatory variables
    will be different depending in which of the three
    equations is considered.
  • In equation (7.5), since Y and Yp are often
    highly correlated, we would say that there is
    high multicollinearity.

32
7.4 Problems with Measuring Multicollinearity
  • In equation (7.6), since YT and Yp are
    uncorrelated, we would say that there is no
    multicollinearity.
  • However, the two equations are essentially the
    same.
  • What we should be talking about is the precision
    with which a and ß or (aß ) are estimable.

33
7.4 Problems with Measuring Multicollinearity
Consider, for instance, the following data
34
7.4 Problems with Measuring Multicollinearity
  • For these data the estimation of equation (7.5)
    gives (figures in parentheses are standard
    errors)
  • One reason for the imprecision in the estimates
    is that Y and Yp are highly correlated (the
    correlation coefficient is 0.95).

35
7.4 Problems with Measuring Multicollinearity
  • For equation (7.6) the correlation between the
    explanatory variables is zero and for equation
    (7.7) it is 0.32.
  • The least squares estimates of a and ß are no
    more precise in equation (7.6) or (7.7).
  • Let us consider the estimation of equation (7.6).
    We get

36
7.4 Problems with Measuring Multicollinearity
  • The estimate at (aß) is thus 0.89 and the
    standard error is 0.11.
  • Thus (aß) is indeed more precisely estimated
    than either aorß.
  • As for a, it is not precisely estimated even
    though the explanatory variables in this equation
    are uncorrelated.
  • The reason is that the variance of YT is very low
    see formula (7.1)

37
7.4 Problems with Measuring Multicollinearity
  • In Table 7.1 we present data in C, Y, and L for
    the period from the first quarter of 1952 to the
    second quarter of 1961.
  • C is consumption expenditures, Y is disposable
    income, and L is liquid assets at the end of the
    previous quarter.
  • All figures are in billions of 1954 dollars.
  • Using the 38 observations we get the following
    regression equations .

38
7.4 Problems with Measuring Multicollinearity
39
7.4 Problems with Measuring Multicollinearity
  • Equation(7.10) shows that L and Y are very highly
    correlated.
  • In fact, substituting the value of L in terms Y
    from (7.10) into equation (7.9) and simplifying,
    we get equation (7.8) correct to four decimal
    place!
  • However, looking at the t-ratios in equation
    (7.9) we might conclude that multicollinearity is
    not a problem.

40
7.4 Problems with Measuring Multicollinearity
  • Are we justified in this conclusion?
  • Let us consider the stability of the coefficients
    with deletion of some observations.
  • Using only the first 36 observations we get the
    following results

41
7.4 Problems with Measuring Multicollinearity
  • Comparing equation (7.11) with (7.8) and equation
    (7.12) with (7.9) we see that the coefficients in
    the latter equation show far greater changes than
    in the former equation.
  • Of course, if one applies the tests for stability
    discussed in Section 4.11, one might conclude
    that the results are not statistically
    significant at the 5 level.
  • Note that the test for stability that we use us
    the predictive test for stability.

42
7.4 Problems with Measuring Multicollinearity
  • Finally, we might consider predicting C for the
    first two quarters of 1961 using equations (7.11)
    and (7.12).
  • The predictions are

43
7.4 Problems with Measuring Multicollinearity
  • Thus the prediction from the equation including L
    is further off from the true value than the
    predictions from the equations excluding L.
  • Thus if prediction was the sole criterion, one
    might as well drop the variable L.

44
7.4 Problems with Measuring Multicollinearity
  • The example above illustrates four different ways
    of looking at the multicollinearity problem
  • 1. Correlation between the explanatory variables
    L and Y, which is high.
  • 2. Standard errors or t-ratios for the estimated
    coefficients
  • In this example the t-ratios are significant,
    suggesting that multicollinearity might not be
    serious.

45
7.4 Problems with Measuring Multicollinearity
  • 3. Stability of the estimated coefficients when
    some observations are deleted.
  • 4. Examining the predictions from the model
  • If multicollinearity is a serious problem, the
    predictions from the model would be worse than
    those from a model that includes only a subset of
    the set of explanatory variables.

46
7.4 Problems with Measuring Multicollinearity
  • The last criterion should be applied if
    prediction is the object of the analysis.
    Otherwise, it would be advisable to consider the
    second and third criteria.
  • The first criterion is not useful, as we have so
    frequently emphasized.

47
7.6 Principal Component Regression
  • Another solution that is often suggested for the
    multicollinearity problem is the principal
    component regression.
  • Suppose that we k explanatory variables.
  • Then we can consider linear functions of these
    variables

48
7.6 Principal Component Regression
  • Suppose we choose the as so that the variance of
    z1 is maximized subject to the condition that
  • This is called the normalization condition.
  • Z1 is then said to be the first principal
    component.
  • It is the linear function of the xs that has the
    highest variance.

49
7.6 Principal Component Regression
  • The process of maximizing the variance of the
    linear function z subject to the condition that
    the sum of squares of the coefficients of the xs
    is equal to 1, produces k solutions.
  • Corresponding to these we construct k linear
    functions z1, z2,..,zk. These are called the
    principal components of the xs.

50
7.6 Principal Component Regression
  • They can be ordered so that
  • var(z1)gtvar(z2)gt..gtvar(zk)
  • z1, the one with the highest variance, is called
    the first principal component
  • z2, with the next highest variance, is called the
    second principal component, and so on
  • Unlike the xs, which are correlated, the zs are
    orthogonal or uncorrelated.

51
7.6 Principal Component Regression
  • The data are presented in Table 7.3.
  • First let us estimate an import demand function.
  • The regression of y on x1, x2, x3 gives the
    following results

52
7.6 Principal Component Regression
53
7.6 Principal Component Regression
  • The R2 is very high and the F-ratio is highly
    significant but the individual t-ratios are all
    insignificant.
  • This is evidence of the multicollinearity
    problem.
  • Chatterjee and Price argue that before any
    further analysis is made, we should look at the
    residuals from this equation.

54
7.6 Principal Component Regression
  • They find (we are omitting the residual plot
    here) a distinctive pattern-the residuals
    declining until 1960 and then rising.
  • Chatterjee and Price argue that the difficulty
    with the model is that the European Common Market
    began operations in 1960, causing change in
    import- export relationships
  • Hence they drop the years after 1959 and consider
    only the 11 years 1949-1959. The regression
    results are below

55
7.6 Principal Component Regression
56
7.6 Principal Component Regression
  • The residual plot (not shown here) is now
    satisfactory (there are no systematic patterns),
    so we can proceed.
  • Even though the R2 is very high, the coefficient
    of x1 is not significant.
  • There is thus a multicollinearity problem.

57
7.6 Principal Component Regression
  • To see what should be done about it, we first
    look at the simple correlations among the
    explanatory variables.
  • These are
    .
  • We suspect that the high correlation between x1
    and x3 could be the source of the trouble.

58
7.6 Principal Component Regression
  • Does principal component analysis help us? First,
    the principal components (obtained from a
    principal components program) are
  • X1, X2, X3 are the normalized values of x1,
    x2 ,x3.

59
7.6 Principal Component Regression
  • That is,
  • ,where
    m1, m2 ,m3 are the means and s1, s2 , s3 are the
    standard deviations of x1, x2 ,x3 respectively.
  • Hence var(X1)var(X2)var(X3)1
  • The variances of the principal components are
  • var(z1)1.999 var(z2)0.998
    var(z3)0.003

60
7.6 Principal Component Regression
  • Note that .
  • The fact that var(z3)0 identifies that linear
    function as the source of multicollinearity.
  • In this example there is only one such linear
    function. In some examples there could be more.
  • Since E(X1)E(X2)E(X3)0 because of
    normalization, the zs have mean zero.
  • Thus z3 has mean zero and its variance is also
    close to zero. Thus we can say that
    .

61
7.6 Principal Component Regression
  • Looking at the coefficients of the Xs, we can
    say that (ignoring the coefficients that are very
    small)

62
7.6 Principal Component Regression
  • In terms of the original (nonnormalized)
    variables the regression of x3 on x1 is (figure
    in parentheses is standard error)

63
7.6 Principal Component Regression
  • then substituting for x3 in terms of x1 we get
  • This gives the linear functions of the ßs that
    are estimable.
  • They are (ß26.258ß3), (ß10.686ß3), and ß2.
  • The regression of y and x1 and x2 gave the
    following results

64
7.6 Principal Component Regression
65
7.6 Principal Component Regression
  • Of course, we can estimate a regression of x1 and
    x3.
  • The regression coefficient is 1.451.
  • We now substitute for x1 and estimate a
    regression y on x1 and x3.
  • The results we get are slightly better (we get a
    higher R2).

66
7.6 Principal Component Regression
  • The results are
  • The coefficient of x3 now is

67
7.6 Principal Component Regression
  • Suppose that we consider regressing y on the
    components z1 and z2 (z3 is omitted because it is
    almost zero).
  • We saw that
    .
  • We have to transform these to the original
    variables.

68
7.6 Principal Component Regression
  • We get

69
7.6 Principal Component Regression
  • Thus, using z2 as a regressor is equivalent to
    using x2, and using z1 is equivalent to using
    .
  • Thus the principal component regression amounts
    to regressing y on
    .
  • In our example .

70
7.6 Principal Component Regression
  • The results are
  • This is the regression equation we would have
    estimated if we assumed that
    .
  • Thus the principal component regression amounts,
    in this example, to the use of the prior
    information .

71
7.7 Dropping Variables
  • Consider the model
  • If our main interest is ß1. Then we drop x2 and
    estimate the equation (7.16)

72
7.7 Dropping Variables
  • Then we drop x2 and estimate the equation
  • y ß1 x1v
    (7.16)
  • Let the estimator of ß1 from the complete model
    (7.15) be denoted by and the estimator of ß1
    from the omitted variable model be denoted by
    .
  • is the OLS estimator and is the OV
    (omitted variable) estimator.

73
7.7 Dropping Variables
  • As an estimator of ß1, we use the conditional
    omitted variable (COV) estimator, defined as

74
7.7 Dropping Variables
  • Also, instead if using ,depending
    on we can consider a linear combination of
    both, namely
  • This is called the weighted (WTD) estimator and
    it has minimum mean-square error if
    .
  • Again t2 is not known and we have to use its
    estimated value .

75
7.8 Miscellaneous Other Solutions
  • Using Ratios or First Differences
  • Getting More Data
Write a Comment
User Comments (0)
About PowerShow.com