Heteroskedasticity - PowerPoint PPT Presentation

1 / 54
Title:

Heteroskedasticity

Description:

Title: PowerPoint Author: Dr. Li, Sung Ko Last modified by: ITSC Created Date: 1/28/2002 2:05:31 AM Document presentation format: On-screen Show – PowerPoint PPT presentation

Number of Views:156
Avg rating:3.0/5.0
Slides: 55
Provided by: DrLi8
Category:
Tags:
Transcript and Presenter's Notes

Title: Heteroskedasticity

1
Heteroskedasticity
Objectives
• What is heteroskedasticity?
• What are the consequences?
• How is heteroskedasticity identified?
• How is heteroskedasticity corrected?

2
Main empirical model for Unit 10 foodexpi
?0 ?1incomei ?i.
foodexp Family food expenditure income
Family income
Least squares estimates, US data (UE_Tab0301)
Is this the best estimated equation?
3
1. The Nature of Heteroskedasticity
• In a regression about firms, for the same mistake,

4
• Heteroskedasticity is a problem that occurs when
the error term does not have a constant variance.

CLRM Each error term comes from the same
probability distribution.
Assumption CLRM.5 is violated!
5
Regression Model
Yi b0 b1X1i b2X2i ?i
E(?iX1i,X2i) 0
zero mean
var(?iX1i,X2i) s 2
homoskedasticity
6
Identical distributions for observations i and j
Distribution for i
Distribution for j
7
HomoskedasticityYi ?0 ?1Xi ?i
var(?iXi) s2 for all i
Conditional Distribution
8
HeteroskedasticityYi ?0 ?1Xi
?i var(?iXi) si2 for all i
Conditional Distribution
9
(No Transcript)
10
(No Transcript)
11
Pure heteroskedasticity Different variances
of the error term. Correctly specified PRF.
Impure heteroskedasticity Different variances
of the error term. Specification error.
12
2. Detecting Heteroscedasticity
2.1 Graphical Method
Plotting foodexp against income (for one
regressor)
Example 1 Food expenditure, US Data
(UE_Tab0301)
13
Example 1 Food expenditure, US Data, UE_Tab0301
Plotting e against income.
Plotting e2 against income.
14
Example 2 textbook data, (Woody3)
15
3.2 Park Test
Model Yi ?0 ?1X1i ?KXKi ?t i
1,,N () Suppose it is suspected that var(?i)
depends on Zi in the form of var(??i)
?i2 ?2Zi?1evi ln?i2 ln?2 ?1lnZki vi
Ho ?1 0 (Homoskedastic errors) HA ?1 ? 0
(Heteroskedastic errors).
16
Step 1 Estimate the equation () with OLS and
obtain the residuals.
Step 2 Regress the natural log of squared
residuals on the natural log of a possible
proportionality factor
ln(ei2) ?0 ?1lnZi vi where vi is
an error term satisfying all classical
assumptions.
17
Step 3 If the coefficient of lnZ is
significantly different from zero, then it would
suggest that there is heteroscedastic pattern in
the residuals with respect to Z. Otherwise,
homoscedastic errors cannot be rejected.
Example 3 Park Test US data (UE_Tab0301)
ln(e2) -7.46 2.07 ln(income)
t (2.28)
p-value (0.0284)
18
• Advantages of the Park test
• The test is simple.
• It provides information about the variance
structure.
• Limitations of the Park test
• The distribution of the dependent variable is
problematic.
• It assumes a specific functional form.
• It does not work when the variance depends on two
or more variables.
• The correct variable with which to order the
observations must be identified first.
• It cannot handle partitioned data.

19
3.3 Whites Test
Model Yi ?0 ?1X1i ?2X2i ?i i
1,,N () Suppose it is suspected there may be
heteroskedasticity but we are not sure of its
functional form.
Ho The conditional variance of ?i is
constant. HA The conditional variance of ?i is
not constant.
20
Step 1 Estimate the equation () with OLS and
obtain the residuals.
Step 2 Regress the squared residuals on all
explanatory variables, all cross product terms
and the square of each explanatory variable.
ei2 ?0 ?1X1i ?2X2i ?3X1i2
?4X2i2 ?5X1iX2i vi
21
Step 3 Test the overall significance of the
equation in Step 2. (df number of regressors)
Statistic NR2white ?2df Critical value (cv)
?2df,?
Reject the hypothesis of homoskedasticity if
NR2err gt cv.
Example 4 White test US data (UE_Tab0301)
e2 1924 7.4 income
0.0088income2 R2 0.3646, N 40,
N?R2 14.58 cv ?2(2, 0.01) 9.21.
22
Advantages of the White test a. It does not
assume a specific functional form. b. It is
applicable when the variance depends on two or
more variables.
• Limitations of the White test
• It is an large-sample test.
• It provides no information about the variance
structure.
• It loses many degrees of freedom when there are
many regressors.
• It cannot handle partitioned data.
• It also captures specification errors.

23
3. Consequences of Heteroskedasticity
If heteroskedasticity appears but OLS is used for
estimation, how are the OLS estimates affected?
Unaffected OLS estimators are still linear and
unbiased because, on average, overestimates are
as likely as underestimates.
24
3.1 OLS estimators are inefficient.
Some fluctuations of the error term are
attributed to the variation in independent
variables.
There are other linear and unbiased estimators
that have smaller variances than the OLS
estimator.
25
3.2 Unreliable Hypothesis Testing
? unreliable testing conclusion
26
4. Remedies
4.1 Heteroskedasticity-Corrected Standard Errors
Yi b0 b1X1i b2X2i ?i
var(?i) si 2
heteroskedasticity
OLS estimators are unbiased. The standard errors
of OLS are biased.
27
A heteroskedasticity-consistent (HC) standard
error of an estimated coefficient is a standard
error of an estimated coefficient adjusted for
heteroskedasticity.
a. HC standard errors are consistent for any
type of heteroskedasticity. b. Hypothesis tests
are valid with HC standard errors in large
samples. c. Typically, HC se gt OLS se
28
Example 5 Yi ?0 ?1Xi ?i,
var(?iXi) ?i.
incorrect variance formula
correct variance formula
29
HC estimator of the variance of the slope
coefficient in a simple regression model
Example 6 HC Standard Errors, US data
(UE_Tab0301)
30
4.2 Weighted Least Squares
Yi b0 b1X1i b2X2i ?i
The variance is assumed to be proportional
to the value of Zi2
si 2 c Zi 2
31
Step 1 Decide which variable is proportional to
the heteroskedasticity. Step 2 Divide all terms
in the original model by that variable (divide
by Zi ).
32
Step 3 Run least squares on the transformed
model which has new variables. Note that the
transformed model have an intercept only if Z is
one of the explanatory variables.
For example, if Zi X2i, then
33
Example 7 WLS US data (UE_Tab0301)
What are values of the estimated coefficients of
the original model?
Has the problem of heteroskedasticity solved?
34
Comparing different estimates US data
(UE_Tab0301)
?0 ?1
OLS estimate 40.77 0.128
OLS se 22.14 0.031
HC se 24.32 0.039
WLS estimate 21.28 0.158
WLS se 14.03 0.023
The WLS estimates have improved upon those of OLS.
35
Other possibilities
• var(?i) cZi
• var(?i) cZi?
• var(?i) c(a1X1i a2X2i)

36
In large samples HC standard errors are
consistent measures for any type of
heteroscedasticity. CI t-test are valid.
37
4.3 Re-specifying the Regression Model
The heteroskedasticity may be impure.
4.3.1 Use another functional form
E.g., Double-log Less variation
Example 8 US data (UE_Tab0301)
The hypothesis of constant variance can be
rejected.
38
Example 9 India data (Food_India55)
Empirical model foodexpi ?0
?1totexpi ?i.
The hypothesis of homoskedasticity can be
rejected by the Park and White tests.
39
Which model is the best?
Double-log
HC
WLS
40
4.3.2 Other reformulations E.g., take average of
variables related to the size of observed units,
Example 10 Data set Concert The concert tour
of a singer in the US
41
(1)
(2)
(3)
42
• Remarks
• The variable Z is difficult to identify. The
functional relationship between the error and Z
is not known. Use WLS at last.
• With correct WLS, we expect the standard errors
of the regression coefficients will be smaller
than the OLS counterparts.
• A log transformation usually reduces the degree
of heteroskedasticity.
• The hypothesis of homoskedasticity should not be
rejected in the new model.

43
5. A Complete Example
Sources Section 8.2.2 (pp. 255 256)
Section 10.5 (pp. 369 376)
Empirical regression model
pconi ?0 ?1regi ?2taxi ?3uhmi ?i.
pconi1 petroleum consumption in the ith
state regi motor vehicle registrations in the
ith state (000) taxi the gasoline tax rate in
the ith state(cents per gallon) uhm urban
highway miles wihtin the ith state
44
Equation 1
Equation 2
45
Graphical investigation
46
Park test
White test
Checking for other specifications Double
47
(4)
(5)
(6)
48
Selected Exercises
Ch. 10 Q. 1, 3, 4, 5, 8, 10, 12, 14
49
Regression Model
Yi b0 b1X1i b2X2i ?i
E(?iX1i,X2i) 0
zero mean
var(?iX1i,X2i) s 2
homoskedasticity
50
HeteroskedasticityYi ?0 ?1Xi
?i var(?iXi) si2 for all i
f(Y)
Y
0
X
Conditional Distribution
51
Step 3 Test the overall significance of the
equation in Step 2. (df number of regressors)
Statistic NR2err ?2df Critical value (cv)
?2df,?
Reject the hypothesis of homoskedasticity if
NR2err gt cv.
52
Step 1 Decide which variable is proportional to
the heteroskedasticity. Step 2 Divide all terms
in the original model by that variable (divide
by Zi ).
53
Step 3 Run least squares on the transformed
model which has new variables. Note that the
transformed model have an intercept only if Z is
one of the explanatory variables.
For example, if Zi X2i, then
54
In large samples HC standard errors are
consistent measures for any type of
heteroscedasticity. CI t-test are valid.