REGRESSION DIAGNOSTICS Detecting problems in regression models Treating them to obtain unbiased results - PowerPoint PPT Presentation

1 / 19

About This Presentation

Title:

REGRESSION DIAGNOSTICS Detecting problems in regression models Treating them to obtain unbiased results

Description:

Check by White test or similar tests. ... ADF (Augmented Dickey-Fuller) - Test yt = + t + yt-1 + yt-1 + . + yt-p + t H0 : = 0 HA: ... – PowerPoint PPT presentation

Number of Views:111

Avg rating:3.0/5.0

Slides: 20

Provided by: nul8

Category:

more less

Transcript and Presenter's Notes

Title: REGRESSION DIAGNOSTICS Detecting problems in regression models Treating them to obtain unbiased results

1
REGRESSION DIAGNOSTICSDetecting problems in
regression modelsTreating them to obtain
unbiased results
2
Assumptions of OLS Estimator

E(ei) 0 (unbiasedness)
Var(ei) is constant (homoscedasticity)
Cov(ui,uj) 0 (independent error terms)
Cov(ui,Xi) 0 (error terms unrelated to Xs)
ei iid (0 , ?2)
Gauss-Markov Theorem If these conditions hold,
OLS is the best linear unbiased estimator (BLUE).
Additional Assumption eis are normally
distributed.

3
3 illnesses in Regression

Multicollinearity Strong relationship among
explanatory variables.
Heteroscedasticty Changing variance.
Autocorrelated Error Terms this is a symptom of
specification error.

4
Multicollinearity (strong relationship among
explanatory variables themselves)
11-4

Variances of regression coefficients are
inflated.
Regression coefficients may be different from
their true values, even signs.
Adding or removing variables produces large
changes in coefficients.
Removing a data point may cause large changes in
coefficient estimates or signs. (inconsistency)
In some cases, the F ratio may be significant, R2
may be very high despite the all t ratios are
insignificant (suggesting no significant
relationship).

5
Solutions to the Multicollinearity Problem
11-5

Drop a collinear variable from the regression
Combine collinear variables (e.g. use their sum
as one variable)

6
Heteroscedasticity

The variance of error terms is used in computing
t-tests of ? coefficients. If this variance is
not constant, then t-tests are not healthy (not
efficient, i.e. the probability of type 2 error
is higher).
However, the coefficients are unbiased.
Therefore heteroscedasticity is not a fatal
illness.
Check by White test or similar tests.
Use heteroscedasticity-adjusted t-statistics and
p-values.

7
Autocorrelation in Error Terms

This is a fatal illness. Because, it indicates a
specification error (missing variable, variables
used in inappropriate form, etc.).
With the current incorrect specification, you
cannot see the true coefficients, which you would
see if you were estimating the correct model.
Hence, this is a serious problem.
Check Durbin-Watson, Graph of error terms.
DW 2(1?) where et ?et-1 vt
Limitation of DW test statistic It only checks
for first order serial correlation in residuals.

Breusch-Godfrey Test
checks for higher order autocorrelation AR(q) in
residuals
H0 no serial correlation
Solution to the problem of serial correlation in
et
Find the correct specification.
In time series, use first differences.

9
Time Series Regressions

Lagged variable Yt ß0ß1Xtß2Xt-1ut
Autoregressive Model
Xt ß1Xt-1ß2Xt-2ut AR(2)
Time-Trend Yt ß0 ß1Xt ß2Tt ut

10
Spurious Regressions

As a general and very strict rule
All variables in a time-series regression must be
stationary.
Never run a regression with nonstationary
variables!
DW statistic will warn. Usually, DW ltlt 2 .
As most economic time-series grow over time, if
you run regression with non-stationary variables
you will find spurious positive relationships.

11
STATIONARITY

A variable is called stationary if it displays
mean-reverting behavior (i.e., if its mean
remains constant over time).
Any regression with nonstationary variables is
invalid.
Hence, any time-series application must start
with two preliminary steps
Test stationarity of the variables
If they are not, convert them into a stationary
form

A regression with nonstationary variables will
typically reveal the problem with a Durbin-Watson
(DW) statistic being significantly smaller than
2.
DW statistic measures the first-order
autocorrelation in the error term. DW ltlt 2
implies positive autocorrelation in the error
term.
--------------------
Financial Markets Application All price series
are typically nonstationary. Therefore, we use
returns.
Rt ln(Pt / Pt-1)

13
TESTING STATIONARITY UNIT ROOT TESTS

ADF Test
H0 the series is non-stationary
(i.e. it has a unit root)
ADF test statistics need to be compared to
McKinnon critical values. If H0 can be rejected
(the test statistic more negative than the
critical value), then the variable can be used in
regression.

ADF (Augmented Dickey-Fuller) - Test
?yt a ßt ?yt-1 ??yt-1 . ??yt-p et
H0 ? 0 HA ? lt 0
(PP test makes a nonparametric adjustment for
lagged changes.)
Test equation derived from the primitive form
Yt ßYt-1 et
ß lt 1 ? stationary ß 1 ? non-stationary
ß gt 1 ? explosive
KPSS Test H0 the series is stationary
HA the series is
non-stationary
(use when the sample size is small)

15
Treating Non-stationary variables

Before using a non-stationary series in any
regression, we have to first treat it.
Possible Remedies
1) first-difference it ?yt yt - yt-1
A series is
I(0) if it is stationary
I(1) if it becomes stationary when differenced
once
I(2) if it becomes stationary when differenced
twice
2) adjust for trend
Sometimes a series can become stationary after
de-trending, the it is called trend-stationary.
3) Field-specific treatments Use
inflation-adjusted series in financial
time-series use log returns.

Sometimes, a variable may be stationary but can
have strong persistence. To check this, obtain
ACF and PACF.
A variable yt is called white noise if
yt
i.i.d. (0, ?2)
When Yt and Xt are both white noise, then a
regression analysis in the form Yt ß0ß1Xtet
is adequate, otherwise the problem of serial
correlation in residuals will arise.
Portmanteau Test
H0 Xt is white noise
HA Xt is not white noise
Solution include lags of the persistent
dependent variable.

17
Summary

Serial correlation in et may result from various
reasons, each signaling that the econometrician
is doing something wrong
1) missing variable (omitted variable bias)
2) incorrect functional form
3) using nonstationary variables
4) persistence in the variables
5) (linear deterministic) trend
These reasons are interrelated (e.g., to correct
persistence in Xt, you add Xt-1, which was a
missing variable in the original specification).

18
Diagnosis

1) Always see the time series plots of all
variables before you run any regression. Check
for stationarity, persistence, trend, seasonality
and outliers. Any unexpected result should remind
you to follow all steps on this page.
2) Formal tests. Before using any variable in
regression Always perform unit root tests, Check
persistence (autocorrelations).
3) Post-regression Always perform Durbin-Watson
and Breusch-Godfrey tests for serial correlation
in residuals.

19
Treatment

1) Try to find the reason, if it is due to a
distortion (e.g., a missing variable, inaccurate
model specification, using nominal variables,
trend term, persistence), the first-best solution
is to remove the distortion (find all relevant
variables and the most appropriate model
specification this is done by reviewing the
theory, use real variables, adjust for trend, add
lags).
2) First-differencing is the ultimate solution,
especially in case of nonstationary variables.
Take first difference of logs ?ln(yt)
ln(yt)-ln(yt-1)