Loading...

PPT – MULTIPLE REGRESSION OF TIME SERIES PowerPoint presentation | free to view - id: eea6d-NzA3Z

The Adobe Flash plugin is needed to view this content

CHAPTER 10

- MULTIPLE REGRESSION OF TIME SERIES
- LINEAR MULTIPLE REGRESSION MODEL
- General Multiple Regression Model

BIG CITY BOOKSTORE EXAMPLE

- Multiple Regression Model
- Coefficient of Multiple Determination-R2
- Partial Regression Coefficients
- Describing the Regression Plane

THE MULTIPLE REGRESSION MODELING PROCESS

- MULTICOLLINEARITY
- Collinearity Among Variables
- Solutions Multicoll. Problems
- An Example Solution
- PARTIAL F-TEST FOR INCLUDING VARIABLES SERIAL

CORRELATION PROBLEMS Forecasting with Serially

Correlated Errors

ANALYSIS OF STOCK INDEXES USING COILS

- ELASTICITIES AND LOGARITHMIC RELATIONSHIPS
- HETEROSCEDASTICITY

- Wrong Functional Form-UK and US Stock Indexes

Goldfeld-Quandt Test

Interpretation of Elasticities

- WEIGHTED LEAST SQUARES
- GENERALIZED LEAST SQUARES
- BETA COEFFICIENTS
- DICHOTOMOUS (DUMMY) VARS. FOR MODELING EVENTS
- CONSTRUCTING CONFIDENCE AND PREDICTION INTERVALS
- PARSIMONY AND REGRESSION ANALYSIS
- AUTOMATED REGRESSION METHODS

CHAPTER 10 MULTIPLE REGRESSION OF TIME SERIES

- "Theorize longer, analyze shorter. Don't be in

a rush to run the program. Think about the model

from every angle, hypothesize how different

variables affect each other. When you have a

theory, then try it. Impatience is the enemy of

valid models. Contemplation is productive work."

The Author"Measure twice, cut once." The

Carpenter's Rule

GENERAL LINEAR MULTIPLE REGRESSION

- General Multiple-Regression Model Y a

b1X1 b2X2...bnXn e (10-1)"n" is

rarely above 6 to 7.

- BIG CITY BOOKSTORE EXAMPLE
- Table 10-1. Big City Bookstore

- Year Sales(Y) Advertising(X1) Competition

(X2) - (1000) (1000) (1000sq.ft.) 1

27 20 10 2

23 20 15 3

31 25 15 4

45 28 15 5

47 29 20 6 42

28 25 7 39

31 35 8 45

34 35 9 57

35 20 10 59

36 30 11 73

41 20 12 84 45

20

Multiple Regression Model

- Correlation Matrix Sales

Advertising CompetitionSales 1

.964 .221Advertising .964

1 .426 Competition

.221 .426 1 - SALES f(ADV.) IGNORING COMPETITION y

-23.02 2.280X1 (10-2)

(-3.64) (11.5) (X1 advertising)

Syx 5.039 R2 .923 n12

F132.26 DW 1.13676

- SALES f(COMP.) IGNORING ADVERTISING
- y 37.34 .477X2 (10-3)

(2.339) (.687) (X2competition)

Syx 18.574 R2 -.045 n 12 F

.507 DW .3767 - SALES f(ADV. , COMP.) SIMULTANEOUSLY
- y -18.80 2.525X1 - .545X2 (10-4)

(-4.879) (19.50) (-4.432) Syx 2.978

R2 .973 n 12 F 199.21 DW

1.7705

- Table 10-2. Simple and Multiple Regression for

Big City Bookstorea) Linear

Regression Salesf(Advertising)-Eq.10-2

1 Dependent Variable SALES 2 Usable

Observations 12 Degs of Freedom 10 3 R

Bar2

0.9227 4 Std Error of Dependent Variable

18.1225 5 Standard Error of Estimate

5.0394 6 Sum of Squared Residuals

253.9530 7 Regression F(1,10)

132.26 8 Significance Level

of F 0.00000044 9

Durbin-Watson Statistic

1.13710 Variable Coeff Std Error T-Stat

Signif

11 Constant -23.02 6.316 -3.644

0.004512 ADVERT 2.28 0.198

11.500 0.0000

- b) Linear Regression Salesf(Competition)-Eq.10-3

1 Dependent Variable

SALES 2 Usable Observations 12 Degs of Freedom

10 3 R Bar2

-0.050 4 Std Deviation of Dependent

Variable 18.123 5 Standard Error of

Estimate 18.574 6 Sum of

Squared Residuals 3449.780 7

Regression F(1,10) 0.472

8 Significance Level of F 0.5076

9 Durbin-Watson Statistic

0.37710 Variable Coeff Std Error T-Stat

Signif

11 Constant 37.3372 15.960 2.339

0.041412 COMP 0.4767 0.694

0.687 0.5076

- c) Mult Regression Salesf(Adver.

Comp.)-Eq.10-4 1 Dependent

Variable SALES 2 Usable Observations 12 Degs

of Freedom 9 3 R Bar2

0.9730 4 Std Deviation of Dependent

Variable 18.1225 5 Standard Error of Estimate

2.978 6 Sum of Squared

Residuals 79.803 7 Regression

F(2,9) 199.2155 8

Significance Level of F

0.00000004 9 Durbin-Watson Statistic

1.77110 Variable Coeff

Std Error T-Stat Signif

11 Constant -18.7958

3.8520 -4.880 0.000912 ADVERT 2.5248

0.1295 19.495 0.000013 COMP

-0.5449 0.1230 -4.432 0.0016

- Multiple Coefficient of Determination - R2

Expl Variance Unexp Var.

Syx2R2 1 - 1-

Total Variance Total Var. Sy2 - Partial Regression Coefficients Y -18.80

2.52530 - .545X2 e Y

-18.80 75.75 - .545X2 e 56.86 -

.545X2 e (10-5)

- Figure 10-2 Here-Deviations About a Plane or

HyperspaceDescribing The Regression

PlaneFigure 10-3 Here Regression Plane for

Equation 10-3.Figure 10-4 Here Several Reg

Lines on the Reg PlaneFigure 10-5 The Multiple

Regression Modeling Process

MULTIPLE REGRESSION MODELING-PLOTS

- Res. VS Included Indep Vars. Detect

heteroscedasticity, misspecification (Nlin) - Res. VS Excluded Indep Vars. Detect variable to

be included, misspecification (Nlin) - Res. VS Y. Detect serial correlation,

heteroscedasticity,misspecifications - Residualst VS Residualst-1. Detect serial

correlation, out of sample projections,

unreasonable forecasts.

MULTICOLLINEARITY

- Highly Related Independent Variables may or may

not be a problem if a problem coefficient be

wrong also its standard error

Syx Sb1

(10-6) ?x2(1 -

r122) With r121, Impossible to fit unique

model because of their redundancy.

- Multicollinearity Problems (MCP)
- May not be evident to the analyst
- Yields the wrong sign or insignificant t-values
- Avoid MCPs By
- Good theory
- Large sample sizes
- Good diagnostic procedures
- Sometimes MCP are simply an artifact of the sample

DETECTING

- Insignificant/Incorrect regression coefficients
- Some strange regression results from MCP
- Assume correlation between X1 and X2 is high
- Each is highly correlated with Y
- Often one regression coefficient is negative

despite the positive relationship - Often one variable is highly significant the

other not - Often the sum of the regression coefficients

equals true, single variable regression

coefficient.

COLLINEARITY AMONG MORE THAN TWO VARIABLES

- What is the problem? Consider that some for of

Linear Transformation of X2 and X3 perfectly

defines X1 - X1 a b2X2 b3X2
- with Syx 0, R2 1, and r123 1 Thus, when

an attempt is made to fit the following

relationship, a solution is not possible.

- Y a b1X1 b2X2 b3X3 eWhen perfect

collinearity the estimation procedure aborts.

Often with Dichotomous Variable

t Yt d1

d2 d3 d4 1 10 1

0 0 0 2 20 0

1 0 0 3 30

0 0 1 0 4 5

0 0 0 1

- Thus, d1 1 - d2 - d3 - d4
- Solution is impossible.
- Avoid-always defining one-less variable
- The last var. is part of constant
- Dummy vars. are studied here later

Solutions to Multicollinearity Problems

- With redundant measures delete the redundant

variable. Good theory precludes most redundant

variables. - Some MCP are an artifact of a specific sample.

Then additional obs. may eliminate the problem. - MCP from flawed theories. When vars. represent

different dimensions of an influence then they

might be combined using factor analysis

- When MCP is caused by a unique sample use ridge

regression. - When Theory dictates that both variables should

be included, then include them.While MCP

affects regression coefficients and their

interpretability, it might not alter the

predictive power of the regression model. That

is, the overall relationship may still be useful

in predictive power, this being confirmed by a

low standard error of estimate and high F-value.

To better understand this, consider the example

below.

- An Example of MCPs (MULT.DAT)
- Table 10-3. Correlation Matrix

X1 X2 X3

X4 Y X1 1.0000 -0.1067

0.1821 0.9998 0.4622 X2 -0.1067 1.0000

0.1031 -0.1053 0.7479 X3 0.1821 0.1031

1.0000 0.1830 0.5334 X4 0.9998 -0.1053

0.1830 1.0000 0.4638 Y 0.4622 0.7479

0.5334 0.4638 1.0000

- Table 10-4. Models Illustrating

Multicollinearity Problems - X1

X2 X3 X4 R2

F-value Syx t-values under each

coefficient) (significance)

M1

9.16 6.06 76.51

162.2 2166.5 (14.3)

(9.4) (.0000) M2 14.82

9.96 4.84 98.51

2188.9 544.9 (37.9) (61.26)

(29.33) (.0000) M3

9.95 4.83 14.83 98.53 2217.0

541.4 (61.61) (29.47)

(38.17) (.0000) M4 -7.07 9.94

4.83 21.89 98.52 1648.3 543.8

(-.38) (61.18) (29.30) (1.17)

(.0000)

Partial F-test for Including Variables

- DETERMINING IF VARS. SHOULD BE IN A RELATIONSHIP

TEST WHETHER M-VARIABLES SHOULD BE INCLUDED

(SSER - SSEU)/m Fcalculated

(10-7)

SSEU/(n-k-1)

- where SSEU Sum of Squared Errors with all

variables - in the relationship, called the
- unrestricted SSE SSER Sum of Sqed

Errors with m vars. - excluded, called the restricted SSE

k-1 Total no. of unrestricted indep.variables

m Number of restricted independent variables

a Chosen level of sign., typically .01 or .05

- This test is used as follows
- Estimate a full, unrestricted k-var.

modelCapture the SSEU, - Estimate a partial, restricted model, k-m

var. Capture the SSER. - Calculate F using eq. 10-6 Compare to F-table

with df of (i.e., m, n-K-1) and alpha value,

that is Fm,n-k-1,a. - If F-cal gtF-table, then SSER is significantly

greater than the SSEU.

- Denotes that
- unexpl. Var. Res. gt unexpl. Var. Unres.
- If F-cal lt F-table then SSER SSEu thus no

significant additional explained variance from

unrestricted model. - Again, If F-cal. gt F-table then SSER gt SSEu ,

there is additional explained variance from the

unrestricted model. - Consider Big City Bookstore W and W/O Competition

- (SSER - SSEU) / m F-cal

SSEU / (n-K-1)

(253.9 - 79.8)/1

19.64 (10-7a)

79.8/(12-2-1) - F-cal 19.64 gtF-table Fm,n-K-1,a

F1,9,a.055.12 - F-cal 19.64 gt F-table F1,9,a .01 10.56

We infer include COMP. This is a powerful test

SERIAL CORRELATION PROBLEMS

- An Assumption of OLS - residuals are independent.

That is, ACF(k) 0 for all k gt 0 - When et have ACF(k) 0 then there may be a

deficiency in model/estimationConsider Table

10-2a), b), and c). - Serial Correlation denotes the following may be

incorrectR2, Syx, Sb, b-t-values

- First order serial correlation denotes

Yt a bXt ret-1 et

(10-8)where r is rho, the first-order

coefficient.In ARIMA terms r is actually q1

How to estimate r?

- One of Several Iterative Processes-Including

Cochrane-Orcutt Iterative Least Squares

(COILS), Hildreth-Lu method, and Prais-Winston

methods. We Illustrate the COILS Method

COILS Given Yt a bXt

ret-1 et (10-9)

- Therefore from et-1 Yt-1 -Y t-1 ret-1

r(Yt-1-Yt-1) r(Yt-1-(abXt-1

ret-2et-1)) (10-10) - substituting equation 10-10 into 10-9 yields

Yt abXtr(Yt-1-(a bXt-1ret-2 et-1))et - expanding and combining a's into a new term
- Yt a bXt rYt-1 - ra rbXt et

Yt - rYt-1 a bXt - rbXt-1 et (10-11) - reintroducing backshift operator

(1-B)Yt (Yt-Yt-1)

- and therefore (1- rB)Yt

(Yt- rYt-1)therefore equation 10-11 can be

simplified to (1- rB)Yt a b(1- rB)Xt

et (10-12) - This is estimated iteratively by trial and error

using different values of r , called the

Cochrane-Orcutt Iterative Least Squares (COILS)

procedure.

- COILS can be used with OLS Software
- Run OLS to determine first r (i.e., ACF(1) of et

). - Using r transform Yt and Xt to Yt and Xt

Yt Yt - rYt-1 (1- rB) Yt Xt Xt - rXt-1

(1- rB) Xt - Save these new variables for use in Yt

a b Xt et (10-13)(We lose one

observation in backshifting. The

Prais-Winston method does not.)

- Estimate a and b using OLS in Eq. 10-11
- Iteratively Search for r with MIN(SSE)
- Using this r, use coef. of eq. 10-13 in eq. 10-8

However, remember that the a

a (1- r)aFigures 10-5 and 10-6

illustrate Xt and Yt

- Table 10-5. OLS Between Y and X,

AR1DAT.DAT Usable

Observations 100 Degrees of Freedom 98R

Bar2

0.5924Std Error of Dependent Variable

2.898Standard Error of Estimate

1.850Sum of Squared Residuals

335.476Regression F(1,98)

144.895Significance Level of F

0.00000000Durbin-Watson Statistic

0.905Q(25)

61.009Significance

Level of Q 0.00008

Variable Coeff Std Error T-Stat

Signif

1. Constant 79.894 9.899 8.071

0.000000002. X 0.681 0.057

12.037 0.00000000

- Table 10-6. ACFs of et for OLS of Table

10-5. 1 0.547 0.239

0.242 0.147 0.084 0.049 7 0.005 -0.003

-0.025 -0.141 -0.103 0.001

2Approx. 2SeACF

.20 100 - Using the ACF(1) of .547 yields Yt

yt - .55Yt-1 Xt xt -

.55Yt-1Regressing these two variables yields

Table 10-7.

- Table 10-7. Y f(X) for r.55

Dependent Variable

Y-Estimation by Least SquaresUsable Obs. 99

Degs. of F. 97R Bar2

0.456 Std Error of

Dependent Variable 2.0089 Standard

Error of Estimate 1.4823

Sum of Squared Residuals

213.145 Regression F(1,97)

82.99 Significance Level of F

0.00000000Durbin-Watson Statistic

1.637 Q(24)

22.958

Significance Level of Q

0.5223 Variable Coeff Std Error

T-Stat Signif

1. Constant 49.731

4.3745 11.368 0.000000002. X

0.506 0.0555 9.110 0.00000000

- Now, let's try r.45 and r.65

Yt yt - .45Yt-1 and Xt

xt - .45Yt-1Table 10-8. Y f(X) for

r.45 - Dependent Variable Y-Estimation by Least

SquaresUsable Obs 99 Degrees of

Freedom 97R Bar2

0.483 Standard Error of Estimate

1.519 Sum of Squared Residuals

223.879 Regression F(1,97)

92.44

Significance Level of F

0.00000000Durbin-Watson Statistic

1.478 Q(24)

26.128 Significance

Level of Q 0.34668

Variable Coeff Std Error T-Stat

Signif

1. Constant 57.403 5.4164 10.598

0.000000002. X 0.541 0.0563

9.615 0.00000000 This

is worse than r .55

- Consider the r in the opposite direction, r

.65 Yt yt - .65Yt-1

and Xt xt -

.65Yt-1Table 10-9. Y f(X) for r.65

- Dependent Variable Y-Estimation by Least

SquaresUsable Obs 99 Degrees of

Freedom 97R Bar2

0.432 Standard Error of Estimate

1.465 Sum of Squared Residuals

208.201 Regression F(1,97)

75.52

Significance Level of F

0.00000000Durbin-Watson Statistic

1.795 Q(24)

23.328 Significance

Level of Q 0.5005

Variable Coeff Std Error T-Stat

Signif

1. Constant 40.549 3.353 12.09

0.000000002. X 0.476 0.055 8.69

0.00000000

- Table 10-10. Iterations of r to Minimum SSE.

r SSE

D-W Statistic.00 535.5

.9053.45 223.88

1.478 .55 213.145

1.637.65 208.20

1.795.75 209.79

1.9331.85 218.626

2.033.95 235.22

2.082 denotes optimal

value of r in manual search.

Forecasting With Serially Correlated Errors

- Yt a bXt r et-1 et Yt

40.55/(1- .65) .476Xt .65et-1 et

Yt 115.85 .476Xt .65et-1 et

(10-14) - Yt made at the end of period t-1 Yt 115.85

.476Xt .65et-1 Yt1 made at the

end of period t-1 Yt1115.85 .476Xt1

.65(0) (10-15)where et-1 is unknown in

period t1. - Cochrane-Orcutt Iterative Least Squares (COILS)

- Table 10-11. Y f (X) COILS

Usable Obs 99

Degrees of Freedom 96R Bar2

0.744 Std Error of

Dependent Variable 2.910 Standard

Error of Estimate 1.472 Sum

of Squared Residuals 207.954

Durbin-Watson Statistic 1.835

Q(24)

24.037 Significance Level of Q

0.4018 Variable Coeff Std Error

T-Stat Signif

1. Constant 117.105 9.592

12.208 0.000000002. Xt 0.468 0.055

8.550 0.00000000

3. RHO(r) 0.677 0.077

8.815 0.00000000

- Because r is so high, only a fraction of the

explained variance is attributed to Xt. - This R2 and RSE are indicative of one-period

forecast. After one period, the influence of r

declines to zero,The RSE (Standard Error of

Estimate) for Yt1 for Kgt1

REVIEWING COILS

- OLS Yt 79.894 .681Xt et

RSE 1.850 DW0.905 - Correct Coefficients from COILS

Yt 117.105 .468Xt et RSE

1.472 DW1.835Figures 10-7 and 10-8 Here

- Table 10-12. Y f(X) by OLS, ARDAT.DAT

Usable Obs 100 Degrees of

Freedom 98R Bar2

0.368 Std Error of Dependent Variable

1.644 Standard Error of Estimate

1.306 Sum of Squared

Residuals 167.243

Regression F(1,98)

58.70 Significance Level of F

0.00000000Durbin-Watson Statistic

0.211 Q(25)

310.561 Significance

Level of Q 0.00000000 Variable

Coeff Std Error T-Stat

Signif

1. Constant 192.602 12.077 15.948

0.000000002. Xt -0.907 0.118

-7.661 0.00000000

- Table 10-13. Yf(X)-Estimation by

COILS - Usable Obs 99 Degrees of Freedom

96R Bar2

0.910Std Error of Dependent Variable

1.652 Standard Error of Estimate

0.497 Sum of Squared Residuals

23.6708 Durbin-Watson Statistic

2.012 Q(24)

25.823

Significance Level of Q

0.30927 Variable Coeff Std Error

T-Stat Signif

1. Constant 114.298 12.331

9.269 0.000000002. Xt -0.136

0.120 -1.133 0.25960000

3.

RHO(r) 0.952 0.032 29.582

0.00000000

- This was generated using a random number

generator X0 100 Y0 100Xt

Xt-1 (1.5 - (RAN1 RAN2 RAN3))Yt Yt-1

(1.5 - (RAN4 RAN5 RAN6))ANALYSIS OF STOCK

INDEXES USING COILS