Ka-fu Wong University of Hong Kong - PowerPoint PPT Presentation

About This Presentation
Title:

Ka-fu Wong University of Hong Kong

Description:

... behavior of the series associated with weather patterns, holiday patterns, etc. ... instance, in the linear trend model, the forecast of TT h made at time T ... – PowerPoint PPT presentation

Number of Views:30
Avg rating:3.0/5.0
Slides: 48
Provided by: kafuw
Category:

less

Transcript and Presenter's Notes

Title: Ka-fu Wong University of Hong Kong


1
Ka-fu WongUniversity of Hong Kong
Modeling and Forecasting Trends
2
Background
  • The unobserved components approach to modeling
    and forecasting economic time series assumes that
    the typical economic time series, yt, is made up
    of the sum of three independent components
  • a time trend component
  • a seasonal component
  • an irregular or cyclical component.
  • yt time trend seasonal cyclical Tt
    St Ct
  • The time trend refers to the long-run average
    behavior of the series.
  • The seasonal refers to the annual predictable
    cyclical behavior of the series associated with
    weather patterns, holiday patterns, etc.
  • The cyclical component refers to the remainder of
    the series after the trend and seasonal have been
    accounted for.

3
Background
  • The assumption that these components are
    determined independently means that each
    component is determined and influenced by its own
    set of forces and, consequently, each component
    can be studied separately.
  • The approach is called an unobserved components
    approach because we do not directly observe each
    of the three components we only get to observe
    their sum. Our job will be to model and estimate
    the various components and use these estimates as
    the basis for forecasting the components and
    their sum.

4
Background
  • Whether the assumption underlying the unobserved
    components approach, that the trend, seasonal,
    and cyclical components are determined
    independently, is plausible or not is debatable
    and is, in fact, an issue of some controversy
    among economists.
  • For example, many macroeconomists argue that
    economic growth (trend) and the business cycle
    (cyclical) are determined by a common set of
    forces.

5
U.S. Female Labor Force Participation Rate
6
U.S. Male Labor Force Participation Rate
7
Hong Kong labor force participation rates (male
and female)
8
Chinas per Capita Real GDP
9
Modeling the Trend
  • If we look at Chinas per capita real GDP time
    series or any one of your time series, the first
    thing that stands out us is the obvious tendency
    of the series to grow (or, in some cases, to
    fall) over time.
  • That is, it is immediately apparent from the time
    series plot that the average change in the series
    is positive (or, in some cases, negative). This
    tendency is the seriess trend.
  • The simplest model of the time trend is the
    linear trend model
  • Tt ß0 ß1t, t 1,,T

10
Modeling the Trend
  • The simplest model of the time trend is the
    linear trend model
  • Tt ß0 ß1t, t 1,,T
  • The trend component is a straight line with
    intercept ß0 and slope ß1. And, T1 ß0 ß1, T2
    ß0 2ß1,,TT ß0 Tß1.
  • Note that ß1 dTt/dt and ß1 Tt Tt-1. So,
  • ß1 gt 0 if y has a positive trend and
  • ß1lt 0 if y has a negative trend.
  • The intercept, as is often the case in
    econometric models, does not have a meaningful
    interpretation and its sign can be positive or
    negative, regardless of the trends sign.

11
Graphical view of linear trend
An downward trend
An upward trend
12
Polynomial trend model
  • In some cases, a linear trend is inadequate to
    capture the trend of a time series. A natural
    generalization of the linear trend model is the
    polynomial trend model
  • Tt ß0 ß1t ß2t2 ßptp where p is a
    positive integer.
  • Note that the linear trend model is a special
    case of the polynomial trend model (p1).
  • For economic time series we almost never require
    p gt 2. That is, if the linear trend model is not
    adequate, the quadratic trend model will usually
    work
  • Tt ß0 ß1t ß2t2
  • In the quadratic model, dTt/dt ß12tß2

13
Graphical view of Quadratic Trends
14
Graphical view of Quadratic Trends
15
The Log Linear Trend Model
  • Another alternative to the linear trend model is
    the log linear trend model, which is also called
    the exponential trend model
  • Tt ß0exp(ß1t)
  • or, taking natural logs on both sides,
  • log(Tt) log(ß0) ß1t
  • so that the log of the trend component is
    linear.
  • Note that for the log linear trend model
  • ß1 log(Tt) log(Tt-1) change in T

16
Graphical view of exponential trends
17
Graphical view of exponential trends
18
Which trend model to use?
  • Knowing the differences among these models can
    help us decide whether the linear, quadratic or
    log linear trend model is more appropriate for
    our data.
  • In the linear trend model the change in T is
    constant over time.
  • In the quadratic trend model the change in T has
    a linear trend.
  • In the log linear trend model the growth rate
    that is constant over time.
  • However, in practice, it is not always obvious by
    simply looking at the time series plot which form
    the trend model should take linear, log linear,
    quadratic? Other?
  • Practice and experience are the most helpful.

19
All Deterministic Trend Models
  • Note that in all of these models, the trend is
    deterministic, i.e., perfectly forecastable. For
    instance, in the linear trend model, the forecast
    of TTh made at time T is
  • ß0 (Th)ß1 TTh
  • (Later in the course we will talk about
    stochastic trend models, in which the trend of
    the series is not perfectly forecastable.)
  • However, even if we correctly specify the shape
    of the trend (linear, quadratic, exponential, ),
    the parameters of the trend model are unknown.
    So, in practice, we will have to estimate these
    parameters, which will introduce errors (called
    sampling or estimation error) into our trend
    forecasts.

20
Estimating the Trend Model
  • Our assumption at this point is that our time
    series, yt, can be modeled as
  • yt Tt(?) et
  • where
  • Tt is one of the trend models we discussed
    earlier,
  • ? is the set of parameters ? (ß0, ß1) in a
    linear trend model.
  • et denotes the other factors (i.e., the seasonal
    and cyclical components) that determine yt.
  • Since ? is unknown, it is natural to estimate the
    trend model via the least squares approach

Quadratic loss
The choice of ? that will minimize the objective
function.
21
Estimating the Trend Model via the Least Squares
approach
  • For linear trend model

can use OLS
  • For quadratic trend model

can use OLS
  • For exponential trend model

Nonlinear, has to be estimated numerically.
or
can use OLS
22
Property of the Ordinary Least Squares Estimators
  • Under the assumptions of the unobserved
    components model, the OLS estimator of the linear
    and quadratic trend models is
  • unbiased,
  • consistent, and
  • asymptotically efficient.
  • Standard regression procedures can be applied to
    test hypotheses about the ?s and construct
    interval estimates. This is true even though the
    regression errors will generally be serially
    correlated and heteroskedastic.

23
Forecasting the Trend
  • Once we have specified a trend model our forecast
    of the h-step ahead trend component of y will
    simply be
  • Tth(?)
  • When ? is unknown, we can estimate it as
    discussed earlier. And, substitute the estimate
    into the function above.

24
Forecasting the Trend
We would like to forecast yTh based on all
information available at time T.
  • Assume that the trend is linear.

If we know the true parameters, the part
?0?1TIMETh can be forecasted perfectly.
Can we forecast eTh? Sometimes YES. Sometimes
NO.
NO when et is known to be an independent
zero-mean random noise.
If et is an i.i.d. sequence with zero mean then
E(eTh ? information available at time T)
E(eTh) E(et) 0.
independent
identical
zero mean
25
Forecasting the Trend
Assume et is known to be an independent zero-mean
random noise.
Forecast when parameters are known
Emphasize that forecast is made at time T,
utilizing all information that is available at
time T (usually all past information).
Forecast error
Fundamental uncertainty! Unavoidable !!
Forecast when parameters are unknown
Substitute in the estimate from the OLS
regression.
Forecast error
Due to parameter uncertainty. (increases with h)
Note TIMETh Th
26
Density forecast
  • Suppose we have no parameter uncertainty, we have

and
The forecast error
  • Then the distribution of the forecast error will
    simply be the distribution of ?Th. That is, for
    any real number c,

E(eTh,T) 0,Var(eTh,T) ?2, where ?2
var(?t).
27
Density forecast
  • Further assume the ?s are i.i.d. N(0,?2), while
    continuing to ignore parameter uncertainty. Then
    the density forecast will be that
  • Note that this density forecast depends on the
    unknown parameter ?2. To make the density
    forecast operational, we can replace ?2 with an
    unbiased and consistent estimator,

28
Density forecast
  • Now consider the case with parameter uncertainty.
  • Under usual assumptions, the forecast error due
    to parameter uncertainty is asymptotically normal.
  • Thus, eTh,T will be asymptotically normal.
  • The unknown variance may be estimated as

Because we assume a linear trend.
29
(No Transcript)
30
Density forecast
Then we act as though yTh is distributed as
or, equivalently,
So, for example,


where Z is an N(0,1) random variable.
31
Density forecast
  • Further, we can construct interval forecasts of
    yTh according to

is a (1-?)100 forecast interval for yTh,
where Z1-(?/2) is the (1-(?/2))100 percentile
of the N(0,1) distribution.
  • For example, if ? .05 then we obtain a
    95-percent forecast interval for yTh,

since 1.96 is the 97.5 percentile of the N(0,1).
  • Recall the interpretation of this kind of
    interval 95 of the time, this procedure will
    produce an interval that will turn out to include
    the actual value of yTh.

32
Selecting Forecasting ModelsR-square as a
criteria
  • Consider the mean squared error (MSE)

where T is the sample size and
  • Note that models with smallest MSE is also the
    model with smallest sum of squared residuals,
    because scaling the sum of squared residuals with
    a constant (1/T) will not change the ranking.

33
Selecting Forecasting ModelsR-square as a
criteria
  • Consider the R-square (R2)

Depends only on data, not on model.
  • Thus, models with the largest R-square is also
    the model with the smallest MSE, and also the
    model with smallest sum of squared residuals,
    because scaling the sum of squared residuals by a
    model-independent quantity will not change the
    ranking.

34
Selecting Forecasting ModelsR-square as a
criteria
  • The R-square (R2) may be a good measure of
    in-sample fit but a bad measure for out-of-sample
    fit.
  • Add an additional regressor in the model, we will
    always obtain a R-square (R2) no less than the
    one with less regressors. That is, a polynomial
    trend model with a larger p will almost always
    result in a smaller MSE and hence a larger
    R-square.
  • In fact, give me a time series and specify an R2,
    subject to data availability, I can almost always
    produce a trend model that will attain the
    specific R2.
  • This effect is called in-sample overfitting or
    data mining.

35
Selecting Forecasting ModelsR-square as a
criteria
  • In short, the MSE is a biased estimator of
    out-of-sample h-step-ahead prediction error
    variance.
  • because the forecast error consists of two
    parts
  • Fundamental uncertainty (unavoidable even if we
    know the parameters)
  • Parameter uncertainty (increases with the number
    of parameters in the model)
  • To reduce the bias associated with MSE and
    R-square, we need to penalize for the number of
    parameters included in the model (or the degree
    of freedom).

36
Selecting Forecasting ModelsAdjusted R-square as
a criteria
  • Adjusted R-square

Number of parameters or degree of freedom
  • Maximizing adjusted R-square is like minimizing
    s2.

S2 increases with number of parameters.
37
Selecting Forecasting ModelsCriteria that
penalize number of model parameters
  • Akaike information criterion (AIC)
  • Schwarz information criterion (SIC)

38
The variation of criteria with k/T
39
Use the consistent model selection criteria
  • A model selection criterion is consistent if the
    following conditions are met
  • When the true model i.e., the data-generating
    process (DGP) is among the models considered,
    the probability of selecting the true DGP
    approaches 1 as the sample size gets large.
  • When the true model is not among those
    considered, so that it is impossible to select
    the true DGP, the probability of selecting the
    best approximation to the true DGP approaches 1
    as the sample size gets large.
  • SIC is consistent but AIC is not.

40
Use the asymptotically efficient model selection
criteria
  • A asymptotically efficient model selection
    criterion chooses a sequence of models, as the
    sample size get large, whose 1-step-ahead
    forecast error variances approach the one that
    would be obtained using the true model with known
    parameters at a rate at least as fast as that of
    any other model selection criteria.
  • AIC is asymptotically efficient but SIC is not.

41
AIC or SIC
  • Usually AIC and SIC suggest the same model.
  • When AIC and SIC suggest different models, we
    usually choose the model selected by SIC because
    the SIC often suggests a more parsimonious model
    (i.e., smaller number of parameters).

42
AIC and SIC reported across software packages
  • ln(AIC) ln(MSE) 2k/T
  • ln(SIC) ln(MSE) kln(T)/T

43
Out-of-sample fitting
  • The AIC and SIC are in-sample fit criteria,
    although they account for the costs of
    overfitting through the inclusion of penalty
    term.
  • What we are really interested in is the question
  • Having fit the model over the sample period, how
    well does it forecast outside of that sample?
  • The in-sample fit criteria that we discussed do
    not directly answer this question.

44
Out-of-sample fitting
  • Suppose we have a data sample y1,,yT.
  • Break it up into two parts (where n ltlt T)
  • y1,yT-n (first T-n observations)
  • yT-n1,,yT (last n observations)

1
T-n
T
T-n1
Use to estimate the model
Save n observations for checking the
out-of-sample fit
45
Out-of-sample fitting
  • Break it up into two parts (where n ltlt T)
  • y1,yT-n (first T-n observations)
  • yT-n1,,yT (last n observations)
  • Fit the shortened sample, y1,,yT-n to various
    trend models that may seem like plausible choices
    based on time series plots, in-sample fit
    criteria, linear, quadratic, the one selected
    by AIC/SIC, log linear,
  • For each estimated trend model, forecast
    yT-n1,,yT and compute the forecast errors
    e1,,en
  • Compare the errors across the various models
  • time series plots (of the forecasts and actual
    values of yT-n1,,yT of the forecast errors)
  • tables of the forecasts, actuals, and errors
  • mean squared prediction errors (MSPE)

46
Out-of-sample fitting
  • The advantage of this approach is that we are
    actually comparing the trend models in terms of
    their out-of-sample forecasting performance.
  • A disadvantage is that the comparison is based on
    models fit over T-n observations rather than the
    T observations we have available. (Note that if
    you do use this approach and, for example, settle
    on the quadratic model, then when you proceed to
    construct your forecasts for T1, you should
    use the quadratic model fit to the full T
    observations in your sample.)
  • Will the fact that, for example, the quadratic
    trend model outperformed other models in
    forecasting out of sample based on the short
    sample mean that it will perform best in
    forecasting beyond the full sample? No.

47
End
Write a Comment
User Comments (0)
About PowerShow.com