Multiple Regression Model: Hypotheses Testing - PowerPoint PPT Presentation

1 / 42
About This Presentation
Title:

Multiple Regression Model: Hypotheses Testing

Description:

Typically, an estimated econometric model is used to make ... We add another assumption (beyond the Gauss-Markov assumptions). Assumption 6: Normality. ... – PowerPoint PPT presentation

Number of Views:249
Avg rating:3.0/5.0
Slides: 43
Provided by: PatriciaM47
Category:

less

Transcript and Presenter's Notes

Title: Multiple Regression Model: Hypotheses Testing


1
Multiple Regression ModelHypotheses Testing
  • y b0 b1 x1 b2 x2 . . . bk xk u

2
Lecture 4 THE MULTIPLE REGRESSION MODEL
HYPOTHESES TESTINGProfessor Victor Aguirregabiria
  • OUTLINE
  • Introduction and Examples
  • Sampling Distribution of OLS
  • Hypotheses on one parameter t-test
  • Hypotheses on a linear combination of parameters
    t-test
  • Hypotheses on Multiple Linear Restrictions F-test

3
1. Introduction and Examples
  • Typically, an estimated econometric model is used
    to make predictions and to make decisions.
  • In both cases, it is important to know how much
    uncertainty (or how much confidence) we have in
    the predictions and decisions we made based on
    the estimated model.
  • There are different sources of uncertainty (right
    model?). Here we concentrate on the uncertainty
    generated by the fact that we have estimates and
    not the true parameters.

4
Example 1 Demand Elasticity and Pricing
  • Consider the example (in Lecture 2) of the
    manager interested in the estimation of the
    elasticity of demand.
  • The demand model is
  • Where ? is the elasticity of demand.

5
Example 1 Demand Elasticity and Pricing
  • Given the estimated model, the manager will use
    it to decide its profit maximizing price.
  • The manager may decide to take this estimate to
    choose her optimal price

6
Example 1 Demand Elasticity and Pricing
  • However, this decision is ignoring that there is
    uncertainty associated with the estimate, and
    therefore there is uncertainty in the profit.
  • The profit function is
  • But the true ? is unknown.

7
Example 1 Demand Elasticity and Pricing
  • Note that from the point of view of the manager,
    the estimated elasticity is a known value (say
    3.45) and the true value of ? is a random
    variable (with mean 3.45).
  • Is this uncertainty important for the choice of
    the optimal price?
  • Yes, the optimal price that accounts for
    uncertainty is different than the price that
    ignores it.

8
Example 1 Demand Elasticity and Pricing
  • How can the manager take account this uncertainty
    when deciding prices?
  • There are different possible approaches.
  • To treat profits as a random variable
    (conditional on the estimate of ?), and maximize
    expected profits.
  • Construct confidence intervals for the true ? and
    use them to construct confidence intervals on
    optimal prices.

9
Example 2 Credit Scores and Mortgages
  • Consider a bank deciding whether to give a
    mortgage loan to a person.
  • To make this decision, the bank uses a MRM to
    predict the probability of repayment.
  • The dependent variable Y is an indicator, 0 o 1,
    of the repayment of a loan. The explanatory
    variables X are socio-economic characteristics of
    borrower, the aggregate economy, and the local
    economy.

10
Example 2 Credit Scores and Mortgages
  • Using data of previous mortgages, the bank
    estimates this MRM.
  • The fitted values of this model can be
    interpreted as predictions for the probability of
    repayment of a mortgage.
  • Given the estimated model, the bank has to decide
    whether to approve a mortgage to a new applicant.

11
Example 2 Credit Scores and Mortgages
  • Given the values of the X regressors for this
    applicant, the fitted value (credit score) of
    this applicant is 0.76.
  • What does the bank do with this credit score?
  • The bank has to choose either Y1 (approval) or
    Y0 (not approval).
  • There is a possible error associated with each
    decision.

12
Example 2 Credit Scores and Mortgages
  • Suppose that the bank considers the scenario
    (hypothesis) that the person will pay the
    mortgage (Y1).
  • If the bank accepts the hypothesis, it can make
    an ERROR TYPE I reject when it is true.
  • If the bank rejects the hypothesis, it can make
    an ERROR TYPE II accept when it is false.
  • Each error has a probability.

13
2. SAMPLING DISTRIBUTION OLS ESTIMATOR
  • In order to do hypothesis testing, we need to
    know the sampling distribution of the OLS
    estimator (not just the mean and variance).
  • We add another assumption (beyond the
    Gauss-Markov assumptions).
  • Assumption 6 Normality.
  • u is independent of x1, x2,, xk and u is
    normally distributed with zero mean and variance
    s2
  • u Normal(0,s2)

14
COMMENTS ON NORMALITY ASSUMPTION
  • Assumptions 1 to 6 are called the Classical
    Assumptions.
  • Under the Classical Assumptions, OLS is not only
    BLUE, but it is the minimum variance unbiased
    estimator.
  • Normality of u implies that, conditional on x
    yx
    Normal(b0 b1x1 bkxk, s2)
  • Under Classical Assumptions the OLS estimator is
    Normally distributed

15
The homoskedastic normal distribution with a
single explanatory variable
y
f(yx)
.
E(yx) b0 b1x
.
Normal distributions
x1
x2
16
COMMENTS ON NORMALITY ASSUMPTION
  • Do we really need to assume that u is normally
    distributed to have that the OLS is normal?
  • Not really.
  • When we have large samples (and under Assumptions
    1 to 5) the Central Limit Theorem implies that
    the OLS estimator is closed to be normally
    distributed.
  • We say that the OLS is asymptotically normal.
    This means that the normal distribution is a good
    approximation when the sample is large even when
    u is not normal.

17
3. Hypotheses about one parameter t-test
18
3. Hypotheses about one parameter t-test
  • Knowing the sampling distribution for the
    standardized estimator allows us to carry out
    hypothesis tests.
  • Start with a null hypothesis
  • For example, H0 bj 0
  • If accept null, then accept that xj has no effect
    on y, controlling for other xs

19
3. Hypotheses about one parameter t-test
20
3. Hypotheses about one parameter t-test
  • Besides our null, H0, we need an alternative
    hypothesis, H1, and a significance level.
  • H1 may be one-sided, or two-sided
  • H1 bj gt 0 is one-sided
  • H1 bj lt 0 is one-sided
  • H1 bj ? 0 is a two-sided alternative
  • If we want to have only a 5 probability of
    rejecting H0 if it is really true, then we say
    our significance level is 5

21
3. Hypotheses about one parameter t-test
  • Having picked a significance level, a, we look
    up the (1 a)th percentile in a t distribution
    with n k 1 df and call this c, the critical
    value
  • We can reject the null hypothesis if the t
    statistic is greater than the critical value
  • If the t statistic is less than the critical
    value then we fail to reject the null

22
One-Sided Alternatives (cont)
yi b0 b1xi1 bkxik ui H0 bj
0 H1 bj gt 0
Fail to reject
reject
(1 - a)
a
c
0
23
T-test One-sided alternatives (cont)
  • Because the t distribution is symmetric, testing
    H1 bj lt 0 is straightforward. The critical
    value is just the negative of before
  • We can reject the null if the t statistic lt c,
    and if the t statistic gt than c then we fail to
    reject the null
  • For a two-sided test, we set the critical value
    based on a/2 and reject H1 bj ? 0 if the
    absolute value of the t statistic gt c

24
T-test Two-sided alternatives
  • Because the t distribution is symmetric, for a
    two-sided test, we set the critical value based
    on a/2 and reject H1 bj ? 0 if the absolute
    value of the t statistic gt c

25
Two-Sided Alternatives
yi b0 b1Xi1 bkXik ui H0 bj
0 H1 bj ? 0
fail to reject
reject
reject
(1 - a)
a/2
a/2
-c
c
0
26
Summary for H0 bj 0
  • Unless otherwise stated, the alternative is
    assumed to be two-sided.
  • If we reject the null, we typically say xj is
    statistically significant at the a level.
  • If we fail to reject the null, we typically say
    xj is statistically insignificant at the a
    level

27
Testing other hypotheses
  • A more general form of the t statistic recognizes
    that we may want to test something like
  • H0 bj aj
  • In this case, the appropriate t statistic is

28
Confidence Intervals
  • We can construct a confidence interval using the
    same critical value as was used for a two-sided
    test.
  • A (1 - a) confidence interval is defined as

29
Computing p-values for t tests
  • An alternative to the classical approach is to
    ask, what is the smallest significance level at
    which the null would be rejected?
  • So, compute the t statistic, and then look up
    what percentile it is in the appropriate t
    distribution this is the p-value.
  • The smaller the p-value, the stronger is the
    evidence in favor of the null hypothesis.

30
  • 3. TESTING A LINEAR RELATIONSHIP OF PARAMETERS
  • Suppose instead of testing whether b1 is equal to
    a constant, you want to test if it is equal to
    another parameter, that is H0 b1 b2
  • Use same basic procedure for forming a t
    statistic

31
Testing Linear Combo (cont)
32
Testing a Linear Combo (cont)
  • So, to use formula, need s12, which standard
    output does not have
  • Many packages will have an option to get it, or
    will just perform the test for you
  • In Stata, after reg y x1 x2 xk you would type
    test x1 x2 to get a p-value for the test
  • More generally, you can always restate the
    problem to get the test you want

33
Multiple Linear Restrictions
  • Everything weve done so far has involved
    testing a single linear restriction, (e.g. b1 0
    or b1 b2 )
  • However, we may want to jointly test multiple
    hypotheses about our parameters
  • A typical example is testing exclusion
    restrictions we want to know if a group of
    parameters are all equal to zero

34
Testing Exclusion Restrictions
  • Now the null hypothesis might be something like
    H0 bk-q1 0, ... , bk 0
  • The alternative is just H1 H0 is not true
  • Cant just check each t statistic separately,
    because we want to know if the q parameters are
    jointly significant at a given level it is
    possible for none to be individually significant
    at that level

35
Exclusion Restrictions (cont)
  • To do the test we need to estimate the
    restricted model without xk-q1,, , xk
    included, as well as the unrestricted model
    with all xs included
  • Intuitively, we want to know if the change in
    SSR is big enough to warrant inclusion of
    xk-q1,, , xk

36
The F statistic
  • The F statistic is always positive, since the
    SSR from the restricted model cant be less than
    the SSR from the unrestricted
  • Essentially the F statistic is measuring the
    relative increase in SSR when moving from the
    unrestricted to restricted model
  • q number of restrictions, or dfr dfur
  • n k 1 dfur

37
The F statistic (cont)
  • To decide if the increase in SSR when we move to
    a restricted model is big enough to reject the
    exclusions, we need to know about the sampling
    distribution of our F stat
  • Not surprisingly, F Fq,n-k-1, where q is
    referred to as the numerator degrees of freedom
    and n k 1 as the denominator degrees of
    freedom

38
The F statistic (cont)
f(F)
  • Reject H0 at a
  • significance level
  • if F gt c

fail to reject
reject
a
(1 - a)
0
c
F
39
The R2 form of the F statistic
  • Because the SSRs may be large and unwieldy, an
    alternative form of the formula is useful
  • We use the fact that SSR SST(1 R2) for any
    regression, so can substitute in for SSRu and
    SSRur

40
Overall Significance
  • A special case of exclusion restrictions is to
    test H0 b1 b2 bk 0
  • Since the R2 from a model with only an intercept
    will be zero, the F statistic is simply

41
General Linear Restrictions
  • The basic form of the F statistic will work for
    any set of linear restrictions
  • First estimate the unrestricted model and then
    estimate the restricted model
  • In each case, make note of the SSR
  • Imposing the restrictions can be tricky will
    likely have to redefine variables again

42
F Statistic Summary
  • Just as with t statistics, p-values can be
    calculated by looking up the percentile in the
    appropriate F distribution
  • Stata will do this by entering display fprob(q,
    n k 1, F), where the appropriate values of F,
    q,and n k 1 are used
  • If only one exclusion is being tested, then F
    t2, and the p-values will be the same
Write a Comment
User Comments (0)
About PowerShow.com