Multiple Regression Model: Hypotheses Testing - PowerPoint PPT Presentation

1 / 42

About This Presentation

Title:

Multiple Regression Model: Hypotheses Testing

Description:

Typically, an estimated econometric model is used to make ... We add another assumption (beyond the Gauss-Markov assumptions). Assumption 6: Normality. ... – PowerPoint PPT presentation

Number of Views:249

Avg rating:3.0/5.0

Slides: 43

Provided by: PatriciaM47

Category:

more less

Transcript and Presenter's Notes

Title: Multiple Regression Model: Hypotheses Testing

1
Multiple Regression ModelHypotheses Testing

y b0 b1 x1 b2 x2 . . . bk xk u

2
Lecture 4 THE MULTIPLE REGRESSION MODEL
HYPOTHESES TESTINGProfessor Victor Aguirregabiria

OUTLINE
Introduction and Examples
Sampling Distribution of OLS
Hypotheses on one parameter t-test
Hypotheses on a linear combination of parameters
t-test
Hypotheses on Multiple Linear Restrictions F-test

3
1. Introduction and Examples

Typically, an estimated econometric model is used
to make predictions and to make decisions.
In both cases, it is important to know how much
uncertainty (or how much confidence) we have in
the predictions and decisions we made based on
the estimated model.
There are different sources of uncertainty (right
model?). Here we concentrate on the uncertainty
generated by the fact that we have estimates and
not the true parameters.

4
Example 1 Demand Elasticity and Pricing

Consider the example (in Lecture 2) of the
manager interested in the estimation of the
elasticity of demand.
The demand model is

Where ? is the elasticity of demand.

5
Example 1 Demand Elasticity and Pricing

Given the estimated model, the manager will use
it to decide its profit maximizing price.

The manager may decide to take this estimate to
choose her optimal price

6
Example 1 Demand Elasticity and Pricing

However, this decision is ignoring that there is
uncertainty associated with the estimate, and
therefore there is uncertainty in the profit.
The profit function is

But the true ? is unknown.

7
Example 1 Demand Elasticity and Pricing

Note that from the point of view of the manager,
the estimated elasticity is a known value (say
3.45) and the true value of ? is a random
variable (with mean 3.45).
Is this uncertainty important for the choice of
the optimal price?
Yes, the optimal price that accounts for
uncertainty is different than the price that
ignores it.

8
Example 1 Demand Elasticity and Pricing

How can the manager take account this uncertainty
when deciding prices?
There are different possible approaches.
To treat profits as a random variable
(conditional on the estimate of ?), and maximize
expected profits.
Construct confidence intervals for the true ? and
use them to construct confidence intervals on
optimal prices.

9
Example 2 Credit Scores and Mortgages

Consider a bank deciding whether to give a
mortgage loan to a person.
To make this decision, the bank uses a MRM to
predict the probability of repayment.
The dependent variable Y is an indicator, 0 o 1,
of the repayment of a loan. The explanatory
variables X are socio-economic characteristics of
borrower, the aggregate economy, and the local
economy.

10
Example 2 Credit Scores and Mortgages

Using data of previous mortgages, the bank
estimates this MRM.
The fitted values of this model can be
interpreted as predictions for the probability of
repayment of a mortgage.
Given the estimated model, the bank has to decide
whether to approve a mortgage to a new applicant.

11
Example 2 Credit Scores and Mortgages

Given the values of the X regressors for this
applicant, the fitted value (credit score) of
this applicant is 0.76.
What does the bank do with this credit score?
The bank has to choose either Y1 (approval) or
Y0 (not approval).
There is a possible error associated with each
decision.

12
Example 2 Credit Scores and Mortgages

Suppose that the bank considers the scenario
(hypothesis) that the person will pay the
mortgage (Y1).
If the bank accepts the hypothesis, it can make
an ERROR TYPE I reject when it is true.
If the bank rejects the hypothesis, it can make
an ERROR TYPE II accept when it is false.
Each error has a probability.

13
2. SAMPLING DISTRIBUTION OLS ESTIMATOR

In order to do hypothesis testing, we need to
know the sampling distribution of the OLS
estimator (not just the mean and variance).
We add another assumption (beyond the
Gauss-Markov assumptions).
Assumption 6 Normality.
u is independent of x1, x2,, xk and u is
normally distributed with zero mean and variance
s2
u Normal(0,s2)

14
COMMENTS ON NORMALITY ASSUMPTION

Assumptions 1 to 6 are called the Classical
Assumptions.
Under the Classical Assumptions, OLS is not only
BLUE, but it is the minimum variance unbiased
estimator.
Normality of u implies that, conditional on x
yx
Normal(b0 b1x1 bkxk, s2)
Under Classical Assumptions the OLS estimator is
Normally distributed

15
The homoskedastic normal distribution with a
single explanatory variable
y
f(yx)
.
E(yx) b0 b1x
.
Normal distributions
x1
x2
16
COMMENTS ON NORMALITY ASSUMPTION

Do we really need to assume that u is normally
distributed to have that the OLS is normal?
Not really.
When we have large samples (and under Assumptions
1 to 5) the Central Limit Theorem implies that
the OLS estimator is closed to be normally
distributed.
We say that the OLS is asymptotically normal.
This means that the normal distribution is a good
approximation when the sample is large even when
u is not normal.

17
3. Hypotheses about one parameter t-test
18
3. Hypotheses about one parameter t-test

Knowing the sampling distribution for the
standardized estimator allows us to carry out
hypothesis tests.
Start with a null hypothesis
For example, H0 bj 0
If accept null, then accept that xj has no effect
on y, controlling for other xs

19
3. Hypotheses about one parameter t-test
20
3. Hypotheses about one parameter t-test

Besides our null, H0, we need an alternative
hypothesis, H1, and a significance level.
H1 may be one-sided, or two-sided
H1 bj gt 0 is one-sided
H1 bj lt 0 is one-sided
H1 bj ? 0 is a two-sided alternative
If we want to have only a 5 probability of
rejecting H0 if it is really true, then we say
our significance level is 5

21
3. Hypotheses about one parameter t-test

Having picked a significance level, a, we look
up the (1 a)th percentile in a t distribution
with n k 1 df and call this c, the critical
value
We can reject the null hypothesis if the t
statistic is greater than the critical value
If the t statistic is less than the critical
value then we fail to reject the null

22
One-Sided Alternatives (cont)
yi b0 b1xi1 bkxik ui H0 bj
0 H1 bj gt 0
Fail to reject
reject
(1 - a)
a
c
0
23
T-test One-sided alternatives (cont)

Because the t distribution is symmetric, testing
H1 bj lt 0 is straightforward. The critical
value is just the negative of before
We can reject the null if the t statistic lt c,
and if the t statistic gt than c then we fail to
reject the null
For a two-sided test, we set the critical value
based on a/2 and reject H1 bj ? 0 if the
absolute value of the t statistic gt c

24
T-test Two-sided alternatives

Because the t distribution is symmetric, for a
two-sided test, we set the critical value based
on a/2 and reject H1 bj ? 0 if the absolute
value of the t statistic gt c

25
Two-Sided Alternatives
yi b0 b1Xi1 bkXik ui H0 bj
0 H1 bj ? 0
fail to reject
reject
reject
(1 - a)
a/2
a/2
-c
c
0
26
Summary for H0 bj 0

Unless otherwise stated, the alternative is
assumed to be two-sided.
If we reject the null, we typically say xj is
statistically significant at the a level.
If we fail to reject the null, we typically say
xj is statistically insignificant at the a
level

27
Testing other hypotheses

A more general form of the t statistic recognizes
that we may want to test something like
H0 bj aj
In this case, the appropriate t statistic is

28
Confidence Intervals

We can construct a confidence interval using the
same critical value as was used for a two-sided
test.
A (1 - a) confidence interval is defined as

29
Computing p-values for t tests

An alternative to the classical approach is to
ask, what is the smallest significance level at
which the null would be rejected?
So, compute the t statistic, and then look up
what percentile it is in the appropriate t
distribution this is the p-value.
The smaller the p-value, the stronger is the
evidence in favor of the null hypothesis.

3. TESTING A LINEAR RELATIONSHIP OF PARAMETERS
Suppose instead of testing whether b1 is equal to
a constant, you want to test if it is equal to
another parameter, that is H0 b1 b2
Use same basic procedure for forming a t
statistic

31
Testing Linear Combo (cont)
32
Testing a Linear Combo (cont)

So, to use formula, need s12, which standard
output does not have
Many packages will have an option to get it, or
will just perform the test for you
In Stata, after reg y x1 x2 xk you would type
test x1 x2 to get a p-value for the test
More generally, you can always restate the
problem to get the test you want

33
Multiple Linear Restrictions

Everything weve done so far has involved
testing a single linear restriction, (e.g. b1 0
or b1 b2 )
However, we may want to jointly test multiple
hypotheses about our parameters
A typical example is testing exclusion
restrictions we want to know if a group of
parameters are all equal to zero

34
Testing Exclusion Restrictions

Now the null hypothesis might be something like
H0 bk-q1 0, ... , bk 0
The alternative is just H1 H0 is not true
Cant just check each t statistic separately,
because we want to know if the q parameters are
jointly significant at a given level it is
possible for none to be individually significant
at that level

35
Exclusion Restrictions (cont)

To do the test we need to estimate the
restricted model without xk-q1,, , xk
included, as well as the unrestricted model
with all xs included
Intuitively, we want to know if the change in
SSR is big enough to warrant inclusion of
xk-q1,, , xk

36
The F statistic

The F statistic is always positive, since the
SSR from the restricted model cant be less than
the SSR from the unrestricted
Essentially the F statistic is measuring the
relative increase in SSR when moving from the
unrestricted to restricted model
q number of restrictions, or dfr dfur
n k 1 dfur

37
The F statistic (cont)

To decide if the increase in SSR when we move to
a restricted model is big enough to reject the
exclusions, we need to know about the sampling
distribution of our F stat
Not surprisingly, F Fq,n-k-1, where q is
referred to as the numerator degrees of freedom
and n k 1 as the denominator degrees of
freedom

38
The F statistic (cont)
f(F)

Reject H0 at a
significance level
if F gt c

fail to reject
reject
a
(1 - a)
0
c
F
39
The R2 form of the F statistic

Because the SSRs may be large and unwieldy, an
alternative form of the formula is useful
We use the fact that SSR SST(1 R2) for any
regression, so can substitute in for SSRu and
SSRur

40
Overall Significance

A special case of exclusion restrictions is to
test H0 b1 b2 bk 0
Since the R2 from a model with only an intercept
will be zero, the F statistic is simply

41
General Linear Restrictions

The basic form of the F statistic will work for
any set of linear restrictions
First estimate the unrestricted model and then
estimate the restricted model
In each case, make note of the SSR
Imposing the restrictions can be tricky will
likely have to redefine variables again

42
F Statistic Summary

Just as with t statistics, p-values can be
calculated by looking up the percentile in the
appropriate F distribution
Stata will do this by entering display fprob(q,
n k 1, F), where the appropriate values of F,
q,and n k 1 are used
If only one exclusion is being tested, then F
t2, and the p-values will be the same

Write a Comment

User Comments (0)