# The Power of Regression - PowerPoint PPT Presentation

Title:

## The Power of Regression

Description:

### Var(ei) xi 2 (i.e., not heteroskedasticity) ... Omitted variable bias. Nonlinear rather than linear relationship ... If this declines significantly, then reject H0 ... – PowerPoint PPT presentation

Number of Views:87
Avg rating:3.0/5.0
Slides: 62
Provided by: johnw122
Category:
Tags:
Transcript and Presenter's Notes

Title: The Power of Regression

1
The Power of Regression
• Previous Research Literature Claim
• Foreign-owned manufacturing plants have greater
levels of strike activity than domestic plants
• In Canada, strike rates of 25.5 versus 20.3
• Budds Claim
• Foreign-owned plants are larger and located in
strike-prone industries
• Need multivariate regression analysis!

2
The Power of Regression
Dependent Variable Strike Incidence Dependent Variable Strike Incidence Dependent Variable Strike Incidence Dependent Variable Strike Incidence
(1) (2) (3)
U.S. Corporate Parent (Canadian Parent omitted) 0.230 (0.117) 0.201 (0.119) 0.065 (0.132)
Number of Employees (1000s) --- 0.177 (0.019) 0.094 (0.020)
Industry Effects? No No Yes
Sample Size 2,170 2,170 2,170
Statistically significant at the 0.10 level at the 0.05 level (two-tailed tests). Statistically significant at the 0.10 level at the 0.05 level (two-tailed tests). Statistically significant at the 0.10 level at the 0.05 level (two-tailed tests). Statistically significant at the 0.10 level at the 0.05 level (two-tailed tests).
3
Important Regression Topics
• Prediction
• Various confidence and prediction intervals
• Diagnostics
• Are assumptions for estimation testing
fulfilled?
• Specifications
• Quadratic terms? Logarithmic dep. vars.?
• Partial F tests
• Dummy dependent variables
• Probit and logit models

4
Confidence Intervals
• The true population whatever is within the
following interval (1-?) of the time
• Estimate t?/2 ? Standard ErrorEstimate
• Just need
• Estimate
• Standard Error
• Shape / Distribution (including degrees of
freedom)

5
Prediction Interval for New Observation at xp
• 1. Point Estimate
• 2. Standard Error
• 3. Shape
• t distribution with n-k-1 d.f
• 4. So prediction interval for a new observation
is
• 4. So prediction interval for a new observation
is

Siegel, p. 481
6
Prediction Interval for Mean Observations at xp
• 1. Point Estimate
• 2. Standard Error
• 3. Shape
• t distribution with n-k-1 d.f
• 4. So prediction interval for a new observation
is

Siegel, p. 483
7
Earlier Example
Hours of Study (x) and Exam Score (y) Example
1. Find 95 CI for Joes exam score (studies for 20
hours)
2. Find 95 CI for mean score for those who studied
for 20 hours

Regression Statistics Regression Statistics Regression Statistics
Multiple R 0.770
R Squared 0.594
Standard Error Standard Error 10.710
Obs. 10
ANOVA
df SS MS F Significance
Regression 1 1340.452 1341.452 11.686 0.009
Residual 8 917.648 114.706
Total 9 2258.100

Coeff. Std. Error t stat p value Lower 95 Upper 95
Intercept 39.401 12.153 3.242 0.012 11.375 67.426
hours 2.122 0.621 3.418 0.009 0.691 3.554
• -

x 18.80
8
Diagnostics / Misspecification
• For estimation testing to be valid
• y b0 b1x1 b2x2 bkxk e makes sense
• Errors (ei) are independent
• of each other
• of the independent variables
• Homoskedasticity
• Error variance independent
• of the independent variables
• ?e2 is a constant
• Var(ei) ? xi?2 (i.e., not heteroskedasticity)

Violations render our inferences invalid and
9
Common Problems
• Misspecification
• Omitted variable bias
• Nonlinear rather than linear relationship
• Levels, logs, or percent changes?
• Data Problems
• Skewed variables and outliers
• Multicollinearity
• Sample selection (non-random data)
• Missing data
• Problems with residuals (error terms)
• Non-independent errors
• Heteroskedasticity

10
Omitted Variable Bias
• Question 3 from Sample Exam B
• wage 9.05 1.39 union
• (1.65) (0.66)
• wage 9.56 1.42 union 3.87 ability
• (1.49) (0.56)
(1.56)
• wage -3.03 0.60 union 0.25 revenue
• (0.70) (0.45)
(0.08)
• H. Farber thinks the average union wage is
different from average nonunion wage because
unionized employers are more selective and hire
individuals with higher ability.
• M. Friedman thinks the average union wage is
different from the average nonunion wage because
unionized employers have different levels of
revenue per employee.

11
Checking the Assumptions
• How to check the validity of the assumptions?
• Cynicism, Realism, and Theory
• Robustness Checks
• Check different specifications
• But dont just choose the best one!
• Automated Variable Selection Methods
• e.g., Stepwise regression (Siegel, p. 547)
• Misspecification and Other Tests
• Examine Diagnostic Plots

12
Diagnostic Plots
heteroskedasticity. Try transformations or
weighted least squares.
13
Diagnostic Plots
Tilt from outliers might indicate skewness.
Try log transformation
14
Problematic Outliers
Stock Performance and CEO Golf Handicaps (New
York Times, 5-31-98)
Number of obs 44 R-squared
0.1718 -------------------------------------------
----- stockrating Coef. Std. Err. t
Pgtt -------------------------------------------
---- handicap -1.711 .580 -2.95
0.005 _cons 73.234 8.992 8.14
0.000 ---------------------------------------
---------
Without 7 Outliers
Number of obs 51 R-squared
0.0017 -------------------------------------------
----- stockrating Coef. Std. Err. t
Pgtt -------------------------------------------
---- handicap -.173 .593 -0.29
0.771 _cons 55.137 9.790 5.63
0.000 -------------------------------------------
-----
With the 7 Outliers
15
Are They Really Outliers??
Diagnostic Plot is OK
BE CAREFUL!
Stock Performance and CEO Golf Handicaps (New
York Times, 5-31-98)
16
Diagnostic Plots
Curvature might indicate nonlinearity. Try
17
Diagnostic Plots
Good diagnostic plot. Lacks obvious indications
of other problems.
18
Job Performance regression on Salary (in 1,000s)
(Egg Data)
Source SS df MS Number of
obs 576 ------- ---------------------
F(2,573) 122.42 Model 255.61 2 127.8
Prob gt F 0.0000 Residual 598.22
573 1.044 R-squared
R-squared 0.2969 Total 853.83 575 1.485
Root MSE 1.0218 ---------------------
-------------------------------------- job
performance Coef. Std. Err. t
Pgtt ------------------------------------------
----------------- salary .0980844
.0260215 3.77 0.000 salary squared
-.000337 .0001905 -1.77 0.077
_cons -1.720966 .8720358 -1.97 0.049
-------------------------------------------------
-----------
Salary Squared Salary2 salary2 in Excel
19
Job perf -1.72 0.098 salary 0.00034 salary
squared
20
Job perf -1.72 0.098 salary 0.00034 salary
squared
Effect of salary will eventually turn negative
But where?
21
Another Specification Possibility
• If data are very skewed, can try a log
specification
• Can use logs instead of levels for independent
and/or dependent variables
• Note that the interpretation of the coefficients
will change
• Re-familiarize yourself with Siegel, pp. 68-69

22
Quick Note on Logs
• a is the natural logarithm of x if
• 2.71828a x
• or, ea x
• The natural logarithm is abbreviated ln
• ln(x) a
• In Excel, use ln function
• We call this the log but dont use the log
function!
• Usefulness spreads out small values and narrows
large values which can reduce skewness

23
Earnings Distribution
Skewed to the right
Weekly Earnings from the March 2002 CPS, n15,000
24
Residuals from Levels Regression
Skewed to the rightuse of t distribution is
suspect
Residuals from a regression of Weekly Earnings on
demographic characteristics
25
Log Earnings Distribution
Not perfectly symmetrical, but better
Natural Logarithm of Weekly Earnings from the
March 2002 CPS, i.e., ln(weekly earnings)
26
Residuals from Log Regression
Almost symmetricaluse of t distribution is
probably OK
Residuals from a regression of Log Weekly
Earnings on demographic characteristics
27
Hypothesis Tests
• Weve been doing hypothesis tests for single
coefficients
• H0 ? 0 reject if t gt t?/2,n-k-1
• HA ? ? 0
• What about testing more than one coefficient at
the same time?
• e.g., want to see if an entire group of 10 dummy
variables for 10 industries should be in the
model
• Joint tests can be conducted using partial F tests

28
Partial F Tests
• H0 ?1 ?2 ?3 ?C 0
• HA at least one ?i ? 0
• How to test this?
• Consider two regressions
• One as if H0 is true
• i.e., ?1 ?2 ?3 ?C 0
• This is a restricted (or constrained) model
• Plus a full (or unconstrained) model in which
the computer can estimate what it wants for each
coefficient

29
Partial F Tests
• Statistically, need to distinguish between
• Full regression no better than the restricted
regression
• versus
• Full regression is significantly better than
the restricted regression
• To do this, look at variance of prediction errors
• If this declines significantly, then reject H0
• From ANOVA, we know ratio of two variances has an
F distribution
• So use F test

30
Partial F Tests
• SSresidual Sum of Squares Residual
• C constraints
• The partial F statistic has C, n-k-1 degrees of
freedom
• Reject H0 if F gt F?,C, n-k-1

31
Coal Mining Example (Again)
Regression Statistics Regression Statistics Regression Statistics Regression Statistics
R Squared R Squared 0.955
Standard Error Standard Error Standard Error 108.052
Obs. Obs. 47
ANOVA ANOVA df SS MS F Significance
Regression Regression 6 9975694.933 1662615.822 142.406 0.000
Residual Residual 40 467007.875 11675.197
Total Total 46 10442702.809
Coeff. Std. Error t stat p value Lower 95 Upper 95
Intercept -168.510 -168.510 258.819 -0.651 0.519 -691.603 354.583
hours 1.244 1.244 0.186 6.565 0.000 0.001 0.002
tons 0.048 0.048 0.403 0.119 0.906 -0.001 0.001
unemp 19.618 19.618 5.660 3.466 0.001 8.178 31.058
WWII 159.851 159.851 78.218 2.044 0.048 1.766 317.935
Act1952 -9.839 -9.839 100.045 -0.098 0.922 -212.038 192.360
Act1969 -203.010 -203.010 111.535 -1.820 0.076 -428.431 22.411
32
Minitab Output
• Predictor Coef StDev T
P
• Constant -168.5 258.8 -0.65
0.519
• hours 1.2235 0.186 6.56
0.000
• tons 0.0478 0.403 0.12
0.906
• unemp 19.618 5.660 3.47
0.001
• WWII 159.85 78.22 2.04
0.048
• Act1952 -9.8 100.0 -0.10
0.922
• Act1969 -203.0 111.5 -1.82
0.076
• S 108.1 R-Sq 95.5 R-Sq(adj)
94.9
• Analysis of Variance
• Source DF SS MS F
P
• Regression 6 9975695 1662616 142.41
0.000
• Error 40 467008 11675
• Total 46 10442703

33
Is the Overall Model Significant?
• H0 ?1 ?2 ?3 ?6 0
• HA at least one ?i ? 0
• Note for testing the overall model, Ck
• i.e., testing all coefficients together
• From the previous slides, we have SSresidual for
the full (or unconstrained) model
• SSresidual467,007.875
• But what about for the restricted (H0 true)
regression?
• Estimate a constant only regression

34
Constant-Only Model
Regression Statistics Regression Statistics Regression Statistics Regression Statistics
R Squared R Squared 0
Standard Error Standard Error Standard Error 476.461
Obs. Obs. 47
ANOVA ANOVA df SS MS F Significance
Regression Regression 0 0 0 . .
Residual Residual 46 10442702.809 227015.278
Total Total 46 10442702.809
Coeff. Std. Error t stat p value Lower 95 Upper 95
Intercept 671.937 671.937 69.499 9.668 0.0000 532.042 811.830
35
Partial F Tests
142.406
• H0 ?1 ?2 ?3 ?6 0
• HA at least one ?i ? 0
• Reject H0 if F gt F?,C, n-k-1 F0.05,6,40 2.34
• 142.406 gt 2.34 so reject H0. Yes, overall model
is significant

36
Select F Distribution 5 Critical Values
Numerator Degrees of Freedom Numerator Degrees of Freedom Numerator Degrees of Freedom Numerator Degrees of Freedom Numerator Degrees of Freedom Numerator Degrees of Freedom Numerator Degrees of Freedom Numerator Degrees of Freedom
1 2 3 4 5 6
1 161 199 216 225 230 234
2 18.5 19.0 19.2 19.2 19.3 19.3
3 10.1 9.55 9.28 9.12 9.01 8.94
8 5.32 4.46 4.07 3.84 3.69 3.58
10 4.96 4.10 3.71 3.48 3.33 3.22
11 4.84 3.98 3.59 3.36 3.20 3.09
12 4.75 3.89 3.49 3.26 3.11 3.00
18 4.41 3.55 3.16 2.93 2.77 2.66
40 3.94 3.09 2.84 2.46 2.31 2.19
1000 3.85 3.00 2.61 2.38 2.22 2.11

Denominator Degrees of Freedom
37
A Small Shortcut
Regression Statistics Regression Statistics Regression Statistics Regression Statistics
R Squared R Squared 0.955
Standard Error Standard Error Standard Error 108.052
Obs. Obs. 47
ANOVA ANOVA df SS MS F Significance
Regression Regression 6 9975694.933 1662615.822 142.406 0.000
Residual Residual 40 467007.875 11675.197
Total Total 46 10442702.809
Coeff. Std. Error t stat p value Lower 95 Upper 95
Intercept -168.510 -168.510 258.819 -0.651 0.519 -691.603 354.583
hours 1.244 1.244 0.186 6.565 0.000 0.001 0.002
tons 0.048 0.048 0.403 0.119 0.906 -0.001 0.001
unemp 19.618 19.618 5.660 3.466 0.001 8.178 31.058
WWII 159.851 159.851 78.218 2.044 0.048 1.766 317.935
Act1952 -9.839 -9.839 100.045 -0.098 0.922 -212.038 192.360
Act1969 -203.010 -203.010 111.535 -1.820 0.076 -428.431 22.411
For constant only model, SSresidual10,442,702.80
9
So to test overall model, you dont need to run a
constant-only model
38
An Even Better Shortcut
Regression Statistics Regression Statistics Regression Statistics Regression Statistics
R Squared R Squared 0.955
Standard Error Standard Error Standard Error 108.052
Obs. Obs. 47
ANOVA ANOVA df SS MS F Significance
Regression Regression 6 9975694.933 1662615.822 142.406 0.000
Residual Residual 40 467007.875 11675.197
Total Total 46 10442702.809
Coeff. Std. Error t stat p value Lower 95 Upper 95
Intercept -168.510 -168.510 258.819 -0.651 0.519 -691.603 354.583
hours 1.244 1.244 0.186 6.565 0.000 0.001 0.002
tons 0.048 0.048 0.403 0.119 0.906 -0.001 0.001
unemp 19.618 19.618 5.660 3.466 0.001 8.178 31.058
WWII 159.851 159.851 78.218 2.044 0.048 1.766 317.935
Act1952 -9.839 -9.839 100.045 -0.098 0.922 -212.038 192.360
Act1969 -203.010 -203.010 111.535 -1.820 0.076 -428.431 22.411
In fact, the ANOVA table F test is exactly the
test for the overall model being
significantrecall Unit 8
39
Testing Any Subset
Regression Statistics Regression Statistics Regression Statistics Regression Statistics
R Squared R Squared 0.955
Standard Error Standard Error Standard Error 108.052
Obs. Obs. 47
ANOVA ANOVA df SS MS F Significance
Regression Regression 6 9975694.933 1662615.822 142.406 0.000
Residual Residual 40 467007.875 11675.197
Total Total 46 10442702.809
Coeff. Std. Error t stat p value Lower 95 Upper 95
Intercept -168.510 -168.510 258.819 -0.651 0.519 -691.603 354.583
hours 1.244 1.244 0.186 6.565 0.000 0.001 0.002
tons 0.048 0.048 0.403 0.119 0.906 -0.001 0.001
unemp 19.618 19.618 5.660 3.466 0.001 8.178 31.058
WWII 159.851 159.851 78.218 2.044 0.048 1.766 317.935
Act1952 -9.839 -9.839 100.045 -0.098 0.922 -212.038 192.360
Act1969 -203.010 -203.010 111.535 -1.820 0.076 -428.431 22.411
Partial F test can be used to test any subset of
variables
For example, H0 ?WWII ?Act1952 ?Act1969
0 HA at least one ?i ? 0
40
Restricted Model
Restricted regression with ?WWII ?Act1952
?Act1969 0
Regression Statistics Regression Statistics Regression Statistics Regression Statistics
R Squared R Squared 0.955
Standard Error Standard Error Standard Error 108.052
Obs. Obs. 47

ANOVA ANOVA df SS MS F Significance
Regression Regression 3 9837344.76 3279114.920 232.923 0.000
Residual Residual 43 605358.049 14078.094
Total Total 46 10442702.809
Coeff. Std. Error t stat p value
Intercept 147.821 147.821 166.406 0.888 0.379
hours 0.0015 0.0015 0.0001 20.522 0.000
tons -0.0008 -0.0008 0.0003 -2.536 0.015
unemp 7.298 7.298 4.386 1.664 0.103
41
Partial F Tests
3.950
• H0 ?WWII ?Act1952 ?Act1969 0
• HA at least one ?i ? 0
• Reject H0 if F gt F?,C, n-k-1 F0.05,3,40 2.84
• 3.95 gt 2.84 so reject H0. Yes, subset of three
coefficients are jointly significant

42
Regression and Two-Way ANOVA
A B C B2 B3 B4 B5 Value
1 0 0 0 0 0 0 10
1 0 0 1 0 0 0 12
1 0 0 0 1 0 0 18
1 0 0 0 0 1 0 20
1 0 0 0 0 0 1 8
0 1 0 0 0 0 0 9
0 1 0 1 0 0 0 6
0 1 0 0 1 0 0 15
0 1 0 0 0 1 0 18
0 1 1 0 0 0 1 7
0 0 1 0 0 0 0 8

Treatments Treatments Treatments
A B C
1 10 9 8
2 12 6 5
3 18 15 14
4 20 18 18
5 8 7 8
Blocks
Stack data using dummy variables
43
Recall Two-Way Results
ANOVA Two-Factor Without Replication ANOVA Two-Factor Without Replication ANOVA Two-Factor Without Replication ANOVA Two-Factor Without Replication ANOVA Two-Factor Without Replication ANOVA Two-Factor Without Replication ANOVA Two-Factor Without Replication

Source of Variation SS df MS F P-value F crit
Blocks 312.267 4 78.067 38.711 0.000 3.84
Treatment 26.533 2 13.267 6.579 0.020 4.46
Error 16.133 8 2.017
Total 354.933 14
44
Regression and Two-Way ANOVA
• Source SS df MS Number of
obs 15
• -------------------------------- F( 6,
8) 28.00
• Model 338.800 6 56.467 Prob gt F
0.0001
• Residual 16.133 8 2.017 R-squared
0.9545
R-squared 0.9205
• Total 354.933 14 25.352 Root MSE
1.4201
• --------------------------------------------------
-----------
• treatment Coef. Std. Err. t Pgtt 95
Conf. Int
• -------------------------------------------------
-----------
• b -2.600 .898 -2.89 0.020
-4.671 -.529
• c -3.000 .898 -3.34 0.010
-5.071 -.929
• b2 -1.333 1.160 -1.15 0.283
-4.007 1.340
• b3 6.667 1.160 5.75 0.000
3.993 9.340
• b4 9.667 1.160 8.34 0.000
6.993 12.340
• b5 -1.333 1.160 -1.15 0.283
-4.007 1.340
• _cons 10.867 .970 11.20 0.000
8.630 13.104
• --------------------------------------------------
-----------

45
Regression and Two-Way ANOVA
• Regression Excerpt for Full Model
• Source SS df MS
• ----------------------------
• Model 338.800 6 56.467
• Residual 16.133 8 2.017
• ----------------------------
• Total 354.933 14 25.352

Use these SSresidual values to do partial F tests
and you will get exactly the same answers as the
Two-Way ANOVA tests
Regression Excerpt for ?b2 ?b3 0 Source
SS df MS ----------------------------
Model 26.533 2 13.267 Residual 328.40 12
27.367 ---------------------------- Total
354.933 14 25.352
Regression Excerpt for ?b ?c 0 Source
SS df MS ----------------------------
Model 312.267 4 78.067 Residual 42.667 10
4.267 ---------------------------- Total
354.933 14 25.352
46
Select F Distribution 5 Critical Values
Numerator Degrees of Freedom Numerator Degrees of Freedom Numerator Degrees of Freedom Numerator Degrees of Freedom Numerator Degrees of Freedom Numerator Degrees of Freedom Numerator Degrees of Freedom Numerator Degrees of Freedom Numerator Degrees of Freedom Numerator Degrees of Freedom
1 2 3 4 5 6 9
1 161 199 216 225 230 234 241
2 18.5 19.0 19.2 19.2 19.3 19.3 19.4
3 10.1 9.55 9.28 9.12 9.01 8.94 8.81
8 5.32 4.46 4.07 3.84 3.69 3.58 3.39
10 4.96 4.10 3.71 3.48 3.33 3.22 3.02
11 4.84 3.98 3.59 3.36 3.20 3.09 2.90
12 4.75 3.89 3.49 3.26 3.11 3.00 2.80
18 4.41 3.55 3.16 2.93 2.77 2.66 2.46
40 3.94 3.09 2.84 2.46 2.31 2.19 2.12
1000 3.85 3.00 2.61 2.38 2.22 2.11 1.89
? 3.84 3.00 2.60 2.37 2.21 2.10 1.83
Denominator Degrees of Freedom
47
3 Seconds of Calculus
48
Regression Coefficients
• y b0 b1x
• (linear form)
• log(y) b0 b1x (semi-log form)
• log(y) b0 b1log(x) (double-log form)

1 unit change in x changes y by b1
1 unit change in x changes y by b1 (x100) percent
1 percent change in x changes y by b1 percent
49
Log Regression Coefficients
• wage 9.05 1.39 union
• Predicted wage is 1.39 higher for unionized
workers (on average)
• log(wage) 2.20 0.15 union
• Semi-elasticity
• Predicted wage is approximately 15 higher for
unionized workers (on average)
• log(wage) 1.61 0.30 log(profits)
• Elasticity
• A one percent increase in profits increases
predicted wages by approximately 0.3 percent

50
Multicollinearity
Auto repair records, weight, and engine size
Number of obs 69 F( 2, 66)
6.84 Prob gt F 0.0020 R-squared
0.1718 Adj R-squared 0.1467 Root MSE
.91445 -------------------------------------------
--- repair Coef. Std. Err. t Pgtt
---------------------------------------------
weight -.00017 .00038 -0.41
0.685 engine -.00313 .00328 -0.96
0.342 _cons 4.50161 .61987 7.26
0.000 --------------------------------------------
--
51
Multicollinearity
• Two (or more) independent variables are so highly
correlated that a multiple regression cant
disentangle the unique contributions of each
• Large standard errors and lack of statistical
significance for individual coefficients
• But joint significance
• Identifying multicollinearity
• Some say rule of thumb rgt0.70 (or 0.80)
• But better to look at results
• OK for prediction Bad for assessing theory

52
Prediction With Multicollinearity
• Prediction at the Mean (weight3019 and
engine197)

Model for prediction Predicted Repair Lower 95 Limit (Mean) Upper 95 Limit (Mean)
Multiple Regression 3.411 3.191 3.631
Weight Only 3.412 3.193 3.632
Engine Only 3.410 3.192 3.629
53
Dummy Dependent Variables
• Dummy dependent variables
• y b0 b1x1 bkxk e
• Where y is a 0,1 indicator variable
• Examples
• Do you intend to quit? yes / no
• Did the worker receive training? yes/no
• Do you think the President is doing a good job?
yes/no
• Was there a strike? yes / no
• Did the company go bankrupt? yes/no

54
Linear Probability Model
• Mathematically / computationally, can estimate a
regression as usual (the monkeys wont know the
difference)
• This is called a linear probability model
• Right-hand side is linear
• And is estimating probabilities
• P(y 1) b0 b1x1 bkxk
• b10.15 (for example) means that a one unit
change in x1 increases probability that y1 by
0.15 (fifteen percentage points)

55
Linear Probability Model
• Excel wont know the difference, but perhaps it
should
• Linear probability model problems
• ?e2 P(y1)?1-P(y1)
• But P(y 1) b0 b1x1 bkxk
• So ?e2 is
• Predicted probabilities are not bounded by 0,1
• R2 is not an accurate measure of predictive
ability
• Can use a pseudo-R2 measure
• Such as percent correctly predicted

56
Logit Model Probit Model
• Solution to these problems is to use nonlinear
functional forms that bound P(y1) between 0,1
• Logit Model (logistic regression)
• Probit Model
• Where ? is the normal cumulative distribution
function

Recall, ln(x) a when ea x
57
Logit Model Probit Model
• Nonlinear so need statistical package to do the
calculations
• Can do individual (z-tests, not t-tests) and
joint statistical testing as with other
regressions
• Also confidence intervals
• Need to convert coefficients to marginal effects
for interpretation
• Should be aware of these models
• Though in many cases, a linear probability model
works just fine

58
Example
• Dep. Var 1 if you know of the FMLA, 0 otherwise

Probit estimates Number of obs
1189 LR chi2(14)
232.39 Prob gt
chi2 0.0000 Log likelihood -707.94377
Pseudo R2 0.1410 ---------------------------
--------------------------------- FMLAknow
Coef. Std. Err. z Pgtz 95 Conf.
Int --------------------------------------------
--------------- union .238 .101 2.35
0.019 .039 .436 age -.002
.018 -0.13 0.897 -.038 .033 agesq
.135 .219 0.62 0.536 -.293
.564 nonwhite -.571 .098 -5.80 0.000
-.764 -.378 income 1.465 .393 3.73
0.000 .696 2.235 incomesq -5.854 2.853
-2.05 0.040 -11.45 -.262 other controls
omitted _cons -1.188 .328 -3.62
0.000 -1.831 -.545 ---------------------------
---------------------------------
59
Marginal Effects
• For numerical interpretation / prediction, need
to convert coefficients to marginal effects
• Example Logit Model
• So b1 gives effect on Log(), not P(y1)
• Probit is similar
• Can re-arrange to find out effect on P(y1)
• Usually do this at the sample means

60
Marginal Effects
Probit estimates Number of obs
1189 LR chi2(14)
232.39 Prob gt
chi2 0.0000 Log likelihood -707.94377
Pseudo R2 0.1410 ---------------------------
--------------------------------- FMLAknow
dF/dx Std. Err. z Pgtz 95 Conf.
Int --------------------------------------------
--------------- union .095 .040 2.35
0.019 .017 .173 age -.001
.007 -0.13 0.897 -.015 .013 agesq
.054 .087 0.62 0.536 -.117
.225 Nonwhite -.222 .036 -5.80 0.000
-.293 -.151 income .585 .157 3.73
0.000 .278 .891 incomesq -2.335 1.138
-2.05 0.040 -4.566 -.105 other controls
omitted -----------------------------------------
------------------
For numerical interpretation / prediction, need
to convert coefficients to marginal effects
61
But Linear Probability Model is OK, Too
Probit Coeff.
Union 0.238 (0.101)
Nonwhite -0.571 (0.098)
Income 1.465 (0.393)
Income Squared -5.854 (2.853)
Probit Marginal
0.095 (0.040)
-0.222 (0.037)
0.585 (0.157)
-2.335 (1.138)
Regression
0.084 (0.035)
-0.192 (0.033)
0.442 (0.091)
-1.354 (0.316)
So regression is usually OK, but should still be
familiar with logit and probit methods