Multiple Linear Regression - PowerPoint PPT Presentation

About This Presentation
Title:

Multiple Linear Regression

Description:

Multiple Linear Regression Multiple Regression In multiple regression we have multiple predictors X1, X2, , Xp and we are interested in modeling the mean of the ... – PowerPoint PPT presentation

Number of Views:191
Avg rating:3.0/5.0
Slides: 47
Provided by: course1Wi
Category:

less

Transcript and Presenter's Notes

Title: Multiple Linear Regression


1
Multiple Linear Regression
2
Multiple Regression
  • In multiple regression we have multiple
    predictors X1, X2, , Xp and we are interested in
    modeling the mean of the response Y as function
    of these predictors, i.e. we wish to estimate
    E(Y X1, X2, , Xp) or E(YX). In linear
    regression we will use a linear function of the
    model parameters, e.g.
  • E(YX1,X2) bo b1X1 b2X2 b12X1X2
  • E(YX1,X2,X3) bo b1ln(X1) b2X22b3X3

3
Example 1 NC Birth Weight Data
  • Y birth weight of infant (g)
  • Consider the following potential predictors
  • X1 mothers age (yrs.)
  • X2 fathers age (yrs.)
  • X3 mothers education (yrs.)
  • X4 fathers education (yrs.)
  • X5 mothers smoking status (1 yes, 0 no)
  • X6 weight gained during pregnancy (lbs.)
  • X7 gestational age (weeks)
  • X8 number of prenatal visits
  • X9 race of child (White, Black, Other)

4
Dichotomous Categorical Predictors
  • In this study smoking status (X5) is an example
    of dichotomous (2 level) categorical predictor.
    How do use a predictor like this in a regression
    model?
  • There are two approaches that get usedOne
    approach is to code smoking status as 0 or 1 and
    treat it as a numeric predictor (this is called
    0-1 coding)
  • The other is to code smoking status as -1 or 1
    and treat it as a numeric predictor (this is
    called contrast coding)

5
Example 1 NC Birth Weight Data
  • We first consider 0-1 coding
  • and fit the model E(YX5) bo b5X5
  • E(YSmoker) 3287.66 214.85(1)
    3072.80 g
  • E(YNon-smoker) 3287.66 214.85(0) 3287.66 g

6
Example 1 NC Birth Weight Data
Punchline Two-sample t-test is equivalent to
regression!!
  • Compare to a pooled t-test

Regression Output (0-1 coding)
E(YSmoker) 3072.80 g E(YNon-smoker)
3287.66 g
7
Example 1 NC Birth Weight Data
  • Now consider -1 / 1 coding
  • and fit the model E(YX5) bo b5X5
  • E(YSmoker) 3180.18 107.38( -1)
    3072.80 g
  • E(YNon-smoker) 3180.18 107.38(1) 3287.66
    g

8
Example 1 NC Birth Weight Data
Punchline Two-sample t-test is equivalent to
regression!!
  • Compare to a pooled t-test

Regression Output (-1/1 coding)
E(YSmoker) 3072.80 g E(YNon-smoker)
3287.66 g
2(95 CI for b5) 2(107.38 1.9628.90)
(101.34, 328.36)
9
Factors with more than two levels
  • Consider Race of the child coded as W white,
    B black, O other
  • E(Birth WeightRace) ?????
  • E(Birth WeightWhite) 3226.33 159.52(-1)
    56.74(-1)
  • 3329.11 g
  • E(Birth WeightBlack) 3226.33 159.52(1)
  • 3066.81 g
  • E(Birth WeightOther) 3226.33 56.74(1)
  • 3283.08 g

What comes alphabetically last is the reference
group, the other groups are coded as -1/1.
10
Factors with more than two levels
E(Birth WeightWhite) 3329.11 g E(Birth
WeightBlack) 3088.62 g E(Birth WeightOther)
3283.08 g
11
Tukeys Regression
Mean birth weight of black infants significantly
differs from that for white infants as white
infants are the reference group (p lt .0001).
However, non-black minority infants do not
significantly differ from the white infants in
terms of mean birth weight (p .2729).
Blacks infants have a significantly lower mean
birth weight than both white and non-black
minority infants.
12
ANOVA Regression!
  • One-way ANOVA is equivalent to regression on
    the -1 ,1 coded levels of the factor with one
    of the k populations to be compared being viewed
    as the reference group.

13
Example NC Birth Weights
We have evidence that the mean birth weight of
infants born to the population of smoking mothers
is between 102.5 and 327.06 g less than the mean
birth weight of infants born to non-smokers.
Does this mean that if we compared the
populations of full-term babies that the mean
birth weights of babies born to smokers would be
lower than that for those born to non-smokers?
Not necessarily, maybe smoking leads to earlier
births and that is the reason for the overall
difference above.
14
Example NC Birth Weights
  • One way to explore this possibility is to add
    gestational age as a covariate to a regression
    model already containing smoking status, i.e.
  • where

15
Example NC Birth Weights
  • The estimated equation is
  • thus for smokers and non-smokers we have
  • The difference between the smokers and
    non-smokers is
    holding gestational age constant.

16
Example NC Birth Weights
  • 95 CI for the Smoking Effect for infants
    with a given gestational age is 2(89.13
    1.9624.12)
  • 2(41.85,136.41) (83.70 g, 272.82 g)
  • Thus adjusting for gestational age, we estimate
    that the mean birth weight of infants born to
    smoking mothers is between 83.70 g and 272.82 g
    lower than the mean birth weight of infants born
    to non-smoking mothers.
  • Q What if the effect of gestational age is
    different for smokers and non-smokers? For
    example, maybe for smokers an additional week of
    gestational age does not translate to the same
    increase in birth weight as it does for
    non-smokers? What should we do?
  • A Add a smoking and gestational age interaction
    term, SmokingGest.Age, which will allow the
    lines for smokers and nonsmokers to different
    slopes.

17
Example NC Birth Weights
The interaction is not statistically significant
(p .9564). So the parallel lines model is
sufficient.
The lines here look very parallel, so there is
little evidence of a significant interaction in
the form of different slopes.
18
Example 2 Birth Weight, Gestational Age
Hospital
  • Study of premature infants born at three
    hospitals.
  • Variables are
  • Birth weight (g)
  • Gest. Age (wks.)
  • Hospital (A,B,C)

19
Example 2 Birth Weight, Gestational Age
Hospital
Do the mean birth weights significantly differ
across the three hospitals in this study? Using
one-way ANOVA we find that the means
significantly differ (p .0022).
We conclude the mean birth weight of infants born
at Hospital A is significantly lower than the
mean birth weight of infants at Hospital B, we
estimate between 128.1 g and 611.0 g lower.
20
Example 2 Birth Weight, Gestational Age
Hospital
  • What role does gestational age play in these
    differences? Perhaps gestational age differs
    across hospitals and that helps explains the
    birth weight differences.

One-way ANOVA yields p .1817 for comparing the
mean gestational ages of infants born at the
three hospitals.
21
Example 2 Birth Weight, Gestational Age
Hospital
This is a scatter plot of birth weight vs.
gestational age with the points color coded by
hospital. Is there evidence that the weight gain
per week differs between the hospitals? The lines
seem to suggest that the weight gain per week
differs across the hospitals.
22
Example 2 Birth Weight, Gestational Age
Hospital
23
Example 2 Birth Weight, Gestational Age
Hospital
The intercepts are meaningless for these data.
For hospital A we see that the weight gain for
premature babies is 48.76 g/week, 108.52 g/week
for hospital B, and 76.49 g/week for hospital C.
As a result the differences between the mean
birth weights as function of age are larger for
infants that are closer to full term.
24
Analysis of Covariance (ANCOVA)
  • These two examples are analysis of covariance
    models where we were primarily interested in
    potential differences between populations defined
    but a nominal variable (e.g. smoking status) and
    we are making adjustment in that comparison for
    other factors such as gestational age. The
    variables that we are adjusting for are called
    covariates.

25
Example 1 NC Birth Data (contd)
  • We now consider comparing smoking and non-smoking
    mothers adjusting for the full set of potential
    confounding factors.

X1 mothers age (yrs.) X2 fathers age
(yrs.) X3 mothers education (yrs.) X4
fathers education (yrs.) X5 mothers smoking
status (1 yes, 0 no) X6 weight gained
during pregnancy (lbs.) X7 gestational age
(weeks) X8 number of prenatal visits X9 race
of child (White, Black, Other)
26
Example 1 NC Birth Data (contd)
Covariates
27
Example 1 NC Birth Data (contd)
  • Effect Tests

These covariates are not significant but are also
fairly correlated, thus they contain much the
same information. We might consider removing
some or potentially all of these predictors from
the model.
28
Example 1 NC Birth Data (contd)
Age of the mother and father are quite correlated
(r .7539), thus it is unlikely both of these
pieces of information would be needed in the same
regression model. When this happens we say there
is multicollinearity amongst the predictors.
Also in regression, when building models we wish
them to be parsimonious, i.e. be simple but
effective.
29
Stepwise Model Selection
  • When building regression models one of the
    simplest strategies is to use is stepwise model
    selection. There are two main types of stepwise
    methods forward selection and backward
    elimination.
  • Forward Selection
  • Fit model with intercept only, E(YX)b0
  • Fit model adding the best predictor amongst
    those available. This could be done by looking
    at one with maximum R2 for example.
  • Continue adding predictors one at time,
    maximizing the R2 at each step until no more
    predictors can be added that have p-values lt a.
    Generally a is chosen to be .10 or potentially
    higher.

30
Stepwise Model Selection
  • When building regression models one of the
    simplest strategies is to use is stepwise model
    selection. There are two main types of stepwise
    methods forward selection and backward
    elimination.
  • Backward Elimination
  • Fit model with all potential predictors added.
  • Remove worst predictor as judged by highest
    p-value usually.
  • Continue removing predictors one at time until
    all p-values for included predictors are lt a.
    Again, generally a is chosen to be .10 or
    potentially higher.

This is the approach I usually take.
31
Example 1 NC Birth Data Backward Elimination
Step 1 Remove Fathers Education
Step 3 Stop, no p-values gt .10.
Step 2 Remove Fathers Age
32
Example 1 NC Birth Data (contd)
R2 35.62 of the variation in birth weight is
explained by our model.
Fitted Model
Interpretation of Smoking Status Adjusting for
mothers age education, weight gain during
pregnancy, gestational age race of the infant,
and number of prenatal visits we find the smoking
mothers have a mean birth weight which is 2 x
85.87 171.74 g less than that for mothers who
do not smoke during pregnancy.
33
95 CI for Difference in Means
After adjusting for mothers age years of
education, weight gain during pregnancy,
gestational age race of the infant, and number
of prenatal visits, we estimate that the mean
birth weight of infants born to women who smoke
during pregnancy is between 77 g and 266 g less
than that for women who do not smoke during
pregnancy.
This can also be obtained directly from parameter
estimates.
34
Checking Assumptions
  • Assumptions
  • The specified function form for E(YX) is
    adequate.
  • The Var(YX) or SD(YX) is constant.
  • Random errors are normally distributed.
  • Error are independent.
  • Basic plots
  • Residuals vs. Fitted Values (checks 1, 2, 4)
  • Normal Quantile Plot of Residuals (checks 3)
  • Note These are the same plots used in simple
    linear regression to check model assumptions.

35
Checking Assumptions
With the exception of a few mild outliers and
one fairly extreme outlier there are no obvious
violations of model assumptions, there is no
curvature evidence and the variation looks
constant.
Residuals are approximately normally distributed
with the exception of a few extreme outliers on
the low end.
36
Example 3 Factors Related to Job Performance of
Nurses
  • A nursing director would like to use nurses
    personal characteristics to develop a regression
    model for predicting job performance (JOBPER).
    The following potential predictors are available
  • X1 assertiveness (ASSERT)
  • X2 enthusiasm (ENTHUS)
  • X3 ambition (AMBITION)
  • X4 communication skills (COMM)
  • X5 problem-solving skills (PROB)
  • X6 initiative (INITIATIVE)
  • Y job performance (JOBPER)

37
Example 3 Factors Related to Job Performance of
Nurses
38
Example 3 Factors Related to Job Performance of
Nurses
  • Correlations and Scatter Plot Matrix

We can see that ambition has the strongest
correlation with performance (r .8787, p lt
.0001) and problem-solving skills the weakest (r
.1555, p .4118). It also interesting to note
that initiative has a negative correlation with
performance (r -.5777, p .0008).
What really would like to see is the correlation
between job performance and each variable
adjusting for the other variables because we can
clearly see that the predictors themselves are
related.
39
Partial Correlations
  • The partial correlation between a
    response/dependent variable (Y) and
    predictor/independent variable (Xi) is a measure
    of the strength of linear association between Y
    and Xi adjusted for the other independent
    variables being considered.

Taking the other variables into account we that
ambition (partial corr. .8023) and initiative
(partial corr. -.4043) have the strongest
adjusted relationship with job performance. We
would therefore expect these variables to be a
final regression model for job performance.
40
Example 3 Factors Related to Job Performance of
Nurses
R2 84.8 of the variation in job performance is
explained by the model. The adjusted R-square
penalizes for having too many predictors in the
model. Every predictor added to a model will
increase the R-square, however we generally reach
a point of diminishing returns as we continue to
add predictors. Here the adjusted R2 80.9.
Several predictors appear to be unimportant and
could be removed from the model, we will again
use backward elimination to do this.
41
Added Variable (Leverage) Plots
Ambition and Initiative exhibit the strongest
adjusted relationship with job performance.
These plots are a visualization of the partial
correlation. They show the relationship between
the response Y and each of the predictors
adjusted for the other predictors. The
correlation exhibited in each is the partial
correlation.
42
Example 3 Factors Related to Job Performance of
Nurses
  • Using backward elimination

Step 3 Drop Enthusiasm
Step 1 Drop Problem-Solving
Step 2 Drop Communication
Step 4 Drop Assertiveness
R2 80.7 of variation in job performance
explained by the regression on ambition and
initiative. Notice this is not much different
than the adjusted R2 for the full model.
43
Checking Assumptions
No problems here.
Or here
Final Regression Model
44
Summary
  • Two-sample t-tests, one-way, and two-way ANOVA
    are all really just regression models with
    nominal predictors.
  • Analysis of Covariance (ANCOVA) is also just
    regression where we are interested in making
    population/treatment comparisons adjusting for
    the potential effects of other factors/covariates.
  • Multiple regression in general is process of
    estimating the mean response of a variable (Y)
    using multiple predictors/independent variables,
    E(YX1,,Xp).

45
Summary
  • Partial correlation and added variable or
    leverage plots help understand the relationship
    between the response and an individual
    independent variable adjusting for the other
    independent variables being considered.
  • Assumption checking is basically the same as it
    was for simple linear regression.

46
Summary
  • When problems are evident general remedies
    include
  • Transforming the response (Y)
  • Transforming the predictors
  • Adding nonlinear terms to the model like squared
    terms (Xi2) or including interaction terms.
  • Still need to be aware of strange observations,
    i.e. outliers and influential points.
Write a Comment
User Comments (0)
About PowerShow.com