Title: MGT 511: Hypothesis Testing and Regression Lecture 8: Framework for Multiple Regression Analysis
1MGT 511 Hypothesis Testing and
RegressionLecture 8 Framework for Multiple
Regression Analysis
2Recall
- Simple Regression
- T-test of slope coefficients, R-square
- Forecasts, Prediction and Confidence Intervals
- Transformations for nonlinearity and non-constant
variance - Multiple Regression
- Partial Slopes, tradeoff between bias and
precision - ANOVA, F-test
- Dummy Variables and Interaction Variables
- Residual Analysis and Outliers
3Framework for Multiple Regression
- Use theory, knowledge to build the initial model
- Residual Analysis and Refinement of model
- Perform F-test If F-test rejects null, perform
t-tests - Possible Reasons for Insignificance of Individual
Slope Coefficients - Refine the model
4Step 1 Using knowledge, theory to specify
initial model
- What is dependent variable? potential predictor
variables? - Should you use
- Transformations to accommodate nonlinear effects
- Normalize the y or x variables (per-capita,
constant etc) - Dummy variables
- Interaction variables if slope effects can be
different - Collect data, Estimate the model
- Are the results plausible? For e.g., how is
prediction at extreme values? - If not refine model.
5What should be the Y and X variables?
- Y- Sales of personal printers in different sales
districts - What are appropriate X variables?
- Knowledge suggests several segments
- College students, home users, small businesses,
computer network workstations - Appropriate X variables
- College freshmen, household income, small
business starts, new network installations
6Potential X variables Tradeoffs
- Omitting important variables can bias results or
reduce explanatory power - Using too many variables can make all variables
insignificant - Prioritize the variables, based on what you
consider are most important
7Transformations
- Is the relationship nonlinear?
- Sales-Advertising relationship
- Experience Curve effect
8Normalization of the Variables
- Normalizing the Y variable Example
- Y- Unit Sales in different cities (Problem?)
- X- Price and Feature Advertising
- Solution?
- Normalizing the X variable Example
- Y- Total Market Value of Firm
- X- Value of Assets, Number of Employees
(Problem?) - Solution?
9Interaction Effects
- Y- Sales X Prices, Feature
- Y- Sales X Price,Holiday
- Y-Salary X Gender, Experience
10Plausibility of Results
- Will results make sense at extreme values?
- Usually alerts to nonlinearity issues
- Examples
- What will sales be at very high prices, very high
advertising? - What will cost be at high levels of experience?
11Step 2 Residual Analysis
- Check the residuals refine model
- Accommodating Nonlinear Effects
- Accounting for non-constant variance
- Accounting for outliers
- Keep refining the model, estimate the refined
model until the residuals are satisfactory - Remember that residuals will not perfectly follow
the rules due to randomness minor deviations
will not affect regression results
12Step 3 Performing F-tests and t-tests
- If estimated equation and residual analysis are
OK, conduct F-test for the model as a whole - If we reject the null using the F-test conduct
t-tests for individual slopes - Question What to do if one or more individual
slope coefficients are insignificant?
13Possible Reasons for Insignificance of Individual
Slope Coefficients
- Omitted Variable Bias
- Nonlinearity not appropriately taken care of
- Multicollinearity
- True effect is non-zero, but small
- True effect is zero
14Omitted Variable Bias
- One or more relevant predictor variables are
missing - action add the variables to the model
- Example 1
- Y- Sales
- X- Price
- Omitted X variable Advertising
- Example 2
- Y- Salary
- X- Schooling
- Omitted X variable Job Experience
15Regression of Salary against Schooling and
Experience
Explain this phenomenon
16Nonlinearity not taken care of
- The X variable affects the Y variable differently
than assumed in the model - action use a different transformation
- Example Recall HW Problem
- Y- Yield
- X-Temperature
- Solution Add Temperature2
17Multicollinearity
- Highly Correlated X variables reduce significance
of all variables - action 1 reformulate the model (e.g. per
capita constant ) - action 2 obtain more data
- action 3 delete this predictor variable
18True Effect is Small or Zero
- True effect of X is small, but non-zero
- action 1 obtain more data (or)
- action 2 delete this variable
- True effect of X is zero
- action 2 delete this variable
19Possible Reasons for Insignificance of Individual
Slope Coefficients
- Omitted Variable Bias
- Nonlinearity not appropriately taken care of
- Multicollinearity
- True effect is non-zero, but small
- True effect is zero
20Summary
- For multiple regression to provide valid and
meaningful results, it is critical that the
proposed model is well done - Before we can justify statistical inference
(about the model, about slope parameters or for
predictions), the plausibility of the estimated
equation should be checked and the residuals
should be examined - Variables should be transformed to accommodate
nonlinear effects for the original variables
(e.g. resulting in linear effects for the
transformed variables) - There are many possible reasons for the
occurrence of insignificant slope coefficients
(and it is not easy to distinguish between these
reasons)