# Chapter 9 Variable Selection and Model building - PowerPoint PPT Presentation

PPT – Chapter 9 Variable Selection and Model building PowerPoint presentation | free to download - id: 7c6234-OTE5Y

The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
Title:

## Chapter 9 Variable Selection and Model building

Description:

### Chapter 9 Variable Selection and Model building Ray-Bing Chen Institute of Statistics National University of Kaohsiung 9.1 Introduction 9.1.1 The Model-Building ... – PowerPoint PPT presentation

Number of Views:59
Avg rating:3.0/5.0
Slides: 33
Provided by: edut1550
Category:
Tags:
Transcript and Presenter's Notes

Title: Chapter 9 Variable Selection and Model building

1
Chapter 9 Variable Selection and Model building
• Ray-Bing Chen
• Institute of Statistics
• National University of Kaohsiung

2
9.1 Introduction
• 9.1.1 The Model-Building Problem
• Ensure that the function form of the model is
correct and that the underlying assumptions are
not violated.
• A pool of candidate regressors
• Variable selection problem
• Two conflicting objectives
• Include as many regressors as possible the
information content in these factors can
influence the predicted values, y

3
• Include as few regressors as possible the
variance of the prediction increases as the
number of the regressors increases
• Best regression equation???
• Several algorithms can be used for variable
selection, but these procedures frequently
specify different subsets of the candidate
regressors as best.
• An idealized setting
• The correct functional forms of regressors are
known.
• No outliers or influential observations

4
• Residual analysis
• Iterative approach
• A variable selection strategy
• Check the correct functional forms, outliers and
influential observations
• None of the variable selection procedures are
guaranteed to produce the best regression
equation for a given data set.

5
• 9.1.2 Consequences of Model Misspecification
• The full model
• The subset model

6
(No Transcript)
7
(No Transcript)
8
(No Transcript)
9
• Motivation for variable selection
• Deleting variables from the model can improve the
precision of parameter estimates. This is also
true for the variance of predicted response.
• Deleting variable from the model will introduce
the bias.
• However, if the deleted variables have small
effects, the MSE of the biased estimates will be
less than the variance of the unbiased estimates.

10
• 9.1.3 Criteria for Evaluating Subset Regression
Models
• Coefficient of Multiple Determination

11
• Aitkin (1974) R2-adequate subset the subset
regressor variables produce R2 gt R20

12
(No Transcript)
13
(No Transcript)
14
(No Transcript)
15
(No Transcript)
16
(No Transcript)
17
• Uses of Regression and Model Evaluation Criteria
• Data description Minimize SSRes and as few
regressors as possible
• Prediction and estimation Minimize the mean
square error of prediction. Use PRESS statistic
• Parameter estimation Chapter 10
• Control minimize the standard errors of the
regression coefficients.

18
9.2 Computational Techniques for Variable
Selection
• 9.2.1 All Possible Regressions
• Fit all possible regression equations, and then
select the best one by some suitable criterions.
• Assume the model includes the intercept term
• If there are K candidate regressors, there are 2K
total equations to be estimated and examined.

19
• Example 9.1 The Hald Cement Data

20
(No Transcript)
21
• R2p criterion

22
(No Transcript)
23
(No Transcript)
24
(No Transcript)
25
(No Transcript)
26
(No Transcript)
27
(No Transcript)
28
(No Transcript)
29
• 9.2.2 Stepwise Regression Methods
• Forward selection
• Backward elimination
• Stepwise regression

30
(No Transcript)
31
• Backward elimination
regressors.
• The partial F-statistic is computed for each
regressor, and drop a regressor which has the
smallest F-statistic and lt FOUT.
• Stop when all partial F-statistics gt FOUT.

32
• Stepwise Regression
• A modification of forward selection.
• A regressor added at an earlier step may be
redundant. Hence this variable should be dropped
from the model.
• Two cutoff values FOUT and FIN
• Usually choose FIN gt FOUT more difficult to add
a regressor than to delete one.