Chapter 9 Variable Selection and Model building - PowerPoint PPT Presentation

Loading...

PPT – Chapter 9 Variable Selection and Model building PowerPoint presentation | free to download - id: 7c6234-OTE5Y



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Chapter 9 Variable Selection and Model building

Description:

Chapter 9 Variable Selection and Model building Ray-Bing Chen Institute of Statistics National University of Kaohsiung 9.1 Introduction 9.1.1 The Model-Building ... – PowerPoint PPT presentation

Number of Views:59
Avg rating:3.0/5.0
Slides: 33
Provided by: edut1550
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Chapter 9 Variable Selection and Model building


1
Chapter 9 Variable Selection and Model building
  • Ray-Bing Chen
  • Institute of Statistics
  • National University of Kaohsiung

2
9.1 Introduction
  • 9.1.1 The Model-Building Problem
  • Ensure that the function form of the model is
    correct and that the underlying assumptions are
    not violated.
  • A pool of candidate regressors
  • Variable selection problem
  • Two conflicting objectives
  • Include as many regressors as possible the
    information content in these factors can
    influence the predicted values, y

3
  • Include as few regressors as possible the
    variance of the prediction increases as the
    number of the regressors increases
  • Best regression equation???
  • Several algorithms can be used for variable
    selection, but these procedures frequently
    specify different subsets of the candidate
    regressors as best.
  • An idealized setting
  • The correct functional forms of regressors are
    known.
  • No outliers or influential observations

4
  • Residual analysis
  • Iterative approach
  • A variable selection strategy
  • Check the correct functional forms, outliers and
    influential observations
  • None of the variable selection procedures are
    guaranteed to produce the best regression
    equation for a given data set.

5
  • 9.1.2 Consequences of Model Misspecification
  • The full model
  • The subset model

6
(No Transcript)
7
(No Transcript)
8
(No Transcript)
9
  • Motivation for variable selection
  • Deleting variables from the model can improve the
    precision of parameter estimates. This is also
    true for the variance of predicted response.
  • Deleting variable from the model will introduce
    the bias.
  • However, if the deleted variables have small
    effects, the MSE of the biased estimates will be
    less than the variance of the unbiased estimates.

10
  • 9.1.3 Criteria for Evaluating Subset Regression
    Models
  • Coefficient of Multiple Determination

11
  • Aitkin (1974) R2-adequate subset the subset
    regressor variables produce R2 gt R20

12
(No Transcript)
13
(No Transcript)
14
(No Transcript)
15
(No Transcript)
16
(No Transcript)
17
  • Uses of Regression and Model Evaluation Criteria
  • Data description Minimize SSRes and as few
    regressors as possible
  • Prediction and estimation Minimize the mean
    square error of prediction. Use PRESS statistic
  • Parameter estimation Chapter 10
  • Control minimize the standard errors of the
    regression coefficients.

18
9.2 Computational Techniques for Variable
Selection
  • 9.2.1 All Possible Regressions
  • Fit all possible regression equations, and then
    select the best one by some suitable criterions.
  • Assume the model includes the intercept term
  • If there are K candidate regressors, there are 2K
    total equations to be estimated and examined.

19
  • Example 9.1 The Hald Cement Data

20
(No Transcript)
21
  • R2p criterion

22
(No Transcript)
23
(No Transcript)
24
(No Transcript)
25
(No Transcript)
26
(No Transcript)
27
(No Transcript)
28
(No Transcript)
29
  • 9.2.2 Stepwise Regression Methods
  • Three broad categories
  • Forward selection
  • Backward elimination
  • Stepwise regression

30
(No Transcript)
31
  • Backward elimination
  • Start with a model with all K candidate
    regressors.
  • The partial F-statistic is computed for each
    regressor, and drop a regressor which has the
    smallest F-statistic and lt FOUT.
  • Stop when all partial F-statistics gt FOUT.

32
  • Stepwise Regression
  • A modification of forward selection.
  • A regressor added at an earlier step may be
    redundant. Hence this variable should be dropped
    from the model.
  • Two cutoff values FOUT and FIN
  • Usually choose FIN gt FOUT more difficult to add
    a regressor than to delete one.
About PowerShow.com