Title: Searching for important factors among large No. of factors in DOE
1Analyzing Supersaturated Designs Using Biased
Estimation
QPRC 2003 ByAdnan Bashir andJames Simpson
May 23,2003
2Outline
- Introduction
- Motivation example
- Research objectives
- Proposed analysis method
- Multicollinearity ridge
- Best subset model
- Simulated case studies
- Example
- Results
- Conclusion recommendations
- Future research
3Introduction
- Many studies and experiments contain a large
number of variables - Fewer variables are significant
- Which are those few factors? How do we find those
factors? - Screening experiments (Design Analysis) are
used to find those important factors - Several methods techniques (Design Analysis)
are available to screen
4Motivation exampleComposites Production
Raw Materials
INPUTS (Factors) Resin Flow Rate (x1) Type of
Resin (x2) Gate Location (x3) Fiber Weave
(x4) Mold Complexity (x5) Fiber Weight
(x6) Curing Type (x7) Pressure (x8)
OUTPUTS (Responses) Fiber Permeability Product
Quality Tensile Strength
Process
Noise
5Motivation example (continued)
Response y Tensile strength
Each experiment costs 500, requires 8 hours,
budget 3,000 (6 experiments)
1 High level -1 Low level
- Supersaturated Designs number of factors m
number of runs n - Columns are not Orthogonal
6Research Objectives
- Propose an efficient technique to screen the
important factors in an experiment with fewer
number of runs - Construct improved supersaturated designs
- Develop an accurate, reliable and efficient
technique to analyze supersaturated designs
7Analysis of SSDs Current Methods
- Stepwise regression, most commonly used
- Lin (1993, 1995), Wang (1995), Nguyen (1996)
- All possible regressions
- Abraham, Chipman, and Vijayan (1999)
- Bayesian method
- Box and Meyer (1993)
- Investigated techniques
- Principle components, partial least squares and
flexible regression methods (MARS CART)
8Analysis of SSDs Proposed Method
- Modified best subset via ridge regression
(MBS-RR) - Ridge regression for multicollinearity
- Best subset for variable selection in each model
- Criterion based selection to identify best model
9Ridge Regression Motivation
Ordinary Least Squares
Ridge Regression
Consider adding k gt 0 to each diagonal of X'X ,
say k 0.1
- Consider a centered, scaled matrix, X
10Ridge Regression
- Ridge regression estimates
- where k is referred to as a
- shrinkage parameter
- Thus,
Geometric interpretation of ridge regression
11Ridge Regression, (continued)Shrinkage parameter
- Hoerl and Kennard (1975) suggest
- where p is number of parameter
- are found from the least squares
solution
12Shrinkage Parameter Ridge Trace
Ridge trace for nine regressors (Adapted from
Montgomery, Peck, Vining 2001)
13Proposed Analysis Method
Read X, Y
Contd.
Select the best 1-factor model By OLS (k0)
Calculate k, and find the best 2-factor model by
all possible subsets
Adding 1 factor at a time to the best 2-factor
model, from the remaining variables to get the
best 3-factor model
14Proposed Analysis Method
Is the stopping rule satisfied?
Yes
No
Adding 1 factor at a time to the best 3-factor
model, from the remaining variables to get the
best 4-factor model
Yes
Is the stopping rule satisfied?
No
Adding 1 factor at a time to the best 7-factor
model, from the remaining variables to get the
best 8-factor model
Final Model with Min. Cp
15Selecting the Best Model
Where diff user defined tolerance
Cp
16Method Comparison-Monte CarloSimulation Design
of Experiments
Factors considered in the simulation study
III Fractional Factorial Design Matrix
17Analysis Method Comparison
- The performance measures, Type I and Type II
errors
18Case Studies with Corresponding Models
19Method Comparison Results, Type I errors
20 Method Comparison Results, Type II errors
21Factors Contributing to Method PerformanceType
II Errors
Stepwise Method
var
22Factors Contributing to Method PerformanceType
II Errors
Proposed Method
var
23Summary Results
A No. of runs B No. of factors C
Multicollinearity D Error variance E No. of
Sig. factors
24Conclusions Recommendations
- SSDs Analysis Best Subset Ridge Regression
- Use ridge regression estimation
- Best subset variable selection method outperforms
stepwise regression
25Future Research
- Analyzing SSDs
- Multiple criteria in selecting the best model
- All possible subset, 3 factor model
- Streamline program code
- Real-life case studies
- Genetic algorithm for variable selection
26Acknowledgement
- Dr. Carroll Croarkin, chair of selection
committee for Mary G. Natrella - Selection Committee for Mary G. Natrella
Scholarship - Dr. Simpson, Supervisor