Title: Chapter 3 Experiments with a Single Factor: The Analysis of Variance
1Chapter 3 Experiments with a Single Factor The
Analysis of Variance
23.1 An Example
- Chapter 2 A signal-factor experiment with two
levels of the factor - Consider signal-factor experiments with a levels
of the factor, a ? 2 - Example
- The tensile strength of a new synthetic fiber.
- The weight percent of cotton
- Five levels 15, 20, 25, 30, 35
- a 5 and n 5
3- Does changing the cotton weight percent change
the mean tensile strength? - Is there an optimum level for cotton content?
43.2 The Analysis of Variance
- a levels (treatments) of a factor and n
replicates for each level. - yij the jth observation taken under factor level
or treatment i.
5- Models for the Data
- Means model
- yij is the ijth observation,
- ?i is the mean of the ith factor level
- ?ij is a random error with mean zero
- Let ?i ? ?i , ? is the overall mean and ?i is
the ith treatment effect - Effects model
6- Linear statistical model
- One-way or Signal-factor analysis of variance
model - Completely randomized design the experiments are
performed in random order so that the environment
in which the treatment are applied is as uniform
as possible. - For hypothesis testing, the model errors are
assumed to be normally and independently
distributed random variables with mean zero and
variance, ?2, i.e. yij N(??i, ?2) - Fixed effect model a levels have been
specifically chosen by the experimenter.
73.3 Analysis of the Fixed Effects Model
- Interested in testing the equality of the a
treatment means, and E(yij) ?i ? ?i, i
1,2, , a - H0 ?1 ?a v.s.
- H1 ?i ? ?j, for at least one
pair (i, j) - Constraint
- H0 ?1 ?a 0 v.s. H1 ?i ? 0, for at least
one i
8- Notations
- 3.3.1 Decomposition of the Total Sum of Squares
- Total variability into its component parts.
- The total sum of squares (a measure of overall
variability in the data) - Degree of freedom an 1 N 1
9- SSTreatment sum of squares of the differences
between the treatment averages (sum of squares
due to treatments) and the grand average, and a
1 degree of freedom - SSE sum of squares of the differences of
observations within treatments from the treatment
average (sum of squares due to error), and N a
degrees of freedom.
10- A large value of SSTreatments reflects large
differences in treatment means - A small value of SSTreatments likely indicates
no differences in treatment means - dfTotal dfTreatment dfError
-
- If there are no differences between a treatment
means,
11- Mean squares
- 3.3.2 Statistical Analysis
- Assumption ?ij are normally and independently
distributed with mean zero and variance ?2
12- SST/?2 Chi-square (N 1), SSE/?2 Chi-square
(N a), SSTreatments/?2 Chi-square (a 1),
and SSE/?2 and SSTreatments/?2 are independent
(Theorem 3.1) - H0 ?1 ?a 0 v.s. H1 ?i ? 0, for at least
one i
13- Reject H0 if F0 gt F?, a-1, N-a
- Rewrite the sum of squares
- See page 71
14 Response Strength ANOVA for Selected
Factorial Model Analysis of variance table
Partial sum of squares Sum of Mean F Source
Squares DF Square Value Prob gt
F Model 475.76 4 118.94 14.76 lt
0.0001 A 475.76 4 118.94 14.76 lt 0.0001 Pure
Error 161.20 20 8.06 Cor Total 636.96 24 Std.
Dev. 2.84 R-Squared 0.7469 Mean 15.04 Adj
R-Squared 0.6963 C.V. 18.88 Pred
R-Squared 0.6046 PRESS 251.88 Adeq
Precision 9.294
15- 3.3.3 Estimation of the Model Parameters
- Model yij ? ?i ?ij
- Estimators
- Confidence intervals
16- Example 3.3 (page 75)
- Simultaneous Confidence Intervals (Bonferroni
method) Construct a set of r simultaneous
confidence intervals on treatment means which is
at least 100(1-?) 100(1-?/r) C.I.s - 3.3.4 Unbalanced Data
- Let ni observations be taken under treatment i,
i1,2,,a, N ?i ni,
17- 1. The test statistic is relatively insensitive
to small departures from the assumption of equal
variance for the a treatments if the sample sizes
are equal. - 2. The power of the test is maximized if the
samples are of equal size.
183.4 Model Adequacy Checking
- Assumptions yij N(??i, ?2)
- The examination of residuals
- Definition of residual
- The residuals should be structureless.
19- 3.4.1 The Normality Assumption
- Plot a histogram of the residuals
- Plot a normal probability plot of the residuals
- See Table 3-6
20- May be
- Slightly skewed (right tail is longer than left
tail) - Light tail (the left tail of error is thinner
than the tail part of standard normal) - Outliers
- The possible causes of outliers calculations,
data coding, copy error,. - Sometimes outliers are more informative than the
rest of the data.
21- Detect outliers Examine the standardized
residuals, - 3.4.2 Plot of Residuals in Time Sequence
- Plotting the residuals in time order of data
collection is helpful in detecting correlation
between the residuals. - Independence assumption
22(No Transcript)
23- 3.4.3 Plot of Residuals Versus Fitted Values
- Plot the residuals versus the fitted values
- Structureless
24- Nonconstant variance the variance of the
observations increases as the magnitude of the
observation increase, i.e. yij ? ?2 - If the factor levels having the larger variance
also have small sample sizes, the actual type I
error rate is larger than anticipated. - Variance-stabilizing transformation
25- Statistical Tests for Equality Variance
- Bartletts test
- Reject null hypothesis if
26- Example 3.4 the test statistic is
- Bartletts test is sensitive to the normality
assumption - The modified Levene test
- Use the absolute deviation of the observation in
each treatment from the treatment median. - Mean deviations are equal gt the variance of the
observations in all treatments will be the same. - The test statistic for Levenes test is the ANOVA
F statistic for testing equality of means.
27- Example 3.5
- Four methods of estimating flood flow frequency
procedure (see Table 3.7) - ANOVA table (Table 3.8)
- The plot of residuals v.s. fitted values (Figure
3.7) - Modified Levenes test F0 4.55 with P-value
0.0137. Reject the null hypothesis of equal
variances.
28- Let E(y) ? and ?y ? ??
- Find y y? that yields a constant variance.
- ? ? ???-1
- Variance-Stabilizing Transformations
29- How to find ?
- Use
- See Figure 3.8, Table 3.10 and Figure 3.9
303.5 Practical Interpretation of Results
- Conduct the experiment gt perform the statistical
analysis gt investigate the underlying
assumptions gt draw practical conclusion - 3.5.1 A Regression Model
- Qualitative factor compare the difference
between the levels of the factors. - Quantitative factor develop an interpolation
equation for the response variable.
31- Regression analysis
- See Figure 3.1
Final Equation in Terms of Actual Factors
Strength 62.61143-9.01143 Cotton Weight
0.48143 Cotton Weight 2 -7.60000E-003
Cotton Weight 3 This is an empirical model of
the experimental results
32- 3.5.2 Comparisons Among Treatment Means
- If that hypothesis is rejected, we dont know
which specific means are different - Determining which specific means differ following
an ANOVA is called the multiple comparisons
problem - 3.5.3 Graphical Comparisons of Means
33- 3.5.4 Contrast
- A contrast a linear combination of the
parameters of the form - H0 ? 0 v.s. H1 ? ? 0
- Two methods for this testing.
34 35 36- The C.I. for a contrast, ?
- Unequal Sample Size
37- 3.5.5 Orthogonal Contrast
- Two contrasts with coefficients, ci and di,
are orthogonal if ?ci di 0 - For a treatments, the set of a 1 orthogonal
contrasts partition the sum of squares due to
treatments into a 1 independent
single-degree-of-freedom components. Thus, tests
performed on orthogonal contrasts are
independent. - See Example 3.6 (Page 94)
38- 3.5.6 Scheffes Method for Comparing All
Contrasts - Scheffe (1953) proposed a method for comparing
any and all possible contrasts between treatment
means. - See Page 95 and 96
39- 3.5.7 Comparing Pairs of Treatment Means
- Compare all pairs of a treatment means
- Tukeys Test
- The studentized range statistic
- See Example 3.7
40- Sometimes overall F test from ANOVA is
significant, but the pairwise comparison of mean
fails to reveal any significant differences. - The F test is simultaneously considering all
possible contrasts involving the treatment means,
not just pairwise comparisons. - The Fisher Least Significant Difference (LSD)
Method - For H0 ?i ?j
41- The least significant difference (LSD)
- See Example 3.8
- Duncans Multiple Range Test
- The a treatment averages are arranged in
ascending order, and the standard error of each
average is determined as
42- Assume equal sample size, the significant ranges
are - Total a(a-1)/2 pairs
- Example 3.9
- The Newman-Keuls Test
- Similar as Duncans multiple range test
- The critical values
43- 3.5.8 Comparing Treatment Means with a Control
- Assume one of the treatments is a control, and
the analyst is interested in comparing each of
the other a 1 treatment means with the control.
- Test H0 ?i ?a v.s. H1 ?i ? ?a, i 1,2,, a
1 - Dunnett (1964)
- Compute
- Reject H0 if
- Example 3.10
443.7 Determining Sample Size
- Determine the number of replicates to run
- 3.7.1 Operating Characteristic Curves (OC Curves)
- OC curves a plot of type II error probability of
a statistical test,
45- If H0 is false, then
- F0 MSTreatment / MSE noncentral F
- with degree of freedom a 1 and N a and
noncentrality parameter ? - Chart V of the Appendix
- Determine
- Let ?i be the specified treatments. Then
estimates of ?i - For ?2, from prior experience, a previous
experiment or a preliminary test or a judgment
estimate.
46- Example 3.11
- Difficulty How to select a set of treatment
means on which the sample size decision should be
based. - Another approach Select a sample size such that
if the difference between any two treatment means
exceeds a specified value the null hypothesis
should be rejected.
47- 3.7.2 Specifying a Standard Deviation Increase
- Let P be a percentage for increase in standard
deviation of an observation. Then - For example (Page 110) If P 20, then
48- 3.7.3 Confidence Interval Estimation Method
- Use Confidence interval.
- For example we want 95 C.I. on the difference
in mean tensile strength for any two cotton
weight percentages to be ? 5 psi and ? 3. See
Page 110.
493.9 The Regression Approach to the Analysis of
Variance
50- The normal equations
- Apply the constraint
- Then estimations are
- Regression sum of squares (the reduction due to
fitting the full model)
51- The error sum of squares
- Find the sum of squares resulting from the
treatment effects
52- The testing statistic for H0 ?1 ?a