Title: Accuracy and power of randomization tests in multivariate analysis of variance with vegetation data
1Accuracy and power of randomization tests in
multivariate analysis of variance with vegetation
data
- Valério De Patta Pillar
- Departamento de Ecologia
- Universidade Federal do Rio Grande do Sul
- Porto Alegre, Brazil
- vpillar_at_ecologia.ufrgs.br
- http//ecoqua.ecologia.ufrgs.br
2- Randomization testing
- Became practical with fast microcomputers.
- Applicable to most cases analyzed by classical
methods. - Applicable to cases not covered by classical
methods.
3How good is randomization testing?
- Is it accurate?
- Is it powerful enough?
4Group comparison by randomization testing
Choose a test criterion (?) to compare the groups
Permute the data according to the conditions
stated by the null hypothesis (Ho) that the
groups do not differ
Calculate the test criterion?in the random data
and compare it to the value found in the observed
data.
After many iterations, the probability P(?o ?)
will be the number of iterations with ?o ?
divided by the total number of iterations.
Reject Ho if P(?o ?) is smaller than a
threshold (?)
Manly, B. F. J. 1997. Randomization, Bootstrap
and Monte Carlo Methods in Biology. 2 ed. Chapman
and Hall.
5Randomization test criteria for multivariate
comparisons of any number of groups
Pillar, V. D. Orlóci, L. 1996. J. Veg. Sci.
7585-592.
6An example
Is there a significant effect of N on vegetation
composition as defined by these two PFTs?
SQ between groups(Qb) 60.088 - 10.02 50.068
How common is a Qb 50.068 if Ho were true (that
the composition is unrelated to group)?
7Reference set under Ho
If Ho true, the observation vector in a given
sampling unit is independent from the group to
which the unit belongs.
8A random permutation and corresponding statistics
SQ between groups(Qbo) 60.088 - 28.35
31.738 Since, 31.738 lt 50.068 (Qbo lt Qb), this
iteration adds zero to the frequency of cases in
which Qbo Qb.
9After 10000 random permutations
10Two-factor designs
- Test criterion
- Qb Qt - Qw is based on the groups defined by
the joint states of the factors. - Qb is partitioned as
- Qb QbA QbB QbAB
- where
- QbA sum of squares between la groups according
to factor A disregarding factor B - QbB sum of squares between lb groups according
to factor B disregarding factor A - QbAB sum of squares of the interaction AB,
obtained by difference. - F-ratio Qb/Qw
11Unrestricted permutation in two-factor design
12Two-factor Multivariate Analysis of Variance
One random permutation
Observed
Data Species (57) composition in 8 vegetation
units surveyed in two landscape positions (factor
A) and two grazing levels (factor B).
13After 10000 random permutations
Data Species (57) composition in 8 vegetation
units surveyed in two landscape positions (factor
A) and two grazing levels (factor B).
Unrestricted random permutations. Test criterion
F-ratio Qb/Qw.
14Restricted permutations
- In two-factor (not nested) designs, for testing
one factor, permutations may be restricted to
occur within the levels of the other factor
(Edgington 1987). - Restricted permutation within the levels of
factor A (for testing factor B)
Edgington, E. S. 1987. Randomization Tests.
Marcel Dekker, New York.
15Permutations of residuals instead of raw data
16Two-factor multivariate analysis of variance by
randomization testing for the effects of
landscape position and grazing level in natural
grassland, southern Brazil (data from Pillar
1986). The data set contains 16 polled community
stands by 60 species. Restricted random
permutations for testing factors landscape and
grazing. Permutation of residuals removing both
factors for testing the interaction.
17How good is randomization testing in two-factor
multivariate analysis of variance?
18Simulation of interaction
- For each case, 1000 data sets were generated,
with distribution properties of real vegetation
data and subject to multivariate analysis of
variance with randomization testing. - When factor or interaction effect is set to zero,
the proportion of Ho rejection under a given a
threshold estimates Type I Error, the probability
of wrongly rejecting Ho when it is true. - If Type I Error is equal to a, the test is exact.
- When factor or interaction effect gt 0, the
proportion of Ho rejection estimates the power of
the test, which is the one-complement of Type II
Error, the probability of not rejecting Ho when
it is false.
19Simulated data generated with distributional
properties of real data
- Data set 16 grassland units described by cover
of 60 species. - Two factors landscape position (top-convex,
concave-lowland) and grazing levels (grazed,
ungrazed). - Procedure described by Peres-Neto Olden (2000)
- Calculate the mean (???) and the standard
deviation (?ij) for each species vector i within
each group j defined by the four factor level
combinations - Standardize these vectors for mean equal 0 and
standard deviation equal 1, thij(xhij-???)/???
- Randomly permute whole stand vectors across
groups - Restore the original dispersion within each group
by computing new observations shij thij???,
defining in this way a data set with the
conditions specified by Ho - Apply to the species vectors the corresponding
group differences for factor and interaction
effects - Perform the randomization tests using 1000 random
permutations - Repeat the steps (3) to (6) 1000 times, recording
the proportion of Ho rejection.
Peres-Neto, P.R. Olden, J.D. 2000. Animal
Behaviour 61 79-86.
20Results of power evaluation by data simulation in
two-factor MANOVA. The proportion of Ho rejection
at a 0.05 was obtained for 1000 simulated data
sets generated on the basis of plant community
data with 16 units and 60 species, with
increasing difference between the two groups for
factor 1, with no interaction. Each factor
combination had equal number of units. For each
data set a randomization test was run with 1000
iterations.
21As the effects of both factors increase, type I
error for the interaction is underestimated with
unrestricted permutations with Qb and ?-ratio,
but not with residuals.
22As the effect of interaction increases, type I
error for both factors is underestimated with Qb
and ?-ratio, un- and restricted permutations.
But, main factors should not be considered at all
when interaction is present!
23(No Transcript)
24Results of power evaluation by data simulation in
two-way designs. The proportion of Ho rejection
at a 0.05 was obtained for 1000 simulated data
sets generated on the basis of plant community
data with 60 units and 60 species, with
increasing relative difference between the four-
group factor combinations. Factor combinations
had unequal number of units (11 31, 12 9, 21
9, 22 11). For each data set a randomization
test was run with 1000 iterations.
25References
- Anderson, M.J. and ter Braak, C. 2003,
Permutation tests for multi-factorial analysis of
variance. Journal of Statistical Computations and
Simulations 7385-113. - Edgington, E. S. 1987. Randomization Tests.
Marcel Dekker, New York. - Manly, B. F. J. 1997. Randomization, Bootstrap
and Monte Carlo Methods in Biology. 2 ed. Chapman
and Hall. - Peres-Neto, P.R. Olden, J.D. 2000. Assessing
the robustness of randomization tests examples
from behavioural studies. Animal Behaviour 61
79-86. - Pillar, V. D. Orlóci, L. 1996. On randomization
testing in vegetation science multifactor
comparisons of relevé groups. Journal of
Vegetation Science 7585-592. - Pillar, V. D. 1994-2004. MULTIV Software for
multivariate analysis, randomization tests and
bootstrapping. Available (minor version, manual
included) at http//ecoqua.ecologia.ufrgs.br