Two Factor ANOVA and the BACI sampling design - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

Two Factor ANOVA and the BACI sampling design

Description:

FTV(3,12) = 3.49. Accept H0. H0: No Int'n between factors. Hi: There is int'n between factors. FTV(6,12) = 3.00. Accept H0. 14. An issue to think about: ... – PowerPoint PPT presentation

Number of Views:534
Avg rating:3.0/5.0
Slides: 33
Provided by: llye
Category:
Tags: anova | baci | design | factor | ftv | sampling | two

less

Transcript and Presenter's Notes

Title: Two Factor ANOVA and the BACI sampling design


1
  • Two Factor ANOVA and the BACI sampling design
  • Non-parametric two-factor tests
  • Resampling method for two-factor tests.

2
Two Factor Designs
  • Consider studying the impact of two factors on
    the yield (response)
  • Note The 1 and 2, etc, mean Level 1, Level
    2, etc.., NOT metric values.
  • Here we have R 3 rows (levels of the Row
    factor), C 4 (levels of the column factor), and
    n 2 replicates per cell nu for cell if not
    all equal

3
Model
  • i 1,, R
  • j 1,, C
  • k 1,, n
  • In general, n observations per cell, R? C cells

4
  • Where

5
  • ALL the terms are somewhat intuitive, except
    for
  • The term is more
    intuitively written as

Adjustment for row membership
Adjustment for column membership
How a cell differs from grand mean
6
  • We can, without loss of generality, assume (for a
    moment) that there is no error why then might
    the above equation be non-zero?
  • Answer INTERACTION
  • Two basic ways to look at interaction

1)
If AHBH 13, no interaction If AHBH 13,
interaction If AHBH - When B goes from BL?BH, yield goes up by 3 (5
?8). - When A goes from AL?AH, yield goes up by 5
(5 ?10). - When both changes of level occur, does
yield go up by the
sum of 3 and 5?
7
  • Interaction degree of difference from sum of
    separate effects
  • Holding BL, what happens as A goes from AL?AH?
    5
  • Holding BH, what happens as A goes from AL?AH?
    9
  • Is the effect of one factor (i.e., the impact of
    changing its level) is DIFFERENT for all levels
    of another factor, then INTERACTION exists
    between the two factors

2)
NOTE - Holding AL, BL ? BH has impact 3
- Holding AH, BL ? BH has impact 7
(AB) (BA) or (9-5) (7-3)
8
Means in a 2-factor ANOVA, with various effects
of the factors and the interaction.
  • a) No effect of factor A, small effect of factor
    B.
  • b) Large effect of factor A, small effect of
    factor B, and no interaction
  • c) No effect of factor A, small effect of factor
    B, and no interaction
  • d) Large effect of factor A, large effect of B,
    and no interaction

9
(e)
(f)

B2
X
X


B2

B1
B1
A1
A2
A1
A2
B1
(h)
(g)
B1
X
X




B2
B2
A1
A2
A1
A2
  • e) No effect of A, no effect of B, but
    interaction between A and B
  • f) Large effect of A, but no effect of B, with
    slight interaction
  • g) No effect of A, large effect of B, with large
    interaction
  • h) Effect of A, effect of B, with large
    interaction

10
  • Going back to the (model) equation bringing
    to the other side of the equation, we get
  • if we then square both sides, triple sum both
    sides over i, j, and k, we get, (after noting
    that all cross-product terms cancel)

11
  • Or,
  • And in terms of degrees of freedom,
  • In our example

12
(No Transcript)
13
  • 1)
  • 2)

ANOVA
H0 All Row Means Equal Hi Not all Row Means
Equal
FTV(2,12) 3.88 Reject H0
H0 All Col. Means Equal Hi Not all Col.
Means Equal
FTV(3,12) 3.49 Accept H0
H0 No Intn between factors Hi There is
intn between factors
FTV(6,12) 3.00 Accept H0
14
  • An issue to think about
  • Since Vintn cannot be negative, and
    MSI1.83strong evidence that Vintn is not 0.
  • If this is true, E(MSI) ??2, and we should
    combine MSI and MSW (i.e.. pool) estimates.
    This gives

We have
15
Another Issue
  • The table of 4 pages ago assumes what is called a
    Fixed Model. There is also what is called a
    Random Model (and a Mixed Model).

Column fixed row random
16
  • Fixed
  • Random
  • Fixed
  • Random
  • Fixed
  • Random

Specific levels chosen by the experimenter
Levels chosen randomly from a large number of
possibilities.
All levels about which inferences are to be made
are included in the experiment.
Levels are some of a large number possible.
A definite number of qualitatively
distinguishable levels, and we plan to study them
all. Or a continuous set of quantitative
settings, but we choose a suitable, definite
subset in a limited region and confine inferences
to that subset.
Levels are a random sample from an infinite
population
17
  • In a great number of cased the investigator may
    argue either way, depending on his mood and his
    handling of the subject matter. In other words,
    it is more a matter of assumption than of
    reality.
  • Some authors say that if in doubt, assume fixed
    model. Others say things like I think in most
    experimental situations the random model is
    applicable. The latter quote is from a person
    whose experiments are in the field of biology.

18
Two Factors with No Replication, No Interaction
  • When theres no replication, there is no pure
    way to estimate ERROR.
  • Error is measured by considering more than one
    observation (i.e. replication) at the same
    treatment combination (i.e. experimental
    conditions).

19
  • Our model for analysis is technically
  • We can write
  • After bringing to the other side of the
    equation, squaring both sides, and double summing
    over i and j, We find

20
  • Degrees of freedom
  • We know,
  • If we assume
  • and we can call

21
  • And our may be rewritten
  • and the labels would become
  • in our problem

22
  • And
  • What if were wrong about there being no
    interaction?

ANOVA
At ?.01
FTV(3,6) 9.78
FTV(2,6) 10.93
TSS 62 11
23
  • If we think our ratio is, in Expectation,
  • (Say,
    for ROWS)
  • and it really is (because theres interaction)
  • being wrong can lead only to giving us an
    underestimated Fcalc.
  • Thus, if weve REJECTED Ho, we can feel confident
    of our conclusion, even if theres interaction.
  • If weve ACCEPTED Ho, only then could the no
    interaction assumption be CRITICAL.

24
Non-parametric 2 Factor ANOVA with replications
  • If assumptions of normality and constant variance
    are not met by the data, rank the data, then use
    the usual parametric ANOVA on the ranked data.
  • Using ranks is more robust than finding a
    transformation that works.
  • If there are no interaction, you can use the
    2-factor with replication procedure given in
    Conover (1980).

25
Non-parametric alternative to 2-Factor ANOVA
without replication Friedmans Test
  • Example TSS at 9 sites during 4 seasons.

H0 MAMBMCMD HA Not all medians TSS equal
during the 4 seasons
26
Convert to ranks within each row
27
  • Test Statistic
  • For the given problem, R 9, C 4,
  • Under the null hypothesis, FR may be approximated
    by a Chi-Square distribution with (C-1)d.f.
  • For our problem with 3 d.f., critical value of
    Chi-Square distribution at ? 5, 7.815.
  • Since 20.037.815, we reject the null hypothesis
    and conclude that there are differences among the
    seasons with respect to TSS.

28
Stratified Shuffling
  • Shuffling or randomization in its simplest form
    is used to test the generic null hypothesis that
    one variable (or groups of variables) is
    unrelated to another variable (or groups of
    variables).
  • Significance is assessed by shuffling one
    variable (or set of variables) relative to
    another variable (or sets of variables).
    Shuffling ensures that there is in fact no
    relationship between the variables.
  • If the variables are related, then the original
    unshuffled data should be unusual relative to the
    values of the test statistic shuffling because of
    the presence of the blocking factor. Hence each
    block must be considered as a strata (or block).

29
  • Consider the case of 2 blocks (nests) and 3
    Treatments (Distance) with R.V. (changes in
    exposure times in seconds) as given below

30
  • To test whether distance has an effect, we can
    use the test statistic given by the pairwise sum
    of squared differences of the mean exposure times
    at each distance. That is
  • The observed SSD for the example above is

SSD (mean _at_ 0.75 - mean _at_ 1.25)2 (mean _at_ 0.75
- mean _at_ 2.5)2 (mean _at_ 1.25 - mean _at_ 2.5)2
31
  • The question now is how likely is it that the
    observed value of 12.48 is equaled or exceeded by
    chance alone I.e. if in fact there is no distance
    effect? If the probability is very low (less
    than 0.05, we say that it is unlikely that there
    is no distance effect. Hence the hypothesis of
    no distance effect is rejected.
  • If there is no distance effect, we can in fact
    combine the data for each strata (nest) and
    shuffle them.
  • For example, for nest 1, the combined data are
    2, -1, 6, 5, 7, 0, and 8. If we shuffle them
    once (randomly rearrange them), we may get 0, 7,
    6, 2, 8, -1, 5. Hence, the values 0, 7, 6 could
    have been at the 0.75nm distance, 2 and 8 could
    have been at the 1.25nm distance, and -1, and 5
    could have been at the 2.5nm distance.

32
  • Similarly, we do the same for the nest 2 data.
    After one cycle of shuffling, we would get one
    value of SSD. Repeat say 10,000 times, we will
    get 10,000 values of SSD giving us a sampling
    distribution of SSD.
  • An estimate of the p-value is obtained by
    counting the proportion of SSDs greater or equal
    to the observed SSD.
Write a Comment
User Comments (0)
About PowerShow.com