Research Methods and Statistics in Psychology Lecture 10: Analysis of Variance - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Research Methods and Statistics in Psychology Lecture 10: Analysis of Variance

Description:

If it is large enough' then we reject the null hypothesis that they are drawn ... This type of interaction is particularly important because if you looked for ... – PowerPoint PPT presentation

Number of Views:249
Avg rating:3.0/5.0
Slides: 24
Provided by: alexh94
Category:

less

Transcript and Presenter's Notes

Title: Research Methods and Statistics in Psychology Lecture 10: Analysis of Variance


1
Research Methods and Statistics in
PsychologyLecture 10 Analysis of Variance
  • Overview of lecture
  • 1. Beyond multiple t-tests
  • 2. Analysing variances
  • 3. Using ANOVA to compare means
  • 4. Using ANOVA to compare multiple means
  • Reading for this lecture
  • Chapter 10 in HM.
  • Appendix A.4 in HM also contains a worked
    example
  • Examples of computer-based ANOVA are provided at
    www.sagepub.com/haslam

2
Research Methods and Statistics in
PsychologyLecture 10 Analysis of Variance
  • 1. Beyond multiple t-tests
  • As we have seen, the t-test is a useful procedure
    for comparing two means. But what do we do when
    we need to compare more than two means?
  • One answer might be to do lots and lots of
    t-tests, comparing every pair of means. This is
    messy, but there is another problem too
  • As we noted at the very end of Lecture 8,
    performing lots of tests increases the chance of
    one or more of them will yield a statistically
    significant result by chance alone (i.e., not
    because there is a real difference).
  • And the more tests you perform the more likely
    this is to happen.

3
Research Methods and Statistics in
PsychologyLecture 10 Analysis of Variance
  • 2. Analysing variances
  • Accordingly, it would be convenient if a single
    test could tell us whether there were differences
    between means on average.
  • The good news is, there is and it involves
    performing analysis of variance (ANOVA).
  • You might well ask yourselves here If we are
    comparing means why is it called analysis of
    variance? Shouldnt it be called analysis of
    differences or analysis of means?
  • The answer is that its called ANOVA because
    (rather cunningly) the procedure uses variances
    to compare means.
  • However, we can use the idea behind ANOVA to
    compare variances and conduct a statistical test
    to see if they are different.
  • So lets start by doing that

4
Research Methods and Statistics in
PsychologyLecture 10 Analysis of Variance
  • 2. Analysing variances
  • In Lecture 8 we saw that one of the assumptions
    of the between-subjects ttest is that the
    variances of the two groups should not be
    different (the assumption of equal variance).
  • Well now we can explicitly compare variances and
    conduct a statistical test to see whether or not
    they are different.
  • We do this by forming a ratio of the variance of
    the two groups. If the ratio of the larger
    variance to the smaller one is about 11 then we
    cannot conclude that they are different (i.e.,
    we cannot reject the null hypothesis that they
    are different).
  • However, if the ratio is much bigger than 11
    then we can reject the null hypothesis, providing
    we know how such a ratio of variances behaves.

5
Research Methods and Statistics in
PsychologyLecture 10 Analysis of Variance
  • 2. Analysing variances
  • Fortunately, we do have a pretty good idea. The
    ratio of two variances drawn from the same
    population tends to follow the F- distribution.
    This is a theoretical (sampling) distribution of
    ratios of variances.
  • The F-distribution looks like this
  • Note that the distribution
  • (a) is non-symmetrical,
  • (b) only has positive values,
  • (c) has a positive skew, and
  • (d) has a long tail

6
Research Methods and Statistics in
PsychologyLecture 10 Analysis of Variance
  • 2. Analysing variances
  • The tricky thing about the F-distribution is that
    it varies with the degrees of freedom (like the
    t-distribution) but this depends on the df in the
    numerator and denominator.
  • To see this in operation, lets compare the
    variances of the two groups of 10 people that we
    looked at in Lecture 8 (where we conducted a
    between-subjects t-test)

Participant number
7
Research Methods and Statistics in
PsychologyLecture 10 Analysis of Variance
  • 2. Analysing variances
  • Here the degrees of freedom for the numerator and
    denominator are both equal to the number of
    people in the group minus 1
    (i.e., df n 1). The particular F-distribution
    we need in this case has 9 degrees of freedom in
    the numerator and 9 in the denominator.
  • A large number of F-distributions for many
    combinations of degrees of freedom are given in
    Table C.4 in HM (pp. 485-491).
  • Once we have calculated a value for F we can
    compare this to the distribution and find what
    proportion of the F-distribution this cuts off.

8
Research Methods and Statistics in
PsychologyLecture 10 Analysis of Variance
  • 2. Analysing variances
  • In this example the variance of the experimental
    group was 2.71 and the variance of the control
    group was 2.04. We can turn these two variances
    into a ratio and obtain the F-value 1.33.
  • We can interpret this by referring to F-values
    that are tabulated (and reported) in the form
    F(df1,df2),
  • So here F(9,9) 1.33.
  • If we assume that these two variances are drawn
    from a normal population, what we now need to do
    is check what proportion of the F-distribution is
    cut off by this value.
  • We can see from Table C.4 that this value is
    quite a bit smaller than the critical value given
    for F(9, 9) with ? .05 of 3.18.
  • This suggests that this ratio would not be a
    particularly unusual one to obtain if the samples
    really were drawn from a population with the same
    variance.

9
Research Methods and Statistics in
PsychologyLecture 10 Analysis of Variance
  • 3. Using ANOVA to compare means
  • Comparing variances might be useful if we want to
    know whether the assumptions of a statistical
    test hold, but how (you may well ask) can we use
    ANOVA to compare means?
  • What we do is use a ratio (of the form used in
    all test statistics i.e, a ratio of
    information error) to compare two estimates of
    variance.
  • One estimate is based on variation between groups
    (information).
  • The other estimate is based on the variation
    within groups (error). It is simply the pooled
    variance for our groups.

10
Research Methods and Statistics in
PsychologyLecture 10 Analysis of Variance
  • 3. Using ANOVA to compare means
  • If both of these estimates turn out to be fairly
    similar then we cannot reject the null hypothesis
    of no difference between groups and we cannot
    conclude that the means are drawn from different
    populations
  • If there is a big difference between the
    estimates such that the variation between groups
    (information) is much greater than the variation
    within them (error) then one will be much bigger
    than the other and we will get a large F-ratio.
  • If it is large enough then we reject the null
    hypothesis that they are drawn from the same
    population (i.e., that there are no differences
    between means).

11
Research Methods and Statistics in
PsychologyLecture 10 Analysis of Variance
  • 3. Using ANOVA to compare means
  • We wont go through all the details of
    calculation (see HM for details) but we can use
    the above example to see how this works in
    principle.
  • In order to work out the information term for the
    above test statistic, the first thing we need to
    estimate is the variance estimated from the
    means. We can construct such an estimate by
    taking the average deviations between the overall
    grand mean and the cell means.
  • The unbiased estimate of the variance of the
    means (when we have equal cell sizes) is
  • ? (Xi X )2 / (k 1)
  • where k is the number of cells, Xi is the mean
    for each group and X is the grand mean.

12
Research Methods and Statistics in
PsychologyLecture 10 Analysis of Variance
  • 3. Using ANOVA to compare means
  • To correct for bias (associated with the law of
    large numbers see HM) we need to multiply this
    variance by the cell size.
  • These variance estimates are called mean squares
    or MS because the sums of squares are averaged.
  • The mean square we calculate here is called the
    between cells mean square (the shorthand notation
    is MSB)
  • MSb ?n(Xi X)2 / (k1)
  • In this case, the mean for the control group is
    4.40 and the mean for the experimental group is
    6.40. Here we can work out that the grand mean
    pooled across the two groups is 5.40 (because the
    cell sizes are the same, the grand mean is just
    the average of the cell means).

13
Research Methods and Statistics in
PsychologyLecture 10 Analysis of Variance
  • 3. Using ANOVA to compare means
  • So here
  • MSB (10 (4.40-5.40)2 (10 6.40-5.40)2))/1
  • (1012 1012) / 1
  • 20
  • This value is associated with dfb degrees of
    freedom
  • where dfb k 1 and k is the number of cells.
  • As we noted above, the bottom line of our test
    statistic is an error term, involving the average
    difference within groups. We can find this by
    just pooling the variances we obtained
    previously. In Lecture 8 we saw that this value
    was 2.38.
  • This value is associated with dfw degrees of
    freedom
  • where dfw N k (the total number of
    participants minus the number of cells).

14
Research Methods and Statistics in
PsychologyLecture 10 Analysis of Variance
  • 3. Using ANOVA to compare means
  • Here
  • dfw 20 2 18
  • Now the F-ratio can be calculated by dividing MSb
    by MSw
  • F 20 / 2.38 8.41
  • In order to evaluate this F-ratio we can compare
    it to the F-distribution with 1 and 18 degrees of
    freedom.
  • From Table C.4 in HM we can see that this value
    is larger than the tabled value for F(1, 18) with
    ? .05 of 4.41.
  • Indeed it is larger than the value with ? .01.
    In other words, if there were no difference
    between the groups then we would expect to find a
    difference that big on less than 1 in 100 random
    selections of two samples of that size.

15
Research Methods and Statistics in
PsychologyLecture 10 Analysis of Variance
  • 3. Using ANOVA to compare means
  • Referring back to Chapter 8, we see this gives us
    a similar answer to the between-subjects t-test.
  • Indeed, providing that there are only two groups
    to be compared (i.e., the numerator has two
    degrees of freedom) there is an exact
    relationship between t and F in that F t2
  • So we can see here that our F-ratio of 8.41 is
    the same as the square of the t-value (2.90) that
    we obtained previously
  • (i.e., 2.902 8.41).

16
Research Methods and Statistics in
PsychologyLecture 10 Analysis of Variance
  • 4. Using ANOVA to compare multiple means
  • Although the above example contains most of the
    key ingredients of ANOVA, in fact it represents
    by far the most simple form of this procedure
    that it is possible to conduct.
  • Indeed, a researcher wouldnt normally use ANOVA
    to analyse this data  they would simply use a
    t-test.
  • However, ANOVA and the F-ratio really come into
    their own when we have to compare results from
    more than two groups and this is precisely why
    they are so useful.
  • Moreover, ANOVA can be used to analyse data from
    a study that has more than one factor  i.e.,
    where a study has more than one independent
    variable.

17
Research Methods and Statistics in
PsychologyLecture 10 Analysis of Variance
  • 4. Using ANOVA to compare multiple means
  • The procedure for conducting and interpreting the
    results from studies with more than two groups
    are very complex and for this reason many people
    only ever perform the procedure using a relevant
    statistical package (see HM support materials on
    the Web).
  • We wont go into these here, but you should read
    Chapter 10 in HM pp. 281-331) to get an
    understanding of these issues.
  • What we will discuss, however, is the very
    important concept of statistical interaction (see
    HM pp. 301-303).
  • Interactions occur when the effect of one
    variable depends on the presence of another
    variable.

18
Research Methods and Statistics in
PsychologyLecture 10 Analysis of Variance
  • 4. Using ANOVA to compare multiple means
  • As an example, people might eat a lot of biscuits
    when they are both hungry and when the biscuits
    are chocolate-coated (which is very different to
    eating a lot of biscuits when they are hungry or
    chocolate-coated).
  • The easiest way to understand such a statement is
    to represent it graphically, as follows

high
Likelihood of eating biscuit
  • The key thing to note here is that the presence
    of an interaction is indicated by the fact that
    the lines are not parallel.

low
not hungry
hungry
State of Hunger
19
Research Methods and Statistics in
PsychologyLecture 10 Analysis of Variance
  • 4. Using ANOVA to compare multiple means
  • Interactions can take a number of different
    forms, but, when plotted all will involve
    non-parallel lines.
  • It is a very useful analytical skill to be able
    to make sense of the different forms that
    interactions take. For example, what is going on
    in the graph below?

high
Likelihood of eating biscuit
  • This interaction suggests that people are likely
    to eat chocolate biscuits whether theyre hungry
    or not, but only eat plain biscuits if theyre
    hungry.

low
not hungry
hungry
State of Hunger
20
Research Methods and Statistics in
PsychologyLecture 10 Analysis of Variance
  • 4. Using ANOVA to compare multiple means
  • The interaction below is called a cross-over
    interaction. This type of interaction is
    particularly important because if you looked for
    effects of biscuit coating and hunger separately
    (what are know as the main effects for these
    variables) you wouldnt find any.
  • In other words, you would wrongly conclude that
    coating and hunger dont affect biscuit eating
    when in fact they do  but only in interaction
    with each other.

high
Likelihood of eating biscuit
  • This interaction suggests that people eat plain
    biscuits more when hungry than not hungry, but
    chocolate biscuits more when not hungry than when
    hungry.
  • It also suggests than when they are not hungry
    people are more likely to eat a chocolate biscuit
    than a plain one, but that when they are hungry
    the opposite is true,

low
not hungry
hungry
State of Hunger
21
Research Methods and Statistics in
PsychologyLecture 10 Analysis of Variance
  • 4. Using ANOVA to compare multiple means
  • And what is going on in the graph below?
  • This is a trick question because there is no
    interaction here just two main effects that
    have an additive effect.
  • One main effect is associated with coating
    (people are more likely to eat chocolate biscuits
    than plain ones) the other is associated with
    hunger (people are more likely to eat biscuits
    when hungry than not hungry).

high
Likelihood of eating biscuit
  • The absence of an interaction is indicated b the
    fact that the two lines are parallel

low
not hungry
hungry
State of Hunger
22
Research Methods and Statistics in
PsychologyLecture 10 Analysis of Variance
  • 4. Using ANOVA to compare multiple means
  • The statistical operations involved in analysing
    and interpreting interactions are complicated but
    important (again, see HM).
  • Indeed, ANOVA can become mind-numbingly complex,
    so it is worth alerting you to the main forms of
    this complexity
  • 1. Designs can involve many more than two cells
    and there can be more than two factors in a given
    design.
  • 2. ANOVA can include within-subjects factors as
    well as between-subjects factors.
  • 3. ANOVA can involve more than one dependent
    variable (this is called Multivariate-ANOVA or
    MANOVA).
  • 4. The procedures for making comparing specific
    conditions within ANOVA (which one almost always
    needs to do) are enormously diverse.

23
Research Methods and Statistics in
PsychologyLecture 10 Analysis of Variance
  • 4. Using ANOVA to compare multiple means
  • 5. There are many ways to analyse variance
    depending on the features of the design and the
    cell sizes (especially where these are unequal).
  • 6. There are different procedures that need to be
    applied when the levels of the factors are not
    determined experimentally (i.e., where there is
    not random assignment to conditions).
  • Even without these complexities there are some
    other points to be wary about.
  • In particular, note that ANOVA is subject to the
    same assumptions as is the t-test (homogeneity of
    variance, normality and independence).
  • So, while it is a very powerful tool, it is one
    that it is important to use carefully and
    prudently.
Write a Comment
User Comments (0)
About PowerShow.com