Analysis of Variance PowerPoint PPT Presentation

presentation player overlay
1 / 132
About This Presentation
Transcript and Presenter's Notes

Title: Analysis of Variance


1
Chapter 15
  • Analysis of Variance
  • ( ANOVA )

2
Analysis of Variance
  • Analysis of variance is a technique that allows
    us to compare two or more populations of interval
    data.
  • Analysis of variance is
  • ? an extremely powerful and widely used
    procedure.
  • ? a procedure which determines whether
    differences exist between population means.
  • ? a procedure which works by analyzing sample
    variance.

3
One-Way Analysis of Variance
  • Independent samples are drawn from k populations
  • Note These populations are referred to as
    treatments.
  • It is not a requirement that n1 n2 nk.

4

Table 15.01 Notation for the One-Way Analysis of
Variance

5

Notation
Independent samples are drawn from k populations
(treatments).

X11 x21 . . . Xn1,1
X12 x22 . . . Xn2,2
X1k x2k . . . Xnk,k
Sample size
Sample mean
X is the response variable. The variables
value are called responses.
6
One Way Analysis of Variance
  • New Terminology
  • x is the response variable, and its values are
    responses.
  • xij refers to the i th observation in the j th
    sample.
  • E.g. x35 is the third observation of the fifth
    sample.

nj

? xij xj mean of the jth sample
nj
i1
nj number of observations in the sample taken
from the jth population
7
One Way Analysis of Variance

x
The grand mean, , is the mean of all the
observations, i.e. (n n1 n2
nk) and k is the number of populations

k nj
? ? xij x n

j 1 i 1
8
One Way Analysis of Variance
  • More New Terminology
  • The unit that we measure is called an
    experimental unit.
  • Population classification criterion is called a
    factor.
  • Each population is a factor level.

9
Example 15-1
  • An apple juice company has a new product
    featuring
  • more convenience,
  • similar or better quality, and
  • lower price
  • when compared with existing juice products.
  • Which factor should an advertising campaign focus
    on?
  • Before going national, test markets are set-up in
    three cities, each with its own campaign, and
    data is recorded
  • Do differences in sales exist between the test
    markets?

10

City 1 City2
City3 (Convenience)
(Quality) (Price)
529.00 658.00 793.00 514.00 663.00 719.00 711.00 6
06.00 461.00 529.00 498.00 663.00 604.00 495.00 48
5.00 557.00 353.00 557.00 542.00 614.00
804.00 630.00 774.00 717.00 679.00 604.00 620.00 6
97.00 706.00 615.00 492.00 719.00 787.00 699.00 57
2.00 523.00 584.00 634.00 580.00 624.00
672.00 531.00 443.00 596.00 602.00 502.00 659.00 6
89.00 675.00 512.00 691.00 733.00 698.00 776.00 56
1.00 572.00 469.00 581.00 679.00 532.00

Data
Xm15-01
11
Example 15.1
Terminology
  • x is the response variable, and its values are
    responses.
  • weekly sales is the response variable
  • the actual sales figures are the responses in
    this example.
  • xij refers to the ith observation in the jth
    sample.
  • E.g. x42 is the fourth weeks sales in city 2
    717 pkgs.
  • x20, 3 is the last week of sales for city 3 532
    pkgs.

comma added for clarity
12
Example 15.1
Terminology
  • The unit that we measure is called an
    experimental unit.
  • The response variable is weekly sales
  • Population classification criterion is called a
    factor.
  • The advertising strategy is the factor were
    interested in. This is the only factor under
    consideration (hence the term one way analysis
    of variance).
  • Each population is a factor level.
  • In this example, there are three factor levels
    convenience, quality, and price.

13

Terminology
In the context of this problem Response
variable weekly salesResponses actual sale
valuesExperimental unit weeks in the three
cities when we record sales figures.Factor the
criterion by which we classify the populations
(the treatments). In this problem the factor is
the marketing strategy. Factor levels the
population (treatment) names. In this problem
factor levels are the marketing strategies.

14
Example 15.1
IDENTIFY
  • The null hypothesis in this case is
  • H0 µ1 µ2 µ3
  • i.e. there are no differences between population
    means.
  • Our alternative hypothesis becomes
  • H1 at least two means differ
  • OK. Now we need some test statistics

15

The rationale of the test statistic
Two types of variability are employed when
testing for the equality of the population means

16
Graphical demonstration Employing two types of
variability
17
20
16 15 14
11 10 9
The sample means are the same as before, but the
larger within-sample variability makes it harder
to draw a conclusion about the population means.
A small variability within the samples makes it
easier to draw a conclusion about the population
means.
Treatment 1
Treatment 2
Treatment 3
18
The rationale behind the test statistic I
  • If the null hypothesis is true, we would expect
    all the sample means to be close to one another
    (and as a result, close to the grand mean).
  • If the alternative hypothesis is true, at least
    some of the sample means would differ.
  • Thus, we measure variability between sample
    means.

19
Variability between sample means
  • The variability between the sample means is
    measured as the sum of squared distances between
    each mean and the grand mean.
  • This sum is called the
  • Sum of Squares for Treatments
  • SST

In our example treatments are represented by the
different advertising strategies.
20
Sum of squares for treatments (SST)
There are k treatments
The mean of sample j
The size of sample j
Note When the sample means are close toone
another, their distance from the grand mean is
small, leading to a small SST. Thus, large SST
indicates large variation between sample means,
which supports H1.
21
Test Statistics
  • Since µ1 µ2 µ3 is of interest to us, a
    statistic that measures the proximity of the
    sample means to each other would also be of
    interest.
  • Such a statistic exists, and is called the
    between-treatments variation. It is denoted SST,
    short for sum of squares for treatments. Its is
    calculated as

grand mean
sum across k treatments
A large SST indicates large variation between
sample means which supports H1.
22
Example 15.1
COMPUTE
  • Since
  • If it were the case that
  • then SST 0 and our null hypothesis, H0
  • would be supported.
  • More generally, a small value of SST supports
    the null hypothesis. The question is, how small
    is small enough?

23
Example 15.1
COMPUTE
  • The following sample statistics and grand mean
    were computed
  • Hence, the between-treatments variation, sum of
    squares for treatments, is
  • is SST 57,512.23 large enough to indicate the
    population means differ?

24

The rationale behind test statistic II
  • Large variability within the samples weakens the
    ability of the sample means to represent their
    corresponding population means.
  • Therefore, even though sample means may markedly
    differ from one another, SST must be judged
    relative to the within samples variability.

25

Within samples variability
  • The variability within samples is measured by
    adding all the squared distances between
    observations and their sample means.
  • This sum is called the
  • Sum of Squares for Error
  • SSE

In our example this is the sum of all squared
differences between sales in city j and
the sample mean of city j (over all the three
cities).
26
Test Statistics
  • SST gave us the between-treatments variation. A
    second statistic, SSE (Sum of Squares for Error)
    measures the within-treatments variation.
  • SSE is given by
    or
  • In the second formulation, it is easier to see
    that it provides a measure of the amount of
    variation we can expect from the random variable
    weve observed.

27
Example 15.1
COMPUTE
  • We calculate the sample variances as

3
and from these, calculate the within-treatments
variation (sum of squares for error) as
28

Sum of squares for errors (SSE)
Is SST 57,512.23 large enough relative to SSE
506,983.50 to reject the null hypothesis that
specifies that all the means are equal? We
still need a couple more quantities in order to
relate SST and SSE together in a meaningful way

29
Mean Squares
  • The mean square for treatments (MST) is given by
  • is F-distributed with k1 and nk degrees of
    freedom.

The mean square for errors (MSE) is given by
And the test statistic
?1 3 1 2 ?2 60 3 57
30
Example 15.1
COMPUTE
  • We can calculate the mean squares treatment and
    mean squares error quantities as

31

Example 15.1
COMPUTE
Giving us our F-statistic of
Does F 3.23 fall into a rejection region or
not? How does it compare to a critical value of
F?
Note these required conditions 1. The
populations tested are normally distributed. 2.
The variances of all the populations tested are
equal.
32
Example 15.1
INTERPRET
  • Since the purpose of calculating the F-statistic
    is to determine whether the value of SST is large
    enough to reject the null hypothesis, if SST is
    large, F will be large.
  • Hence our rejection region is
  • Our value for FCritical is

33
Example 15.1
INTERPRET
  • Since F 3.23 is greater than FCritical 3.15,
    we reject the null hypothesis (H0 µ1 µ2 µ3 )
    in favor of the alternative hypothesis (H1 at
    least two population means differ).
  • That is there is enough evidence to infer that
    the mean weekly sales differ between the three
    cities.
  • Stated another way we are quite confident that
    the strategy used to advertise the product will
    produce different sales figures.

34

35
Summary of Techniques (so far)
36
ANOVA Table
  • The results of analysis of variance are usually
    reported in an ANOVA table

Source of Variation degrees of freedom Sum of Squares Mean Square
Treatments k1 SST MSTSST/(k1)
Error nk SSE MSESSE/(nk)
Total n1 SS(Total)
F-statMST/MSE
37

Table 15.2 ANOVA Table for the One-Way Analysis
of Variance

38

Table 15.3 ANOVA Table for Example 15.1

39

SPSS Output



40

41

Checking required conditions
Figure 15.3a Histogram of Sales, City 1
(Convenience)

42

Figure 15.3b Histogram of Sales, City 2 (Quality)

43

Figure 15.3c Histogram of Sales, City 3 (Price)

44
Can We Use t Test Instead of ANOVA?
  • We cant for two reasons
  • We need to perform more calculations. If we have
    six pairs then we will have to test C6 ( 6 x 5
    ) / 2 15 times
  • It will increase the probability of making Type I
    error from 5 to 54

2
45
Relationship Between t and F Statistics
The F statistic is approximately equal to the
square of t
  • F t2

Hence we will draw exactly the same conclusion
using analysis of variance as we did when we
applied t test of u1 u2.
46
Identifying Factors
  • Factors that Identify the One-Way Analysis of
    Variance

47
Analysis of Variance Experimental Designs
  • Experimental design is one of the factors that
    determines which technique we use.
  • In the previous example we compared three
    populations on the basis of one factor
    advertising strategy.
  • One-way analysis of variance is only one of many
    different experimental designs of the analysis of
    variance.

48
Analysis of Variance Experimental Designs
  • A multifactor experiment is one where there are
    two or more factors that define the treatments.
  • For example, if instead of just varying the
    advertising strategy for our new apple juice
    product if we also vary the advertising medium
    (e.g. television or newspaper), then we have a
    two-factor analysis of variance situation.
  • The first factor, advertising strategy, still has
    three levels (convenience, quality, and price)
    while the second factor, advertising medium, has
    two levels (TV or print).

49

One - way ANOVA Single factor
Two - way ANOVA Two factors
Response

Response
Treatment 3 (level 1)
Treatment 2 (level 2)
Treatment 1 (level 3)
Level 3
Level2
Factor A
Level 1
Level 1
Level2
Factor B
50
Independent Samples and Blocks
  • Similar to the matched pairs experiment, a
    randomized block design experiment reduces the
    variation within the samples, making it easier to
    detect differences between populations.
  • The term block refers to a matched group of
    observations from each population.
  • We can also perform a blocked experiment by using
    the same subject for each treatment in a
    repeated measures experiment.

51
Independent Samples and Blocks
  • The randomized block experiment is also called
    the two-way analysis of variance, not to be
    confused with the two-factor analysis of
    variance. To illustrate where were headed

well do this first
52
Models of Fixed and Random Effects
  • Fixed effects
  • If all possible levels of a factor are included
    in our analysis we have a fixed effect ANOVA.
  • The conclusion of a fixed effect ANOVA applies
    only to the levels studied.
  • Random effects
  • If the levels included in our analysis represent
    a random sample of all the possible levels, we
    have a random-effect ANOVA.
  • The conclusion of the random-effect ANOVA applies
    to all the levels (not only those studied).

53
Models of Fixed and Random Effects.
  • In some ANOVA models the test statistic of the
    fixed effects case may differ from the test
    statistic of the random effect case.
  • Fixed and random effects - examples
  • Fixed effects - The advertisement Example
    (15.1) All the levels of the marketing
    strategies were included
  • Random effects - To determine if there is a
    difference in the production rate of 50 machines,
    four machines are randomly selected and there
    production recorded.

54
Randomized Block Analysis of Variance
  • The purpose of designing a randomized block
    experiment is to reduce the within-treatments
    variation to more easily detect differences
    between the treatment means.
  • In this design, we partition the total variation
    into three sources of variation
  • SS(Total) SST SSB SSE
  • where SSB, the sum of squares for blocks,
    measures the variation between the blocks.

55

Randomized Blocks

Block all the observations with some commonality
across treatments
Treatment 4
Treatment 3
Treatment 2
Treatment 1
Block 1
Block3
Block2
56
Randomized Blocks
  • In addition to k treatments, we introduce
    notation for b blocks in our experimental design

mean of the observations of the 1st block
mean of the observations of the 2nd treatment
57
Sum of Squares Randomized Block
  • Squaring the distance from the grand mean,
    leads to the following set of formulae

test statistic for treatments
test statistic for blocks
58
ANOVA Table
  • We can summarize this new information in an
    analysis of variance (ANOVA) table for the
    randomized block analysis of variance as follows

Source of Variation d.f. Sum of Squares Mean Square F Statistic
Treatments k1 SST MSTSST/(k1) FMST/MSE
Blocks b1 SSB MSBSSB/(b-1) FMSB/MSE
Error nkb1 SSE MSESSE/(nkb1)
Total n1 SS(Total)
59
Test Statistics Rejection Regions
60
Example 15.2
IDENTIFY
  • Are there difference in the effectiveness of four
    new cholesterol drugs? 25 groups of men were
    matched according to age weight, and the
    results were recorded.
  • The hypotheses to test in this case are
  • H0 µ1 µ2 µ3 µ4
  • H1 At least two means differ

61

Group Drug 1 Drug 2
Drug 3 Drug 4
2.70 2.40 6.50 16.20 8.30 5.40 15.40 17.10 7.70 16
.10 9.00 24.30 9.30 19.20 18.70 18.90 7.90 23.80 8
.80 26.70 25.20 27.30 17.60 25.60 26.10
8.70 9.30 10.00 12.60 10.60 15.40 16.30 18.90 13.7
0 19.40 18.50 21.10 19.30 21.90 22.10 19.40 25.40
26.50 22.20 23.50 19.60 30.10 26.60 24.50 27.40
1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00 10.00
11.00 12.00 13.00 14.00 15.00 16.00 17.00 18.00 1
9.00 20.00 21.00 22.00 23.00 24.00 25.00
6.60 7.10 7.50 9.90 13.80 13.90 15.90 14.30 16.00
16.30 14.60 18.70 17.30 19.60 20.70 18.40 21.50 20
.40 21.90 22.50 21.50 25.20 23.00 23.70 28.40
12.60 3.50 4.40 7.50 6.40 13.50 16.90 11.40 16.90
14.80 18.60 21.20 10.00 17.00 21.00 27.20 26.80 28
.00 31.70 11.90 28.70 29.50 22.20 19.50 31.20

62
Example 15.2
IDENTIFY
  • Each of the four drugs can be considered a
    treatment.
  • Each group) can be blocked, because they are
    matched by age and weight.
  • By setting up the experiment this way, we
    eliminate the variability in cholesterol
    reduction related to different combinations of
    age and weight. This helps detect differences in
    the mean cholesterol reduction attributed to the
    different drugs.

63

Example 15.2
The Data
Group Drug 1 Drug 2 Drug 3 Drug 4 Group Drug 1 Drug 2 Drug 3 Drug 4
1 6.6 12.6 2.7 8.7 14 19.6 17.0 19.2 21.9
2 7.1 3.5 2.4 9.3 15 20.7 21.0 18.7 22.1
3 7.5 4.4 6.5 10.0 16 18.4 27.2 18.9 19.4
4 9.9 7.5 16.2 12.6 17 21.5 26.8 7.9 25.4
5 13.8 6.4 8.3 10.6 18 20.4 28.0 23.8 26.5
6 13.9 13.5 5.4 15.4 19 21.9 31.7 8.8 22.2
7 15.9 16.9 15.4 16.3 20 22.5 11.9 26.7 23.5
8 14.3 11.4 17.1 18.9 21 21.5 28.7 25.2 19.6
9 16.0 16.9 7.7 13.7 22 25.2 29.5 27.3 30.1
10 16.3 14.8 16.1 19.4 23 23.0 22.2 17.6 26.6
11 14.6 18.6 9.0 18.5 24 23.7 19.5 25. 6 24.5
12 18.7 21.2 24.3 21.1 25 28.4 31.2 26.1 27.4
13 17.3 10.0 9.3 19.3

64

65

SPSS Output
b - 1
K - 1

MSB
MST
Blocks
Treatments
66

The p value to determine whether differences
exist between the four drugs ( treatments) is
.009. Thus we reject H0 in favor of the research
hypothesis at least two means differ. The p
value for groups 0 indicates that there are
differences between groups of men ( blocks) that
is age, and weight have an impact, but our
experiment design accounts for that.

67

68

69
Identifying Factors
  • Factors that Identify the Randomized Block of the
    Analysis of Variance

70
Two-Factor Analysis of Variance
  • The original set-up for Example 15.1 examined one
    factor, namely the effects of the marketing
    strategy on sales.
  • Emphasis on convenience,
  • Emphasis on quality, or
  • Emphasis on price.
  • Suppose we introduce a second factor, that being
    the effects of the selected media on sales, that
    is
  • Advertise on television, or
  • Advertise in newspapers.
  • To which factor(s) or the interaction of factors
    can we attribute any differences in mean sales of
    apple juice?

71
More Terminology
  • A complete factorial experiment is an experiment
    in which the data for all possible combinations
    of the levels of the factors are gathered. This
    is also known as a two-way classification.
  • The two factors are usually labeled A B, with
    the number of levels of each factor denoted by a
    b respectively.
  • The number of observations for each combination
    is called a replicate, and is denoted by r. For
    our purposes, the number of replicates will be
    the same for each treatment, that is they are
    balanced.

72
Example 15.3 Test Marketing of Advertising
Strategies and Advertising Media
  • Manufacturing
  • Media Television Newspaper
  • City 1 Convenience Television
  • City 2 Convenience Newspaper
  • City 3 Quality - Television
  • City 4 Quality Newspaper
  • City 5 Price - Television
  • City 6 Price - Newspaper

73

Sales Data

C-1 C-2 C-3 C-4 C-5 C-6 491 464 677 689 575 803
712 559 627 650 614 584 558 759 590 704 706 525
447 557 632 652 484 498 479 528 683 576 478 812
624 670 760 836 650 565 546 534 690 628 583 708
444 657 548 798 536 546 582 557 579 497 579 616
672 474 644 841 795 587
74

The Data

Factor A Strategy Convenience Quality
Price Factor B Medium Television Newspaper


Convenience Quality Price
Convenience Quality Price
Newspaper 464 689 803 Newspaper 559 650 584 Newspa
per 759 704 525 Newspaper 557 652 498 Newspaper 52
8 576 812 Newspaper 670 836 565 Newspaper 534 628
708 Newspaper 657 798 546 Newspaper 557 497 616 Ne
wspaper 474 841 587
Television 491 677 575 Television 712 627 614 Tele
vision 558 590 706 Television 447 632 484 Televisi
on 479 683 478 Television 624 760 650 Television 5
46 690 583 Television 444 548 536 Television 582 5
79 579 Television 672 644 795
75

Example 15.3
The Data
Factor A Strategy

Factor B Medium
There are a 3 levels of factor A, b 2 levels
of factor B, yielding 3 x 2 6 replicates, each
replicate has r 10 observations
76
Possible Outcomes
Fig. 15.5
  • This figure illustrates the case where there are
    differences between levels of A, but no
    difference between the levels of B and no
    interaction between A B

77
Possible Outcomes
Fig. 15.6
  • This figure illustrates the case where there are
    differences between levels of B, but no
    differences between the levels of A and no
    interaction between A B

78
Possible Outcomes
Fig. 15.4
  • This figure illustrates the case where there are
    differences between levels of A, and there are
    differences between the levels of B, but and no
    interaction between A B
  • (i.e. the factors affect sales independently,
    which means there is no interaction)

79
Possible Outcomes
Fig. 15.7
  • This figure shows the levels of A B interacting

80
ANOVA Table
Table 15.8
Source of Variation d.f. Sum of Squares Mean Square F Statistic
Factor A a-1 SS(A) MS(A)SS(A)/(a-1) FMS(A)/MSE
Factor B b1 SS(B) MS(B)SS(B)/(b-1) FMS(B)/MSE
Interaction (a-1)(b-1) SS(AB) MS(AB) SS(AB) (a-1)(b-1) FMS(AB)/MSE
Error nab SSE MSESSE/(nab)
Total n1 SS(Total)
81
Two Factor ANOVA
  • Test for the differences between the Levels of
    Factor A
  • H0 The means of the a levels of Factor A are
    equal
  • H1 At least two means differ
  • Test statistic F MS(A) / MSE
  • Example 15.3 Are there differences in the mean
    sales caused by different marketing strategies?
  • H0 µconvenience µquality µprice
  • H1 At least two means differ

82
Two Factor ANOVA
  • Test for the differences between the Levels of
    Factor B
  • H0 The means of the a levels of Factor B are
    equal
  • H1 At least two means differ
  • Test statistic F MS(B) / MSE
  • Example 15.3 Are there differences in the mean
    sales
  • caused by different advertising media?
  • H0 µtelevision µnewspaper
  • H1 At least two means differ

83
Two Factor ANOVA
  • Test for interaction between Factors A and B
  • H0 Factors A and B do not interact to affect the
    mean responses.
  • H1 Factors A and B do interact to affect the
    mean responses.
  • Test statistic F MS(AB) / MSE
  • Example 15.3 Are there differences in the mean
    sales caused by interaction between marketing
    strategy and advertising medium??
  • H0 µconvenience television µquality
    television µprice newspaper
  • H1 At least two means differ

84

COMPUTE
SPSS Output
Factor B - Media
Factor A - Mktg Strategy
Interaction of AB
Error
85
Example 15.3
INTERPRET
There is evidence at the 5 significance level to
infer that differences in weekly sales exist
between the different marketing strategies
(Factor A).
86
Example 15.3
INTERPRET
There is insufficient evidence at the 5
significance level to infer that differences in
weekly sales exist between television and
newspaper advertising (Factor B).
87
Example 15.3
INTERPRET
There is not enough evidence to conclude that
there is an interaction between marketing
strategy and advertising medium that affects mean
weekly sales (interaction of Factor A Factor B).
88

89
See for yourself
  • There are differences between the levels of
    factor A, no difference between the levels of
    factor B, and no interaction is apparent.

90
See for yourself
INTERPRET
  • These results indicate that emphasizing quality
    produces the highest sales and that television
    and newspapers are equally effective.

91

92
Identifying Factors
  • Independent Samples Two-Factor Analysis of
    Variance

93
Multiple Comparisons
µ1 µ2 µ3
  • When we conclude from the one-way analysis of
    variance that at least two treatment means differ
    (i.e. we reject the null hypothesis that H0 µ1
    µ2 . µk ), we often need to know which
    treatment means are responsible for these
    differences.
  • We will examine three statistical inference
    procedures that allow us to determine which
    population means differ
  • Fishers least significant difference (LSD)
    method
  • Bonferroni adjustment, and
  • Tukeys multiple comparison method.

94
Multiple Comparisons
  • Two means are considered different if the
    difference between the corresponding sample means
    is larger than a critical number. The general
    case for this is,
  • IF
  • THEN we conclude µi and µj differ.
  • The larger sample mean is then believed to be
    associated with a larger population mean.

95
Fishers Least Significant Difference
  • What is this critical number, NCritical ?
  • One measure is the Least Significant Difference,
    given by
  • LSD will be the same for all pairs of means if
    all k sample sizes are equal. If some sample
    sizes differ, LSD must be calculated for each
    combination.

96
Back to Example 15.1
  • With k3 treatments (marketing strategy based on
    convenience, quality, or price), we will perform
    three comparisons based on the sample means

We compare these to the Least Significant
Difference we calculate as (at 5significance)
97
Example 15.1 Fishers LSD
we conclude that only the means for convenience
and quality differ
98
Bonferroni Adjustment to LSD Method
  • Fishers method may result in an increased
    probability of committing a type I error.
  • We can adjust Fishers LSD calculation by using
    the Bonferroni adjustment.
  • Where we used alpha ( ), say .05, previously,
    we now use and adjusted value for alpha
  • where

99
Example 15.1 Bonferronis Adjustment
  • Since we have k3 treatments,
    Ck(k1)/23(2)/23, hence we set our new alpha
    value to
  • Thus, instead of using t.05/2 in our LSD
    calculation, we are going to use t.0167/2

100

Bonferroni

Similar result as before but different Std.error
and Sig
101
Tukeys Multiple Comparison Method
  • As before, we are looking for a critical number
    to compare the differences of the sample means
    against. In this case
  • Note is a lower case Omega, not a w

harmonic mean of the sample sizes
Critical value of the Studentized range with nk
degrees of freedom Table 7 - Appendix B
?
102
Example 15.1  Tukeys Method

Similar result as before but different Std.error
and Sig
103
Which method to use?
  • In example 15.1, all three multiple comparison
    methods yielded the same results. This will not
    always be the case! Generally speaking
  • If you have identified two or three pairwise
    comparisons that you wish to make before
    conducting the analysis of variance, use the
    Bonferroni method.
  • If you plan to compare all possible combinations,
    use Tukeys comparison method.

104
Nonparametric Tests forTwo or More Populations

105
Kruskal-Wallis Test
  • So far weve been comparing locations of two
    populations, now well look at comparing two or
    more populations.
  • The Kruskal-Wallis test is applied to problems
    where we want to compare two or more populations
    of ordinal or interval (but nonnormal) data from
    independent samples.
  • Our hypotheses will be
  • H0 The locations of all k populations
    are the same.
  • H1 At least two population locations
    differ.

106
Test Statistic
  • In order to calculate the Kruskal-Wallis test
    statistic, we need to
  • Rank all the observations from smallest (1) to
    largest (n), and average the ranks in the case of
    ties.
  • We calculate rank sums for each sample T1, T2,
    , Tk
  • Lastly, we calculate the test statistic (denoted
    H)

107
Sampling Distribution of the Test Statistic
  • For sample sizes greater than or equal to 5, the
    test statistic H is approximately Chi-squared
    distributed with k1 degrees of freedom.
  • Our rejection region is H gt ?2a,k-1
  • And our p-value is P ( ?2 gt H )

108

Figure 21.10 Sampling Distribution of H
109
Example 21.5
IDENTIFY
  • Can we compare customer ratings (4good 1poor)
    for speed of service across three shifts in a
    fast food restaurant? Our hypotheses will be
  • H0 The locations of all 3 populations are
    the same.
  • (that is, there is no difference in service
    between shifts),
  • and
  • H1 At least two population locations
    differ.
  • Customer ratings for service were recorded

110
Example 21.5
  • 10 customers were selected at random from each
    shift

400 P.M to Midnight 4 4 3 4 3 3 3 3 2 3
Midnight to 800 A.M 3 4 2 2 3 4 3 3 2 3
8 A.M to 4P.M 3 1 3 2 1 3 4 2 4 1
111
Example 21.5
COMPUTE
  • One way to solve the problem is to take the
    original data,
  • stack it, and then
  • sort by customer response
  • rank bottom to top

sorted by response
112
Example 21.5
COMPUTE
  • Once its in stacked format, put in straight
    rankings from 1 to 30, average the rankings for
    the same response, then parse them out by shift
    to come up with rank sum totals

113
Example 21.5
COMPUTE

2.64
Our critical value of Chi-squared (5
significance and k12 degrees of freedom) is
5.99147, hence there is not enough evidence to
reject H0.
114
Example 21.5
  • There is not enough evidence to infer that a
    difference in speed of service exists between the
    three shifts, i.e. all three of the shifts are
    equally rated, and any action to improve service
    should be applied to all three shifts

INTERPRET
115

Example 21.5
COMPUTE
  • There is not enough evidence to infer that a
    difference in speed of service exists between the
    three shifts, i.e. all three of the shifts are
    equally rated, and any action to improve service
    should be applied to all three shifts

compare
p-value
116

SPSS Output

There is not enough evidence to infer that a
difference in speed of service exists between
the three shifts, i.e. all three of the shifts
are equally rated, and any action to improve
service should be applied to all three shifts
117
Identifying Factors
  • Factors that Identify the Kruskal-Wallis Test

118
Friedman Test
  • The Friedman Test is a technique used compare two
    or more populations of ordinal or interval
    (nonnormal) data that are generated from a
    matched pairs experiment.
  • The hypotheses are the same as before
  • H0 The locations of all k populations are the
    same.
  • H1 At least two population locations differ.

119
Friedman Test Test Statistic
  • Since this is a matched pairs experiment, we
    first rank each observation within each of b
    blocks from smallest to largest (i.e. from 1 to
    k), averaging any ties. We then compute the rank
    sums T1, T2, , Tk. Then we calculate our test
    statistic

120
Friedman Test Test Statistic
  • This test statistic is approximate Chi-squared
    with k1 degrees of freedom (provided either k or
    b 5). Our rejection region and p-value are

121

Sampling Distribution of the Test Statistic
The test statistics is approximately chi-squared
distributed with k 1 degrees of freedom
provided either k or b is greater than or equal
to 5.The rejection region is Fr gt ?2a, k-1 and
the p value is P( ?2 gt Fr ) The figure on next
slide depicts the sampling distribution and p
value

122

Figure 21.11 Sampling Distribution of Fr
123
Example 21.6
IDENTIFY
  • Four managers evaluate and score job applicants
    on a scale from 1 (good) to 5 (not so good).
    There have been complaints that the process isnt
    fair. Is it the case that all managers score the
    candidates equally or not? That is

124

Example 21.6
IDENTIFY
  • H0 The locations of all 4 populations are the
    same.
  • (i.e. all managers score like candidates alike)
  • H1 At least two population locations differ.
  • (i.e. there is some disagreement between managers
    on scores)
  • The rejection region is
  • Fr gt ?2a,k-1 ?2.05,3 7.81473

125
Example 21.6
COMPUTE
  • The data looks like this

There are k4 populations (managers) and b8
blocks (applicants) in this set-up.
126

Example 21.6
COMPUTE
Applicant 1 for example, received a top
score from manager v and next-to-top scores from
the other three. Applicant 7 received a top
score from manager v as well, but the other three
scored this candidate very low

127
Example 21.6
COMPUTE
  • rank each observation within block from smallest
    to largest (i.e. from 1 to k), averaging any
    ties For example, consider the case of
    candidate 2

Manager u Manager v Manager w Manager x
Original Scores 4 2 3 2 checksum
straight ranking 4 1 3 2 10
averaged ranking 4 (12)/2 1.5 3 (12)/2 1.5 10
checksum 1 2 3 k
128
Example 21.6
COMPUTE
  • Compute the rank sums T1, T2, , Tk and our test
    statistic

129

Example 21.6
COMPUTE
The rejection region is Fr gt ?2a,k-1
?2.05,3 7.81473


130
Example 21.6
INTERPRET
  • The value of our Friedman test statistic is 10.61
    compared to a critical value of Chi-squared (at
    5 significance and 3 d.f.) which is 7.81473
  • Thus, there is sufficient evidence to reject H0
    in favor of H1

It appears that the managers evaluations of
applicants do indeed differ
131

SPSS Output

132
Identifying Factors
  • Factors that Identify the Friedman Test
Write a Comment
User Comments (0)
About PowerShow.com