NonParametric Tests - PowerPoint PPT Presentation

1 / 84
About This Presentation
Title:

NonParametric Tests

Description:

Example: The effect of Ecstasy vs. Alcohol shall be measured, using the Beck ... Ecstasy users (Mdn=17.5) didn't seem to differ in depression levels from alcohol ... – PowerPoint PPT presentation

Number of Views:653
Avg rating:3.0/5.0
Slides: 85
Provided by: iiMet
Category:

less

Transcript and Presenter's Notes

Title: NonParametric Tests


1
Chapter_13
  • Non-Parametric Tests
  • Field_2005

2
What are non-parametric tests?
  • They do not make any parametric assumptions about
    the data such as normality, homogeneity of
    variances, etc.
  • They are therefore also called 'assumption-free'
    tests
  • They work on the principle of ranking data. The
    lowest score receives the rank 1, the next
    highest score the rank 2, etc., without implying
    that the intervals between the ranks are equal.
    Low scores will be represented as low ranks, high
    scores as high ranks. The analysis is then
    carried out on the ranks and not on the original
    scores.
  • 4 tests will be considered here
  • - Wilcoxon rank-sum test -Kruskal-Wallis test
  • (Mann-Whitney test)
  • - Wilcoxon signed-rank test - Friedman's test

3
Outlook Terminology
4
Wilcoxon rank-sum test and Mann-Whitney Test
  • With these two tests you can compare 2
    independent conditions.
  • They are equivalent to an independent t-test
  • Example The effect of Ecstasy vs. Alcohol shall
    be measured, using the Beck Depression Inventory
    (BDI).

5
The data effect of Ecstasy vs. Alcohol
Depression scores were obtained one day after
taking the drug (Sunday)? and three days later
(Wednesday)? to find out if there is a
development of depression over time
6
The theory
  • The scores are translated into ranks.
  • The lowest score gets the lowest rank, the next
    higher score the next higher rank up to the
    highest rank.
  • If there is no difference in the depression level
    between Ecstasy and Alcohol, a similar number of
    low and high ranks should be found in each group.
    If we add up the ranks, the summed total of ranks
    in each group should be about the same.
  • If there is a difference between the two groups,
    e.g., Ecstasy produces higher levels of
    depression, one would find higher ranks in the
    Ecstasy group and lower ranks in the Alcohol
    group.

7
Ranking of the data (Sunday and Wednesday)?
Same scores share 'tied ranks'. The actual value
of a tied rank is the average of the ranks that
constitute it. E.g., the actual rank for the tied
ranks 3 and 4 for the 2 occurrences of the score
6 in the Wednesday data is 3,5. The ranks are
summed up for each group and day (AW59 EW151
AS90.5 ES119.5). The lowest of the sums serves
as test statistics.For Wednesday this is WS59.
For Sunday, it is WS90.5.
8
The test statistics, mean and SE(Wilcoxon
rank-sum test)
  • Lower sum of ranks for Wednesday WS 59
  • (WS Wilcoxon sum)?
  • Lower sum of ranks for Sunday WS 90.5
  • Mean of the test statistics (mean of Wilcoxon
    sum, WS)
  • __
  • WS n1(n1n21) 10(10101) 105
  • 2 2
  • SE of the test statistics (SE of ?WS)
  • SEWS ?n1 n2(n1n21)/12
  • ???10x10)(10101)/12 13.23

Why 12?
9
Test statistic as z-score, significance
  • _ __
  • Z X-?X WS - WS
  • s SEWS
  • __
  • zSunday WS - WS 90.5 105 -1.10 ns
  • SEWS 13.23
  • __
  • zWednesday WS - WS 59 105 -3.48
  • SEWS 13.23
  • If the z-scores are 1.96 (irrespective of or
    -), then the test is significant.
  • ? The group difference for Sunday is n.s.,
    whereas
  • ? The group difference for Wednesday is
    significant

10
Mann-Whitney (U) test
  • The Mann-Whitney test is similar to the Wilcoxon
    rank-sum test but uses the U test statistic.
  • U N1N2 N1(N11) - R1
  • 2
  • USunday (10x10) 10(11) - 119.5 35.50
  • 2
  • UWednesday (10x10) 10(11) -151.0 4.00
  • 2
  • SPSS produces both statistics. Since they are
    related they always say the same. Choose yourself!

R1 sum of ranks of Group 1, here
Ecstasy 119.5
R1 sum of ranks of Group 1, here Ecstasy 119.5
R1 sum of ranks of Group 1, here Ecstasy 151
11
Data input Ecstacy_Alcand provisional analysis
  • For a between subjects test, we need a coding
    variable (as in a between subjects t-test), e.g.
  • 'drug' 1ecstasy 2alcohol
  • We then have a column for the dependent variable
    BDI on Sunday (sunbdi) and one for BDI on
    Wednesday (wedbdi).

12
Before running the Analysis Run a test of
normalityAnalyze ? Descriptive Statistics ?
Explore
  • Sunbdi and wedbdi go to the dependent list
  • 'drug administered' goes to the factor list
  • In the plots, tick 'test of normality' for the
    test of normality

13
Test of Normality
ns ns
ns ns
Non- normal
Normal
Normal
Non- ormal
  • Both the K-S test and the Shapiro-Wilk test tell
    us that the distribution for Ecstasy-sunbdi and
    Alcohol-wedbdi are not normal.

14
Decision for a non-parametric test
  • As we have seen, some of the distributions are
    non-normal. What can you do?
  • Transform the data (z-, logarithmic, etc.)?
  • Choose a non-parametric test

15
Homogeneity of variances
The homogeneity test you can request in the
'Options' of a simple One-way ANOVA. It also
comes automatically if you run a t-test for
independent samples.
  • Levene's test is n.s. ? the variances of the
    Sunday and Wednesday data are equal

16
Further DescriptivesAnalyze ? Descriptive
Statistics ? Frequencies or ?
Descriptives
  • Request basic descriptive statistics such as the
    mean, median, SD, variance.
  • Note that for a non-parametric test, the median
    is a better indicator of the central tendency
    than the mean.

17
Running the analysis(using your own
Ecstasy_Alc.sav)?
  • Analyze ? Nonparametric Test ? 2 independent
    samples

Tranfer Sunbdi and Wedbdi to the 'Test Variable
List' and 'drug' to the 'grouping variable'
window.
Tranfer Sunbdi and Wedbdi to the 'Test Variable
List' and 'drug' to the 'grouping variable'
window.
Exact...
18
Specifying the dialog boxes
Define the two levels of the grouping variable
Request 'Descriptives' in the 'Options' box
If you have installed 'Exact Tests', you can
request such an 'Exact Test'
19
Exact Test
Exact...
  • You may or may not have 'Exact...' in your Main
    Dialog Box (I haven't). The 'Ecact Test' is an
    extra module of SPSS which needs to be installed.
  • It enables an Exact test of the significance of
    the Kruskal-Wallis test, which is a good thing to
    have for small samples. However, it is a very
    time-demanding procedure (can take really
    long...).
  • Instead of an 'Exact Test', a less intense test
    can be requested based on the 'Monte Carlo'
    Method.
  • In the Monte-Carlo-Method, a distribution similar
    to the sample is found and then many samples (up
    to 10.000) are created for which the mean
    significance value and Confidence Intervals are
    computed.

20
Other options for the Mann-Whitney test
Do not confuse the K-S test for normality with
the K-S Z-test!!!
  • Kolmogorov-Smirnov Z The K-S Z-test test whether
    two samples have been drawn from the same
    population. In sofar, it does the same as the M-W
    test. The K-S Z-test has even better power for
    small samples (n
  • Moses Extreme Reactions This test compares the
    variability of scores in the two groups, hence
    like a non-parametric Levene test.
  • Wald-Wolfowitz runs This is a variant of the
    M-W-test which looks for 'runs' of scores in a
    row from the same group
  • AAAAAAAAAAAAEEEEAEEEEEEEE
  • If the groups are different, runs or ranks for
    each group should cluster at different ends of
    the distribution.

21
Output from Mann-Whitney Test
Mean Ranks Sum of ranks/n
Sum of Ranks are all ranks summed up
22
Test statistics of the M-W test
Wilcoxon W the lower WS for sunbdi and wedbdi.
For the Sunday-BDI, there are no differences
between the two groups Ecstasy vs. Alcohol. For
the Wednesday-BDI, there are significant
diffe- rences, though the average rank is higher
in the Ecstasy users (15.1)than in the alcohol
users (5.9)?
For the Sunday-BDI, there are no differences
between the two groups Ecstasy vs. Alcohol. For
the Wednesday-BDI, there are significant
diffe- rences, though the average rank is higher
in the Ecstasy users (15.1) than in the alcohol
users (5.9)?
23
Comparison with t-test for independent samples
Levene's tests for homogeneity of variances OK
Levene's tests for homogeneity of variances OK
T-test statistics
  • ? Wilcoxon rank sum test and t-test yield the
    same results

24
Calculating the effect sizes
  • The effect size r can easily be calculated from
    the z-scores.
  • r z
  • ?n
  • rSunday -1.11 -.25
  • ???
  • rWednesday -3.48 -.78
  • ???

Medium effect
Huge effect
25
Reporting the results(Field_2005_532)?
  • Ecstasy users (Mdn17.5) didn't seem to differ in
    depression levels from alcohol users (Mdn16) the
    day after the drugs were taken, U35.5, ns,
    r-25. However, by Wednesday, ecstasy users
    (Mdn33.5) were significantly more depressed than
    alcohol users (Mdn7.5), U4, p

Mdn Median
26
Non-parametric tests and statistical power
  • With a non-parametric test we avoid the
    assumptions of a parametric test, esp. normality.
    However, by ranking the scores rather than
    computing the scores directly, we lose
    information about the magnitude of the difference
    between the scores (remember, two ranks do not
    tell you anymore how far, numerically, the two
    original scores were apart). Therefore, we may
    lose statistical power, i.e., we may not detect
    an effect which is genuinely there.
  • However, non-parametric tests are only less
    powerful if parametric assumptions are met. Thus,
    if you run a parametric and a non-parametric test
    over normally-distributed data, then the
    non-parametric test may be weaker.

27
Non-parametric tests and statistical power
  • The problem
  • For normally-distributed data Type 1-error rate
    is 5.
  • For non-normally-distributed data we would not
    know where 5 of the non-normal distribution are.
    It depends on the shape of the distribution.

28
Terminology
29
Comparing two related conditions The Wilcoxon
Signed-rank test
  • The Wilcoxon signed-rank test is used when you
    want to compare 2 conditions but within the same
    subject.
  • It is the non-parametric equivalent to the
    dependent t-test.
  • Expl Measuring the differences between the
    depression scores on Sunday and Wednesday, from
    the previous example.
  • Note before, we had only tested the difference
    between the two groups of Ecstasy and Alcohol
    users (between subjects design).

30
The theory
  • First, the differences between the scores in the
    two conditions are obtained, then they are
    ranked.
  • Additionally, the sign (positive/negative) of the
    difference is assigned to the rank.

31
Rankingthe data
The test statistics is the smaller of the summed
ranks T0 for Ecstasy and T8 for Alcohol
32
Calculating significance
  • General formulas
  • ?
  • T n(n1) Test statistics
  • 4
  • SET ?n(n1)(2n1)/24 SE of Test statistics

33
Calculating significance
  • ?T n(n1)/4 Test statistic
  • SET ?n(n1)(2n1)/24 SE of Test statistics
  • ?TEcstasy 8(81)/4 18
  • SET Ecstasy ??(81)(161)/24 7.14
  • ?TAlcohol 10(101)/4 27.5
  • SET Alcohol ???(101)(201)/24 9.81

Note that there are only n8 in the Ecstacy group
now.
34
z-scores
TEcstasy 0 ?TEcstasy 18 SET
Ecstasy 7.14 TAlcohol 8 ?TAlcohol
27.5 SET Ecstasy 9.81
  • Z X-?X T - ?T
  • s SET
  • zEcstasy 0-18 -2.52
  • 7.14
  • zAlcohol 8-27.5 -1.99
  • 9.81
  • ? Both values are is a significant difference in depression scores
    between Sunday and Wednesday for both drugs,
    Ecstasy and Alcohol

35
Before running the analysis (using your own
Ecstasy_Alc.sav)?
  • Before running the analysis, you have to split
    the file for the Ecstasy and the Alcohol group
  • Data ? Split File ? Organize output by groups

36
Running Wilcoxon signed-rank test(using your own
Ecstasy_Alc.sav)?
  • Analyze ? Non-parametric tets ? 2-related samples

Exact...
If you have 'Exact' choose 'Asymptotic only'
37
Alternatives for Wilcoxon signed-rank test
  • In the main dialog window, there are 3
    alternative tests which you may choose instead of
    Wilcoxon
  • 1. Sign It only considers the direction of the
    differences (pos or neg), irrespective of
    magnitude of change. Therefore, it looses power.
  • 2. McNemar Good for nominal (not ordinal) data,
    i.e., two related dichotomous variables.
  • 3. Marginal Homogeneity Extension of the McNemar
    for ordinal data. Equivalent to Wilcoxon.
  • (My version of SPSS does not have this option)?

38
Aside Request Descriptive StatisticsAnalyze ?
Descriptive Statistics ? FrequenciesBefore,
split the files according to 'kind of drug'!
Later, when you report the results, you will
need esp. the Median (Mdn) which is a better
suited value of the central tendency as the mean
for non-parametric data.
Later, when you report the results, you will
need esp. the Median (Mdn) which is a better
suited value of the central tendency as the mean
for non-parametric data. These are the
outputs for the split data (Ecstasy and Alcohol
separatly)?
39
You can also request the medians from the
descriptive statistics of the signed-rank test by
clicking on quartiles in the options
Ecstasy
Alcohol
40
Output for Ecstacy
  • SPSS first gives the results for the Ecstasy group

There were no neg differences so that Wed Sunday There were 8 pos diff, so that Wed
Sun There were 2 ties (same values) which are
excluded
? All included differences (8 out of 10, since
the 2 ties were excluded) were positive, i.e.,
depression scores were always higher on Wednesday
than they were on Sunday, for Ecstasy.
The difference (z-score)? between Sun-Wed is
significant! The z-score is based on the neg
ranks since they are the smaller
41
Output for Alcohol
  • SPSS then gives the results for the Alcohol group

There were 9 neg differences so that Wed Sunday There was 1 pos diff, so that Wed
Sun There were 0 ties (same values)
? 9 out of 10 differences were negative, and 1
was positive, i.e., depression scores were lower
on Wednesday than they were on Sunday, for
Alcohol.
The difference (z-score)? between Sun-Wed is
significant! The z-score is based on the pos
ranks since they are the smaller ones
42
Effects of Ecstasy vs. Alcohol
  • For Ecstasy, depression increases from Sunday to
    Wednesday.
  • For Alcohol, depression decreases from Sunday to
    Wednesday.
  • This reverse effect is an interaction!

43
Calculating the effect sizes
  • Effect sizes for the Wilcoxon signed-rank test
    can be calculated from the z-scores
  • r z
  • ?n
  • rEcstasy -2.53 -.57
  • ???
  • rAlcohol 1.99 -.44
  • ???

Note although 2 ties in the Ecstasy group
were excluded, here, all 10 subjects are included
in the calculation of the effect size (????
Medium to large effects
Medium to large effects
44
Reporting the results(Field_2005_541)?
  • For Ecstasy users, depression levels were
    significantly higher on Wednesday (Mdn33.5) than
    on Sunday (Mdn17.50), T0,p
  • For Alcohol users, the opposite was true
    depression levels were significantly lower on
    Wednesday (Mdn7.5) than on Sunday (Mdn16), T8,
    p

45
Terminology
46
Differences between several independent groups
The Kruskal-Wallis Test
  • The Kruskal-Wallis Test is the non-parametric
    equivalent to a Simple One-way independent ANOVA.
  • Example
  • Background It has been claimed that the chemical
    'genistein' which naturally occurs in soya
    products decreases the number of sperms in males.
  • Research question Do groups of male subject who
    eat various amounts of soya meals per week have
    different amounts of sperm after a year's period ?

47
The variables
  • Independent variable number of soya meals
  • (1) no soya meals (control condition) 0 per
    year
  • (2) 1 soya meal per week - 52 per year
  • (3) 4 soya meals per week - 208 per year
  • (4) 7 soya meals per week - 364 per year
  • Each group consisted of 20 different male
    individuals.
  • Dependent variable number of sperms

48
The Theory of the Kruskal-Wallis Test
  • As the other non-parametric tests, the K-W Test
    is also based on ranked data.
  • First, the scores are ranked,irrespective of
    group memebership.
  • Then, for each group, their ranks are added. The
    sum of ranks for each group is Ri.

49
Ranked data for the soya experiment
50
The Test Statistic H
  • k
  • H 12 Si1 R2i - 3 (N1)
  • N(N1) ni
  • H 12 9272 8832 8832 5472
    - 3(81)
  • 80(81) 20 20 20 20
  • 12 (42,966.45 38,984.45 38,984.45
    14,960.45) -243 6480
  • 0.0019 (135,895.8) 243
  • 251.66 243 8.659

H has a ?2 distribution Its df k-1 where k
of groups, hence df 4-13
H Hcritical (3) 7.81, pdistribution)?
51
Data input
  • As for a One-Way ANOVA, we code the different
    groups with a dummy coding variable 'Soya' in the
    1st column
  • (1) no soya
  • (2) 1 soya meal
  • (3) 4 soya meals
  • (4) 7 soya meals
  • The dependent variable 'sperm' goes in the 2nd
    column

52
The data in SPSS(Soya.sav)?
Dummy coding 1,2,3,4
Dep Var 'sperm'
ranks
53
Exploratory analyses
  • Analyze ? Descriptive Statistics ? Explore
  • tick 'Test of Normality' in 'Statistics'

Most groups show non-normal data distributions
Analyze ? General Linear Model ? Univariate, tick
'Homogeneity test' in 'options'
Levene's test is significant ? heterogeneous varia
nces
The data violate two parametric assumptions
Normal distribution of data Homogeneous
variances. Therefore, a non-parametric test is
advised.
54
Running the Kruskal-Wallis test
  • Analyze ? Nonparametric Tests ?
  • K-Independent Samples...

Define the range of the independent Var 1-4
levels of soya meals
Exact...
Jonckheere-Terpstra
If you have, choose 'Monte Carlo'
If you have, select 'Jonckheere-Terpstra'. This
is for a linear trend in the data
55
Output Kruskal-Wallis Test
Mean Ranks for all levels of '' of soya meals'
Main Test Statistics H (here, called
Chi-Square)? H is significant If you have
requested 'Monte Carlo', the result will also be
displayed here.
  • ? Number of soya meals has a significant effect
    on sperm count, overall.
  • However, we do not know where the difference is
    exactly located.

56
Boxplots for the 4 groupsGraphs ? Boxplots
  • Visual inspection
  • The Medians for groups 1-3
  • seem rather similar however,
  • the Median for group 4
  • seems somewhat lower

How can we know which particular difference(s)
brought about the overall difference?
57
1. Posthoc Tests for Kruskal-Wallis
  • 1. Posthoc tests in nonparametric tests can be
    done with the Mann-Whitney test (for pairs of
    unrelated samples).
  • If we want to do Posthoc tests, we risk inflating
    Type I error.
  • In order to correct for family-wise error
    inflation, we may use the Bonferroni correction.
    However, then we loose power.
  • 2. Posthoc tests in nonparametric tests can be
    done by hand

58
1. Posthoc Tests for Kruskal-Wallis
  • ? Compromise do only a few promising
    comparisons, e.g. Each level against the control
    condition (as in 'simple' contrasts)?
  • Test 1 no soya vs. 1 soya meal
  • Test 2 no soya vs. 4 soya meals
  • Test 3 no soya vs. 7 soya meals
  • With 3 tests, we have to divide our ?-level by 3,
  • .05/3 .0167
  • So we are doing our Posthoc tests on this more
    rigorous level.

59
1. Single Mann-Whitney tests forthe three
comparisons
Analyze ? Nonparametric Tests ? 2-Independent
tests, define the groups in the grouping
variables window group 1 vs 2 1 vs 3 1 vs 4.
(Here, the contrast 1 vs 4 is requested.)? Each
Mann-Whitney test is carried out indepdently
60
1. Output of the Single Mann-Whitney tests for
the three comparisons
Group 1 vs 2 (no vs 1 soya meal) n.s
Group 1 vs 3 (no vs 4 soya meals) n.s
Group 1 vs 4 (no vs 7 soya meals)
? Eating only 1 to 4 soya meals a week does not
affect number of sperms as compared to not eating
soya meals at all. However, eating 7 soya meals a
week significantly diminishes number of sperms.
61
2. Posthoc Tests in nonparametric tests(for
nerds)?
  • You can also calculate the differences for all
    pairs of contrasts by hand.
  • You take the difference between the mean ranks of
    the different groups and compare them to a value
    based on the value of z (corrected for the number
    of comparisons you make) and a constant based on
    the total sample size and the sample size in the
    2 groups being compared.
  • ??Ru - ?Rv??z??k(k-1) ? N(N1) /12 ((1/nu)
    1/nv))?

K number of groups (4)? N total sample size
(80)? nu number of subj in 1st group (20)? nv
number of subj in 2nd group (20)?
  • Difference between the
  • mean rank of the 2 groups,
  • ignoring the /- sign only the
  • ?absolute value? is considered

62
Determining the critical difference for z
  • ??Ru - ?Rv??z??k(k-1) ? N(N1) /12 ((1/nu)
    1/nv))?
  • In order to know the value for z??k(k-1) , we
    need to determine the ??level. Normally, it is
    .05. This level needs to be divided by 12 which
    is k(k-1) where k is the number of groups, that
    is, 4x312. The ??level therefore is .05/12
    .00417. Now, z??k(k-1) means 'the value of z for
    which only .00417 other values of z are bigger'.
  • Looking up in Appendix A.1 (normal
    z-distribution) the smaller portion for .00417
    (actually, .00145), we find the value of z2.64.
    This is the crit value.

63
Determining the critical difference for z
  • ??Ru - ?Rv??z??k(k-1) ? N(N1) /12 ((1/nu)
    1/nv))?
  • crit. Diff 2.64 ? (80(801)/12) (1/20 1/20)?
  • crit. Diff 2.64 ? 540(0.1)
  • crit. Diff 2.64 ? 54
  • crit. Diff 19.4
  • Since sample sizes for all groups are identical,
    this value holds for all comparisons.
  • We now can test the actual differences in mean
    ranks for all comparisons against this critical
    difference. If a value is bigger, then the
    comparison is significant.

64
Testing individual differences in mean rank
against the critical difference (19.4)?
  • According to this calculation, none of the
    differences is significant! However, in the
    previous Mann-Whitney test the 'No meals 7
    meals' had been significant. How come?

65
Significant or ns comparisons?
  • In the old calculation we had to divide our
    overall ? level into 3 portions.
  • In the old M-W test we had only conducted 3
    comparisons which yields a corrected ? of
    .05/3.0167. The ? of the 'no vs 7 meals'
    comparison had been .009 which is smaller than
    .0167.
  • In the new comparison, however, we have an ? of
    .05/6 (for all 6 comparisons) .0083. Now .009
    .0083, hence the comparison is n.s.
  • ? Better carry out only a few reasonable
    comparisons

66
Testing for trends the Jonckheere-Terpstra test
  • This test looks at the differences between the
    medians of the groups, just as the
    Kruskall-Wallis test does.
  • Additionally, it includes information about
    whether the medians are ordered.
  • In our example, we predict an order for the
    number of sperms in the 4 groups, indeed
  • no meal 1 meal 4 meals 7 meals
  • In the coding variable, we have already encoded
    the order which we expect (1234)?

67
Output of the J-T test
If you have J-T in your version of SPSS,
it would look like this
Z-score (912-1200)/116.33-2.476
J-T test should always be 1-tailed (since we have
a directed hypo!) We compare -2.47 against 1.65
which is the z-value for an ?-level of 5 for a
1- tailed test. Since 2.471.65 the result is
significant. The negative sign means that medians
are in descending order (a positive sign would
have meant ascending order).
68
Calculating effect sizes
  • Calculate only effect sizes for single focused
    comparisons
  • r z
  • ?2n
  • rNoSoya 1 meal -0.243/?40 -.04
  • rNoSoya 4 meal -0.325/?40 -.05
  • rNoSoya 7 meal -2.597/?40 -.41
  • rJonckheere -2.47/??0 -.28

Neglegible effects
Negligible effects
Medium effects
Medium effects
69
Reporting the results of the Kruskal-Wallis Test
(Field_2005_556)?
  • Sperm counts were significantly affected by
    eating soya meals (H(3) 8.66, p Mann-Whitney Tests were used to follow up this
    finding. A Bonferroni corrrection was applied and
    so all effects are reported at a .0167 level of
    significance. It appeared that sperm counts were
    no different when one soya meal (U191, r-.04)
    or four soya meals (U188, r -.05) were eaten
    per week compared to none. However, when seven
    soya meals were eaten per week, sperm counts were
    significantly lower than when no soya was eaten
    (U104, r-.41).

70
Terminology
71
Differences between several related groups
Friedman's ANOVA
  • Friedman's ANOVA is the non-parametric analogue
    to a repeated measure ANOVA (see chapter 11)
    where the same subjects have been subjected to
    various conditions.
  • Example here Testing the effect of a new diet
    called 'Andikins diet' on n10 women. Their
    weight (in kg) was tested 3 times
  • Start
  • Month 1
  • Month 2
  • Would they loose weight in the course of the diet?

http//goc-frankfurt.de/images/dualit/personenwaag
e_450.jpg
72
Theory of Friedman's ANOVA
  • Subject's weight on each of the 3 dates is listed
    in a separate column. Then ranks for the 3 dates
    are determined and listed in separate columns.
  • Then, the ranks are summed up for each Condition
    (Ri)?

Always the 3 scores are compared The
smallest one gets 1, the next 2, and the
biggest one 3.
Diet data with ranks
73
The Test statistic Fr
  • From the sum of ranks for each group, the test
    statistic Fr is derived
  • k
  • Fr 12/Nk (k1) Si1 R2i - 3N(k1)?
  • (12/(10x3)(31)) (192 202 212))
    (3x10)(31)?
  • 12/120 (361400441) 120
  • 0.1 (1202) 120
  • 120.2 - 120 0.2

74
Data Input and provisional analysis (using)
diet.sav
  • First, test for normality
  • Analyze ? Descriptive Statistics ? Explore, tick
    'Normality plots with tests' in the 'Plots' window

Data sheet
In the Shapiro-Wilk test (which is more accurate
than the K-S Test, two groups (Start, 1 month)
show non-normal distributions. This violation of
a parametric constraint justifies the choice of a
non-para-metric test.
75
Running Friedman's ANOVA
  • Analyze ? Non-parametric Tests ? K Related
    Samples...

If you have 'Exact', tick 'Exact and limit
calculation time to 5 minutes.
Other options
Other options
Exact...
Request everything there is - it is not much...
76
Other options
  • Kendall's W Similar to Friedman's ANOVA, but
    looks specifically at agreement between raters.
    For example to what extent (from 0-1) women rate
    Justin Timberlake, David Beckham, or Tony Blair
    on their attractiveness. This is like a
    correlation coefficient.
  • Cochran's Q This is an extension of NcNemar's
    test. It is like a Friedman's test for
    dichotomous data. For example, if women should
    judge whether they would like to kiss Justin
    Timberlake, David Beckham, or Tony Blair and they
    could only answer Yes or No.

77
Output from Friedman's ANOVA
The F-Statistics is called Chi-Square, here. It
has df2 (k-1, where k is the of groups). The
statistics is n.s.
78
Posthoc tests for Friedman's ANOVAWilcoxon
signed-rank tests but correcting for the numbers
of tests we do, here ? .05/3.0167.
Analyze ? Nonparametric Tests ? 2-Related Tests,
tick 'Wilcoxon', specify the 3 pairs of groups
Mean ranks and sum of ranks for all 3 comparisons
So, actually, we do not have to calculate any
further...
All comparisons are ns, as expected from the
overall ns effect.
79
Posthoc tests for Friedman's ANOVA- calculation
by hand
  • We take the difference between the mean ranks of
    the different groups and compare them to a value
    based on the value of z (corrected for the of
    comparions) and a constant based on the total
    sample size (n10) and the of conditions (k3)?
  • ??Ru - ?Rv??z??k(k-1) ? k(k1)/6N
  • z??k(k-1) .05/3(3-1) .00833
  • If the difference is significant, it should have
    a higher value than the value of z for which only
    .00833 other values of z are bigger. As before,
    we look in the Appendix A.1 under the column
    Smaller Portion. The number corresponding to
    .00833 is the critical value it is between 2.39
    and 2.4.

k(k-1) 3 (3-1) 6
80
Calculating the critical differences
  • Critical difference z??k(k-1) ? k(k1)/6N
  • crit. Diff 2.4 ? (3(31)/6x10
  • crit. Diff 2.4 ? 12/60
  • crit. Diff 2.4 ? 0.2
  • crit. Diff 1.07
  • ? If the differences between mean ranks are ?
    the critical difference 1.07, then that
    difference is significant.

81
Calculating the differences between mean ranks
for diet data
  • ? None of the differences is ? the critical
    difference 1.07, hence none of the comparisons is
    significant.

82
Calculating the effect size
  • Again, we will only calculate the effect sizes
    for single comparisons
  • r z
  • ?2n
  • rStart 1 month -0.051/??? -.01
  • rStart 2 months -0.255/??0 -.06
  • r1 month 2 months -0.153/??0 -.03

Tiny effects
Tiny effects
Tiny effects
83
Reporting the results of Friedman's ANOVA
(Field_2005_566)?
  • The weight of participants did not significantly
    change over the 2 months of the diet (?2(2)
    0.20, p .05). Wilcoxon tests were used to
    follow up on this finding. A Bonferroni
    correction was applied and so all effects are
    reported at a .0167 level of significance. It
    appeared that weight didn't significantly change
    from the start of the diet to 1 month, T27,
    r-.01, from the start of the diet to 2 months,
    T25, r-.06, or from 1 month to 2 months,
    T26,r-0.3. We can conclude that the Andikinds
    diet (...) is a complete failure.

84
Summary Terminology
Write a Comment
User Comments (0)
About PowerShow.com