NonParametric Tests

About This Presentation

Title:

NonParametric Tests

Description:

Example: The effect of Ecstasy vs. Alcohol shall be measured, using the Beck ... Ecstasy users (Mdn=17.5) didn't seem to differ in depression levels from alcohol ... – PowerPoint PPT presentation

Number of Views:653

Avg rating:3.0/5.0

Slides: 85

Provided by: iiMet

Category:

more less

Transcript and Presenter's Notes

Title: NonParametric Tests

1
Chapter_13

Non-Parametric Tests
Field_2005

2
What are non-parametric tests?

They do not make any parametric assumptions about
the data such as normality, homogeneity of
variances, etc.
They are therefore also called 'assumption-free'
tests
They work on the principle of ranking data. The
lowest score receives the rank 1, the next
highest score the rank 2, etc., without implying
that the intervals between the ranks are equal.
Low scores will be represented as low ranks, high
scores as high ranks. The analysis is then
carried out on the ranks and not on the original
scores.
4 tests will be considered here
- Wilcoxon rank-sum test -Kruskal-Wallis test
(Mann-Whitney test)
- Wilcoxon signed-rank test - Friedman's test

3
Outlook Terminology
4
Wilcoxon rank-sum test and Mann-Whitney Test

With these two tests you can compare 2
independent conditions.
They are equivalent to an independent t-test
Example The effect of Ecstasy vs. Alcohol shall
be measured, using the Beck Depression Inventory
(BDI).

5
The data effect of Ecstasy vs. Alcohol
Depression scores were obtained one day after
taking the drug (Sunday)? and three days later
(Wednesday)? to find out if there is a
development of depression over time
6
The theory

The scores are translated into ranks.
The lowest score gets the lowest rank, the next
higher score the next higher rank up to the
highest rank.
If there is no difference in the depression level
between Ecstasy and Alcohol, a similar number of
low and high ranks should be found in each group.
If we add up the ranks, the summed total of ranks
in each group should be about the same.
If there is a difference between the two groups,
e.g., Ecstasy produces higher levels of
depression, one would find higher ranks in the
Ecstasy group and lower ranks in the Alcohol
group.

7
Ranking of the data (Sunday and Wednesday)?
Same scores share 'tied ranks'. The actual value
of a tied rank is the average of the ranks that
constitute it. E.g., the actual rank for the tied
ranks 3 and 4 for the 2 occurrences of the score
6 in the Wednesday data is 3,5. The ranks are
summed up for each group and day (AW59 EW151
AS90.5 ES119.5). The lowest of the sums serves
as test statistics.For Wednesday this is WS59.
For Sunday, it is WS90.5.
8
The test statistics, mean and SE(Wilcoxon
rank-sum test)

Lower sum of ranks for Wednesday WS 59
(WS Wilcoxon sum)?
Lower sum of ranks for Sunday WS 90.5
Mean of the test statistics (mean of Wilcoxon
sum, WS)
__
WS n1(n1n21) 10(10101) 105
2 2
SE of the test statistics (SE of ?WS)
SEWS ?n1 n2(n1n21)/12
???10x10)(10101)/12 13.23

Why 12?
9
Test statistic as z-score, significance

_ __
Z X-?X WS - WS
s SEWS
__
zSunday WS - WS 90.5 105 -1.10 ns
SEWS 13.23
__
zWednesday WS - WS 59 105 -3.48
SEWS 13.23
If the z-scores are 1.96 (irrespective of or
-), then the test is significant.
? The group difference for Sunday is n.s.,
whereas
? The group difference for Wednesday is
significant

10
Mann-Whitney (U) test

The Mann-Whitney test is similar to the Wilcoxon
rank-sum test but uses the U test statistic.
U N1N2 N1(N11) - R1
2
USunday (10x10) 10(11) - 119.5 35.50
2
UWednesday (10x10) 10(11) -151.0 4.00
2
SPSS produces both statistics. Since they are
related they always say the same. Choose yourself!

R1 sum of ranks of Group 1, here
Ecstasy 119.5
R1 sum of ranks of Group 1, here Ecstasy 119.5
R1 sum of ranks of Group 1, here Ecstasy 151
11
Data input Ecstacy_Alcand provisional analysis

For a between subjects test, we need a coding
variable (as in a between subjects t-test), e.g.
'drug' 1ecstasy 2alcohol
We then have a column for the dependent variable
BDI on Sunday (sunbdi) and one for BDI on
Wednesday (wedbdi).

12
Before running the Analysis Run a test of
normalityAnalyze ? Descriptive Statistics ?
Explore

Sunbdi and wedbdi go to the dependent list
'drug administered' goes to the factor list
In the plots, tick 'test of normality' for the
test of normality

13
Test of Normality
ns ns
ns ns
Non- normal
Normal
Normal
Non- ormal

Both the K-S test and the Shapiro-Wilk test tell
us that the distribution for Ecstasy-sunbdi and
Alcohol-wedbdi are not normal.

14
Decision for a non-parametric test

As we have seen, some of the distributions are
non-normal. What can you do?
Transform the data (z-, logarithmic, etc.)?
Choose a non-parametric test

15
Homogeneity of variances
The homogeneity test you can request in the
'Options' of a simple One-way ANOVA. It also
comes automatically if you run a t-test for
independent samples.

Levene's test is n.s. ? the variances of the
Sunday and Wednesday data are equal

16
Further DescriptivesAnalyze ? Descriptive
Statistics ? Frequencies or ?
Descriptives

Request basic descriptive statistics such as the
mean, median, SD, variance.
Note that for a non-parametric test, the median
is a better indicator of the central tendency
than the mean.

17
Running the analysis(using your own
Ecstasy_Alc.sav)?

Analyze ? Nonparametric Test ? 2 independent
samples

Tranfer Sunbdi and Wedbdi to the 'Test Variable
List' and 'drug' to the 'grouping variable'
window.
Tranfer Sunbdi and Wedbdi to the 'Test Variable
List' and 'drug' to the 'grouping variable'
window.
Exact...
18
Specifying the dialog boxes
Define the two levels of the grouping variable
Request 'Descriptives' in the 'Options' box
If you have installed 'Exact Tests', you can
request such an 'Exact Test'
19
Exact Test
Exact...

You may or may not have 'Exact...' in your Main
Dialog Box (I haven't). The 'Ecact Test' is an
extra module of SPSS which needs to be installed.
It enables an Exact test of the significance of
the Kruskal-Wallis test, which is a good thing to
have for small samples. However, it is a very
time-demanding procedure (can take really
long...).
Instead of an 'Exact Test', a less intense test
can be requested based on the 'Monte Carlo'
Method.
In the Monte-Carlo-Method, a distribution similar
to the sample is found and then many samples (up
to 10.000) are created for which the mean
significance value and Confidence Intervals are
computed.

20
Other options for the Mann-Whitney test
Do not confuse the K-S test for normality with
the K-S Z-test!!!

Kolmogorov-Smirnov Z The K-S Z-test test whether
two samples have been drawn from the same
population. In sofar, it does the same as the M-W
test. The K-S Z-test has even better power for
small samples (n
Moses Extreme Reactions This test compares the
variability of scores in the two groups, hence
like a non-parametric Levene test.
Wald-Wolfowitz runs This is a variant of the
M-W-test which looks for 'runs' of scores in a
row from the same group
AAAAAAAAAAAAEEEEAEEEEEEEE
If the groups are different, runs or ranks for
each group should cluster at different ends of
the distribution.

21
Output from Mann-Whitney Test
Mean Ranks Sum of ranks/n
Sum of Ranks are all ranks summed up
22
Test statistics of the M-W test
Wilcoxon W the lower WS for sunbdi and wedbdi.
For the Sunday-BDI, there are no differences
between the two groups Ecstasy vs. Alcohol. For
the Wednesday-BDI, there are significant
diffe- rences, though the average rank is higher
in the Ecstasy users (15.1)than in the alcohol
users (5.9)?
For the Sunday-BDI, there are no differences
between the two groups Ecstasy vs. Alcohol. For
the Wednesday-BDI, there are significant
diffe- rences, though the average rank is higher
in the Ecstasy users (15.1) than in the alcohol
users (5.9)?
23
Comparison with t-test for independent samples
Levene's tests for homogeneity of variances OK
Levene's tests for homogeneity of variances OK
T-test statistics

? Wilcoxon rank sum test and t-test yield the
same results

24
Calculating the effect sizes

The effect size r can easily be calculated from
the z-scores.
r z
?n
rSunday -1.11 -.25
???
rWednesday -3.48 -.78
???

Medium effect
Huge effect
25
Reporting the results(Field_2005_532)?

Ecstasy users (Mdn17.5) didn't seem to differ in
depression levels from alcohol users (Mdn16) the
day after the drugs were taken, U35.5, ns,
r-25. However, by Wednesday, ecstasy users
(Mdn33.5) were significantly more depressed than
alcohol users (Mdn7.5), U4, p

Mdn Median
26
Non-parametric tests and statistical power

With a non-parametric test we avoid the
assumptions of a parametric test, esp. normality.
However, by ranking the scores rather than
computing the scores directly, we lose
information about the magnitude of the difference
between the scores (remember, two ranks do not
tell you anymore how far, numerically, the two
original scores were apart). Therefore, we may
lose statistical power, i.e., we may not detect
an effect which is genuinely there.
However, non-parametric tests are only less
powerful if parametric assumptions are met. Thus,
if you run a parametric and a non-parametric test
over normally-distributed data, then the
non-parametric test may be weaker.

27
Non-parametric tests and statistical power

The problem
For normally-distributed data Type 1-error rate
is 5.
For non-normally-distributed data we would not
know where 5 of the non-normal distribution are.
It depends on the shape of the distribution.

28
Terminology
29
Comparing two related conditions The Wilcoxon
Signed-rank test

The Wilcoxon signed-rank test is used when you
want to compare 2 conditions but within the same
subject.
It is the non-parametric equivalent to the
dependent t-test.
Expl Measuring the differences between the
depression scores on Sunday and Wednesday, from
the previous example.
Note before, we had only tested the difference
between the two groups of Ecstasy and Alcohol
users (between subjects design).

30
The theory

First, the differences between the scores in the
two conditions are obtained, then they are
ranked.
Additionally, the sign (positive/negative) of the
difference is assigned to the rank.

31
Rankingthe data
The test statistics is the smaller of the summed
ranks T0 for Ecstasy and T8 for Alcohol
32
Calculating significance

General formulas
?
T n(n1) Test statistics
4
SET ?n(n1)(2n1)/24 SE of Test statistics

33
Calculating significance

?T n(n1)/4 Test statistic
SET ?n(n1)(2n1)/24 SE of Test statistics
?TEcstasy 8(81)/4 18
SET Ecstasy ??(81)(161)/24 7.14
?TAlcohol 10(101)/4 27.5
SET Alcohol ???(101)(201)/24 9.81

Note that there are only n8 in the Ecstacy group
now.
34
z-scores
TEcstasy 0 ?TEcstasy 18 SET
Ecstasy 7.14 TAlcohol 8 ?TAlcohol
27.5 SET Ecstasy 9.81

Z X-?X T - ?T
s SET
zEcstasy 0-18 -2.52
7.14
zAlcohol 8-27.5 -1.99
9.81
? Both values are is a significant difference in depression scores
between Sunday and Wednesday for both drugs,
Ecstasy and Alcohol

35
Before running the analysis (using your own
Ecstasy_Alc.sav)?

Before running the analysis, you have to split
the file for the Ecstasy and the Alcohol group
Data ? Split File ? Organize output by groups

36
Running Wilcoxon signed-rank test(using your own
Ecstasy_Alc.sav)?

Analyze ? Non-parametric tets ? 2-related samples

Exact...
If you have 'Exact' choose 'Asymptotic only'
37
Alternatives for Wilcoxon signed-rank test

In the main dialog window, there are 3
alternative tests which you may choose instead of
Wilcoxon
1. Sign It only considers the direction of the
differences (pos or neg), irrespective of
magnitude of change. Therefore, it looses power.
2. McNemar Good for nominal (not ordinal) data,
i.e., two related dichotomous variables.
3. Marginal Homogeneity Extension of the McNemar
for ordinal data. Equivalent to Wilcoxon.
(My version of SPSS does not have this option)?

38
Aside Request Descriptive StatisticsAnalyze ?
Descriptive Statistics ? FrequenciesBefore,
split the files according to 'kind of drug'!
Later, when you report the results, you will
need esp. the Median (Mdn) which is a better
suited value of the central tendency as the mean
for non-parametric data.
Later, when you report the results, you will
need esp. the Median (Mdn) which is a better
suited value of the central tendency as the mean
for non-parametric data. These are the
outputs for the split data (Ecstasy and Alcohol
separatly)?
39
You can also request the medians from the
descriptive statistics of the signed-rank test by
clicking on quartiles in the options
Ecstasy
Alcohol
40
Output for Ecstacy

SPSS first gives the results for the Ecstasy group

There were no neg differences so that Wed Sunday There were 8 pos diff, so that Wed
Sun There were 2 ties (same values) which are
excluded
? All included differences (8 out of 10, since
the 2 ties were excluded) were positive, i.e.,
depression scores were always higher on Wednesday
than they were on Sunday, for Ecstasy.
The difference (z-score)? between Sun-Wed is
significant! The z-score is based on the neg
ranks since they are the smaller
41
Output for Alcohol

SPSS then gives the results for the Alcohol group

There were 9 neg differences so that Wed Sunday There was 1 pos diff, so that Wed
Sun There were 0 ties (same values)
? 9 out of 10 differences were negative, and 1
was positive, i.e., depression scores were lower
on Wednesday than they were on Sunday, for
Alcohol.
The difference (z-score)? between Sun-Wed is
significant! The z-score is based on the pos
ranks since they are the smaller ones
42
Effects of Ecstasy vs. Alcohol

For Ecstasy, depression increases from Sunday to
Wednesday.
For Alcohol, depression decreases from Sunday to
Wednesday.
This reverse effect is an interaction!

43
Calculating the effect sizes

Effect sizes for the Wilcoxon signed-rank test
can be calculated from the z-scores
r z
?n
rEcstasy -2.53 -.57
???
rAlcohol 1.99 -.44
???

Note although 2 ties in the Ecstasy group
were excluded, here, all 10 subjects are included
in the calculation of the effect size (????
Medium to large effects
Medium to large effects
44
Reporting the results(Field_2005_541)?

For Ecstasy users, depression levels were
significantly higher on Wednesday (Mdn33.5) than
on Sunday (Mdn17.50), T0,p
For Alcohol users, the opposite was true
depression levels were significantly lower on
Wednesday (Mdn7.5) than on Sunday (Mdn16), T8,
p

45
Terminology
46
Differences between several independent groups
The Kruskal-Wallis Test

The Kruskal-Wallis Test is the non-parametric
equivalent to a Simple One-way independent ANOVA.
Example
Background It has been claimed that the chemical
'genistein' which naturally occurs in soya
products decreases the number of sperms in males.
Research question Do groups of male subject who
eat various amounts of soya meals per week have
different amounts of sperm after a year's period ?

47
The variables

Independent variable number of soya meals
(1) no soya meals (control condition) 0 per
year
(2) 1 soya meal per week - 52 per year
(3) 4 soya meals per week - 208 per year
(4) 7 soya meals per week - 364 per year
Each group consisted of 20 different male
individuals.
Dependent variable number of sperms

48
The Theory of the Kruskal-Wallis Test

As the other non-parametric tests, the K-W Test
is also based on ranked data.
First, the scores are ranked,irrespective of
group memebership.
Then, for each group, their ranks are added. The
sum of ranks for each group is Ri.

49
Ranked data for the soya experiment
50
The Test Statistic H

k
H 12 Si1 R2i - 3 (N1)
N(N1) ni
H 12 9272 8832 8832 5472
- 3(81)
80(81) 20 20 20 20
12 (42,966.45 38,984.45 38,984.45
14,960.45) -243 6480
0.0019 (135,895.8) 243
251.66 243 8.659

H has a ?2 distribution Its df k-1 where k
of groups, hence df 4-13
H Hcritical (3) 7.81, pdistribution)?
51
Data input

As for a One-Way ANOVA, we code the different
groups with a dummy coding variable 'Soya' in the
1st column
(1) no soya
(2) 1 soya meal
(3) 4 soya meals
(4) 7 soya meals
The dependent variable 'sperm' goes in the 2nd
column

52
The data in SPSS(Soya.sav)?
Dummy coding 1,2,3,4
Dep Var 'sperm'
ranks
53
Exploratory analyses

Analyze ? Descriptive Statistics ? Explore
tick 'Test of Normality' in 'Statistics'

Most groups show non-normal data distributions
Analyze ? General Linear Model ? Univariate, tick
'Homogeneity test' in 'options'
Levene's test is significant ? heterogeneous varia
nces
The data violate two parametric assumptions
Normal distribution of data Homogeneous
variances. Therefore, a non-parametric test is
advised.
54
Running the Kruskal-Wallis test

Analyze ? Nonparametric Tests ?
K-Independent Samples...

Define the range of the independent Var 1-4
levels of soya meals
Exact...
Jonckheere-Terpstra
If you have, choose 'Monte Carlo'
If you have, select 'Jonckheere-Terpstra'. This
is for a linear trend in the data
55
Output Kruskal-Wallis Test
Mean Ranks for all levels of '' of soya meals'
Main Test Statistics H (here, called
Chi-Square)? H is significant If you have
requested 'Monte Carlo', the result will also be
displayed here.

? Number of soya meals has a significant effect
on sperm count, overall.
However, we do not know where the difference is
exactly located.

56
Boxplots for the 4 groupsGraphs ? Boxplots

Visual inspection
The Medians for groups 1-3
seem rather similar however,
the Median for group 4
seems somewhat lower

How can we know which particular difference(s)
brought about the overall difference?
57
1. Posthoc Tests for Kruskal-Wallis

1. Posthoc tests in nonparametric tests can be
done with the Mann-Whitney test (for pairs of
unrelated samples).
If we want to do Posthoc tests, we risk inflating
Type I error.
In order to correct for family-wise error
inflation, we may use the Bonferroni correction.
However, then we loose power.
2. Posthoc tests in nonparametric tests can be
done by hand

58
1. Posthoc Tests for Kruskal-Wallis

? Compromise do only a few promising
comparisons, e.g. Each level against the control
condition (as in 'simple' contrasts)?
Test 1 no soya vs. 1 soya meal
Test 2 no soya vs. 4 soya meals
Test 3 no soya vs. 7 soya meals
With 3 tests, we have to divide our ?-level by 3,
.05/3 .0167
So we are doing our Posthoc tests on this more
rigorous level.

59
1. Single Mann-Whitney tests forthe three
comparisons
Analyze ? Nonparametric Tests ? 2-Independent
tests, define the groups in the grouping
variables window group 1 vs 2 1 vs 3 1 vs 4.
(Here, the contrast 1 vs 4 is requested.)? Each
Mann-Whitney test is carried out indepdently
60
1. Output of the Single Mann-Whitney tests for
the three comparisons
Group 1 vs 2 (no vs 1 soya meal) n.s
Group 1 vs 3 (no vs 4 soya meals) n.s
Group 1 vs 4 (no vs 7 soya meals)
? Eating only 1 to 4 soya meals a week does not
affect number of sperms as compared to not eating
soya meals at all. However, eating 7 soya meals a
week significantly diminishes number of sperms.
61
2. Posthoc Tests in nonparametric tests(for
nerds)?

You can also calculate the differences for all
pairs of contrasts by hand.
You take the difference between the mean ranks of
the different groups and compare them to a value
based on the value of z (corrected for the number
of comparisons you make) and a constant based on
the total sample size and the sample size in the
2 groups being compared.
??Ru - ?Rv??z??k(k-1) ? N(N1) /12 ((1/nu)
1/nv))?

K number of groups (4)? N total sample size
(80)? nu number of subj in 1st group (20)? nv
number of subj in 2nd group (20)?

Difference between the
mean rank of the 2 groups,
ignoring the /- sign only the
?absolute value? is considered

62
Determining the critical difference for z

??Ru - ?Rv??z??k(k-1) ? N(N1) /12 ((1/nu)
1/nv))?
In order to know the value for z??k(k-1) , we
need to determine the ??level. Normally, it is
.05. This level needs to be divided by 12 which
is k(k-1) where k is the number of groups, that
is, 4x312. The ??level therefore is .05/12
.00417. Now, z??k(k-1) means 'the value of z for
which only .00417 other values of z are bigger'.
Looking up in Appendix A.1 (normal
z-distribution) the smaller portion for .00417
(actually, .00145), we find the value of z2.64.
This is the crit value.

63
Determining the critical difference for z

??Ru - ?Rv??z??k(k-1) ? N(N1) /12 ((1/nu)
1/nv))?
crit. Diff 2.64 ? (80(801)/12) (1/20 1/20)?
crit. Diff 2.64 ? 540(0.1)
crit. Diff 2.64 ? 54
crit. Diff 19.4
Since sample sizes for all groups are identical,
this value holds for all comparisons.
We now can test the actual differences in mean
ranks for all comparisons against this critical
difference. If a value is bigger, then the
comparison is significant.

64
Testing individual differences in mean rank
against the critical difference (19.4)?

According to this calculation, none of the
differences is significant! However, in the
previous Mann-Whitney test the 'No meals 7
meals' had been significant. How come?

65
Significant or ns comparisons?

In the old calculation we had to divide our
overall ? level into 3 portions.
In the old M-W test we had only conducted 3
comparisons which yields a corrected ? of
.05/3.0167. The ? of the 'no vs 7 meals'
comparison had been .009 which is smaller than
.0167.
In the new comparison, however, we have an ? of
.05/6 (for all 6 comparisons) .0083. Now .009
.0083, hence the comparison is n.s.
? Better carry out only a few reasonable
comparisons

66
Testing for trends the Jonckheere-Terpstra test

This test looks at the differences between the
medians of the groups, just as the
Kruskall-Wallis test does.
Additionally, it includes information about
whether the medians are ordered.
In our example, we predict an order for the
number of sperms in the 4 groups, indeed
no meal 1 meal 4 meals 7 meals
In the coding variable, we have already encoded
the order which we expect (1234)?

67
Output of the J-T test
If you have J-T in your version of SPSS,
it would look like this
Z-score (912-1200)/116.33-2.476
J-T test should always be 1-tailed (since we have
a directed hypo!) We compare -2.47 against 1.65
which is the z-value for an ?-level of 5 for a
1- tailed test. Since 2.471.65 the result is
significant. The negative sign means that medians
are in descending order (a positive sign would
have meant ascending order).
68
Calculating effect sizes

Calculate only effect sizes for single focused
comparisons
r z
?2n
rNoSoya 1 meal -0.243/?40 -.04
rNoSoya 4 meal -0.325/?40 -.05
rNoSoya 7 meal -2.597/?40 -.41
rJonckheere -2.47/??0 -.28

Neglegible effects
Negligible effects
Medium effects
Medium effects
69
Reporting the results of the Kruskal-Wallis Test
(Field_2005_556)?

Sperm counts were significantly affected by
eating soya meals (H(3) 8.66, p Mann-Whitney Tests were used to follow up this
finding. A Bonferroni corrrection was applied and
so all effects are reported at a .0167 level of
significance. It appeared that sperm counts were
no different when one soya meal (U191, r-.04)
or four soya meals (U188, r -.05) were eaten
per week compared to none. However, when seven
soya meals were eaten per week, sperm counts were
significantly lower than when no soya was eaten
(U104, r-.41).

70
Terminology
71
Differences between several related groups
Friedman's ANOVA

Friedman's ANOVA is the non-parametric analogue
to a repeated measure ANOVA (see chapter 11)
where the same subjects have been subjected to
various conditions.
Example here Testing the effect of a new diet
called 'Andikins diet' on n10 women. Their
weight (in kg) was tested 3 times
Start
Month 1
Month 2
Would they loose weight in the course of the diet?

http//goc-frankfurt.de/images/dualit/personenwaag
e_450.jpg
72
Theory of Friedman's ANOVA

Subject's weight on each of the 3 dates is listed
in a separate column. Then ranks for the 3 dates
are determined and listed in separate columns.
Then, the ranks are summed up for each Condition
(Ri)?

Always the 3 scores are compared The
smallest one gets 1, the next 2, and the
biggest one 3.
Diet data with ranks
73
The Test statistic Fr

From the sum of ranks for each group, the test
statistic Fr is derived
k
Fr 12/Nk (k1) Si1 R2i - 3N(k1)?
(12/(10x3)(31)) (192 202 212))
(3x10)(31)?
12/120 (361400441) 120
0.1 (1202) 120
120.2 - 120 0.2

74
Data Input and provisional analysis (using)
diet.sav

First, test for normality
Analyze ? Descriptive Statistics ? Explore, tick
'Normality plots with tests' in the 'Plots' window

Data sheet
In the Shapiro-Wilk test (which is more accurate
than the K-S Test, two groups (Start, 1 month)
show non-normal distributions. This violation of
a parametric constraint justifies the choice of a
non-para-metric test.
75
Running Friedman's ANOVA

Analyze ? Non-parametric Tests ? K Related
Samples...

If you have 'Exact', tick 'Exact and limit
calculation time to 5 minutes.
Other options
Other options
Exact...
Request everything there is - it is not much...
76
Other options

Kendall's W Similar to Friedman's ANOVA, but
looks specifically at agreement between raters.
For example to what extent (from 0-1) women rate
Justin Timberlake, David Beckham, or Tony Blair
on their attractiveness. This is like a
correlation coefficient.
Cochran's Q This is an extension of NcNemar's
test. It is like a Friedman's test for
dichotomous data. For example, if women should
judge whether they would like to kiss Justin
Timberlake, David Beckham, or Tony Blair and they
could only answer Yes or No.

77
Output from Friedman's ANOVA
The F-Statistics is called Chi-Square, here. It
has df2 (k-1, where k is the of groups). The
statistics is n.s.
78
Posthoc tests for Friedman's ANOVAWilcoxon
signed-rank tests but correcting for the numbers
of tests we do, here ? .05/3.0167.
Analyze ? Nonparametric Tests ? 2-Related Tests,
tick 'Wilcoxon', specify the 3 pairs of groups
Mean ranks and sum of ranks for all 3 comparisons
So, actually, we do not have to calculate any
further...
All comparisons are ns, as expected from the
overall ns effect.
79
Posthoc tests for Friedman's ANOVA- calculation
by hand

We take the difference between the mean ranks of
the different groups and compare them to a value
based on the value of z (corrected for the of
comparions) and a constant based on the total
sample size (n10) and the of conditions (k3)?
??Ru - ?Rv??z??k(k-1) ? k(k1)/6N
z??k(k-1) .05/3(3-1) .00833
If the difference is significant, it should have
a higher value than the value of z for which only
.00833 other values of z are bigger. As before,
we look in the Appendix A.1 under the column
Smaller Portion. The number corresponding to
.00833 is the critical value it is between 2.39
and 2.4.

k(k-1) 3 (3-1) 6
80
Calculating the critical differences

Critical difference z??k(k-1) ? k(k1)/6N
crit. Diff 2.4 ? (3(31)/6x10
crit. Diff 2.4 ? 12/60
crit. Diff 2.4 ? 0.2
crit. Diff 1.07
? If the differences between mean ranks are ?
the critical difference 1.07, then that
difference is significant.

81
Calculating the differences between mean ranks
for diet data

? None of the differences is ? the critical
difference 1.07, hence none of the comparisons is
significant.

82
Calculating the effect size

Again, we will only calculate the effect sizes
for single comparisons

r z
?2n
rStart 1 month -0.051/??? -.01
rStart 2 months -0.255/??0 -.06
r1 month 2 months -0.153/??0 -.03

Tiny effects
Tiny effects
Tiny effects
83
Reporting the results of Friedman's ANOVA
(Field_2005_566)?

The weight of participants did not significantly
change over the 2 months of the diet (?2(2)
0.20, p .05). Wilcoxon tests were used to
follow up on this finding. A Bonferroni
correction was applied and so all effects are
reported at a .0167 level of significance. It
appeared that weight didn't significantly change
from the start of the diet to 1 month, T27,
r-.01, from the start of the diet to 2 months,
T25, r-.06, or from 1 month to 2 months,
T26,r-0.3. We can conclude that the Andikinds
diet (...) is a complete failure.

84
Summary Terminology

Write a Comment

User Comments (0)