Title: Measuringtesting for association in tables: Cramers V and Odds Ratios
1Measuring/testing for association in tables
Cramers V and Odds Ratios
- Monday 28th November 2005
2Strength of Association
- Chi-square tells us whether there is a
significant association between two variables
(or whether there exists an association that
would be unlikely to be found by chance). - However it does not tell us in a clear way how
strong this association is since the size of
chi-square depends on the sample size. - Today we will look at
- Cramérs V. This is a PRE statistic that tells us
the strength of associations in contingency
tables. (the meaning of PRE will be explained).
It also enables us to compare different
associations and decide which is stronger. - Odds Ratios. These tell us the relative odds of
an event occurring for different categories or
groups of cases (or people). As such theyre
relatively easily understood ways to discuss the
strength of association in contingency tables.
3PRE
- Strength of association is generally measured by
PRE or Proportional Reduction in Error. - Another way of saying this is, how much better
our prediction of the dependent variable will be
if we know something about the independent
variable. - PRE statistics range from 0 to 1. Where
- values of 0 to 0.25 indicate a non-existent to
weak association - values between 0.26 and 0.50 indicate
moderate association - values between 0.51 and 0.75 indicate a
moderate to strong association - values between 0.76 and 1 indicate perfect
association.
4Example Sex, Age and SportData from Young
Peoples Social Attitudes Study 2003 (available
on module website)
Boys (men)
Girls (women)
5What can we say about these tables?
- It looks like boys play sport with clubs more
than girls. - And it looks like both boys and girls become less
likely to play sport with clubs as they get
older. - So,
- Does age significantly affect sports club
membership for boys? For girls? - Is the effect of age stronger/weaker for boys
than it is for girls?
61. Does age significantly affect sports club
membership for boys?
To answer this question we work out chi-square,
by calculating ?2 (Observedij
Expectedij)2 Expectedij (62
(95164)/303)2 (33 (95139)/303)2 (68
(125164)/303)2
(95164)/303 (95139)/303
(125164)/303) (57 (125139)/303)2
(34 (83164)/303)2 (49 (83139)/303)2
(125139)/303 (83164)/303
(83139)/303) (62-51.4)2/51.4
(33-43.6)2/43.6 (68-67.7)2/67.7
(57-57.3)2/57.3 (34-44.9)2/44.9
(49-38.1)2/38.1 2.2 2.6 0 0 2.6
3.1 10.5
?
71. Does age significantly affect sports club
membership for boys?
- Chi-square 10.5.
- df (r-1)(c-1) 12 2
- If we look up the .05 value for 2 degrees of
freedom its 5.99 and the .01 value is 9.21.
Therefore this is significant at (plt 0.01). - And the computer printout, below, confirms that
in fact the p value is .005 (n.b. chi-square
without rounding is 10.541).
8And girls?
- The similar chi-square test for girls shows that
age has a significant affect on whether or not
they play sport (plt.001)
92. Is the effect of age stronger/weaker for boys
than it is for girls?
- The chi-square statistic is bigger for girls than
for boys, however the sample of girls is also
bigger (360 versus 303) so this may have affected
it. - To work out the strength of association we need
to correct for both sample size and for the table
shape (as this also affects the magnitude of
chi-square statistics). A frequently used measure
of association is Cramérs V - ?2 .
- N (L-1)
- (where ?2 is chi-square, N is the sample size,
and L is the lesser of the number of rows and
number of columns). - Note In any table where either the number of
rows or columns is equal to 2, Cramérs V is
equal to another measure of association, referred
to as phi (or F).
?
10Comparing strength of association between age and
involvement in sport for boys and girls
Boys (men) ?2 10.541
Girls (women) ?2 23.394
Cramérs V for the two tables are therefore
10.5 0.186 and 23.9
0.258 303 (2-1) 360 (2-1)
?
?
11Cramérs V
- We can also see Cramérs V in SPSS printout (to
the right boys are above and girls, below).
12Whats it mean?
- We can say that age has a significant but
relatively weak affect on boys participation in
sport. - And that age has a significant and slightly
stronger affect on girls participation in sport.
- Thus, both girls and boys are likely to decrease
their participation in sports clubs as they get
older but this effect is more pronounced among
girls than among boys. - And there is a (small) gender difference in the
relationship between age and participation in
sport.
13Odds Ratios
The odds Ratio is the odds of an outcome given
membership of group a divided by the odds of that
outcome given membership of group b. Or, looking
at the table below, the odds of playing sport if
youre male (or membership of group male),
divided by the odds of playing sport if youre
female (or membership of group female).
14Working out the odds.
- The odds of an event occurring can be worked out
by the number of times that it occurs divided by
the number of times that it does not occur.
ODDS OF A MALE PLAYING SPORT IN A CLUB 164/139
1.18 This means that a male is 1.18 times more
likely to be a member of a sports club than not
to be. ODDS OF A FEMALE PLAYING SPORT IN A CLUB
114/246 0.46 This means that a female is 0.46
times as likely to be a member of a sports club
as not to be (or less than 50 as likely). n.b.
its often easier to talk about it the other way
round (i.e. 246/114 2.15 therefore women are
about two times as likely not to take part in
sports clubs as they are to take part). The ODDS
RATIO is the odds of a male playing sport divided
by the odds of a woman playing sport 1.18 / 0.46
2.57 Therefore the odds that a male is part of
a sport club is over two and a half times as
great as the odds of a female being part of a
sports club.
15Odds Ratios
- When Odds Ratios are equal to 1 the two groups
are identical (the odds of the given outcome are
the same for both groups). - When odds ratios get close to 0 or to infinity
the groups are totally different (the odds of the
given outcome are very high (or certain) for one
group and very low (or zero) for the other). - We will come back to odds ratios when we look at
logistic regression and log-linear analysis. - Note a mathematic description of the odds ratio
is(a/c) / (b/d)where the independent
variable is in the columns this is the same
as(ad) / (bc)