Making Inferences for Associations Between Categorical Variables: Chi Square Chapter 12 - PowerPoint PPT Presentation

About This Presentation
Title:

Making Inferences for Associations Between Categorical Variables: Chi Square Chapter 12

Description:

3 Chi-Square /degrees of freedom ... Chi squared example generate random digits 250 times ... 3 Chi-Square question. 200 tosses of a fair coin, 115 heads, 85 tails. ... – PowerPoint PPT presentation

Number of Views:64
Avg rating:3.0/5.0
Slides: 22
Provided by: setonhallu
Learn more at: http://pirate.shu.edu
Category:

less

Transcript and Presenter's Notes

Title: Making Inferences for Associations Between Categorical Variables: Chi Square Chapter 12


1
Making Inferences for Associations Between
Categorical Variables Chi SquareChapter 12
  • Reading Assignment
  • pp. 463-482 485

2
Elements of a test of hypotheses 3
  • Hypothesis testing Process for finding out
    whether we can generalize about an association
    from a sample to a population
  • Null hypothesis (H_0) Represents the status quo
    to the party performing the sampling experiment,
    i.e., will be accepted unless the data provides
    convincing evidence it is false.
  • Research hypothesis (H_1) (aka alternative
    hypothesis) Will be accepted only if the data
    provides convincing evidence of its truth
  • Homework Skills 1, p. 464

3
Process of Hypothesis Testing 5
  • Step 1 Specify a research hypothesis and a null
    hypothesis
  • Step 2 Compute the value of a test statistic for
    the relationship
  • Step 3 Calculate the degrees of freedom for the
    variables involved
  • Step 4 Look up the distribution for the test
    statistic to find its critical value at a
    specified level of probability (to determine the
    likelihood that a test stat. of a particular
    value could have occurred by chance alone)
  • Step 5 Decide whether to reject the null
    hypothesis

4
Null Hypothesis 3
  • Null Hypothesis(H_0) speculates there is no
    association between the two variables. Examples
  • H_0 men are no different from women in there
    political affiliations
  • H_0 There is no relationship between a
    respondents educational level and his or her
    parents
  • H_0 Older people are no more likely to be happy
    than younger people
  • This is the only hypothesis that can actually be
    tested-we either reject or fail to reject the
    null hypothesis
  • EX H_0 There is no association between age and
    happiness among American adults hw/ read p. 466

5
2 Statistical Independence
  • Statistical Independence Two variables are
    statistically independent when changes in one
    variable (age of respondents) have nothing to do
    with changes in a second (happiness), ie, they
    vary independently of one another
  • Conversely, when two variables are statistically
    dependent on one another, changes in one variable
    are associated with changes in a second
    variable.,ie, changes in age(older respondents)
    are associated with changes in levels of
    happiness (more happiness)

6
2 Statistical Independence and hypothesis testing
  • Ex/ Null Hypothesis Age is statistically
    independent of happiness, ie, differences among
    respondents to the variable age are unrelated to
    any differences in their levels of reported
    happiness
  • Hyp. Testing can assess the likelihood that the
    degree of statistical indep found in the sample
    is due to chance
  • If we find that the degree of statistical indep
    found in the sample is not likely to be due to
    chance, null hyp is rejected
  • If it is likely due to chance, null hyp is
    accepted

7
3 Type I and Type II Errors
  • Mistakes arising from whether a given sample
    may or may not be representative of a population
  • If a Null Hypothesis assumes there is no
    association between two variables, and we reject
    it even though there is no association is a Type
    I error, i.e, we call someone a liar when he is
    telling the truth
  • If a Null Hypothesis assumes there is no
    association between two variables, and we accept
    it even though there is an association is a Type
    II error, i.e., we say someone is truthful when
    he is lying

8
3 Type I and Type II Errors

Conclusion H_0 true H_1 true
H_0 true Correct decision Type II error
H_1 true Type I error Correct decision
9
3 Elements of a Test of Hypothesis
  • Null Hypothesis (H_0) a theory about one of the
    population parameters. The theory generally
    represents the status quo, which must be proven
    false
  • Research Hypothesis (H_1) a theory that
    contradicts the null hypothesis. The theory
    generally represents the truth that will be
    accepted only if there is evidence
  • Test statistic Sample statistic used to decide
    whether to reject the null hypothesis

10
3 Elements of a Test of Hypothesis (cont)
  • Rejection region The numerical values of the
    test statistic for which the null hypothesis will
    be rejected. The rejection region is chosen so
    that the probability is ? that it will contain
    the test statistic when the null hypothesis is
    true, thereby leading to a Type I error. The
    value of ? chosen is usually small (e.g.,
    0.01,0.05, or 0.1), and is referred to as the
    level of significance of the test. A 0.05 (or 5)
    level of significance indicates that there is a
    5 chance that we would reject the hypothesis
    when we should not, or we have 95 confidence
    that we have made the right decision
  • Assumptions Clear statement(s) of any
    assumptions made about the population(s) being
    sampled

11

Elements of a Test of Hypothesis (cont)
  • Experiment and calculation of test statistic
  • Conclusion
  • If the numerical value of the test statistic
    falls in the rejection region, we reject the null
    hypothesis and conclude that the research
    hypothesis is true. We know that hypothesis
    testing will led to this conclusion incorrectly
    (Type I Error) 100? of the time when H_0 is
    true.
  • If the test statistic does not fall in the
    rejection region, we do not reject H_0. Thus we
    reserve judgment about which hypothesis is true.
    We do not conclude that the null hypothesis is
    true because we do not, in general, know the
    probability that our test procedure will lead to
    an incorrect failure to reject H_0 (Type II Error)

12
5 Chi-Square
  • Formula 12.1
  • Observed vs. Expected Roll a die 6 times, get
    three 3sobserved expected one 3
  • Pp.469-71 skills Filling in the table of
    expected values
  • Skills 3,4 Excel
  • Generally, the greater the value of chi-square,
    the more statistical dependence between two
    variables

13
3 Chi-Square /degrees of freedom
  • We are using observations from a sample as well
    as certain population parameters. If these
    parameters are unknown,they must be estimated
    from the sample.
  • Degrees of Freedom (?) the number N of
    independent observations in the sample (ie,
    sample size) minus the number k of population
    parameters which must be estimatede from sample
    observations
  • ? N k
  • When working with a contingency table,
    df(r-1)(c-1), where r and c are the number of
    rows and columns (resp) in the contingency table

14
Chi squared examplegenerate random digits 250
times
digit 0 1 2 3 4 5 6 7 8 9
obs freq 17 31 29 18 14 20 35 30 20 36
Exp freq 25 25 25 25 25 25 25 25 25 25
15
Chi squared examplegenerate random digits 250
times
  • Question Does the observed frequency differ from
    the expected distribution in a significant way?

digit 0 1 2 3 4 5 6 7 8 9
obs freq 17 31 29 18 14 20 35 30 20 36
Exp freq 25 25 25 25 25 25 25 25 25 25
16
3 Chi-Square random digit example
  • ?2 (17-25)2/25 (31-25)2/25 (29-25)2/25
    (36-25)2/25 excel
  • 23.3
  • Degrees of freedom 10-19
  • Table, p. 545
  • ?2 at .99 is 21.7 23.3gt 21.7, so the observed
    frequency differs from the expected frequency at
    the 0.01 level of significance, so the table of
    random numbers is somewhat doubtful

17
3 Chi-Square question
  • 200 tosses of a fair coin, 115 heads, 85 tails.
    Test the hypothesis that the coin is fair using
    (a) 0.05, (b) 0.01 levels of significance
  • Ans
  • Df2-11 (2 for H,T)
  • O1115, O285 E1E2100
  • ?2(115-100)2/100 (85-100)2/100 4.5
  • (a) ?2 table for .95 is 3.84 4.5gt3.84, so reject
    hyp that coin is fair at the 0.05 level of
    significance
  • (b) ?2 table for .99 is 6.63 4.5lt6.63, so cannot
    reject hyp that coin is fair at the 0.01 level of
    significance

18
Interpreting Chi Square 4
  • When hypothesizing about an association between
    two variables, chi-square tells the likelihood
    that the degree of statistical dependence
    observed is simply the luck of the draw
  • A p value of 0.05 tells that there are no more
    than 5 chances in 100 that the statistical
    dependence is due to chance. Thus, there are 95
    chances in 100 that the statistical dependence
    found is not due to chance, so the null
    hypothesis, ie., no association between
    variables, is rejected
  • The higher the value of p, the less likely we are
    to make a Type I error
  • bility

19
Interpreting Chi Square 4
  • When hypothesizing about an association between
    two variables, chi-square tells the likelihood
    that the degree of statistical dependence
    observed is simply the luck of the draw
  • A p value of 0.05 tells that there are no more
    than 5 chances in 100 that the statistical
    dependence is due to chance. Thus, there are 95
    chances in 100 that the statistical dependence
    found is not due to chance, so the null
    hypothesis, ie., no association between
    variables, is rejected
  • The higher the value of p, the less likely we are
    to make a Type I error
  • bility

20
Interpreting Chi Square 4
  • P. 480-81 Table 12.4 (p. 472) has ?2 15.487,
    ?6
  • The higher the ?2 value, the less likely it is
    that the value obtained is due to chance. (read
    table 12.9, p. 481)
  • Rule of thumb reject null hypothesis when ?2
    reaches 0.05only 5 chances in 100 that the
    dependence is due to chance
  • Skills7, p. 481
  • Skills 8, p. 485 (following their example, p.
    484)

21
4
  • Homework/ p. 492/ 1,3
  • P 494/ spss 1,2
Write a Comment
User Comments (0)
About PowerShow.com