Loading...

PPT – Chi-squared Tests PowerPoint presentation | free to download - id: 5652b7-YjQ3Z

The Adobe Flash plugin is needed to view this content

Chi-squared Tests

We want to test the goodness of fit of a

particular theoretical distribution to an

observed distribution. The procedure is

- 1. Set up the null and alternative hypotheses

and select the significance level. - 2. Draw a random sample of observations from

a population or process. - 3. Derive expected frequencies under the

assumption that the null hypothesis is true. - 4. Compare the observed frequencies and the

expected frequencies. - 5. If the discrepancy between the observed

and expected frequencies is too great to

attribute to chance fluctuations at the selected

significance level, reject the null hypothesis.

Example 1 Five brands of coffee are

taste-tested by 1000 people with the results

below. Test at the 5 level the hypothesis that,

in the general population, there is no difference

in the proportions preferring each brand (i.e.

H0 pA pB pC pD pE versus H1 not all the

proportions are the same).

Brand preference Observed frequency fo Theoretical frequency ft fo-ft (fo-ft)2

A 210

B 312

C 170

D 85

E 223

1000

If all the proportions were the same, wed expect

about 200 people in each group, if we have a

total of 1000 people.

Brand preference Observed frequency fo Theoretical frequency ft fo-ft (fo-ft)2

A 210 200

B 312 200

C 170 200

D 85 200

E 223 200

1000 1000

We next compute the differences in the observed

and theoretical frequencies.

Brand preference Observed frequency fo Theoretical frequency ft fo-ft (fo-ft)2

A 210 200 10

B 312 200 112

C 170 200 -30

D 85 200 -115

E 223 200 23

1000 1000

Then we square each of those differences.

Brand preference Observed frequency fo Theoretical frequency ft fo-ft (fo-ft)2

A 210 200 10 100

B 312 200 112 12544

C 170 200 -30 900

D 85 200 -115 13225

E 223 200 23 539

1000 1000

Then we divide each of the squares by the

expected frequency and add the quotients. The

resulting statistic has a chi-squared (?2)

distribution.

Brand preference Observed frequency fo Theoretical frequency ft fo-ft (fo-ft)2

A 210 200 10 100 0.500

B 312 200 112 12544 62.720

C 170 200 -30 900 4.500

D 85 200 -115 13225 66.125

E 223 200 23 539 2.645

1000 1000 136.49

The chi-squared (?2) distribution

The chi-squared distribution is skewed to the

right. (i.e. It has the bump on the left and

the tail on the right.)

In these goodness of fit problems, the number of

degrees of freedom is

In the current problem, we have 5 categories (the

5 brands). We have 1 restriction. When we

determined our expected frequencies, we

restricted our numbers so that the total would be

the same total as for the observed frequencies

(1000). We didnt estimate any parameters in

this particular problem. So dof 5 1 0 4 .

Large values of the ?2 statistic indicate big

discrepancies between the observed and

theoretical frequencies.

So when the ?2 statistic is large, we reject the

hypothesis that the theoretical distribution is a

good fit. That means the critical region consists

of the large values, the right tail.

acceptance region

crit. reg.

From the ?2 table, we see that for a 5 test with

4 degrees of freedom, the cut-off point is 9.488.

In the current problem, our ?2 statistic had a

value of 136.49. So we reject the null hypothesis

and conclude that the proportions preferring each

brand were not the same.

f(?2)

acceptance region

0.05

crit. reg.

9.488

136.49

Example 2 A diagnostic test of mathematics is

given to a group of 1000 students. The

administrator analyzing the results wants to know

if the scores of this group differ significantly

from those of the past. Test at the 10 level.

Grade Historical Rel. freq. Expected Abs. freq. ft CurrentObs. freq. fo fo-ft (fo-ft)2

90-100 0.10 50

80-89 0.20 100

70-79 0.40 500

60-69 0.20 200

lt60 0.10 150

1000

Grade Historical Rel. freq. Expected Abs. freq. ft CurrentObs. freq. fo fo-ft (fo-ft)2

90-100 0.10 50

80-89 0.20 100

70-79 0.40 500

60-69 0.20 200

lt60 0.10 150

1000

Based on the historical relative frequency, we

determine the expected absolute frequency,

restricting the total to the total for the

current observed frequency.

Grade Historical Rel. freq. Expected Abs. freq. ft CurrentObs. freq. fo fo-ft (fo-ft)2

90-100 0.10 100 50

80-89 0.20 200 100

70-79 0.40 400 500

60-69 0.20 200 200

lt60 0.10 100 150

1000 1000

We subtract the theoretical frequency from the

observed frequency.

Grade Historical Rel. freq. Expected Abs. freq. ft CurrentObs. freq. fo fo-ft (fo-ft)2

90-100 0.10 100 50 -50

80-89 0.20 200 100 -100

70-79 0.40 400 500 100

60-69 0.20 200 200 0

lt60 0.10 100 150 50

1000 1000

We square those differences.

Grade Historical Rel. freq. Expected Abs. freq. ft CurrentObs. freq. fo fo-ft (fo-ft)2

90-100 0.10 100 50 -50 2500

80-89 0.20 200 100 -100 10,000

70-79 0.40 400 500 100 10,000

60-69 0.20 200 200 0 0

lt60 0.10 100 150 50 2500

1000 1000

We divide the square by the theoretical frequency

and sum up.

Grade Historical Rel. freq. Expected Abs. freq. ft CurrentObs. freq. fo fo-ft (fo-ft)2

90-100 0.10 100 50 -50 2500 25

80-89 0.20 200 100 -100 10,000 50

70-79 0.40 400 500 100 10,000 25

60-69 0.20 200 200 0 0 0

lt60 0.10 100 150 50 2500 25

1000 1000 125

We have 5 categories (the 5 grade groups). We

have 1 restriction. We restricted our expected

frequencies so that the total would be the same

total as for the observed frequencies (1000).

We didnt estimate any parameters in this

particular problem. So dof 5 1 0 4 .

From the ?2 table, we see that for a 10 test

with 4 degrees of freedom, the cut-off point is

7.779.

In the current problem, our ?2 statistic had a

value of 125. So we reject the null hypothesis

and conclude that the grade distribution is NOT

the same as it was historically.

f(?2)

acceptance region

0.10

crit. reg.

7.779

125

Example 3 Test at the 5 level whether the

demand for a particular product as listed below

has a Poisson distribution.

of units demanded per day x Observed of days fo xfo prob. f(x) Expected of days ft fo-ft (fo-ft)2

0 11

1 28

2 43

3 47

4 32

5 28

6 7

7 0

8 2

9 1

10 1

200

Multiplying the number of days on which each

amount was sold by the amount sold on that day,

and then adding those products, we find that the

total number of units sold on the 200 days is

600. So the mean number of units sold per day is

3.

of units demanded per day x Observed of days fo xfo prob. f(x) Expected of days ft fo-ft (fo-ft)2

0 11 0

1 28 28

2 43 86

3 47 141

4 32 128

5 28 140

6 7 42

7 0 0

8 2 16

9 1 9

10 1 10

200 600

We use the 3 as the estimated mean for the

Poisson distribution. Then using the Poisson

table, we determine the probabilities for each x

value.

of units demanded per day x Observed of days fo xfo prob. f(x) Expected of days ft fo-ft (fo-ft)2

0 11 0 0.050

1 28 28 0.149

2 43 86 0.224

3 47 141 0.224

4 32 128 0.168

5 28 140 0.101

6 7 42 0.050

7 0 0 0.022

8 2 16 0.008

9 1 9 0.003

10 1 10 0.001

200 600 1.

Then we multiply the probabilities by 200 to

compute ft, the expected number of days on which

each number of units would be sold. By

multiplying by 200, we restrict the ft total to

be the same as the fo total.

of units demanded per day x Observed of days fo xfo prob. f(x) Expected of days ft fo-ft (fo-ft)2

0 11 0 0.050 10.0

1 28 28 0.149 29.8

2 43 86 0.224 44.8

3 47 141 0.224 44.8

4 32 128 0.168 33.6

5 28 140 0.101 20.2

6 7 42 0.050 10.0

7 0 0 0.022 4.4

8 2 16 0.008 1.6

9 1 9 0.003 0.6

10 1 10 0.001 0.2

200 600 1. 200

When the fts are small (less than 5), the test

is not reliable. So we group small ft values.

In this example, we group the last 4 categories.

of units demanded per day x Observed of days fo Observed of days fo xfo prob. f(x) Expected of days ft Expected of days ft fo-ft (fo-ft)2

0 11 11 0 0.050 10.0 10.0

1 28 28 28 0.149 29.8 29.8

2 43 43 86 0.224 44.8 44.8

3 47 47 141 0.224 44.8 44.8

4 32 32 128 0.168 33.6 33.6

5 28 28 140 0.101 20.2 20.2

6 7 7 42 0.050 10.0 10.0

7 0 4 0 0.022 4.4 6.8

8 2 4 16 0.008 1.6 6.8

9 1 4 9 0.003 0.6 6.8

10 1 4 10 0.001 0.2 6.8

200 200 600 200 200

Next we subtract the theoretical frequencies ft

from the observed frequencies fo.

of units demanded per day x Observed of days fo Observed of days fo xfo prob. f(x) Expected of days ft Expected of days ft fo-ft (fo-ft)2

0 11 11 0 0.050 10.0 10.0 1

1 28 28 28 0.149 29.8 29.8 1.8

2 43 43 86 0.224 44.8 44.8 -1.8

3 47 47 141 0.224 44.8 44.8 2.2

4 32 32 128 0.168 33.6 33.6 -1.6

5 28 28 140 0.101 20.2 20.2 7.8

6 7 7 42 0.050 10.0 10.0 -3

7 0 4 0 0.022 4.4 6.8 -2.8

8 2 4 16 0.008 1.6 6.8 -2.8

9 1 4 9 0.003 0.6 6.8 -2.8

10 1 4 10 0.001 0.2 6.8 -2.8

200 200 600 200 200

Then we square the differences

of units demanded per day x Observed of days fo Observed of days fo xfo prob. f(x) Expected of days ft Expected of days ft fo-ft (fo-ft)2

0 11 11 0 0.050 10.0 10.0 1 1

1 28 28 28 0.149 29.8 29.8 1.8 3.24

2 43 43 86 0.224 44.8 44.8 -1.8 3.24

3 47 47 141 0.224 44.8 44.8 2.2 4.84

4 32 32 128 0.168 33.6 33.6 -1.6 2.5

5 28 28 140 0.101 20.2 20.2 7.8 60.84

6 7 7 42 0.050 10.0 10.0 -3 9

7 0 4 0 0.022 4.4 6.8 -2.8 7.84

8 2 4 16 0.008 1.6 6.8 -2.8 7.84

9 1 4 9 0.003 0.6 6.8 -2.8 7.84

10 1 4 10 0.001 0.2 6.8 -2.8 7.84

200 200 600 200 200

divide by the theoretical frequencies, and sum

up.

of units demanded per day x Observed of days fo Observed of days fo xfo prob. f(x) Expected of days ft Expected of days ft fo-ft (fo-ft)2

0 11 11 0 0.050 10.0 10.0 1 1 0.10

1 28 28 28 0.149 29.8 29.8 1.8 3.24 0.11

2 43 43 86 0.224 44.8 44.8 -1.8 3.24 0.07

3 47 47 141 0.224 44.8 44.8 2.2 4.84 0.11

4 32 32 128 0.168 33.6 33.6 -1.6 2.5 0.08

5 28 28 140 0.101 20.2 20.2 7.8 60.84 3.01

6 7 7 42 0.050 10.0 10.0 -3 9 0.90

7 0 4 0 0.022 4.4 6.8 -2.8 7.84 1.15

8 2 4 16 0.008 1.6 6.8 -2.8 7.84 1.15

9 1 4 9 0.003 0.6 6.8 -2.8 7.84 1.15

10 1 4 10 0.001 0.2 6.8 -2.8 7.84 1.15

200 200 600 200 200 5.53

We have 8 categories (after grouping the small

ones). We have 1 restriction. We restricted our

expected frequencies so that the total would be

the same total as for the observed frequencies

(200). We estimated 1 parameter, the mean for the

Poisson distribution. So dof 8 1 1 6 .

From the ?2 table, we see that for a 5 test with

6 degrees of freedom, the cut-off point is 12.592.

In the current problem, our ?2 statistic had a

value of 5.53. So we accept the null hypothesis

that the Poisson distribution is a reasonable fit

for the product demand.

f(?2)

acceptance region

0.05

crit. reg.

12.592

5.53

Example 4 Test at the 10 level whether the

following exam grades are from a normal

distribution. Note This is a very long problem.

grade intervals midpoint X fo X fo

50, 60) 14

60,70) 18

70,80) 36

80.90) 18

90,100 14

100

If the distribution is normal, we need to

estimate its mean and standard deviation.

grade intervals midpoint X fo X fo

50, 60) 14

60,70) 18

70,80) 36

80.90) 18

90,100 14

100

To estimate the mean, we first determine the

midpoints of the grade intervals.

grade intervals midpoint X fo X fo

50, 60) 55 14

60,70) 65 18

70,80) 75 36

80.90) 85 18

90,100 95 14

100

We then multiple these midpoints by the observed

frequencies of the intervals, add the products,

and divide the sum by the number of

observations. The resulting mean is 7500/100 75.

grade intervals midpoint X fo X fo

50, 60) 55 14 770

60,70) 65 18 1170

70,80) 75 36 2700

80.90) 85 18 1530

90,100 95 14 1330

100 7500

Next we need to calculate the standard deviation

We begin by subtracting the mean of 75 from

each midpoint, and squaring the differences.

grade intervals midpoint X fo X fo

50, 60) 55 14 770 -20 400

60,70) 65 18 1170 -10 100

70,80) 75 36 2700 0 0

80.90) 85 18 1530 10 100

90,100 95 14 1330 20 400

100 7500

We multiply by the observed frequencies and sum

up. Dividing by n 1 or 99, the sample variance

s2 149.49495. The square root is the sample

standard deviation s 12.2268.

grade intervals midpoint X fo X fo

50, 60) 55 14 770 -20 400 5600

60,70) 65 18 1170 -10 100 1800

70,80) 75 36 2700 0 0 0

80.90) 85 18 1530 10 100 1800

90,100 95 14 1330 20 400 5600

100 7500 14,800

We will use the 75 and 12.2268 as the mean ? and

the standard deviation ? of our proposed normal

distribution. We now need to determine what the

expected frequencies would be if the grades were

from that normal distribution.

Start with our lowest grade category, under 60.

We then expect that 10.93 of our 100

observations, or about 11 grades, would be in the

lowest grade category. So 11 will be one of our

ft values. We need to do similar calculations for

our other grade categories.

The next grade category is 60,70).

So 23.16 of our 100 observations, or about 23

grades, are expected to be in that grade category.

The next grade category is 70,80).

So 31.82 of our 100 observations, or about 32

grades, are expected to be in that grade category.

The next grade category is 80,90).

So 23.16 of our 100 observations, or about 23

grades, are expected to be in that grade category.

The highest grade category is 90 and over.

So 10.93 of our 100 observations, or about 11

grades, are expected to be in that grade category.

Now we can finally compute our ?2 statistic. We

put in the observed frequencies that we were

given and the theoretical frequencies that we

just calculated.

grade category fo ft

under 60 14 11

60,70) 18 23

70,80) 36 32

80.90) 18 23

90 and up 14 11

We subtract the theoretical frequencies from the

observed frequencies, square the differences,

divide by the theoretical frequencies, and sum

up. The resulting ?2 statistic is 4.3104.

grade category fo ft

under 60 14 11 0.8182

60,70) 18 23 1.0870

70,80) 36 32 0.5000

80.90) 18 23 1.0870

90 and up 14 11 0.8182

4.3104

We have 5 categories (the 5 grade groups). We

have 1 restrictions. We restricted our expected

frequencies so that the total would be the same

total as for the observed frequencies (100). We

estimated two parameters, the mean and the

standard deviation. So dof 5 1 2 2 .

From the ?2 table, we see that for a 10 test

with 2 degrees of freedom, the cut-off point is

4.605.

In the current problem, our ?2 statistic had a

value of 4.31. So we accept the null hypothesis

that the normal distribution is a reasonable fit

for the grades.

f(?2)

acceptance region

0.10

crit. reg.

4.605

4.31

We can also use the ?2 statistic to test whether

two variables are independent of each other.

Example 5 Given the following frequencies for a

sample of 10,000 households, test at the 1 level

whether the number of phones and the number of

cars for a household are independent of each

other.

of cars of cars of cars

0 1 2

of phones 0 1,000 900 100

of phones 1 1500 2600 500

of phones 2 or more 500 2500 400

10,000

We first compute the row and column totals,

of cars of cars of cars

0 1 2 row total

of phones 0 1,000 900 100 2000

of phones 1 1500 2600 500 4600

of phones 2 or more 500 2500 400 3400

column total 3,000 6,000 1,000 10,000

and the row and column percentages (marginal

probabilities).

of cars of cars of cars

0 1 2 row total

of phones 0 1,000 900 100 2000 0.20

of phones 1 1500 2600 500 4600 0.46

of phones 2 or more 500 2500 400 3400 0.34

column total 3,000 6,000 1,000 10,000 1.00

0.30 0.60 0.10 1.00

Recall that if 2 variables X and Y are

independent of each other, then Pr(Xx and Yy)

Pr(Xx) Pr(Yy)

We can use our row and column percentages as

marginal probabilities, and multiply to determine

the probabilities and numbers of households we

would expect to see in the center of the table if

the numbers of phones and cars were independent

of each other.

of cars of cars of cars

0 1 2 row total

of phones 0 0.20

of phones 1 0.46

of phones 2 or more 0.34

column total 1.00

0.30 0.60 0.10 1.00

First calculate the expected probability. For

example, Pr(0 phones 0 cars) Pr(0 phones)

Pr(0 cars) (0.20)(0.30) 0.06. So we expect 6

of our 10,000 households, or 600 households to

have 0 phones and 0 cars.

of cars of cars of cars

0 1 2 row total

of phones 0 600 0.20

of phones 1 0.46

of phones 2 or more 0.34

column total 10,000 1.00

0.30 0.60 0.10 1.00

Pr(0 phones 1 car) Pr(0 phones) Pr(1 car)

(0.20)(0.60) 0.12. So we expect 12 of our

10,000 households, or 1200 households to have 0

phones and 1 car.

of cars of cars of cars

0 1 2 row total

of phones 0 600 1200 0.20

of phones 1 0.46

of phones 2 or more 0.34

column total 10,000 1.00

0.30 0.60 0.10 1.00

Pr(0 phones 2 cars) Pr(0 phones) Pr(2 cars)

(0.20)(0.10) 0.02. So we expect 2 of our

10,000 households, or 200 households to have 0

phones and 2 cars.

of cars of cars of cars

0 1 2 row total

of phones 0 600 1200 200 0.20

of phones 1 0.46

of phones 2 or more 0.34

column total 10,000 1.00

0.30 0.60 0.10 1.00

Notice that when we add the 3 numbers that we

just calculated, we get the same total for the

row (2,000) that we had observed. The row and

column totals should be the same for the observed

and expected tables.

of cars of cars of cars

0 1 2 row total

of phones 0 600 1200 200 2,000 0.20

of phones 1 0.46

of phones 2 or more 0.34

column total 10,000 1.00

0.30 0.60 0.10 1.00

Continuing, we get the following numbers for the

2nd and 3rd rows.

of cars of cars of cars

0 1 2 row total

of phones 0 600 1200 200 2,000 0.20

of phones 1 1380 2760 460 4600 0.46

of phones 2 or more 1020 2040 340 3400 0.34

column total 10,000 1.00

0.30 0.60 0.10 1.00

The column totals are the same as for the

observed table.

of cars of cars of cars

0 1 2 row total

of phones 0 600 1200 200 2,000 0.20

of phones 1 1380 2760 460 4600 0.46

of phones 2 or more 1020 2040 340 3400 0.34

column total 3000 6000 1000 10,000 1.00

0.30 0.60 0.10 1.00

Now we set up the same type of table that we did

for our earlier ?2 goodness-of-fit tests. We

put in the fo column the observed frequencies and

in the ft column the expected frequencies that we

calculated.

of cars of phones fo ft

0 0 1000 600

0 1 1500 1380

0 2 or more 500 1020

1 0 900 1200

1 1 2600 2760

1 2 or more 2500 2040

2 0 100 200

2 1 500 460

2 2 or more 400 340

Then we subtract the theoretical frequencies from

the observed frequencies, square the differences,

divide by the theoretical frequencies, and sum to

get our ?2 statistic.

of cars of phones fo ft

0 0 1000 600 266.67

0 1 1500 1380 10.43

0 2 or more 500 1020 265.10

1 0 900 1200 75.00

1 1 2600 2760 9.28

1 2 or more 2500 2040 103.73

2 0 100 200 50.00

2 1 500 460 3.48

2 2 or more 400 340 10.59

794.28

In these tests of independence, the number of

degrees of freedom is

In our example, we have 3 rows and 3 columns. So

dof (3 1)( 3 1) (2)(2) 4 .

From the ?2 table, we see that for a 1 test with

4 degrees of freedom, the cut-off point is 13.277.

In the current problem, our ?2 statistic had a

value of 794.28. So we reject the null hypothesis

and conclude that the number of phones and the

number of cars in a household are not independent.

f(?2)

acceptance region

0.01

crit. reg.

13.277

794.28

Yates Correction

In testing for independence in 2x2 tables, the

chi-square statistic has only (r-1)(c-1) 1

degree of freedom. In these cases, it is often

recommended that the value of the statistic be

corrected so that its discrete distribution

will be better approximated by the continuous

chi-square distribution.

The Hypothesis Test for the Variance or Standard

Deviation

This test is another one that uses the

chi-squared distribution.

Sometimes it is important to know the variance or

standard deviation of a variable.

For example, medication often needs to be

extremely close to the specified dosage. If the

dosage is too low, the medication may be

ineffective and a patient may die from inadequate

treatment. If the dosage is too high, the

patient may die from an overdose. So you may

want to make sure that the variance is a very

small amount.

If the data are normally distributed, the

chi-squared test for the variance or standard

deviation is appropriate.

The statistic is

n is the sample size, and s2 is the

hypothesized population variance. The number of

degrees of freedom is n-1.

Example Suppose you want to test at the 5

level whether the population standard deviation

for a particular medication is 0.5 mg. Based on

a sample of 25 capsules, you determine the sample

standard deviation to be 0.6 mg. Perform the

test.

Now we need to determine the critical region for

the test.

Because the chi-squared distribution is not

symmetric, you need to look up the two critical

values for a two-tailed test separately.

You can find the two numbers either by looking

under Cumulative Probabilities 0.025 and

1-0.0250.975 or under Upper-Tail Areas 0.975

and 0.025 .

Recall that the value of the test statistic was

34.56, which is in the acceptance region. So we

can not rule out the null hypothesis and

therefore we conclude that the population

standard deviation is 0.5 mg.