Title: Math 3680
1Math 3680 Lecture 6 Introduction to
Hypothesis Testing
2- We illustrate the method of hypothesis testing
with an example. Throughout this example, the
lingo of hypothesis testing will be introduced.
We will use this terminology throughout the rest
of the chapter. - Example Historically, an average of 1,808
people have entered a certain store each day. In
hopes of increasing the number of customers, a
designer is hired to refurbish the store's
entrance. After the redesign, a simple random
sample of 45 days is taken. In this sample, an
average of 1,895 people enter the store daily,
with a sample SD of 187. Does it appear that the
redesign of the store's entrance was effective? -
3- Solution There are two possibilities
- The increase in the average number of customers
may be plausibly attributed to a run of good
luck, or - The average increased after the redesign.
- We cannot give a decisive answer to this
question, since both of these explanations are
possible. The first option, which attributes the
increase in the number of customers to mere
chance, is called the null hypothesis. The other
option is called the alternative hypothesis. We
need to assess which option is more plausible.
4- Under certain extremes, the choice is fairly
obvious. - If 10,000 people were now entering the story
daily, then it can be confidently concluded that,
after the redesign, the store had more customers.
In this case, we would reject the null
hypothesis. - On the other hand, if the store averaged only
1,809 people entering daily, then such an
increase is reasonably attributable to chance. In
this case, we would retain (or fail to reject)
the null hypothesis.
5- Somewhere between these two extremes will be a
certain cut-off point called the critical value.
Above this critical value, it will be more
plausible to think that the average number of
customers increased after the redesign. Below
this critical value, it will be more plausible to
think that the increase is simply attributable to
chance.
Critical value
1808
1809
10,000
6- Therefore, the question may be rephrased as
follows - Is the sample average of 1,895 close enough to
1,808 to be consistent with the assumption that
the population mean is still 1,808? - This question leads to two hypotheses
- H0 The average is still 1,808.
- Ha The average is greater than 1,808.
- (Why greater?)
-
7- H0 The average is still 1,808.
- Ha The average is greater than 1,808.
-
Critical value
1808
Decision Retain H0
Reject H0
8- Example A dime is flipped 100 times to
determine if it is fair. - State the null and alternative hypotheses.
-
9Critical value
Critical value
p 0.5
Reject H0 Retain
H0 Reject H0
10- Essential difference between probability and
statistics - Probability A fair dime is flipped 100 times.
Find the probability that it lands heads 55
times. - Statistics A dime is flipped 100 times to
determine if it is fair. -
0
1
50
50
?
?
?
?
?
0
1
1 - p
p
0
1
1
0
0
11- Example Many staff members in teaching hospitals
are hired in July. Because of these inexperienced
staff members, is it more dangerous to be
admitted to a teaching hospital between July and
September than it is during the rest of the year? - State the null and alternative hypotheses.
-
12- Example A dime is suspected of landing heads
more often than tails. It is flipped 100 times to
determine if it is fair. - State the null and alternative hypotheses, and
draw the corresponding figure. -
13- No matter where we set the critical value, we
will occasionally make a mistake. By chance, a
fair coin may land heads more times than the
critical value. - This situation is called a Type I error, meaning
that we would decide to reject the null
hypothesis even though the coin was fair. - The probability of a Type I error is denoted by
a.
14- Another type of error could occur if the coin
that favors heads actually lands heads less
often than the critical value. - This is called a Type II error, meaning that we
fail to reject the null hypothesis even though
the coin is imbalanced. - The probability of a Type II error is denoted by
b. - We define the power of the test to be 1 - b.
This is the probability of correctly rejecting
the null hypothesis when the null hypothesis is
false.
15- Type I Error
- The null hypothesis is true but we reject it.
- P(reject H0 H0 is true) a
- Type II Error
- The null hypothesis is false but we retain it.
- P(retain H0 H0 is false) ß
- Power
- P(reject H0 H0 is false) 1- ß
16THE WAY IT IS
Ha true
H0 true
Decide to retain H0
THE WAY WE THINK IT IS
Type II Error
Decide to reject H0
Type I Error
17- Example Many staff members in teaching hospitals
are hired in July. Because of these inexperienced
staff members, is it more dangerous to be
admitted to a teaching hospital between July and
September than it is during the rest of the year? - State in words what a Type I error means. Repeat
for a Type II error. -
18- Example Historically, an average of 1,808
people have entered a certain store each day. In
the hopes of increasing the number of customers,
a designer is hired to refurbish the store's
entrance. After the redesign, a simple random
sample of 45 days is taken. In this sample, an
average of 1,895 people enter the store daily,
with a sample standard deviation of 187. - State in words what a Type I error means. Repeat
for a Type II error.
19- Basic question How do we pick the critical
value? Typically, the value of a is
prescribed and the critical values are determined
by a. We will discuss this in more depth later. - For now, to develop our intuition, well take the
critical values as prescribed and compute a and
b from these critical values. -
20- Example A scientific article compared two
different methods for the analysis of ampicillin
dosages. It was thought that the first method
generally returned a higher amount than the
second method. In one series of experiments,
pairs of tablets were analyzed by the two
methods. The data below give the percentages of
claimed amount of ampicillin found by the two
methods in 15 pairs of tablets. State the null
and alternative hypotheses. - 97.2 97.2 79.2 74.0 96.8 95.8
- 105.8 97.8 76.0 75.0 99.2 98.0
- 99.5 96.2 69.5 67.5 99.2 99.0
- 100.0 101.8 23.5 21.2 91.0 100.2
- 93.8 88.0 95.2 94.8 72.0 67.5
-
21- Solution 1 (in words)
- H0 If there is a difference, the probability
that the first method returns a higher amount is
50. - Ha If there is a difference, the probability
that the first method returns a higher amount is
greater than 50. - Solution 2 (more precise) Let p be the
population proportion of pairs in which the first
method returns a higher amount. - H0 p 0.5 (simple
hypothesis) - Ha p gt 0.5 (compound hypothesis)
-
22- Example For this particular data set, there are
14 pairs in which there is a measurable
difference. Let K denote the number of pairs in
which the first method returned a higher amount.
Suppose we make the decision rule - Retain H0 if K ? 9, and
- reject H0 if K ? 10.
- Compute a.
- 97.2 97.2 79.2 74.0 96.8 95.8
- 105.8 97.8 76.0 75.0 99.2 98.0
- 99.5 96.2 69.5 67.5 99.2 99.0
- 100.0 101.8 23.5 21.2 91.0 100.2
- 93.8 88.0 95.2 94.8 72.0 67.5
-
23- Solution This is the risk of an error in case
that H0 is true. - a P(H0 is rejected H0 is true)
- P(K ? 10 p 0.5)
-
24- Example Suppose we make the decision rule
- retain H0 if K ? 9
- reject H0 if K ? 10
- Compute b if
- (a) p 0.7
- (b) p 0.9
-
25- Solution (a) This is the risk of an error in
case that - p 0.7, so that H0 is false.
- b b (0.7)
- P(H0 is retained p 0.7)
- P(K ? 9 p 0.7)
-
26- Solution (b) This is the risk of an error in
case that - p 0.9, so that H0 is false.
- b b (0.9)
- P(H0 is retained p 0.9)
- P(K ? 9 p 0.9)
-
27- Notice that b is a function of p.
- (Why is this function decreasing?)
-
28 29Note. This lengthy calculation may be facilitated
by using page 505 from the back of the
book
This lengthy calculation may also be facilitated
with EXCEL BINOMDIST(9,14,0.7,1) TI-83
binomcdf(14,0.7,9)
30- Example For this particular data set, there are
14 pairs in which there is a measurable
difference. Let K denote the number of pairs in
which the first method returned a higher amount.
Suppose we make the decision rule - Retain H0 if K ? 10, and
- reject H0 if K ? 11.
- Compute a.
- 97.2 97.2 79.2 74.0 96.8 95.8
- 105.8 97.8 76.0 75.0 99.2 98.0
- 99.5 96.2 69.5 67.5 99.2 99.0
- 100.0 101.8 23.5 21.2 91.0 100.2
- 93.8 88.0 95.2 94.8 72.0 67.5
-
31- Solution This is the risk of an error in case
that H0 is true. - a P(H0 is rejected H0 is true)
- P(K ? 11 p 0.5)
-
- Practically, why did this decrease?
32Note. This lengthy calculation may be facilitated
by using page 505 from the back of the
book
From the back of the book, 1 - 0.9713 0.0287
From Excel, use 1 - BINOMDIST(10,14,0.5,1)
33- Unfortunately, decreasing a also has the effect
- of increasing b. (Why?)
-
34- In statistical parlance, the test has become less
powerful. -
35- Note In a perfect world, we would have a b
0, but this can only happen in trivial cases. For
any realistic scenario of hypothesis testing,
decreasing a will increase b, and vice versa. - In practice, we set the significance level a in
advance, usually at a fairly small number (a
0.05 is typical) . We then compute b for this
level of a. - (Ideally, we would like to construct a test that
makes b as small as possible. This topic will
be considered in future statistics courses.)
36- In these problems, the critical regions were
prescribed and a was computed. - In practice, a is chosen and the critical value
is computed to match this value of a. -