STA 291Lecture 24

- 11 Hypothesis Testing
- 11.1 Concepts of Hypothesis Testing
- 11.2 Testing the population mean
- 12.3 Testing the population proportion

- Bonus Homework, due in the lab April 16-18
- Essay How do you prove or disprove the hot

hand theory? (400-600 words / approximately

one typed page)

Essay should (at least) include

- Background conflicting theory of hot hand or no

hot hand. - Hypothesis to be tested
- What data to collect? (I suggest only consider

free throws) how much data? - If data were available, what calculation you will

perform? (as specific as possible) - and how this calculation leads to your

affirmation or rejection of the hot hand

theory? (as specific as possible)

- Instead of estimation how much the new drug

improves survival (which is harder to answer).

We ask Does it help? - Null Hypothesis No difference
- Alternative Hypothesis Some improvement
- Leave the how much improvements question later.

Significance Test

- A significance test is a way of statistically

testing a hypothesis by comparing the data to

values predicted by the hypothesis - Data that fall far from the predicted values

provide evidence against the hypothesis - Significantly different

Statistically Significant

- A significant result is usually called

Statistically significant - You may want to follow up by estimating how

large is the difference? (caution difference

may be small) - For example, 720 and 710 (SAT score) will

sometimes be statistically significantly

different but for all practical purposes they

are just as good

Logical Procedure

- State a hypothesis that you would like to find

evidence against (null Hypothesis, Ho) - Get data and calculate a statistic (for example

sample mean) - The hypothesis often determines the sampling

distribution of our statistic - If the calculated value in 2. is very

unreasonable given 3., then we conclude that the

hypothesis was wrong (sampling result is

significantly different from what we expect from

Ho.)

Elements of a Significance Test

- Assumptions
- Type of data, type of population distribution
- Hypotheses
- Null and alternative hypothesis Ho and Ha

(usually it is about the parameter(s) of the

population distribution - Test Statistic
- Usually compares point estimate to parameter

value under the null hypothesis - P-value
- Uses sampling distribution to quantify evidence

against null hypothesis - Small P is more contradictory
- Conclusion
- Report P-value
- Make formal rejection decision (optional)

p-Value

- How likely is the observed test statistic value

when the null hypothesis is assumed true? - The p-value is the probability, assuming that H0

is true, that the test statistic takes values at

least as contradictory to H0 as the value

actually observed. - The smaller the p-value, the more strongly the

data contradict H0

Example Study design

- In a study comparing two pain killer (Tylenol vs.

Advil etc.), 215 volunteers are give both, one

kind for each week (disguised as just brand A and

B) - After they used both, they state a preference

either A is better or B is better - Hypothesis if there were no difference, then the

preference for A should be 50

Example -cont.

- Let p popu. proportion prefer A over B
- Ha p not 0.5 -- since the preference can go

either way - Computation of the P-value (after the study was

done) - Conclusion

Example -cont.

- Suppose among the 215 there were 130 prefer brand

A, how strong is the evidence? - P-value 0.002611 (by web)
- Conclusion since the P-value is so small
- (smaller than 1, smaller than 5) we reject the

null hypothesis of p0.5

- We also say the result is statistically

significant at 1 level. Etc (just mean the

P-value is less than 1)

Alternative and p-value computation

- We may also try to compute the P-value by

hand(table, calculator, paper/pencil) - 130/215 0.6046
- 0.6046-0.50.1046
- 0.5(1-0.5)/215
- Z(obs)3.067

Example

- Somebody makes the claim that 50 of all UK

students wear sandals to class in the month of

Sept. - You dont believe it, so one of those days, you

take a random sample of 10 students, and find

that only 2 out of these 10 students actually

wear sandals - How (un)likely is this under the hypothesis?
- The sampling distribution helps us quantify the

(un)likeliness in terms of a probability (p-value)

Assumptions in the Example

- What type of data do we have?
- Qualitative with two categories
- Either wearing sandals or not wearing

sandals - What is the population distribution?
- It is Bernoulli type. It is definitely not normal

since it can only take two values - Which sampling method has been used?
- We assume simple random sampling
- What is the sample size?
- n10

Hypotheses in the Example

- Null hypothesis (H0)
- 50 of all UK students wear sandals to class
- H0 Population proportion 0.5
- Alternative hypothesis (H1)
- The proportion of UK students wearing sandals

is different from 0.5 (two sided)

Conclusion

- Sometimes, in addition to reporting the p-value,

a formal decision is made about rejecting or not

rejecting the null hypothesis - Most studies require small p-values like plt.05 or

plt.01 as significant evidence against the null

hypothesis - Decision The results are significant/not

significant at the 5 level

Example, cont.

- The calculation of P-value for this particular

example here is a topic our book do not cover

(only cover for sample size gt30) - But lets suppose we had used a software and it

reported a P-value of 0.109 - (look at the bottom of the syllabus page)

Conclusion in the Example

- We have calculated a P-value of 0.109
- This is not significant at the 5 level
- So, we cannot reject the null hypothesis (at the

5 level) - So, do we have enough evidence to refute the

claim that the proportion of UK students wearing

sandals is truly 50? - (not yet)

p-Values and Their Significance

- p-Value lt 0.01
- Highly Significant / Overwhelming Evidence
- 0.01 lt p-Value lt 0.05
- Significant / Strong Evidence
- 0.05 lt p-Value lt 0.1
- Not Significant / Weak Evidence
- p-Value gt 0.1
- Not Significant / No Evidence

- Not reject Ho can due to one of the two reasons

(sometimes both) - (1) sample size is too small, you can hardly

reject anything. (not enough info.) - (the case in the example)
- (2) there is truly no difference. Even when

sample size is big enough.

Decisions and Types of Errors in Tests of

Hypotheses

- Terminology
- The alpha-level (significance level) is a number

such that one rejects the null hypothesis if the

p-value is less than or equal to it. The most

common alpha-levels are .05 and .01 - The choice of the alpha-level reflects how

cautious the researcher wants to be

Type I and Type II Errors

- Type I Error The null hypothesis is rejected,

even though it is true. - Type II Error The null hypothesis is not

rejected, even though it is false.

Type I and Type II Errors

Type I and Type II Errors

- Terminology
- Alpha Probability of a Type I error
- Beta Probability of a Type II error
- Power 1 Probability of a Type II error
- For a given data, the smaller the probability of

Type I error, the larger the probability of Type

II error and the smaller the power - If you need a very strong evidence to reject the

null hypothesis (set alpha small), it is more

likely that you fail to detect a real difference

(larger Beta).

- When sample size increases, both error

probabilities could be made to decrease

Type I and Type II Errors

- In practice, alpha is specified, and the

probability of Type II error could be calculated,

but the calculations are usually difficult - How to choose alpha?
- If the consequences of a Type I error are very

serious, then chose a small alpha, like 0.01. - For example, you want to find evidence that

someone is guilty of a crime. - In exploratory research, often a larger

probability of Type I error is acceptable (like

0.05 or even 0.1)

11.2 Significance Test for a Mean

- Example
- The mean age at first marriage for married men in

a New England community was 28 years in 1790. - For a random sample of 40 married men in that

community in 1990, the sample mean and standard

deviation of age at first marriage were 26 and 9,

respectively - Q Has the mean changed significantly?

Significance Test for a Mean

- Assumptions
- What type of data?
- Quantitative, continuous
- What is the population distribution?
- No special assumptions. The hypothesis refers to

the population mean of the quantitative variable. - Which sampling method has been used?
- Simple Random Sampling
- What is the sample size?
- Minimum sample size of n30 to use Central Limit

Theorem, for sample mean

- Because the hypothesis is about the (population)

mean, we should study the sample mean, or a

test statistic constructed from it. - Also, Central limit theorem say the sample mean

will be approx. normally distributed for large

samples sizes.

Significance Test for a Mean

- Hypotheses
- The null hypothesis has the form
- where is an a priori (before taking the

sample) specified number like 28 (years), or 0 or

5.3 etc. - The most common alternative hypothesis is
- This is called a two-sided hypothesis, since it

includes values falling above and below the null

hypothesis

Significance Test for a Mean

- Test Statistic
- The hypothesis is about the population mean
- So, a natural test statistic would be the sample

mean - The sample mean has, for sample size of at least

n30, an approximately normal sampling

distribution - The parameters of the sampling distribution are,

under the null hypothesis, - Mean (that is, the sampling

distribution is centered around the hypothesized

mean) - Standard error

Significance Test for a Mean

- Test Statistic
- Then, the z-score
- has a standard
- normal distribution
- The z-score measures how many estimated standard

errors the sample mean falls from the

hypothesized population mean - The farther the sample mean falls from
- the larger the absolute value of the z test

statistic, and the stronger the evidence against

the null hypothesis

Significance Test for a Mean

- p-Value
- The p-value has the advantage that different test

results from different tests can be compared The

p-value is always a number between 0 and 1 - The p-value can be obtained from Table B3 It is

the probability that a standard normal

distribution takes values more extreme than the

observed z score - The smaller the p-value is, the stronger is the

evidence against the null hypothesis and in favor

of the alternative hypothesis

Significance Test for a Mean

- Example again
- The mean age at first marriage for married men in

a New England community was 28 years in 1790. - For a random sample of 40 married men in that

community in 1990, the sample mean and standard

deviation of age at first marriage were 26 and 9,

respectively - State the hypotheses, find the test statistic and

P-value for testing whether the mean has changed.

Interpret. - Make a decision, using a significance level of 5

- (2-sided) P-value2x0.080.16

One-Sided VersusTwo-Sided Test

- Two-sided tests are more common
- Look for formulations like
- test whether the mean has changed
- test whether the mean has increased
- test whether the mean is the same
- test whether the mean has decreased

SummaryLarge Sample Significance Test for a Mean

Attendance Survey Question 24

- On a 4x6 index card
- Please write down your name and section number
- Todays Question

