Loading...

PPT – Chapter 8 Statistical inference: Significance Tests About Hypotheses PowerPoint presentation | free to download - id: 9124-ZWRhZ

The Adobe Flash plugin is needed to view this content

Chapter 8Statistical inference Significance

Tests About Hypotheses

- Learn .
- To use an inferential method called
- a Significance Test
- To analyze evidence that data provide
- To make decisions based on data

Two Major Methods for Making Statistical

Inferences about a Population

- Confidence Interval
- Significance Test

Questions that Significance Tests Attempt to

Answer

- Does a proposed diet truly result in weight loss,

on the average? - Is there evidence of discrimination against women

in promotion decisions? - Does one advertising method result in better

sales, on the average, than another advertising

method?

Section 8.1

- What Are the Steps For Performing a Significance

Test?

Hypothesis

- A hypothesis is a statement about a population,

usually of the form that a certain parameter

takes a particular numerical value or falls in a

certain range of values - The main goal in many research studies is to

check whether the data support certain hypotheses

Significance Test

- A significance test is a method of using data to

summarize the evidence about a hypothesis - A significance test about a hypothesis has five

steps

Step 1 Assumptions

- A (significance) test assumes that the data

production used randomization - Other assumptions may include
- Assumptions about the sample size
- Assumptions about the shape of the population

distribution

Step 2 Hypotheses

- Each significance test has two hypotheses
- The null hypothesis is a statement that the

parameter takes a particular value - The alternative hypothesis states that the

parameter falls in some alternative range of

values

Null and Alternative Hypotheses

- The value in the null hypothesis usually

represents no effect - The symbol Ho denotes null hypothesis
- The value in the alternative hypothesis usually

represents an effect of some type - The symbol Ha denotes alternative hypothesis

Null and Alternative Hypotheses

- A null hypothesis has a single parameter value,

such as Ho p 1/3 - An alternative hypothesis has a range of values

that are alternatives to the one in Ho such as - Ha p ? 1/3 or
- Ha p 1/3 or
- Ha p

Step 3 Test Statistic

- The parameter to which the hypotheses refer has a

point estimate the sample statistic - A test statistic describes how far that estimate

(the sample statistic) falls from the parameter

value given in the null hypothesis

Step 4 P-value

- To interpret a test statistic value, we use a

probability summary of the evidence against the

null hypothesis, Ho - First, we presume that Ho is true
- Next, we consider the sampling distribution from

which the test statistic comes - We summarize how far out in the tail of this

sampling distribution the test statistic falls

Step 4 P-value

- We summarize how far out in the tail the test

statistic falls by the tail probability of that

value and values even more extreme - This probability is called a P-value
- The smaller the P-value, the stronger the

evidence is against Ho

Step 4 P-value

Step 4 P-value

- The P-value is the probability that the test

statistic equals the observed value or a value

even more extreme - It is calculated by presuming that the null

hypothesis H is true

Step 5 Conclusion

- The conclusion of a significance test reports the

P-value and interprets what it says about the

question that motivated the test

Summary The Five Steps of a Significance Test

- Assumptions
- Hypotheses
- Test Statistic
- P-value
- Conclusion

Is the Statement a Null Hypothesis or an

Alternative Hypothesis?

- In Canada, the proportion of adults who favor

legalize gambling is 0.50. - Null Hypothesis
- Alternative Hypothesis

Is the Statement a Null Hypothesis or an

Alternative Hypothesis?

- The proportion of all Canadian college students

who are regular smokers is less than 0.24, the

value it was ten years ago. - Null Hypothesis
- Alternative Hypothesis

Section 8.2

- Significance Tests About
- Proportions

Example Are Astrologers Predictions Better

Than Guessing?

- Scientific test of astrology experiment
- For each of 116 adult volunteers, an astrologer

prepared a horoscope based on the positions of

the planets and the moon at the moment of the

persons birth - Each adult subject also filled out a California

Personality Index Survey

Example Are Astrologers Predictions Better

Than Guessing?

- For a given adult, his or her birth data and

horoscope were shown to an astrologer together

with the results of the personality survey for

that adult and for two other adults randomly

selected from the group - The astrologer was asked which personality chart

of the 3 subjects was the correct one for that

adult, based on his or her horoscope

Example Are Astrologers Predictions Better

Than Guessing?

- 28 astrologers were randomly chosen to take part

in the experiment - The National Council for Geocosmic Research

claimed that the probability of a correct guess

on any given trial in the experiment was larger

than 1/3, the value for random guessing

Example Are Astrologers Predictions Better

Than Guessing?

- Put this investigation in the context of a

significance test by stating null and alternative

hypotheses

Example Are Astrologers Predictions Better

Than Guessing?

- With random guessing, p 1/3
- The astrologers claim p 1/3
- The hypotheses for this test
- Ho p 1/3
- Ha p 1/3

What Are the Steps of a Significance Test about a

Population Proportion?

- Step 1 Assumptions
- The variable is categorical
- The data are obtained using randomization
- The sample size is sufficiently large that the

sampling distribution of the sample proportion is

approximately normal - np 15 and n(1-p) 15

What Are the Steps of a Significance Test about a

Population Proportion?

- Step 2 Hypotheses
- The null hypothesis has the form
- Ho p po
- The alternative hypothesis has the form
- Ha p po (one-sided test) or
- Ha p
- Ha p ? po (two-sided test)

What Are the Steps of a Significance Test about a

Population Proportion?

- Step 3 Test Statistic
- The test statistic measures how far the sample

proportion falls from the null hypothesis value,

po, relative to what wed expect if Ho were true - The test statistic is

What Are the Steps of a Significance Test about a

Population Proportion?

- Step 4 P-value
- The P-value summarizes the evidence
- It describes how unusual the data would be if H0

were true

What Are the Steps of a Significance Test about a

Population Proportion?

- Step 5 Conclusion
- We summarize the test by reporting and

interpreting the P-value

Example Are Astrologers Predictions Better

Than Guessing?

- Step 1 Assumptions
- The data is categorical each prediction falls

in the category correct or incorrect

prediction - Each subject was identified by a random number.

Subjects were randomly selected for each

experiment. - np116(1/3) 15
- n(1-p) 116(2/3) 15

Example Are Astrologers Predictions Better

Than Guessing?

- Step 2 Hypotheses
- H0 p 1/3
- Ha p 1/3

Example Are Astrologers Predictions Better

Than Guessing?

- Step 3 Test Statistic
- In the actual experiment, the astrologers were

correct with 40 of their 116 predictions (a

success rate of 0.345)

Example Are Astrologers Predictions Better Than

Guessing?

- Step 4 P-value
- The P-value is 0.40

Example Are Astrologers Predictions Better Than

Guessing?

- Step 5 Conclusion
- The P-value of 0.40 is not especially small
- It does not provide strong evidence against H0 p

1/3 - There is not strong evidence that astrologers

have special predictive powers

How Do We Interpret the P-value?

- A significance test analyzes the strength of the

evidence against the null hypothesis - We start by presuming that H0 is true
- The burden of proof is on Ha

How Do We Interpret the P-value?

- The approach used in hypotheses testing is called

a proof by contradiction - To convince ourselves that Ha is true, we must

show that data contradict H0 - If the P-value is small, the data contradict H0

and support Ha

Two-Sided Significance Tests

- A two-sided alternative hypothesis has the form

Ha p ? p0 - The P-value is the two-tail probability under the

standard normal curve - We calculate this by finding the tail probability

in a single tail and then doubling it

Example Dr Dog Can Dogs Detect Cancer by

Smell?

- Study investigate whether dogs can be trained

to distinguish a patient with bladder cancer by

smelling compounds released in the patients urine

Example Dr Dog Can Dogs Detect Cancer by

Smell?

- Experiment
- Each of 6 dogs was tested with 9 trials
- In each trial, one urine sample from a bladder

cancer patient was randomly place among 6 control

urine samples

Example Dr Dog Can Dogs Detect Cancer by

Smell?

- Results
- In a total of 54 trials with the six dogs, the

dogs made the correct selection 22 times (a

success rate of 0.407)

Example Dr Dog Can Dogs Detect Cancer by

Smell?

- Does this study provide strong evidence that the

dogs predictions were better or worse than with

random guessing?

Example Dr Dog Can Dogs Detect Cancer by

Smell?

- Step 1 Check the sample size requirement
- Is the sample size sufficiently large to use the

hypothesis test for a population proportion? - Is np0 15 and n(1-p0) 15?
- 54(1/7) 7.7 and 54(6/7) 46.3
- The first, np0 is not large enough
- We will see that the two-sided test is robust

when this assumption is not satisfied

Example Dr Dog Can Dogs Detect Cancer by

Smell?

- Step 2 Hypotheses
- H0 p 1/7
- Ha p ? 1/7

Example Dr Dog Can Dogs Detect Cancer by

Smell?

- Step 3 Test Statistic

Example Dr Dog Can Dogs Detect Cancer by

Smell?

- Step 4 P-value

Example Dr Dog Can Dogs Detect Cancer by

Smell?

- Step 5 Conclusion
- Since the P-value is very small and the sample

proportion is greater than 1/7, the evidence

strongly suggests that the dogs selections are

better than random guessing

Example Dr Dog Can Dogs Detect Cancer by

Smell?

- Insight
- In this study, the subjects were a convenience

sample rather than a random sample from some

population - Also, the dogs were not randomly selected
- Any inferential predictions are highly tentative
- The predictions become more conclusive if similar

results occur in other studies

Summary of P-values for Different Alternative

Hypotheses

The Significance Level Tells Us How Strong the

Evidence Must Be

- Sometimes we need to make a decision about

whether the data provide sufficient evidence to

reject H0 - Before seeing the data, we decide how small the

P-value would need to be to reject H0 - This cutoff point is called the significance

level

The Significance Level Tells Us How Strong the

Evidence Must Be

Significance Level

- The significance level is a number such that we

reject H0 if the P-value is less than or equal to

that number - In practice, the most common significance level

is 0.05 - When we reject H0 we say the results are

statistically significant

Possible Decisions in a Test with Significance

Level 0.05

Report the P-value

- Learning the actual P-value is more informative

than learning only whether the test is

statistically significant at the 0.05 level - The P-values of 0.01 and 0.049 are both

statistically significant in this sense, but the

first P-value provides much stronger evidence

against H0 than the second

Do Not Reject H0 Is Not the Same as Saying

Accept H0

- Analogy Legal trial
- Null Hypothesis Defendant is Innocent
- Alternative Hypothesis Defendant is Guilty
- If the jury acquits the defendant, this does not

mean that it accepts the defendants claim of

innocence - Innocence is plausible, because guilt has not

been established beyond a reasonable doubt

One-Sided vs Two-Sided Tests

- Things to consider in deciding on the alternative

hypothesis - The context of the real problem
- In most research articles, significance tests use

two-sided P-values - Confidence intervals are two-sided

The Binomial Test for Small Samples

- The test about a proportion assumes normal

sampling distributions for and the z-test

statistic. - It is a large-sample test the requires that the

expected numbers of successes and failures be at

least 15. In practice, the large-sample z test

still performs quite well in two-sided

alternatives even for small samples. - Warning For one-sided tests, when p0 differs

from 0.50, the large-sample test does not work

well for small samples

For a test of H0 p 0.50

- The z test statistic is 1.04. Find the

P-value for Ha p 0.50. - .15
- .20
- .175
- .222

For a test of H0 p 0.50

- The z test statistic is 1.04. Find the

P-value for Ha p ? 0.50. - .15
- .22
- .30
- .175

For a test of H0 p 0.50

- The z test statistic is 1.04. Does the

P-value for Ha p ? 0.50 give strong evidence

against H0? - yes
- no

For a test of H0 p 0.50

- The z test statistic is 2.50. Find the

P-value for Ha p 0.50. - .05
- .10
- .0062
- .0124

For a test of H0 p 0.50

- The z test statistic is 2.50. Find the

P-value for Ha p ? 0.50. - .05
- .10
- .0062
- .0124

For a test of H0 p 0.50

- The z test statistic is 2.50. Does the

P-value for Ha p ? 0.50 give strong evidence

against H0? - yes
- no

Section 8.3

- Significance Tests about Means

What Are the Steps of a Significance Test about a

Population Mean?

- Step 1 Assumptions
- The variable is quantitative
- The data are obtained using randomization
- The population distribution is approximately

normal. This is most crucial when n is small and

Ha is one-sided.

What Are the Steps of a Significance Test about a

Population Mean?

- Step 2 Hypotheses
- The null hypothesis has the form
- H0 µ µ0
- The alternative hypothesis has the form
- Ha µ µ0 (one-sided test) or
- Ha µ
- Ha µ ? µ0 (two-sided test)

What Are the Steps of a Significance Test about a

Population Mean?

- Step 3 Test Statistic
- The test statistic measures how far the sample

mean falls from the null hypothesis value µ0

relative to what wed expect if H0 were true - The test statistic is

What Are the Steps of a Significance Test about a

Population Mean?

- Step 4 P-value
- The P-value summarizes the evidence
- It describes how unusual the data would be if H0

were true

What Are the Steps of a Significance Test about a

Population Mean?

- Step 5 Conclusion
- We summarize the test by reporting and

interpreting the P-value

Summary of P-values for Different Alternative

Hypotheses

Example Mean Weight Change in Anorexic Girls

- A study compared different psychological

therapies for teenage girls suffering from

anorexia - The variable of interest was each girls weight

change weight at the end of the study

weight at the beginning of the study

Example Mean Weight Change in Anorexic Girls

- One of the therapies was cognitive therapy
- In this study, 29 girls received the therapeutic

treatment - The weight changes for the 29 girls had a sample

mean of 3.00 pounds and standard deviation of

7.32 pounds

Example Mean Weight Change in Anorexic Girls

Example Mean Weight Change in Anorexic Girls

- How can we frame this investigation in the

context of a significance test that can detect a

positive or negative effect of the therapy? - Null hypothesis no effect
- Alternative hypothesis therapy has some

effect

Example Mean Weight Change in Anorexic Girls

- Step 1 Assumptions
- The variable (weight change) is quantitative
- The subjects were a convenience sample, rather

than a random sample. The question is whether

these girls are a good representation of all

girls with anorexia. - The population distribution is approximately

normal

Example Mean Weight Change in Anorexic Girls

- Step 2 Hypotheses
- H0 µ 0
- Ha µ ? 0

Example Mean Weight Change in Anorexic Girls

- Step 3 Test Statistic

Example Mean Weight Change in Anorexic Girls

- Step 4 P-value
- Minitab Output
- Test of mu 0 vs not 0
- Variable N Mean StDev SE Mean

wt_chg 29 3.000 7.3204 1.3594 CI - 95 CI T P
- (0.21546, 5.78454) 2.21 0.036

Example Mean Weight Change in Anorexic Girls

- Step 5 Conclusion
- The small P-value of 0.036 provides considerable

evidence against the null hypothesis (the

hypothesis that the therapy had no effect)

Example Mean Weight Change in Anorexic Girls

- The diet had a statistically significant

positive effect on weight (mean change 3

pounds, n 29, t 2.21, P-value 0.04) - The effect, however, may be small in practical

terms - 95 CI for µ (0.2, 5.8) pounds

Results of Two-Sided Tests and Results of

Confidence Intervals Agree

- Conclusions about means using two-sided

significance tests are consistent with

conclusions using confidence intervals - If P-value 0.05 in a two-sided test, a 95

confidence interval does not contain the H0

value - If P-value 0.05 in a two-sided test, a 95

confidence interval does contain the H0 value

What If the Population Does Not Satisfy the

Normality Assumption

- For large samples (roughly about 30 or more) this

assumption is usually not important - The sampling distribution of x is approximately

normal regardless of the population distribution

What If the Population Does Not Satisfy the

Normality Assumption

- In the case of small samples, we cannot assume

that the sampling distribution of x is

approximately normal - Two-sided inferences using the t distribution are

robust against violations of the normal

population assumption - They still usually work well if the actual

population distribution is not normal

Regardless of Robustness, Look at the Data

- Whether n is small or large, you should look at

the data to check for severe skew or for severe

outliers - In these cases, the sample mean could be a

misleading measure

A study has a random sample of 20 subjects. The

test statistic for testing Hoµ100 is t 2.40.

- Find the approximate P-value for the alternative,

Ha µ 100. - between .100 and .050
- between .050 and .025
- between .025 and .010
- between .010 and .005

A study has a random sample of 20 subjects. The

test statistic for testing Hoµ100 is t 2.40.

- Find the approximate P-value for the alternative,

Ha µ ? 100. - between .100 and .050
- between .050 and .020
- between .025 and .010
- between .020 and .010

Section 8.4

- Decisions and Types of Errors in Significance

Tests

Type I and Type II Errors

- When H0 is true, a Type I Error occurs when H0 is

rejected - When H0 is false, a Type II Error occurs when H0

is not rejected

Significance Test Results

An Analogy Decision Errors in a Legal Trial

P(Type I Error) Significance Level a

- Suppose H0 is true. The probability of rejecting

H0, thereby committing a Type I error, equals the

significance level, a, for the test.

P(Type I Error)

- We can control the probability of a Type I error

by our choice of the significance level - The more serious the consequences of a Type I

error, the smaller a should be

Type I and Type II Errors

- As P(Type I Error) goes Down, P(Type II Error)

goes Up - The two probabilities are inversely related

A significance test about a proportion is

conducted using a significance level of 0.05.

- The test statistic is 2.58. The P-value is 0.01.

If Ho is true, for what probability of a Type I

error was the test designed? - .01
- .05
- 2.58
- .02

A significance test about a proportion is

conducted using a significance level of 0.05.

- The test statistic is 2.58. The P-value is 0.01.

If this test resulted in a decision error, what

type of error was it? - Type I
- Type II

Section 8.5

- Limitations of Significance Tests

Statistical Significance Does Not Mean Practical

Significance

- When we conduct a significance test, its main

relevance is studying whether the true parameter

value is - Above, or below, the value in H0 and
- Sufficiently different from the value in H0 to be

of practical importance

What the Significance Test Tells Us

- The test gives us information about whether the

parameter differs from the H0 value and its

direction from that value

What the Significance Test Does Not Tell Us

- It does not tell us about the practical

importance of the results

Statistical Significance vs. Practical

Significance

- A small P-value, such as 0.001, is highly

statistically significant, but it does not imply

an important finding in any practical sense - In particular, whenever the sample size is large,

small P-values can occur when the point estimate

is near the parameter value in H0

Significance Tests Are Less Useful Than

Confidence Intervals

- A significance test merely indicates whether the

particular parameter value in H0 is plausible - When a P-value is small, the significance test

indicates that the hypothesized value is not

plausible, but it tells us little about which

potential parameter values are plausible

Significance Tests are Less Useful than

Confidence Intervals

- A Confidence Interval is more informative,

because it displays the entire set of believable

values

Misinterpretations of Results of Significance

Tests

- Do Not Reject H0 does not mean Accept H0
- A P-value above 0.05 when the significance level

is 0.05, does not mean that H0 is correct - A test merely indicates whether a particular

parameter value is plausible

Misinterpretations of Results of Significance

Tests

- Statistical significance does not mean practical

significance - A small P-value does not tell us whether the

parameter value differs by much in practical

terms from the value in H0

Misinterpretations of Results of Significance

Tests

- The P-value cannot be interpreted as the

probability that H0 is true

Misinterpretations of Results of Significance

Tests

- It is misleading to report results only if they

are statistically significant

Misinterpretations of Results of Significance

Tests

- Some tests may be statistically significant just

by chance

Misinterpretations of Results of Significance

Tests

- True effects may not be as large as initial

estimates reported by the media

Section 8.6

- How Likely is a Type II Error?

Type II Error

- A Type II error occurs in a hypothesis test when

we fail to reject H0 even though it is actually

false

Calculating the Probability of a Type II Error

- To calculate the probability of a Type II error,

we must do a separate calculation for various

values of the parameter of interest

Example Reconsider the Experiment to test

Astrologers Predictions

- Scientific test of astrology experiment
- For each of 116 adult volunteers, an astrologer

prepared a horoscope based on the positions of

the planets and the moon at the moment of the

persons birth - Each adult subject also filled out a California

Personality Index Survey

Example Reconsider the Experiment to test

Astrologers Predictions

- For a given adult, his or her birth data and

horoscope were shown to an astrologer together

with the results of the personality survey for

that adult and for two other adults randomly

selected from the group - The astrologer was asked which personality chart

of the 3 subjects was the correct one for that

adult, based on his or her horoscope

Example Reconsider the Experiment to test

Astrologers Predictions

- 28 astrologers were randomly chosen to take part

in the experiment - The National Council for Geocosmic Research

claimed that the probability of a correct guess

on any given trial in the experiment was larger

than 1/3, the value for random guessing

Example Reconsider the Experiment to test

Astrologers Predictions

- With random guessing, p 1/3
- The astrologers claim p 1/3
- The hypotheses for this test
- Ho p 1/3
- Ha p 1/3
- The significance level used for the test is 0.05

Example Reconsider the Experiment to test

Astrologers Predictions

- For what values of the sample proportion can we

reject H0? - A test statistic of z 1.645 has a P-value of

0.05. So, we reject H0 for z 1.645 and we fail

to reject H0 for z

Example Reconsider the Experiment to test

Astrologers Predictions

- Find the value of the sample proportion that

would give us a z of 1.645

Example Reconsider the Experiment to test

Astrologers Predictions

- So, we fail to reject H0 if

- Suppose that in reality astrologers can make the

correct prediction 50 of the time (that is, p

0.50) - In this case, (p 0.50), we can now calculate

the probability of a Type II error

Example Reconsider the Experiment to test

Astrologers Predictions

- We calculate the probability of a sample

proportion proportion is 0.50

Example Reconsider the Experiment to test

Astrologers Predictions

- The area to the left of -2.04 in the standard

normal table is 0.02 - The probability of making a Type II error and

failing to reject H0 p 1/3 is only 0.02 in the

case in which the true proportion is 0.50 - This is only a small chance of making a Type II

error

Power of a Test

- Power 1 P(Type II error)
- The higher the power, the better
- In practice, it is ideal for studies to have high

power while using a relatively small significance

level

Example Reconsider the Experiment to test

Astrologers Predictions

- In this example, the Power of the test at p

0.50 is 1 0.02 0.98 - Since, the higher the power the better, a test

power of 0.98 is quite good