Chapter 8 Introduction to Statistical Inferences - PowerPoint PPT Presentation

1 / 67
About This Presentation
Title:

Chapter 8 Introduction to Statistical Inferences

Description:

2. z(a/2) : z-score corresponding to a level of confidence 1- a. The 1-a Confidence Interval of m ... z(a/2) n. s. z(a/2) The Confidence Interval. A Five-Step Model: ... – PowerPoint PPT presentation

Number of Views:92
Avg rating:3.0/5.0
Slides: 68
Provided by: debb115
Category:

less

Transcript and Presenter's Notes

Title: Chapter 8 Introduction to Statistical Inferences


1
Chapter 8 Introduction to Statistical
Inferences
2
Chapter Goals
  • Learn the basic concepts of estimation and
    hypothesis testing
  • Consider questions about a population mean using
    two methods that assume the population standard
    deviation is known
  • Consider what value or interval of values can we
    use to estimate a population mean?
  • Consider is there evidence to suggest the
    hypothesized mean is incorrect?

3
Important Definitions
  • Interval Estimate An interval bounded by two
    values and used to estimate the value of a
    population parameter. The values that bound this
    interval are statistics calculated from the
    sample that is being used as the basis for the
    estimation.

Level of Confidence (1 a)100 The
probability that the sample to be selected yields
an interval that includes the parameter being
estimated
Confidence Interval An interval estimate with a
specified level of confidence
4
8.3 Estimation of Mean m (s Known)
  • Formalize the interval estimation process as it
    applies to estimating the population mean m based
    on a random sample
  • Assume the population standard deviation s is
    known
  • The assumptions are the conditions that need to
    exist in order to correctly apply a statistical
    procedure

5
The Assumption...
Assumption satisfied by 1. Knowing that the
sampled population is normally distributed, or
2. Using a large enough random sample (CLT)
Note The CLT may be applied to smaller samples
(for example n 15) when there is evidence to
suggest a unimodal distribution that is
approximately symmetric. If there is evidence of
skewness, the sample size needs to be much larger.
6
The 1-a Confidence Interval of m
  • A (1-a)100 confidence interval for m is found by

2. z(a/2) z-score corresponding to a level of
confidence 1- a. P(Zgtz(a/2)) a/2
7
Notes Continued
  • 3. standard error of the mean
  • The standard deviation of the distribution of

8
The Confidence Interval
  • A Five-Step Model

1. Describe the population parameter of concern
2. Specify the confidence interval criteria a.
Check the assumptions b. Identify the probability
distribution and the formula to be used c.
Determine the level of confidence, 1 - a
3. Collect and present sample information
4. Determine the confidence interval a. Determine
the confidence coefficient b. Find the maximum
error of estimate c. Find the lower and upper
confidence limits
5.State the confidence interval
9
Example
  • Example The weights of full boxes of a certain
    kind of cereal are normally distributed with a
    standard deviation of 0.27 oz. A sample of 18
    randomly selected boxes produced a mean weight
    of 9.87 oz. Find a 95 confidence interval for
    the true mean weight of a box of this cereal.
  • Solution
  • 1. Describe the population parameter of
    concernThe mean, µ, weight of all boxes of this
    cereal

10
Solution Continued
  • 3. Collect and present informationThe sample
    information is given in the statement of the
    problem
  • Given

11
Solution Continued
  • b. Find the maximum error of estimate Use the
    maximum error part of the formula for a CI

5. State the confidence interval 9.75 to 10.00
is a 95 confidence interval for the true mean
weight, ?, of cereal boxes
12
Example
  • Example A random sample of the test scores of
    100 applicants for clerk-typist positions at a
    large insurance company showed a mean score of
    72.6. Determine a 99 confidence interval for
    the mean score of all applicants at the
    insurance company. Assume the standard deviation
    of test scores is 10.5.
  • Solution
  • 1. Parameter of concernThe mean test score, µ,
    of all applicants at the insurance company
  • 2. Confidence interval criteria
  • a. Assumptions The distribution of the
    variable, test score, is not known. However,
    the sample size is large enough (n 100) so that
    the CLT applies
  • b. Probability distribution standard normal
    variable z with ? 10.5
  • c. The level of confidence 99, or 1 - ? 0.99

13
Solution Continued
  • 3. Sample informationGiven n 100 and
    72.6

x
5. Confidence interval With 99 confidence we
say, The mean test score is between 69.9 and
75.3, or 69.9 to 75.3 is a 99 confidence
interval for the true mean test score
Note99 confidence means if we conduct the
experiment over and over, and construct lots of
confidence intervals, then 99 of the confidence
intervals will contain the true mean value m.
14
Sample Size
  • Problem Find the sample size necessary in order
    to obtain a specified maximum error and level of
    confidence (assume the standard deviation is
    known)

15
Example
  • Example Find the sample size necessary to
    estimate a population mean to within 0.5 with 95
    confidence if the standard deviation is 6.2

Note When solving for sample size n, always
round up to the next largest integer (Why?)
16
8.4 The Nature of Hypothesis Testing
  • Formal process for making an inference
  • Consider many of the concepts of a hypothesis
    test and look at several decision-making
    situations
  • The entire process starts by identifying
    something of concern and then formulating two
    hypotheses about it

17
Hypothesis
  • Hypothesis A statement that something is true

Statistical Hypothesis Test A process by which
a decision is made between two opposing
hypotheses. The two opposing hypotheses are
formulated so that each hypothesis is the
negation of the other. (That way one of them is
always true, and the other one is always false).
Then one hypothesis is tested in hopes that it
can be shown to be a very improbable occurrence
thereby implying the other hypothesis is the
likely truth.
18
Null Alternative Hypothesis
  • There are two hypotheses involved in making a
    decision

Null Hypothesis, Ho The hypothesis to be
assumed to be true. Usually a statement that a
population parameter has a specific value. The
starting point for the investigation.
Alternative Hypothesis, Ha A statement about
the same population parameter that is used in the
null hypothesis. Generally this is a statement
that specifies the population parameter has a
value different, in some way, from the value
given in the null hypothesis. The rejection of
the null hypothesis will imply the likely truth
of this alternative hypothesis. This statement is
the research hypothesis.
19
Notes
  • 1. Basic idea proof by contradiction
  • Assume the null hypothesis is true and look for
    evidence to suggest that it is false

2. Null hypothesis A statement about a
population parameter that is assumed to be true
3. Alternative hypothesis also called the
research hypothesis. Generally, what you are
trying to prove? We hope experimental evidence
will suggest the alternative hypothesis is true
by showing the unlikeliness of the truth of the
null hypothesis
20
Example
  • Example Suppose you are investigating the
    effects of a new pain reliever. You hope the
    new drug relieves minor muscle aches and pains
    longer than the leading pain reliever. State the
    null and alternative hypotheses.
  • Solutions
  • Ho The new pain reliever is no better than the
    leading pain reliever
  • Ha The new pain reliever lasts longer than the
    leading pain reliever

21
Example
  • Example You are investigating the presence of
    radon in homes being built in a new development.
    If the mean level of radon is greater than 4 then
    send a warning to all home owners in the
    development. State the null and alternative
    hypotheses.
  • Solutions
  • Ho The mean level of radon for homes in the
    development is 4 (or less)
  • Ha The mean level of radon for homes in the
    development is greater than 4

22
Hypothesis Test Outcomes
  • Type A correct decision Null hypothesis true,
    decide in its favor

Type B correct decision Null hypothesis false,
decide in favor of alternative hypothesis
Type I error Null hypothesis true, decide in
favor of alternative hypothesis
Type II error Null hypothesis false, decide in
favor of null hypothesis
23
Example
  • Example A calculator company has just received a
    large shipment of parts used to make the screens
    on graphing calculators. They consider the
    shipment acceptable if the proportion of
    defective parts is 0.01 (or less). If the
    proportion of defective parts is greater than
    0.01 the shipment is unacceptable and returned to
    the manufacturer. State the null and alternative
    hypotheses, and describe the four possible
    outcomes and the resulting actions that would
    occur for this test.
  • Solutions
  • Ho The proportion of defective parts is 0.01 (or
    less)
  • Ha The proportion of defective parts is greater
    than 0.01

24
Fail To Reject Ho
  • Null Hypothesis Is True
  • Type A correct decision
  • Truth of situation The proportion of defective
    parts is 0.01 (or less)
  • Conclusion It was determined that the proportion
    of defective parts is 0.01 (or less)
  • Action The calculator company received parts
    with an acceptable proportion of defectives

Null Hypothesis Is False Type II error Truth
of situation The proportion of defective parts
is greater than 0.01 Conclusion It was
determined that the proportion of defective parts
is 0.01 (or less) Action The calculator company
received parts with an unacceptable proportion of
defectives
25
Reject Ho
  • Null hypothesis is true
  • Type I error
  • Truth of situation The proportion of defectives
    is 0.01 (or less)
  • Conclusion It was determined that the proportion
    of defectives is greater than 0.01
  • Action Send the shipment back to the
    manufacturer. The proportion of defectives is
    acceptable

Null hypothesis is false Type B correct
decision Truth of situation The proportion of
defectives is greater than 0.01 Conclusion It
was determined that the proportion of defectives
is greater than 0.01 Action Send the shipment
back to the manufacturer. The proportion of
defectives is unacceptable
26
Errors
  • Notes
  • Since we make a decision based on a sample, there
    is always the chance of making an error

Probability of a type I error a Probability of
a type II error b
27
Notes
  • 1. Would like a and b to be as small as possible

2. a and b are inversely related
3. Usually set a (and dont worry too much about
b. Why?)
4. Most common values for a and b are 0.01 and
0.05
5. 1 - b the power of the statistical test A
measure of the ability of a hypothesis test to
reject a false null hypothesis
6. Regardless of the outcome of a hypothesis
test, we never really know for sure if we have
made the correct decision
28
Level of Significance Test Statistic
  • Level of Significance, a The probability of
    committing the type I error

Test Statistic A random variable whose value is
calculated from the sample data and is used in
making the decision fail to reject Ho or reject Ho
  • Notes
  • The value of the test statistic is used in
    conjunction with a decision rule to determine
    fail to reject Ho or reject Ho
  • The decision rule is established prior to
    collecting the data and specifies how you will
    reach the decision

29
The Conclusion
  • a. If the decision is reject Ho, then the
    conclusion should be worded something like,
    There is sufficient evidence at the a level of
    significance to show that . . . (the meaning of
    the alternative hypothesis)

b. If the decision is fail to reject Ho, then the
conclusion should be worded something like,
There is not sufficient evidence at the a level
of significance to show that . . . (the meaning
of the alternative hypothesis)
  • Notes
  • The decision is about Ho
  • The conclusion is a statement about Ha
  • There is always the chance of making an error

30
8.5 Hypothesis Test of Mean ?(? known) A
p-Value Approach
Hypothesis test 1. A well-organized,
step-by-step procedure used to make a decision
2. p-value approach (p-value approach) a
procedure that has gained popularity in recent
years. Organized into five steps.
31
The Probability-Value Hypothesis Test
  • A Five-Step Procedure

1. The Set-Up a. Describe the population
parameter of concern b. State the null
hypothesis (Ho) and the alternative hypothesis
(Ha)
2. The Hypothesis Test Criteria a. Check the
assumptions b. Identify the probability
distribution and the test statistic formula to be
used c. Determine the level of significance, a
3. The Sample Evidence a. Collect the sample
information b. Calculate the value of the test
statistic
4. The Probability Distribution a. Calculate the
p-value for the test statistic b. Determine
whether or not the p-value is smaller than a
5. The Results a. State the decision about
Ho b. State a conclusion about Ha
32
Example
  • Example A company advertises the net weight of
    its cereal is 24 ounces. A consumer group
    suspects the boxes are underfilled. They cannot
    check every box of cereal, so a sample of cereal
    boxes will be examined. A decision will be made
    about the true mean weight based on the sample
    mean. State the consumer groups null and
    alternative hypotheses. Assume s 0.2
  • Solution
  • 1. The Set-Up
  • a. Describe the population parameter of concern
  • The population parameter of interest is the
    mean ?, the mean weight of the cereal boxes

33
Solution Continued
Note The trichotomy law from algebra states that
two numerical values must be related in exactly
one of three possible relationships lt, , or gt.
All three of these possibilities must be
accounted for between the two opposing hypotheses
in order for the hypotheses to be negations of
each other.
34
Possible Statements of Null Alternative
Hypotheses
  • Notes
  • The null hypothesis will be written with just the
    equal sign (a value is assigned)
  • When equal is paired with less than or greater
    than, the combined symbol is written beside the
    null hypothesis as a reminder that all three
    signs have been accounted for in these two
    opposing statements.

35
Examples
  • Example An automobile manufacturer claims a new
    model gets at least 27 miles per gallon. A
    consumer groups disputes this claim and would
    like to show the mean miles per gallon is lower.
    State the null and alternative hypotheses.

Solution Ho m 27 (³) and Ha m lt 27
Solution Ho m 10 and Ha m ¹ 10
36
Common Phrases Their Negations
37
Example Continued
  • Example Continued Weight of cereal boxes

Recall Ho m 24 (³) (at least 24) Ha m lt
24 (less than 24)
  • 2. The Hypothesis Test Criteria
  • a. Check the assumptions
  • The weight of cereal boxes is probably mound
    shaped. A sample size of 40 should be sufficient
    for the CLT to apply. The sampling distribution
    of the sample mean can be expected to be normal.

38
Solution Continued
c. Determine the level of significance Let a
0.05
4. The Probability Distribution a. Calculate the
p-value for the test statistic
39
p-Value
  • Probability-Value, or p-Value The probability
    that the test statistic could be the value it is
    or a more extreme value (in the direction of the
    alternative hypothesis) when the null hypothesis
    is true (Note the symbol P will be used to
    represent the p-value, especially in algebraic
    situations)

40
Solution Continued

b. Determine whether or not the p-value is
smaller than a The p-value (0.0571) is greater
than a (0.05)
a. State the decision about HoDecision about Ho
Fail to reject Ho b. Write a conclusion about
Ha There is not sufficient evidence at the 0.05
level of significance to show that the mean
weight of cereal boxes is less than 24 ounces
41
Notes
  • If we fail to reject Ho, there is no evidence to
    suggest the null hypothesis is false. This does
    not mean Ho is true.
  • The p-value is the area, under the curve of the
    probability distribution for the test statistic,
    that is more extreme than the calculated value of
    the test statistic.
  • There are 3 separate cases for p-values. The
    direction (or sign) of the alternative hypothesis
    is the key.

42
Finding p-Values
43
Example
  • Example The mean age of all shoppers at a local
    jewelry store is 37 years (with a standard
    deviation of 7 years). In an attempt to attract
    older adults with more disposable income, the
    store launched a new advertising campaign.
    Following the advertising, a random sample of 47
    shoppers showed a mean age of 39.3. Is there
    sufficient evidence to suggest the advertising
    campaign has succeeded in attracting older
    customers?
  • Solution
  • 1. The Set-Up
  • a. Parameter of concern the mean age, m, of all
    shoppers
  • b. The hypotheses
  • Ho m 37 ()
  • Ha m gt 37

44
Solution Continued
  • 2. The Hypothesis Test Criteria
  • a. The assumptions The distribution of the age
    of shoppers is unknown. However, the sample size
    is large enough for the CLT to apply.
  • b. The test statistic The test statistic will
    be z
  • c. The level of significance none given We
    will find a p-value

45
Solution Continued
5. The Results Because the p-value is so small
(P lt 0.05), there is evidence to suggest the mean
age of shoppers at the jewelry store is greater
than 37
46
p-Value
  • The idea of the p-value is to express the degree
    of belief in the null hypothesis

1. When the p-value is minuscule (like 0.0001),
the null hypothesis would be rejected by everyone
because the sample results are very unlikely for
a true Ho
2. When the p-value is fairly small (like 0.01),
the evidence against Ho is quite strong and Ho
will be rejected by many
3. When the p-value begins to get larger (say,
0.02 to 0.08), there is too much probability that
data like the sample involved could have occurred
even if Ho were true, and the rejection of Ho is
not an easy decision
4. When the p-value gets large (like 0.15 or
more), the data is not at all unlikely if the Ho
is true, and no one will reject Ho
47
p-Value Advantages Disadvantage
  • Advantages of p-value approach
  • 1. The results of the test procedure are
    expressed in terms of a continuous probability
    scale from 0.0 to 1.0, rather than simply on a
    reject or fail to reject basis
  • 2. A p-value can be reported and the user of the
    information can decide on the strength of the
    evidence as it applies to his/her own situation
  • 3. Computers can do all the calculations and
    report the p-value, thus eliminating the need for
    tables

Disadvantage 1. Tendency for people to put off
determining the level of significance
48
Example
  • Example The active ingredient for a drug is
    manufactured using fermentation. The standard
    process yields a mean of 26.5 grams (assume s
    3.2). A new mixing technique during fermentation
    is implemented. A random sample of 32 batches
    showed a sample mean 27.1. Is there any evidence
    to suggest the new mixing technique has changed
    the yield?
  • Solution
  • 1. The Set-Up
  • a. The parameter of interest is the mean yield
    of active ingredient, m
  • b. The null and alternative hypotheses
  • H0 m 26.5
  • Ha m ¹ 26.5

49
Solution Continued
2 The Hypothesis Test Criteria . a. Assumptions
A sample of size 32 is large enough to satisfy
the CLT b. The test statistic z c. The level
of significance find a p-value
50
Solution Continued
5. The Results Because the p-value is large (P
0.2892), there is no evidence to suggest the new
mixing technique has changed the mean yield
51
8.6 Hypothesis Test of mean ? (? known)A
Classical Approach
52
The Classical Hypothesis Test
A Five-Step Procedure
  • 1. The Set-Up
  • a. Describe the population parameter of concern
  • b. State the null hypothesis (Ho) and the
    alternative
  • hypothesis (Ha)

2. The Hypothesis Test Criteria a. Check the
assumptions b. Identify the probability
distribution and the test statistic to
be used c. Determine the level of
significance, a
3. The Sample Evidence a. Collect the sample
information b. Calculate the value of the test
statistic
4. The Probability Distribution a. Determine the
critical region(s) and critical
value(s) b. Determine whether or not the
calculated test statistic is in the
critical region
5. The Results a. State the decision about
Ho b. State the conclusion about Ha
53
Example Continued
  • Example (continued) A company advertises the
    net weight of its cereal is 24 ounces. A
    consumer group suspects the boxes are
    underfilled. They cannot check every box of
    cereal, so a sample of cereal boxes will be
    examined. A decision will be made about the
    true mean weight based on the sample mean. State
    the consumer groups null and alternative
    hypotheses. Assume s 0.2

54
Solution Continued
c. Determine the level of significance Consider
the four possible outcomes and their
consequences Let a 0.05
4. The Probability Distribution a. Determine the
critical region(s) and critical value(s)
55
Critical Region Critical Value(s)
  • Critical Region The set of values for the test
    statistic that will cause us to reject the null
    hypothesis. The set of values that are not in
    the critical region is called the noncritical
    region (sometimes called the acceptance region).

Critical Value(s) The first or boundary
value(s) of the critical region(s)
56
Critical Region Critical Value(s)
  • Illustration

57
Solution Continued
5. The Results We need a decision rule
58
Decision Rule
  • Decision Rule
  • a. If the test statistic falls within the
    critical region, we will reject Ho (the critical
    value is part of the critical region)
  • b. If the test statistic is in the noncritical
    region, we will fail to reject Ho

a. State the decision about Ho Decision Fail
to reject Ho b. State the conclusion about Ha
Conclusion There is not enough evidence at the
0.05 level of significance to show that the mean
weight of cereal boxes is less than 24
59
Notes
  • 1. The null hypothesis specifies a particular
    value of a population parameter

2. The alternative hypothesis can take three
forms. Each form dictates a specific location of
the critical region(s)
4. Significance level a
60
Example
  • Example The mean water pressure in the main
    water pipe from a town well should be kept at 56
    psi. Anything less and several homes will have
    an insufficient supply, and anything greater
    could burst the pipe. Suppose the water
    pressure is checked at 47 random times. The
    sample mean is 57.1. (Assume s 7). Is there
    any evidence to suggest the mean water pressure
    is different from 56? Use a 0.01
  • Solution
  • 1. The Set-Up
  • a. Describe the parameter of concern
  • The mean water pressure in the main pipe
  • b. State the null and alternative hypotheses
  • Ho m 56
  • Ha m ¹ 56

61
Solution Continued
2. The Hypothesis Test Criteria a. Check the
assumptions A sample of n 47 is large enough
for the CLT to apply b. Identify the test
statistic The test statistic is z c. Determine
the level of significance a 0.01 (given)
62
Solution Continued
4. The Probability Distribution a. Determine the
critical regions and the critical values
63
Solution Continued
5. The Results a. State the decision about
Ho Fail to reject Ho b. State the conclusion
about Ha There is no evidence to suggest the
water pressure is different from 56 at the 0.01
level of significance
64
Example
  • Example An elementary school principal claims
    students receive no more than 30 minutes of
    homework each night. A random sample of 36
    students showed a sample mean of 36.8 minutes
    spent doing homework (assume s 7.5). Is there
    any evidence to suggest the mean time spent on
    homework is greater than 30 minutes? Use a 0.05
  • Solution
  • 1. The Set-Up
  • The parameter of concern m, the mean time spent
    doing homework each night
  • Ho m 30 ()Ha m gt 30

65
Solution Continued
2. The Hypothesis Test Criteria a. The sample
size is n 36, the CLT applies b. The test
statistic is z c. The level of significance is
given a 0.01
66
Solution Continued
4. The Probability Distribution
67
Solution Continued
5. The Results Decision Reject
Ho Conclusion There is sufficient evidence at
the 0.01 level of significance to conclude the
mean time spent on homework by the elementary
students is more than 30 minutes
Note Suppose we took repeated sample of size 36.
What would you expect to happen?
Write a Comment
User Comments (0)
About PowerShow.com