Title: ESTIMATES AND SAMPLE SIZES WITH ONE SAMPLE
1ESTIMATES AND SAMPLE SIZES WITH ONE SAMPLE
2Overview
3Applications of Inferential Statistics
- Estimate the value of a population parameter.
- Test some claim (or hypothesis) about a
population.
4Estimating a Population Proportion
5Objective
- Given a sample statistic, estimate the value of
the population parameter (for this section, we
will use the sample proportion to estimate the
value the population proportion p) .
6Requirements for Using a Normal Distribution as
an Approximation to a Binomial Distribution
- The sample is a random sample.
- The conditions for the binomial distribution are
satisfied. That is, there is a fixed number of
trials, the trials are independent, there are two
categories of outcomes, and the probabilities
remain constant for each trial. - The normal distribution can be used to
approximate the distribution of sample
proportions because and
are both satisfied. Because p and q are unknown,
we use the sample proportion to estimate their
values.
7Notation for Proportions
- p proportion of successes in the entire
population - sample proportion of x successes in
a sample of size n. - sample proportion of failures in
a sample of size n.
8Definition
- A point estimate is a single value (or point)
used to approximate a population parameter. - The sample proportion is the best point
estimate of the population proportion p.
9Definition
- A confidence interval (or interval estimate) is a
range (or an interval) of values used to estimate
the true value of a population parameter. A
confidence interval is sometimes abbreviated as
CI.
10Definition
- The confidence level is the probability (often
expressed as the equivalent percentage value,
such as 95) that is the proportion of times that
the confidence interval actually does contain the
population parameter, assuming that the
estimation process is repeated a large number of
times. (the confidence level is also called the
degree of confidence, or the confidence
coefficient.)
11Notation for Critical Value
- The critical value is the positive z value
that is at the boundary separating an area of
in the right tail of the standard normal
distribution. (The value of is at the
vertical boundary for the area of in the
left tail.) The subscript is simply a
reminder that the z score separates an area of
in the right tail of the standard normal
distribution.
12Critical Values in the Standard Normal
Distribution
13Definition
- A critical value is the number on the borderline
separating sample statistics that are likely to
occur from those that are unlikely to occur. The
number is a critical value that is a z
score with the property that it separates an area
ofin the right tail of the standard normal
distribution.
14Example
- Find the critical values for the following
confidence levels - 90
- 95
- 99
15Definition
- When data from a simple random sample are used to
estimate a population proportion p, the margin of
error, denoted by E, is the maximum likely (with
probability ) difference between the
observed sample proportion and the true
value of the population proportion p. The margin
of error E is also called the maximum error of
the estimate and can be found as follows
16Confidence Interval (or Interval Estimate) for
the Population Proportion p
- whereThe
confidence interval is often expressed in the
following equivalent formats.or
17Round-Off Rule for Confidence Interval Estimates
of p
- Round the confidence interval limits for p to
three significant digits.
18Procedure for Constructing a Confidence Interval
for p
- Check that the requirements for this procedure
are satisfied. (For this procedure, check that
the sample is a simple random sample, the
conditions for the binomial distribution are
satisfied, and the normal distribution can be
used to approximate the distribution of sample
proportions because and
are both satisfied.) - Refer to Table A-2 and find the critical
valuethat corresponds to the desired confidence
level.
19Procedure for Constructing a Confidence Interval
for p (continued)
- Evaluate the margin of error
- Using the value of the calculated margin of error
E and the value of the sample proportion ,
find the values of and .
Substitute those values in the general format for
the confidence intervaloror
20Procedure for Constructing a Confidence Interval
for p (continued)
- Round the resulting confidence interval limits to
three significant digits.
21Example
- As part of the National Health and Nutrition
Examination Survey, iron levels were checked for
a sample of 786 girls aged 12 to 15. Iron
deficiency was detected in 71 of those sampled.
Find a 95 confidence interval estimate of the
population proportion p.
22Interpreting a Confidence Interval
- We are 95 (or 90 or 95) confident that the
interval actually does contain
the true value of p.
23Determining Sample Size
- How large should our sample size be if we want
the margin of error to be less than some given
value?
24Sample Size for Estimating Proportion p
- When an estimate of is known
- When no estimate of is known
25Round-Off Rule for Determining Sample Size
- In order to ensure that the required size is at
least as large as it should be, if the computed
sample size is not a whole number, round it up to
the next higher whole number.
26Example
- What sample size would be needed to estimate the
proportion of girls aged 12 to 15 with iron
deficiency if the researcher wants 99 confidence
that the sample proportion is in error by no more
than 0.03? Use the sample proportion as a known
estimate.
27Estimating a Population Mean Known
28Requirements for Estimating when is known
- The sample is a simple random sample.
- The value of the population standard deviation
is known. - Either or both of these conditions are satisfied
- The population is normally distributed, or
-
29Point Estimate of
- The sample mean is the best point estimate of
the population mean .
30Confidence Interval Estimate of the Population
Mean (With Known)
31Definition
- The two values and are called
confidence limits.
32Procedure for Constructing a Confidence Interval
for (with Known )
- Check that the requirements are satisfied.
(Requirements We have a simple random sample,
is known, and either the population appears to
be normally distributed or .) - Refer to Table A-2 and find the critical
valuethat corresponds to the desired confidence
level. - Evaluate the margin of error
.
33Procedure for Constructing a Confidence Interval
for (with Known ) (continued)
- Using the value of the calculated margin of error
E and the value of the sample mean , find the
values of and . Substitute those
values in the general format for the confidence
intervaloror - Round the resulting values by using the following
round-off rule.
34Round-Off Rule for Confidence Intervals Used to
Estimate
- When using the original set of data to construct
a confidence interval, round the confidence
interval limits to one more decimal place than is
used for the original set of data. - When the original set of data is unknown and only
the summary statistics are used,
round the confidence interval limits to the same
number of decimal places used for the sample mean.
35Example
- The health of the bear population in Yellowstone
National Park is monitored by periodic
measurements taken from anesthetized bears. A
sample of 54 bears has a mean weight of 182.9 lb.
Assuming that is known to be 121.8 lb, find
a 99 confidence interval estimate of the mean of
the population of all such bear weights.
36Sample Size for Estimating Mean
- where critical z score based on the
desired confidence levelE desired
margin of error population standard
deviation
37Round-Off Rule for Sample Size n
- When finding the sample size n, if the use
ofdoes not result in a whole number, always
increase the value of n to the next larger whole
number.
38Dealing with Unknown When Finding Sample Size
- Use the range rule of thumb to estimate the
standard deviation as follows - Conduct a pilot study be starting the sampling
process. Based on the first collection of at
least 31 randomly selected sample values,
calculate the sample standard deviation s and use
it in place of . The estimated value can
then be improved as more sample data are
obtained. - Estimate the value of by using the results of
some other study that was done earlier.
39Example
- We are still very concerned about the health of
the bear population in Yellowstone National Park.
How large a sample is needed if we want to be 90
confident that the sample mean weight is within 5
lb of the true population mean? Use the sample
standard deviation from the previous study.
40Estimating a Population Mean Not Known
41Requirements for Estimating when is Unknown
- The sample is a simple random sample.
- Either the sample is from a normally distributed
population or
42Point Estimate of
- The sample mean is the best point estimate of
the population mean .
43Student t Distribution
- If the distribution of a population is
essentially normal (approximately bell-shaped),
then the distribution ofis essentially a
Student t distribution for all samples of size n.
The Student t distribution, often referred to
simply as the t distribution, is used to find
critical values denoted by .
44Important Properties of the Student t Distribution
- The Student t distribution is different for
different sample sizes. - The Student t distribution has the same general
symmetric bell shape as the standard normal
distribution, but it reflects the greater
variability (with wider distributions) that is
expected with small samples. - The Student t distribution has a mean of t 0
(just as the standard normal distribution has a
mean of z 0). - The standard deviation of the Student t
distribution varies with the sample size, but it
is greater than 1 (unlike the standard normal
distribution, which has ). - As the sample size n gets larger, the Student t
distribution gets closer to the standard normal
distribution.
45Definition
- The number of degrees of freedom for a collection
of sample data is the number of sample values
that can vary after certain restrictions have
been imposed on all data values.
degrees of freedom n - 1
46Student t Distribution
47Example
- A sample size of n 20 is a simple random sample
selected from a normally distribution population.
Find the critical values for the following
confidence levels - 90
- 95
48Margin of Error E for the Estimate of (With
Not Known)
- where has n 1 degrees of freedom.
49Confidence Interval for the Estimate of the
(With Not Known)
50Procedure for Constructing a Confidence Interval
for (with Not Known)
- Check that the requirements are satisfied.
(Requirements We have a simple random sample,
and either the population appears to be normally
distributed or .) - Using n 1 degrees of freedom, refer to Table
A-3 and find the critical value that
corresponds to the desired confidence level. - Evaluate the margin of error .
51Procedure for Constructing a Confidence Interval
for (with Not Known) (continued)
- Using the value of the calculated margin of error
E and the value of the sample mean , find the
values of and . Substitute
those values in the general format for the
confidence intervaloror - Round the resulting confidence interval limits.
52Example
- As part of a study on plant growth, a plan
physiologist grew 13 individually potted soybean
seedlings of the type called Wells II. She raised
the plants in a greenhouse under identical
environmental conditions (light, temperature,
soil, etc.). She measured the total stem length
(cm) for each plant after 16 days of growth.
Assuming that the distribution of lengths is
approximately normal, calculate a 95 confidence
interval for the mean stem length.
53Choosing the Appropriate Distribution
54Choosing the Appropriate Distribution
55Estimating a Population Variance
56Requirements for Estimating or
- The sample is a simple random sample.
- The population MUST have normally distributed
values (even if the sample is large).
57Chi-Square Distribution
- where n sample size
s2 sample variance
population variance degrees of
freedom n - 1
58Properties of the Distribution of the Chi-Square
Statistic
- The chi-square distribution is not symmetric,
unlike the normal and Student t distributions.
(As the number of degrees of freedom increases,
the distributions becomes more symmetric.) - The values of chi-square can be zero or positive,
but they cannot be negative. - The chi-square distribution is different for each
number of degrees of freedom, and the number of
degrees of freedom is given by df n 1 in this
section. As the degrees of freedom increases, the
chi-square distribution approaches a normal
distribution.
59Chi-Square Distribution
60Critical Values of the Chi-Square Distribution
- In Table A-4, each critical value of corresponds
to an area given in the top row of the table, and
that area represents the total region located to
the right of the critical value.
61Notation
- With the total area of divided equally
between the two tails of a chi-square
distribution, denotes the left-tailed
critical value and denotes the right-tailed
critical value.
62Example
- A sample size of n 20 is a simple random sample
selected from a normally distribution population.
Find the critical values and for the
following confidence levels - 90
- 95
63Estimators of and
- The sample variance s2 is used as the best point
estimate of the population variance . - The sample standard deviation s is commonly used
as a point estimate of (even though it is a
biased estimator).
64Confidence Interval (or Interval Estimate) for
the Population Variance
- To find the confidence interval for the
population standard deviation , use
65Procedure for Constructing a Confidence Interval
for or
- Check that the requirements for the methods of
this section are satisfied. (Requirements The
sample is a simple random sample and a histogram
or normal quantile plot suggests that the
population has a distribution that is very close
to a normal distribution.) - Using n 1 degrees of freedom, refer to Table
A-4 and find the critical values and
that correspond to the desired confidence level. - Evaluate the upper and lower confidence interval
limits using this format of the confidence
interval
66Procedure for Constructing a Confidence Interval
for or(continued)
- If a confidence interval estimate of is
desired, take the square root of the upper and
lower confidence interval limits and change
to . - Round the resulting confidence interval limits.
If using the original set of data, round to one
more decimal place than is used for the original
set of data. If using the sample standard
deviation or variance, round the confidence
interval limits to the same number of decimal
places.
67Example
- In a study of the effectiveness of a gluten-free
diet in first-degree relatives of patients with
Type I diabetes, researchers placed seven
subjects on a gluten-free diet for 12 months.
Prior to the diet, they took a baseline
measurement of the diabetes related insulin
autoantibody (IAA). The seven subjects had IAA
units of 9.7, 12.3, 11.2, 5.1, 24.8,
14.8, and 17.7Assuming that sample is a simple
random sample and that the sample data appear to
come from a population with a normal
distribution, find a 95 confidence interval
estimate of the population standard deviation.
68Determining Sample Size
69Example
- Find the minimum sample size needed to be 99
confident that the IAA sample standard deviation
is within 20 of .