Title: Confidence Intervals: Admitting that Estimates Are Not Exact
1Lesson 2
- Confidence Intervals Admitting that Estimates
Are Not Exact
2Chapter 8Interval Estimation
- Population Mean s Unknown
- Determining the Sample Size
3Overview
- Confidence Interval
- Computed from data
- Has a known probability of including the unknown
population parameter being estimated - Statistical Inference
- An exact probability statement about the
population, based on sample data - Confidence Level (Confidence coefficient)
- The probability of including the population
parameter within the confidence interval - 95 is the usual standard. Also 99, 99.9, 90
4Interval Estimation of a Population
MeanLarge-Sample Case
- Sampling Error
- Probability Statements about the Sampling Error
- Constructing an Interval Estimate
- Large-Sample Case with ?? Known
- Calculating an Interval Estimate
- Large-Sample Case with ?? Unknown
5Sampling Error
- The absolute value of the difference between an
unbiased point estimate and the population
parameter it estimates is called the sampling
error. - For the case of a sample mean estimating a
population mean, the sampling error is - Sampling Error
6Probability StatementsAbout the Sampling Error
- Knowledge of the sampling distribution of
enables us to make probability statements about
the sampling error even though the population
mean ? is not known. - A probability statement about the sampling error
is a precision statement. - ? is the level of significance
-
7Margin of Error and the Interval Estimate
A point estimator cannot be expected to provide
the exact value of the population parameter.
An interval estimate can be computed by adding
and subtracting a margin of error to the point
estimate.
Point Estimate /- Margin of Error
The purpose of an interval estimate is to
provide information about how close the point
estimate is to the value of the parameter.
8Margin of Error and the Interval Estimate
The general form of an interval estimate of a
population mean is
9Confidence Interval
- Estimate of Margin
- the Parameter of Error
-
10Confidence Interval
- Estimate of t or z
standard error - the Parameter of the
estimate
11Probability StatementsAbout the Sampling Error
- Precision Statement
- There is a 1 - ? probability that the value of
a sample mean will provide a sampling error of
or less.
1 - ? of all values
?/2
?/2
?
12Interval Estimation of a Population Means Known
- With ? ?Known
-
- where is the sample mean
- 1 -? is the confidence coefficient
- z?/2 is the z value providing an area of
- ?/2 in the upper tail of the
standard - normal probability distribution
- s is the population standard deviation
- n is the sample size
13Interval Estimation of a Population Means Known
- There is a 1 - ? probability that the
value of a - sample mean will provide a margin of error of
- or less.
?/2
?/2
?
14Interval Estimate of a Population Means Known
?/2
?/2
interval does not include m
?
interval includes m
interval includes m
-------------------------
-------------------------
-------------------------
-------------------------
-------------------------
-------------------------
15Interval Estimate of a Population Mean s Known
In most applications, a sample size of n 30
is adequate.
If the population distribution is highly skewed
or contains outliers, a sample size of 50 or
more is recommended.
16Interval Estimate of a Population Mean s Known
- Adequate Sample Size (continued)
If the population is not normally distributed
but is roughly symmetric, a sample size as small
as 15 will suffice.
If the population is believed to be at least
approximately normal, a sample size of less than
15 can be used.
17Example National Discount, Inc.
- National Discount has 260 retail outlets
throughout the United States. National evaluates
each potential location for a new retail outlet
in part on the mean annual income of the
individuals in the marketing area of the new
location. - Sampling can be used to develop an interval
estimate of the mean annual income for
individuals in a potential marketing area for
National Discount. - A sample of size n 36 was taken. The sample
mean, , is 21,100 and the population standard
deviation, ?, is 4,500. We will use .95 as the
confidence coefficient in our interval estimate.
18Example National Discount, Inc.
- Interval Estimate of the Population Mean ?
known -
- Interval Estimate of ? is
- 21,100 1,470
- or 19,630 to 22,570
- We are 95 confident that the interval contains
the - population mean.
19Example National Discount, Inc.
Precision Statement There is a .95 probability
that the value of a sample mean for National
Discount will provide a sampling error of 1,470
or less. determined as follows 95 of the
sample means that can be observed are within
1.96 of the population mean ?. If
, then 1.96 1,470.
20Example Restaurant Survey
- n 100 residents
- 23.91 average expenditures
- ? 11.49 variability of individuals
- 1.149 variability of the sample average
- z 1.960 for 2-sided 95 confidence,
- From
- To
- We are 95 sure that the unknown population mean
expenditure ? is between 21.66 and 26.16 for
all N 77,386 residents in the population - Even though we only observed 100 people!
21MOST COMMONLY USED CONFIDENCE LEVELS FOR Z
22Interval Estimation of a Population Means
Unknown
- If an estimate of the population standard
deviation s cannot be developed prior to
sampling, we use the sample standard deviation s
to estimate s .
- This is the s unknown case.
- In this case, the interval estimate for m is
based on the t distribution.
- (Well assume for now that the population is
normally distributed.)
23t Distribution
- The t distribution is a family of similar
probability distributions. - A specific t distribution depends on a parameter
known as the degrees of freedom. - As the number of degrees of freedom increases,
the difference between the t distribution and
the standard normal probability distribution
becomes smaller and smaller. - A t distribution with more degrees of freedom
has less dispersion. - The mean of the t distribution is zero.
24t Distribution
Standard normal z values
25Interval Estimation of a Population Meanwith ?
Unknown
- Interval Estimate
- where 1 -? the confidence coefficient
- t?/2 the t value providing an
area of ?/2 in the upper
tail of a t distribution - with n - 1 degrees of freedom
- s the sample standard deviation
26Example Apartment Rents
- Interval Estimation of a Population Mean
- with ? Unknown
- A reporter for a student newspaper is writing
an - article on the cost of off-campus housing. A
sample of 16 one-bedroom units within a half-mile
of campus resulted in a sample mean of 650 per
month and a sample standard deviation of 55. - Let us provide a 95 confidence interval
estimate of the mean rent per month for the
population of one-bedroom units within a
half-mile of campus. Well assume this
population to be normally distributed.
27Example Apartment Rents
- At 95 confidence, ? .05, and ?/2 .025.
t.025 is based on n - 1 16 - 1 15 degrees of
freedom.
In the t distribution table we see that t.025
2.131.
28Example Apartment Rents
- Interval Estimation of a Population
MeanSmall-Sample Case (n lt 30) with ? Unknown - We are 95 confident that the mean rent per month
for the population of efficiency apartments
within a half-mile of campus is between 620.70
and 679.30.
29Excel Example
- Lets suppose we took a random sample of of 40
students from a population of 44,000 students at
UCF. One of the variables the professor is
interested in is AGE. The following is the
distribution of the random sample for the
variable age. -
30Descriptive Statistics to Summarize a Variable
- Variable Name
- Number of Observations
- Lowest Value
- Mean
- Median
- Standard Deviation
- Standard Error
- Maximum Value
- 1st Quartile
- 3rd Quartile.
31(No Transcript)
32Select tools from the menu bar.
Then select Data Analysis from the pull down menu
33From the Data Analysis box select Descriptive
Statistics and then click on the button OK.
34Input the range on the Input Range Box
Select the Summary Statistics and Confidence
level for Means Box
35Excel
36Descriptive Statistics to Summarize a Variable
- Variable Name AGE
- Number of Observations 40
- Lowest Value 17
- Mean 24.475
- Median 22.5
- Standard Deviation 6.11
- Standard Error 0.96
- Maximum Value 45
- 1st Quartile QUARTILE(A2A41,1)19.75
- 3rd Quartile QUARTILE(A2A41,3)28
37Detecting Outliers Using IQR
- IQR 3rd Quartile 1st Quartile
- The lower limit is located 1.5(IQR) below Q1.
- The upper limit is located 1.5(IQR) above Q3.
- Data outside these limits are considered outliers.
38Example Apartment for Rent
- IQR 28 19.75 8.25
- Lower Limit Q1 - 1.5(IQR) 19.75 - 1.5(8.25)
7.375 - Upper Limit Q3 1.5(IQR) 28 1.5(8.25)
40.375 - There might be 1 outlier (value greater than
40.374 ) AGE 45.
39Confidence Interval
- Excel is providing us with the information we
need to build a 95 confidence interval estimate
of the mean. - Margin of Error
- Therefore, we are 95 Confident that the
population mean for the variable age is between - 26.429 and 22.521.
40Sample Size for an Interval Estimateof a
Population Mean
- Let E the maximum sampling error
- E is the amount added to and subtracted from the
point estimate to obtain an interval estimate. - E is often referred to as the margin of error.
- We have
- Solving for n we have
41Continued
- In case we Sigma is unknown we should use the
variance of the sample.
42Example National Discount, Inc.
- Sample Size for an Interval Estimate of a
Population Mean - Suppose that Nationals management team wants
an estimate of the population mean such that
there is a .95 probability that the sampling
error is 500. - How large a sample size is needed to meet the
required precision?
43Example National Discount, Inc.
- Sample Size for Interval Estimate of a Population
Mean - At 95 confidence, z.025 1.96.
- Recall that ?? 4,500.
- Solving for n we have
- We need to sample 312 to reach a desired
precision of - 500 at 95 confidence.
44Interval Estimationof a Population Proportion
The general form of an interval estimate of a
population proportion is
45Interval Estimationof a Population Proportion
46Interval Estimationof a Population Proportion
?/2
?/2
p
47Interval Estimationof a Population Proportion
48Interval Estimation of a Population Proportion
- Example Political Science, Inc.
- Political Science, Inc. (PSI)
- specializes in voter polls and
- surveys designed to keep
- political office seekers informed
- of their position in a race.
- Using telephone surveys, PSI interviewers ask
- registered voters who they would vote for if the
- election were held that day.
49Interval Estimation of a Population Proportion
- Example Political Science, Inc.
In a current election campaign, PSI has just
found that 220 registered voters, out of
500 contacted, favor a particular
candidate. PSI wants to develop a 95
confidence interval estimate for the proportion
of the population of registered voters that
favor the candidate.
50Interval Estimation of a Population Proportion
PSI is 95 confident that the proportion of all
voters that favor the candidate is between .3965
and .4835.
51Sample Size for an Interval Estimateof a
Population Proportion
Solving for the necessary sample size, we get
52Sample Size for an Interval Estimateof a
Population Proportion
The planning value p can be chosen by 1. Using
the sample proportion from a previous sample of
the same or similar units, or 2. Selecting a
preliminary sample and using the sample
proportion from this sample.
53Sample Size for an Interval Estimateof a
Population Proportion
- Suppose that PSI would like a .99 probability
- that the sample proportion is within .03 of
the population proportion. - How large a sample size is needed to meet the
required precision? (A previous sample of
similar units yielded .44 for the sample
proportion.)
54Sample Size for an Interval Estimateof a
Population Proportion
A sample of size 1817 is needed to reach a
desired precision of .03 at 99 confidence.
55Sample Size for an Interval Estimateof a
Population Proportion
Note We used .44 as the best estimate of p
in the preceding expression. If no information
is available about p, then .5 is often assumed
because it provides the highest possible sample
size. If we had used p .5, the recommended n
would have been 1843.
56End of Lesson 2