Title: Sampling Distributions for a Mean a sample mean parameters such as population means dont have a dist
1Sampling Distributions for a Meana sample
mean parameters such as population means dont
have a distribution.
2Word Lengths Gettysburg Address
- The mean length is ? 4.295.
- The standard deviation is ? 2.123.
- Not Normal. Right skewed. The standard deviation
isnt helpful for finding probabilities.
3Sampling Distribution n 5
- The sample mean has a different distribution.
Its called the sampling distribution of the
sample mean (for n 5). - The mean of these sample means is 4.165 (pretty
close). - The standard deviation of these sample means is
0.927 (less variable). - It fills the number line more completely.
- The shape is unknown, but appears closer to
Normal.
4Sampling Distribution n 5
- The mean of these sample means is 4.302 (very
close).
The standard deviation of these sample means is
0.930.
The shape is close to Normal (but not Normal
theres right skew).
5Sampling Distribution
- Given A quantitative population with
- mean ?
- standard deviation ?
- A random sample from the population, where the
population is at least 20 times larger than the
sample. (Independent trials.) - Statistic The sample mean .
- This statistic is an unbiased estimate of the
parameter ?.
6Sampling Distribution Results
- The distribution of the sample mean has
- gt mean (means
unbiased) - gt standard deviation
- gt shape closer to Normal (but not
necessarily Normal)
The book calls this the standard error of the
(sample) mean.
7Sampling Distribution n 5
- Example
- Sample means from samples of size n 5 have
- gt mean
- gt standard deviation
- gt shape closer to Normal (but not Normal
a bit right skewed)
8Sampling Distribution n 5
- The mean of these sample means is 4.302 (very
close to 4.295).
The standard deviation of these sample means is
0.930 (very close to 0.949).
The shape is close to Normal (but not Normal
theres right skew).
9Sampling Distribution n 10
- Example
- Sample means from sample of size n 10 have
- gt mean
- gt standard deviation
- gt shape closer to Normal
10Sampling Distribution n 10
- The mean of these sample means is 4.305 (very
close).
The standard deviation of these sample means is
0.658.
The shape is very close to Normal (just a little
right skew not enough to fuss over).
11Sampling Distribution
- Example
- Sample means from sample of size n 10 have
- gt mean
- gt standard deviation
- gt shape closer to Normal
- very close enough so that a Normal could be
used for probabilities
12Distribution of the Sample Mean
- Given
- A variable with population that is not Normally
distributed with mean ? and standard deviation ?. - A random sample of size n.
- Result
- The sample mean has approximate Normal
distribution with
Assume the population size is at least 20 times n.
13Example
- Rolls of paper leave a factory with weights that
are Normal with mean ? 1493 lbs, and standard
deviation ? 12 lbs.
14Finding probabilities
- What is the probability a roll weighs over 1500
lbs? - ANS 0.2798
- (about 28 of rolls exceed 1500 lbs)
15New Question
- A truck transports 8 rolls at a time. The legal
weight limit for the truck is 12,000 lbs. What is
the probability 8 rolls have total weight
exceeding this limit? - Since 12000/8 1500, the question could also be
phrased - What is the probability 8 rolls have (sample)
mean weight exceeding 1500? - The bad news The answer is not 0.2798.
- The good news Its not that tough.
16Distribution of the Sample Mean
- Given
- A variable with population that is Normally
distributed with mean ? and standard deviation ?. - A random sample of size n. (N/n ? 20)
- Result
- The sample mean has Normal distribution
Called the standard error of the sample mean.
17Example - continued
- Rolls (single rolls) of paper leave a factory
with weights that are Normal with mean ? 1493
lbs, and standard deviation ? 12 lbs. - If n 8 rolls are randomly selected, what is the
probability their sample mean weight exceeds
1500? - The distribution is Normal.
18Finding probabilities
- Find the probability the sample mean is over 1500
lbs. - Here were using the same mean, but a standard
deviation reduced to 4.243. - ANS 0.0495
19 Interpreting the Result
The probability the sample mean for 8 rolls
exceeds 1500 lbs is 0.0495. For 4.95 of all
possible samples of 8 rolls, the sample mean
exceeds 1500 lbs. Equivalent There is a 0.0495
probability that the total weight will exceed
8?1500 12,000 lbs. Again we are working with a
sampling distribution for a statistic. The
statistic here is the sample mean. Were working
towards using the sample mean as an estimate of
the population mean.
20 The Picture
Sample mean weights for samples of 8 rolls.
Weights of single rolls.
21Example
- Waiting times between customer arrivals at a
service center have Exponential distribution with
mean ? 2 minutes. - So ? 2 minutes. (Formula sheet.)
- What can we say about the sample mean waiting
time for n customers? - As n gets larger, the distribution gets closer to
Normal.
22 The Picture
Single values
Sample mean n 64
Sample mean n 16
Sample mean n 4
23 The 20 Times Rule
Not close enough when n lt 20 times population
size. Too large. (There is an adjustment.)
24Distribution of the Sample Mean
- Given
- A variable with population that is not Normally
distributed with mean ? and standard deviation ?. - A random sample of size n.
- Result
- The sample mean has approximate Normal
distribution with
Assume the population size is at least 20 times n.
25Distribution of the Sample Mean
- Given
- A variable with population that is not Normally
distributed with mean ? and standard deviation ?. - A random sample of size n.
- Result
- The sample mean has generally unknown
distribution with
26Distribution of the Sample Mean
Central Limit Theorem (CLT)
- Given
- A variable with population that is not Normally
distributed with mean ? and standard deviation ?. - A random sample of size n, where n is
sufficiently large. - Result
- The sample mean has approximate Normal
distribution with
27What is Sufficiently Large?
- Your book says generally n at least 30.
- If the population is fairly symmetric without
outliers, considerably less than 30 will do the
trick. - If the population is highly skewed, or not
unimodal, considerably more than 30 may be
required. - If the population is Normal then sample size is
not a concern The sample mean is Normal. - You may use the 30 rule if you recognize that
its not that black and white, and that for
Normal populations, n 1 is sufficiently large.
28Example
- The Census Bureau reports the average age at
death for female Americans is 79.7 years, with
standard deviation 14.5 years. - ? 79.7 years ? 14.5 years
- I looked at a few recent obituaries in the Oswego
Daily News (online) - 79 70 48 99 85 71 45
29Example
- Consider randomly samples of n 7 U.S. women
What is the distribution of the sample mean?
30Example
- ? 79.7 years ? 14.5 years
- The distribution of the sample mean has
- Our sample has
- How does this fit in?
- Z (71.00 79.7) /5.48 -1.59
- This suggests 71.00 is somewhat (but not very)
unusually low.
31Example
- The Normal shouldnt be used here (why not?)
- Distribution of longevity ? ? 80 ? ? 15
-
- Within 1 s.d.
32Example
- The Normal shouldnt be used here (why not?)
- Distribution of longevity ? ? 80 ? ? 15
- If Normal
- Within 1 s.d. (65, 95)
33Example
- The Normal shouldnt be used here (why not?)
- Distribution of longevity ? ? 80 ? ? 15
- If Normal
- Within 1 s.d. (65, 95) ? 68
34Example
- The Normal shouldnt be used here (why not?)
- Distribution of longevity ? ? 80 ? ? 15
- If Normal
- Within 1 s.d. (65, 95) ? 68
- Within 2 s.d.s (50, 110) ? 95
35Example
- The Normal shouldnt be used here (why not?)
- Distribution of longevity ? ? 80 ? ? 15
- If Normal
- Within 1 s.d. (65, 95) ? 68
- Within 2 s.d.s (50, 110) ? 95
- Above 110
36Example
- The Normal shouldnt be used here (why not?)
- Distribution of longevity ? ? 80 ? ? 15
- If Normal
- Within 1 s.d. (65, 95) ? 68
- Within 2 s.d.s (50, 110) ? 95
- Above 110 ? 2.5
- 1 in 40 ???
- No way! The distribution is not Normal.
37Example
- The Normal shouldnt be used here (why not?)
- Distribution of longevity ? ? 80 ? ? 15
- If Normal
- Within 1 s.d. (65, 95) ? 68
- Within 2 s.d.s (50, 110) ? 95
- Abover 110 ? 2.5
- 1 in 40 ???
- The distribution is not Normal.
38Example
- The Normal shouldnt be used here (why not?)
39Example
- The Normal shouldnt be used here (why not?)
- The distribution of age at death is not Normal.
It is quite left skewed. - The sample size is not sufficiently large. (At
least 30 by your book, although for this
situation your instructor would probably buy into
as low as 20.) - ?
- The Central Limit Theorem cant be applied.
- The sample mean doesnt have approximate Normal
distribution
40Example
- I looked at 41 more recent obituaries (total of
48) - 79 70 48 99 85 71 45
- more ?? data
- 87 75 90 95 51 99 69
- 71 49 93 80 89 77 72
- 101 69 92 92 86 78 92
- 89 91 81 74 68 89 92
- 64 71 50 81 88 42 91
- 44 51 85 81 92 93
41Example
42Example
- What is the distribution of the sample mean of
samples of size n 48? - Even though age at death is left skewed, with n
48 (large enough) the Central Limit Theorem
applies, and the sample mean has approximate
Normal distribution.
43Example
- Normal
My data - Find the probability that a random sample of 48
U.S. womens deaths gives a sample mean 77.52 or
less. - Z (77.52 79.7) / 2.09 -2.18 / 2.09 -1.04
- Probability 0.1492
- About 15 of all samples of 48 deaths give a
sample mean 77.52 or less.
44Example
- The Census Bureau reports the average age at
death for Americans is 79.7 years. My data on
deaths for this region gave a mean of 77.52. - What most reasonably accounts for this difference
of over 2 years? - CHANCE
- Theres a 15 probability of this low or lower of
a result for a sample drawn randomly from such a
population. Theres a 30 probability of a result
as far or farther from 79.7 (in either direction).
45Example
- The Census Bureau reports the average age at
death for Americans is 79.7 years. My data on
deaths for this region gave a mean of 77.52. - What most reasonably accounts for this difference
of over 2 years? - H0 ?Oswego 79.7 H1 ?Oswego ? 79.7
- Theres a 15 probability of this low or lower of
a result for a sample drawn randomly from such a
population. Theres a 30 probability of a result
as far or farther from 79.7 (in either direction).
P-value!