Title: Describing Distributions with Numbers
1Chapter 12
- Describing Distributions with Numbers
2Thought Question 1
If you were to read the results of a study
showing that daily use of a certain exercise
machine resulted in an average 10-pound weight
loss, what more would you want to know about the
numbers in addition to the average? (Hint Do you
think everyone who used the machine lost 10
pounds?)
3Thought Question 2
A recent newspaper article in California said
that the median price of single-family homes sold
in the past year in the local area was 136,000
and the average price was 149,160. How do you
think these values are computed? Which do you
think is more useful to someone considering the
purchase of a home, the median or the average?
4Thought Question 3
The Stanford-Binet IQ test is designed to have a
mean, or average, for the entire population, of
100. It is also said to have a standard
deviation of 16. What do you think is meant by
the term standard deviation?
5Turning Data Into Information
- Center of the data
- mean
- median
- mode
- Variation
- variance
- standard deviation
- range
- interquartile range
6Average or Mean
- Traditional measure of center
- Sum the values and divide by the number of values
7Median (M)
- A resistant measure of the datas center
- At least half of the ordered values are less than
or equal to the median value - At least half of the ordered values are greater
than or equal to the median value - If n is odd, the median is the middle ordered
value - If n is even, the median is the average of the
two middle ordered values
8Median
- Example 1 data 2 4 6
- Median (M) 4
- Example 2 data 2 4 6 8
- Median 5 (ave. of 4
and 6) - Example 3 data 6 2 4
- Median ? 2
- (order the values 2 4 6 , so Median 4)
9Case Study
Weight Data
STAT 208 Class SurveySpring, 1997 Virginia
Commonwealth University
10Weight DataMeasures of Center
10 0166 11 009 12 0034578 13 00359 14 08 15
00257 16 555 17 000255 18 000055567 19 245 20
3 21 025 22 0 23 24 25 26 0
11Case Study
Airline fares
appeared in the New York Times on November 5, 1995
- ...about 60 of airline passengers pay less
than the average fare for their specific
flight. - How can this be?
13 of passengers pay more than 1.5 times the
average fare for their flight
12Variance and Standard Deviation
- If all values are the same, then they are all
equal to the mean. There is no variation. - Variation exists when some values are above or
below the mean. - Each data value has an associated deviation from
the mean
13Deviations
- what is a typical deviation from the mean?
- small values of this typical deviation indicate
small variation in the data - large values of this typical deviation indicate
large variation in the data
14Variance
- Find the mean
- Find the deviation of each value from the mean
- Square the deviations
- Sum the squared deviations
- Divide the sum by n-1
15Variance Formula
16Standard Deviation Formulatypical deviation from
the mean
17Traditional Summary StatisticsWeight Data
- Mean 158.75
- Standard deviation 35.65
18Quartiles
- Three numbers which divide the ordered data into
four equal sized groups. - Q1 has 25 of the data below it.
- Q2 has 50 of the data below it. (Median)
- Q3 has 75 of the data below it.
19QuartilesUniform Histogram
20Obtaining the Quartiles
- Order the data.
- For Q2, just find the median.
- For Q1, look at the lower half of the data
values, those to the left of the median find
the median of this lower half. - For Q3, look at the upper half of the data
values, those to the right of the median find
the median of this upper half.
21Weight Data Sorted
22Weight Data Quartiles
- Q1 127.5
- Q2 165 (Median)
- Q3 185
23Weight DataQuartiles
10 0166 11 009 12 0034578 13 00359 14 08 15
00257 16 555 17 000255 18 000055567 19 245 20
3 21 025 22 0 23 24 25 26 0
24Five-Number Summary
- minimum 100
- first quartile 127.5
- second quartile 165
- third quartile 185
- maximum 260
range max ? min 160
25Five-Number Summary Boxplot
26Outliers
- Affect value of the mean and standard deviation
- Median and five-number summary should be used to
describe center and spread when outliers are
present
27Number of Books Read for Pleasure Sorted
Med
28Five-Number Summary Boxplot
- Median 3
- interquartile range (iqr) 5.5-1.0 4.5
- range 99-0 99
Mean 7.06 s.d. 14.43
29Number of Books Read for Pleasure
30Key Concepts
- Numerical Summaries
- Center (mean, median)
- Variation (variance, std. dev., range, iqr)
- Five-number summary Boxplots
- Choosing mean versus median
- Choosing standard deviation versus five-number
summary