Measures of Central Tendency - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Measures of Central Tendency

Description:

Mean. Median. Mode. Centracidal Tendencies. Mean Girls. Mean most commonly used measure of center ... The mean is sensitive to extreme (very large or small) ... – PowerPoint PPT presentation

Number of Views:125
Avg rating:3.0/5.0
Slides: 29
Provided by: drjason5
Category:

less

Transcript and Presenter's Notes

Title: Measures of Central Tendency


1
Measures of Central Tendency
MARE 250 Dr. Jason Turner
2
Centracidal Tendencies
The measure of central tendency indicates where
along the measurement scale the sample or
population is located can be determined via
various measures Three most important Mean Med
ian Mode
3
Mean Girls
Mean most commonly used measure of center sum
of the observations divided by the number of
observations
4
The Median
"As we were driving, we saw a sign that said
"Watch for Rocks." Martha said it should read
"Watch for Pretty Rocks." I told her she should
write in her suggestion to the highway
department, but she started saying it was a joke
- just to get out of writing a simple letter! And
I thought I was lazy! Jack Handy
The median is typically defined as the middle
measurement in an ordered set of data Separates
the bottom 50 of the data from the top 50
5
The Mode
Oh, no way - where?  Holy crap, he's with a
girl! But he's the guy from Depeche Mode! 
That's impossible! Come on, he's in Depeche
Mode! - The Monarch
The mode is typically defined as the most
frequently occurring measurement in a set of
data The mode is useful if the distribution is
skewed or bimodal (having two very pronounced
values around which data are concentrated)
6
You are so totally skewed!
The mean is sensitive to extreme (very large or
small) observations and the median is
not Therefore you can determine how skewed
your data is by looking at the relationship
between median and mean
Mean is Greater than the Median
Mean and Median are Equal
Mean is Less Than the Median
7
Resistance Measures
A resistance measure is not sensitive to the
influences of a few extreme observations Median
resistant measure of center Mean not
Resistance of Mean can be improved by using
Trimmed Means a specified percentage of the
smallest and largest observations are removed
before computing the mean Will do something like
this later when exploring the data and evaluating
outliers(their effects upon the mean)
8
How To on Computer
On Minitab Your data must be in a single
column Go to the 'Stat' menu, and select 'Basic
stats', then 'Display descriptive stats'.
Select your data column in the 'variables' box.
The output will generally go to the session
window, or if you select 'graphical summary' in
the 'graphs' options, it will be given in a
separate window. This will give you a number of
basic descriptive stats, though not the mode.
9
Measures of Dispersion and Variability
MARE 250 Dr. Jason Turner
10
Please Disperse!
Alright everyone, disperse immediately. We are
prepared to use force a-- what, what? We're not
prepared, Eddie? Someone call 911! Chief Wiggum
Measure of Dispersion of the Data - an indication
of the spread of measurements around the center
of the distribution 2 of the most frequently
used Range Standard Deviation
11
The Range
Range - the difference between the highest and
lowest values in the observations This is
useful, but may be misleading when the data has
one or more outliers (single measurements that
are exceptionally large or small relative to the
other data) It is not relative to the central
location Range Max - Min
12
The Variance
Variance - the average of the squared deviations
from the mean The most widely used measure of
spread, and one that will be used often in
various statistical applications
13
The Variance
Degrees of Freedom - quantity (n -1) Used
instead of n to provide an unbiased estimate of
the population variance As the sample size (n)
increases (and n approaches N) Value of the
population and sample variance will become more
similar
14
Standard Deviation
Standard Deviation the positive square root of
the variance Indicates how far (on average)
the observations in the sample are from the mean
of the sample The more variation in a data
set, the larger its standard deviation
15
Quartiles
Median divides data into 2 equal parts 50
bottom, 50 top Quartiles into quarters 4
equal parts A dataset has 3 quartiles Q1
is the number that divides the bottom 25 from
top 75 Q2 is the median bottom 50 from top
50 Q3 is the number that divides the bottom
75 from top 25
16
Quartiles
17
Interquartile Range
Interquartile Range (IQR) the difference
between the first and third quartiles IQR Q3
Q1
The IQR gives you the range of the middle 50 of
the data
18
Outlier, Outlier
Outliers observations that fall well outside
the overall pattern of the data Requires special
attention May be the result of Measurement or
Recording Error Observation from a different
population Unusual Extreme observation
19
Pants on Fire
Must deal with outliers (Yes, really!) If error
can delete otherwise judgment call Can use
quartiles and IQR to identify potential outliers
20
The Outer Limits
Lower and Upper Limits Lower limit is the
number that lies 1.5 IQRs below the first
quartile Lower Limit Q1 - 1.5 IQR Upper
limit is the number that lies 1.5 IQRs above
the first quartile Upper Limit Q3 1.5 IQR
21
The Outer Limits
If a value is outside the Outer Limits of a
dataset it is an
Outlier
22
Five-Number Summary
5-Number Summary Min, Q1, Q2, Q3, Max Written
in increasing order Provides information on
Center and Variation Are used to construct
Box-Plots
23
Boxplots
Boxplot (Box-and-Whisker-Design) based on
the 5-number summary provide graphic display
of the center and variation
Q1
Q2
Q3
Min
Max
0
70
24
Boxplots
Modified Boxplot includes outliers
Potential Outlier

0
70
Note that Min Max are determine after outliers
are removed!
25
Boxplots
26
Boxplots
Boxplots summarize information about the shape,
dispersion, and center of your data. They can
also help you spot outliers. The left edge of the
box represents the first quartile (Q1), while the
right edge represents the third quartile (Q3).
Thus the box portion of the plot represents the
interquartile range (IQR), or the middle 50 of
the observations
Q1
Q2
Q3
Min
Max
0
70
27
Boxplots
The line drawn through the box represents the
median of the data The lines extending from the
box are called whiskers. The whiskers extend
outward to indicate the lowest and highest values
in the data set (excluding outliers) Extreme
values, or outliers, are represented by dots. A
value is considered an outlier if it is outside
of the box (greater than Q3 or less than Q1) by
more than 1.5 times the IQR
Potential Outlier

0
70
28
Boxplots
Use the boxplot to assess the symmetry of the
data If the data are fairly symmetric, the
median line will be roughly in the middle of the
IQR box and the whiskers will be similar in
length If the data are skewed, the median may
not fall in the middle of the IQR box, and one
whisker will likely be noticeably longer than the
other
Write a Comment
User Comments (0)
About PowerShow.com