Tendencia central y dispersi - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

Tendencia central y dispersi

Description:

Tendencia central y dispersi n de una distribuci n Tendencia central y dispersi n de una distribuci n Review Topics Measures of Central Tendency Mean, Median ... – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0
Slides: 30
Provided by: Department676
Category:

less

Transcript and Presenter's Notes

Title: Tendencia central y dispersi


1
Tendencia central y dispersión de una distribución
2
Review Topics
  • Measures of Central Tendency
  • Mean, Median, Mode
  • Quartile
  • Measures of Variation
  • The Range, Variance and
  • Standard Deviation, Coefficient of variation
  • Shape
  • Symmetric, Skewed

3
Important Summary Measures
One sample Summary Measures
Variation
Central Tendency
Quartile
Mean
Mode
Median
Coefficient of Variation
Range
Variance
Standard Deviation
4
Measures of Central Tendency
Central Tendency
Median
Mode
Mean
Data You can access practice sample data on HMO
premiums here.
5
Measures of Central Location (Tendency)
  • Usually, we focus our attention on two aspects of
    measures of central location
  • Measure of the central data point (the average).
  • Measure of dispersion of the data about the
    average.

With two data points, the central location
should fall in the middle between them (in order
to reflect the location of both of them).
If the third data point appears exactly in the
middle of the current range, the
central location should not change (because it
is currently residing in the middle).
With one data point clearly the central
location is at the point itself.
But if the third data point appears on the left
hand-side of the midrange, it should pull the
central location to the left.
6
  • Arithmetic mean
  • This is the most popular and useful measure of
    central location

Sample mean
Population mean
Sample size
Population size
7
  • Example 4.1

The mean of the sample of six measurements 7, 3,
9, -2, 4, 6 is given by
7
3
9
4
6
4.5
42.19
15.30
53.21
43.59
8
  • The median
  • The median of a set of measurements is the value
    that falls in the middle when the measurements
    are arranged in order of magnitude.

Even number of observations
First, sort the salaries. Then, locate the values
in the middle
First, sort the salaries. Then, locate the value
in the middle
There are two middle values!
29.5,
26,26,28,29,30,32,60,31
26,26,28,29, 30,32,60,31
26,26,28,29, 30,32,60,31
26,26,28,29, 30,32,60,31
9
  • The mode
  • The mode of a set of measurements is the value
    that occurs most frequently.
  • Set of data may have one mode (or modal class),
    or two or more modes.

For large data sets the modal class is much more
relevant than the a single- value mode.
The modal class
10
  • Example 4.6

A professor of statistics wants to report the
results of a midterm exam, taken by 100
students. The data appear in file XM04-06. Find
the mean, median, and mode, and describe the
information they provide.
The mean provides information about the over-all
performance level of the class.
The Median indicates that half of the class
received a grade below 81, and half of the
class received a grade above 81.
The mode must be used when data is qualitative.
If marks are classified by letter grade, the
frequency of each grade can be calculated.Then,
the mode becomes a logical measure to compute.
Excel Results
11
Relationship among Mean, Median, and Mode
  • If a distribution is symmetrical, the mean,
    median and mode coincide
  • If a distribution is non symmetrical, and skewed
    to the left or to the right, the three
    measures differ.

A positively skewed distribution (skewed to the
right)
Mode
Mean
Median
12
  • If a distribution is symmetrical, the mean,
    median and mode coincide
  • If a distribution is non symmetrical, and skewed
    to the left or to the right, the three measures
    differ.

A negatively skewed distribution (skewed to the
left)
A positively skewed distribution (skewed to the
right)
Mean
Mode
Mean
Mode
Median
Median
13
Measures of Variation
Variation
Variance
Standard Deviation
Coefficient of Variation
Range
Population Variance
Population Standard Deviation
Sample Variance
Sample Standard Deviation
Interquartile Range
14
Measures of variability(Looking beyond the
average)
  • Measures of central location fail to tell the
    whole story about the distribution.
  • A question of interest still remains unanswered

How typical is the average value of all the
measurements in the data set?
or
How much spread out are the measurements about
the average value?
15
Observe two hypothetical data sets
Low variability data set
The average value provides a good representation
of the values in the data set.
High variability data set
This is the previous data set. It is now
changing to...
The same average value does not provide as good
presentation of the values in the data set as
before.
16
  • The range
  • The range of a set of measurements is the
    difference between the largest and smallest
    measurements.
  • Its major advantage is the ease with which it can
    be computed.
  • Its major shortcoming is its failure to provide
    information on the dispersion of the values
    between the two end points.

But, how do all the measurements spread out?
The range cannot assist in answering this question
Range
Largest measurement
Smallest measurement
17
  • The variance
  • This measure of dispersion reflects the values of
    all the measurements.
  • The variance of a population of N measurements
    x1, x2,,xN having a mean m is defined as
  • The variance of a sample of n measurementsx1,
    x2, ,xn having a mean is defined as

18
Consider two small populations Population A 8,
9, 10, 11, 12 Population B 4, 7, 10, 13, 16
9-10 -1
11-10 1
8-10 -2
12-10 2
Let us start by calculating the sum of deviations
Thus, a measure of dispersion is needed that
agrees with this observation.
The sum of deviations is zero in both
cases, therefore, another measure is needed.
A
10
9
8
11
12
but measurements in B are much more
dispersed then those in A.
The mean of both populations is 10...
4-10 - 6
16-10 6
B
7-10 -3
13-10 3
7
4
10
13
16
19
9-10 -1
The sum of squared deviations is used in
calculating the variance. See example next.
11-10 1
8-10 -2
12-10 2
The sum of deviations is zero in both
cases, therefore, another measure is needed.
A
10
9
8
11
12
4-10 - 6
16-10 6
B
7-10 -3
13-10 3
7
4
10
13
16
20
Let us calculate the variance of the two
populations
Why is the variance defined as the average
squared deviation? Why not use the sum of squared
deviations as a measure of dispersion instead?
After all, the sum of squared deviations
increases in magnitude when the dispersion of a
data set increases!!
21
  • Example 4.8
  • Find the mean and the variance of the following
    sample of measurements (in years).
  • 3.4, 2.5, 4.1, 1.2, 2.8, 3.7
  • Solution

A shortcut formula
3.422.523.72-(17.7)2/6 1.075 (years)2
22
Sample Standard Deviation

For the Sample use n - 1 in the denominator.
s
Data 10 12 14
15 17 18 18 24
n 8 Mean 16
s
4.2426
23
Interpreting Standard Deviation
  • The standard deviation can be used to
  • compare the variability of several distributions
  • make a statement about the general shape of a
    distribution.
  • The empirical rule If a sample of measurements
    has a mound-shaped distribution, the interval

24
Comparing Standard Deviations
Data 10 12 14
15 17 18 18 24
N 8 Mean 16
s
4.2426
3.9686
Value for the Standard Deviation is larger for
data considered as a Sample.
25
Comparing Standard Deviations
Data A
Mean 15.5 s 3.338
11 12 13 14 15 16 17 18
19 20 21
Data B
Mean 15.5 s .9258
11 12 13 14 15 16 17 18
19 20 21
Data C
Mean 15.5 s 4.57
11 12 13 14 15 16 17 18
19 20 21
26
Measures of Association
  • Two numerical measures are presented, for the
    description of linear relationship between two
    variables depicted in the scatter diagram.
  • Covariance - is there any pattern to the way two
    variables move together?
  • Correlation coefficient - how strong is the
    linear relationship between two variables

27
  • The covariance

mx (my) is the population mean of the variable X
(Y) N is the population size. n is the sample
size.
28
  • The coefficient of correlation
  • This coefficient answers the question How strong
    is the association between X and Y.

29
Strong positive linear relationship
1 0 -1
COV(X,Y)gt0
or
r or r
No linear relationship
COV(X,Y)0
Strong negative linear relationship
COV(X,Y)lt0
Write a Comment
User Comments (0)
About PowerShow.com