Measures of Central Tendency and Dispersion

Why Describe Central Tendency?

- Data often cluster around a central value that

lies between the two extremes. This single

number can describe the value of scores in the

entire data set. - There are three measures of central tendency.
- 1) Mean
- 2) Median
- 3) Mode

The Mode

- The mode is the most frequently occurring number

in a set of data. - E.g., Find the mode of the following numbers
- 15, 20, 21, 23, 23, 23, 25, 27, 30
- Also, if there are two modes, the data set is

bimodal. - If there are more than two modes, the data set is

said to be multimodal.

The Median

- The middle score when all scores in the data set

are arranged in order. - Half the scores lie above and half lie below the

median. - E.g., Find the median of the following numbers

10, 12, 14, 15, 17, 18, 20.

- When there are an even number of scores, you must

take the average of the middle two scores.

Eg., 10, 12, 14, 15, 17, 18 (14 15)/2 14.5.

- The median can also be calculated from a

frequency distribution. - E.g., A stats class received the following marks

out of 20 on their first exam.

X freq Cumulative freq 20

1 15 19 2 14 16 2 12 14

1 10 12 4 9 11

2 5 10 3 3

What is the median grade?

- Step 1 - Multiply 0.5 times N 1 to obtain the

location of the middle frequency. - 0.5(15 1) 8
- Step 2 - Locate this score on your frequency

distribution. - 12

Example in the book on pp. 77-78.

The Mean

- The is the sum of all the scores data set divided

by the number of scores in the set.

E.g., Whats the mean of the following test

scores? 56, 65, 75, 83, 92 x 371/5 74.2

- The mean can also be calculated using a frequency

distribution. - The following scores were obtained on a stats

exam marked out of 20.

X freq 20 1 19

2 16 2 14 1 12

4 11 2 10 3

Find the mean of the exam scores.

- Multiply each score by the frequency. Add them

together and divide by N

X freq 20 1 19

2 16 2 14 1 12

4 11 2 10 3

fX 20 38

32 14 48 22

30

X ?fX/N 204/15 13.6

N 15 ?fX 204

Characteristics of the Mean

- Summed deviations about the mean equal 0.

X - X 2 - 5 -3 3 - 5 -2 5 - 5 0 7 - 5 2 8

- 5 3

Score 2 3 5 7 __8__ ? X 25 X

5

?(x - x) 0

- The mean is sensitive to extreme scores.

Score 2 3 5 7 __8__ ? X 25 X

5

Score 2 3 5 7 __33__ ? X 50 X

10

Note, the median remains the same in both cases.

- The sum of squared deviations is least about the

mean (pp. 82-83).

Score 2 3 5 7 __8__ ? X 25 X

5

(X - X)2 (2 - 5)2 9 (3 - 5)2 4 (5 - 5)2

0 (7 - 5)2 4 (8 - 5)2 9

?(x - x)2 26

The Weighted Mean

- Used when a single mean must be calculated for

two or more groups of different sizes. - The different sizes of the groups is accounted

for or weighted.

- E.g., The means of five different stats classes

are as follows. Calculate the weighted mean.

Sample Mean f 58

45 64 42

77 83 62

38 52 45

The larger the sample, the greater its weight in

determining the overall mean.

- Step 1 Multiply each sample mean by the N of

that group. - Step 2 Add these products together.
- Step 3 Divide by the overall N.

Sample Mean f 58

45 64 42 77

83 62 38

52 45 N

253

fX 2610 2688 6391

2356 2340 ? fX 16385

X ? fX/Nt 16385/253 64.76

Comparison of the Mean, Median, and Mode

- The mode is the roughest measure of central

tendency and is rarely used in behavioral

statistics. - Mean and median are generally more appropriate.
- If a distribution is skewed, the mean is pulled

in the direction of the skew. In such cases, the

median is a better measure of central tendency.

Skewness of Distribution

- Comparing the mean and the median

Why Measure Dispersion?

- Measures of dispersion tell us how spread out the

scores in a data set are. Surely all scores will

not be equal to the mean. - There are four measures of dispersion we will

look at - Range (crude range)
- Semi-Interquartile Range
- Variance
- Standard Deviation

The Range

- The simplest measure of variability. Simply the

highest score minus the lowest score. - Limited by extreme scores or outliers.

E.g., Find the range in the following test

scores. 100, 74, 68, 68, 57, 56 Range H - L

100 - 56 44

Semi-Interquartile Range

- A measure of variability obtained by subtracting

the score at the 25th percentile (i.e., first

quartile) from the score at the 75th percentile

(i.e.,third quartile) and dividing by 2. - If the distribution is normal, this cuts off the

middle 50 of all cases.

Q1 Q2 Q3

- Cuts off approximately 50 of all cases if the

distribution is skewed (best measure in a skewed

distribution).

- E.g., An organic chemistry class obtains the

following distribution on their first exam...

What is the semi- interquartile range?

X f cf 91 1

32 61 1 31 59

4 30 56 6

26 54 7 20 52 5

13 50 4 8 47

3 4 45 1

1

- Step 1 - 25th percentile
- 0.25(32) 8
- The eighth score is at the 1st quartile.
- 50
- Step 2 - 75th percentile
- 0.75(32) 24
- The 24th score is at the 3rd quartile.
- 56

- Step 3 - Calculate the median.
- 0.5(N 1)
- 0.5(32 1) 16.5 17.
- The median is the 17th score.
- 54
- Step 4 - Subtract the 1st quartile from the 3rd

quartile and divide by 2. - (56 - 50)/2 3

- Step 4 - Add and subtract 3 from the median.
- 54 3 57
- 54 - 3 51
- 50 of the scores lie between 51 and 57.
- 54 3.

The Variance

- The sum of the squared deviations from the mean

divided by N.

Calculating Variance (Deviation Formula)

X X

- X (X - X)2

12 3

9 11

2

4 10

1

1 9

0 0

9 0

0 9

0

0 8

-1

1 7

-2

4 6

-3 9

? x 81 ? (x - x) 0

? (x - x)2 28 x 9

S2 ? (x - x)2 28 3.11

n 9

A Simpler Formula(raw score method)

X X2

SS ? x2 - (?? x)2 12

144 N

11 121

10 100

757 - (81)2 28 9

81

9 9 81

9 81 s2 SS

8

64 N

7 49

6 36

28 3.11 ? x 81 ? x2 757

9

Calculating Standard Deviation

- Simply calculate the square root of the variance.
- So if s2 from the previous example was 3.11, the

standard deviation (denoted by s) is 1.76.

