Summarizing Data - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Summarizing Data

Description:

The average (mean) of the n numbers. Three Different Shapes for a ... standard deviation of a set of measurements is the positive square root of the variance. ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 31
Provided by: richardu
Category:
Tags: data | number | of | root | square | summarizing

less

Transcript and Presenter's Notes

Title: Summarizing Data


1
Summarizing Data
2
Overview
  • To frame our discussion, consider

3
Outline
  • Distribution
  • Measures of Central Tendency
  • Measures of Variability

4
Distribution
  • The distribution of a variable tells us the
    values it takes, and how often it takes the
    values. The distribution is a summary of the
    frequency of individual values or ranges of
    values for a variable.

5
Description
  • A distribution is described by its shape, its
    center, and its spread.

6
Histogram Shapes
symmetric unimodal
bimodal
positively skewed
negatively skewed
7
Measures of Central Tendency
  • Mode
  • Median
  • Mean

8
Mode
  • The mode of a set of measurements is the
    measurement that occurs most often. The
    measurement with the highest frequency.

9
Median
The sample median, is the middle value in a
set of data that is arranged in ascending order.
For an even number of data points the median is
the average of the middle two.
Population median
10
Mean
  • The mean of a set of measurement is the sum of
    the measurements divided by the total number of
    measurements.

11
The Mean
The average (mean) of the n numbers
12
Three Different Shapes for a Population
Distribution
symmetric
positive skew
negative skew
13
Summary
  • Mode and median are not influenced by outliers.
  • Outliers influence the mean.
  • Mode is useful for qualitative and quantitative
    data.
  • Median and mean are useful for quantitative data
    only.

14
Measures of Variability
  • Range
  • Percentile
  • Interquartile Range
  • Variance
  • Standard Deviation

15
Range
  • The range of a set of measurements is the
    difference between the largest and the smallest
    measurement.

16
Percentile
  • The pth percentile of a set of ordered
    measurements is the value that has p below it
    and (100-p) above it.

17
Interquartile Range
  • The interquartile range of a set of measurements
    is the difference between the upper and lower
    quartiles.
  • IQR 75 percentile - 25 percentile

18
Upper and Lower Fourths
After the n observations in a data set are
ordered from smallest to largest, the lower
(upper) fourth is the median of the smallest
(largest) half of the data, where the median
is included in both halves if n is odd. A
measure of the spread that is resistant to
outliers is the fourth spread fs upper
fourth lower fourth (IQR).
19
Outliers
Any observation farther than 1.5fs from the
closest fourth is an outlier. An outlier is
extreme if it is more than 3fs from the nearest
fourth, and it is mild otherwise.
20
Boxplots
upper fourth
lower fourth
Scale
median
extreme outliers
mild outliers
21
Variance
  • The variance of a set of n measurements is the
    sum of the squared deviations divided by either n
    or n-1.
  • The choice of n or n-1 depends on whether we are
    dealing with a population or a sample from that
    population.

22
Sample Variance
Variance is a measure of the spread of the data.
The sample variance of the sample x1, x2, xn of
n values of X is given by
We refer to s2 as being based on n 1 degrees of
freedom.
23
Standard Deviation
Standard deviation is a measure of the spread of
the data using the same units as the data.
The sample standard deviation is the square root
of the sample variance
24
Formula for s2
An alternative expression for the numerator of s2
is
25
Properties of s2
Let x1, x2,,xn be any sample and c be any
nonzero constant.
where is the sample variance of the xs and
is the sample variance of the ys.
26
Standard Deviation
  • The standard deviation of a set of measurements
    is the positive square root of the variance.

27
Example
28
Distribution
29
Distribution (3)
30
Distribution (4)
Write a Comment
User Comments (0)
About PowerShow.com