# Numeric Summaries and Descriptive Statistics - PowerPoint PPT Presentation

PPT – Numeric Summaries and Descriptive Statistics PowerPoint presentation | free to download - id: 3ba54f-ZTdhM

The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
Title:

## Numeric Summaries and Descriptive Statistics

Description:

### Numeric Summaries and Descriptive Statistics populations vs. samples we want to describe both samples and populations the latter is a matter of inference ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 24
Provided by: stanfordE46
Category:
Tags:
Transcript and Presenter's Notes

Title: Numeric Summaries and Descriptive Statistics

1
Numeric Summaries and Descriptive Statistics
2
populations vs. samples
• we want to describe both samples and populations
• the latter is a matter of inference

3
outliers
• minority cases, so different from the majority
that they merit separate consideration
• are they errors?
• are they indicative of a different pattern?
• think about possible outliers with care, but
beware of mechanical treatments
• significance of outliers depends on your research
interests

4
(No Transcript)
5
summaries of distributions
• graphic vs. numeric
• graphic may be better for visualization
• numeric are better for statistical/inferential
purposes
• resistance to outliers is usually an advantage in
either case

6
general characteristics
peakedness
• kurtosis

leptokurtic? ?platykurtic
7
• skew (skewness)

8
(No Transcript)
9
central tendency
• measures of central tendency
• provide a sense of the value expressed by
multiple cases, over all
• mean
• median
• mode

10
mean
• center of gravity
• evenly partitions the sum of all measurement
among all cases average of all measures

11
mean pro and con
• crucial for inferential statistics
• mean is not very resistant to outliers
• a trimmed mean may be better for descriptive
purposes

12
mean
R mean(x)
13
trimmed mean
R mean(x, trim.1)
14
median
• 50th percentile
• less useful for inferential purposes
• more resistant to effects of outliers

15
median
16
mode
• the most numerous category
• for ratio data, often implies that data have been
grouped in some way
• can be more or less created by the grouping
procedure
• for theoretical distributionssimply the location
of the peak on the frequency distribution

17
modal class hamlets
isolated scatters
hamlets
villages
regional centers
regional centers
18
dispersion
• measures of dispersion
• summarize degree of clustering of cases, esp.
with respect to central tendency
• range
• variance
• standard deviation

19
range
R range(x)
20
variance
R var(x)
• analogous to average deviation of cases from mean
• in fact, based on sum of squared deviations from
the meansum-of-squares

21
variance
• computational form

22
• note units of variance are squared
• this makes variance hard to interpret
• ex. projectile point sample
• mean 22.6 mm
• variance 38 mm2
• what does this mean???

23
standard deviation
• square root of variance

24
standard deviation
• units are in same units as base measurements
• ex. projectile point sample
• mean 22.6 mm
• standard deviation 6.2 mm
• mean /- sd (16.428.8 mm)
• should give at least some intuitive sense of
where most of the cases lie, barring major
effects of outliers

25
(No Transcript)
26
trimmed dispersion measures
• variance and sd are even more sensitive to
extreme values (outliers) than the mean
• why??
• you can calculate a trimmed version of the
variance simply by eliminating cases from the
tails, and calculating the variance in the normal
way

27
trimmed standard deviation
• trimmed sd is calculated differently
• sT trimmed standard deviation n number of
cases in untrimmed batch s2w variance of
trimmed (winsorized) batch nT number of cases
in the trimmed batch