Chapter 2 Exploring Data with Graphs and Numerical Summaries - PowerPoint PPT Presentation

About This Presentation
Title:

Chapter 2 Exploring Data with Graphs and Numerical Summaries

Description:

Characteristics are observed. Characteristics are Variables. A Variable is any characteristic that is recorded for subjects in the study. Variation in Data ... – PowerPoint PPT presentation

Number of Views:254
Avg rating:3.0/5.0
Slides: 54
Provided by: katemcl
Category:

less

Transcript and Presenter's Notes

Title: Chapter 2 Exploring Data with Graphs and Numerical Summaries


1
Chapter 2Exploring Data with Graphs and
Numerical Summaries
  • Learn .
  • The Different Types of Data
  • The Use of Graphs to Describe
  • Data
  • The Numerical Methods of Summarizing Data

2
Section 2.1
  • What are the Types of Data?

3
In Every Statistical Study
  • Questions are posed
  • Characteristics are observed

4
Characteristics are Variables
  • A Variable is any characteristic that is
    recorded for subjects in the study

5
Variation in Data
  • The terminology variable highlights the fact that
    data values vary.

6
Example Students in a Statistics Class
  • Variables
  • Age
  • GPA
  • Major
  • Smoking Status

7
Data values are called observations
  • Each observation can be
  • Quantitative
  • Categorical

8
Categorical Variable
  • Each observation belongs to one of a set of
    categories
  • Examples
  • Gender (Male or Female)
  • Religious Affiliation (Catholic, Jewish, )
  • Place of residence (Apt, Condo, )
  • Belief in Life After Death (Yes or No)

9
Quantitative Variable
  • Observations take numerical values
  • Examples
  • Age
  • Number of siblings
  • Annual Income
  • Number of years of education completed

10
Graphs and Numerical Summaries
  • Describe the main features of a variable
  • For Quantitative variables key features are
    center and spread
  • For Categorical variables key feature is the
    percentage in each of the categories

11
Quantitative Variables
  • Discrete Quantitative Variables
  • and
  • Continuous Quantitative Variables

12
Discrete
  • A quantitative variable is discrete if its
    possible values form a set of separate numbers
    such as 0, 1, 2, 3,

13
Examples of discrete variables
  • Number of pets in a household
  • Number of children in a family
  • Number of foreign languages spoken

14
Continuous
  • A quantitative variable is continuous if its
    possible values form an interval

15
Examples of Continuous Variables
  • Height
  • Weight
  • Age
  • Amount of time it takes to complete an assignment

16
Frequency Table
  • A method of organizing data
  • Lists all possible values for a variable along
    with the number of observations for each value

17
Example Shark Attacks
18
Example Shark Attacks
Example Shark Attacks
  • What is the variable?
  • Is it categorical or quantitative?
  • How is the proportion for Florida calculated?
  • How is the for Florida calculated?

19
Example Shark Attacks
  • Insights what the data tells us about shark
    attacks

20
Identify the following variable as categorical or
quantitative
  • Choice of diet
  • (vegetarian or non-vegetarian)
  • Categorical
  • Quantitative

21
Identify the following variable as categorical or
quantitative
  • Number of people you have known who have been
    elected to political office
  • Categorical
  • Quantitative

22
Identify the following variable as discrete or
continuous
  • The number of people in line at a box office to
    purchase theater tickets
  • Continuous
  • Discrete

23
Identify the following variable as discrete or
continuous
  • The weight of a dog
  • Continuous
  • Discrete

24
Section 2.2
  • How Can We Describe Data Using Graphical
    Summaries?

25
Graphs for Categorical Data
  • Pie Chart A circle having a slice of pie for
    each category
  • Bar Graph A graph that displays a vertical bar
    for each category

26
Example Sources of Electricity Use in the U.S.
and Canada
27
Pie Chart
28
Bar Chart
29
Pie Chart vs. Bar Chart
  • Which graph do you prefer?
  • Why?

30
Graphs for Quantitative Data
  • Dot Plot shows a dot for each observation
  • Stem-and-Leaf Plot portrays the individual
    observations
  • Histogram uses bars to portray the data

31
Example Sodium and Sugar Amounts in Cereals
32
Dotplot for Sodium in Cereals
  • Sodium Data
  • 0 210 260 125 220 290 210 140
    220 200 125 170 250 150 170 70
    230 200 290 180

33
Stem-and-Leaf Plot for Sodium in Cereal
  • Sodium Data 0 210
  • 260 125
  • 220 290
  • 210 140
  • 220 200
  • 125 170
  • 250 150
  • 170 70
  • 230 200
  • 290 180

34
Frequency Table
  • Sodium Data
  • 0 210
  • 260 125
  • 220 290
  • 210 140
  • 220 200
  • 125 170
  • 250 150
  • 170 70
  • 230 200
  • 290 180

35
Histogram for Sodium in Cereals
36
Which Graph?
  • Dot-plot and stem-and-leaf plot
  • More useful for small data sets
  • Data values are retained
  • Histogram
  • More useful for large data sets
  • Most compact display
  • More flexibility in defining intervals

37
Shape of a Distribution
  • Overall pattern
  • Clusters?
  • Outliers?
  • Symmetric?
  • Skewed?
  • Unimodal?
  • Bimodal?

38
Symmetric or Skewed ?
39
Example Hours of TV Watching
40
  • Identify the minimum and maximum sugar values

2 and 14 1 and 3
1 and 15 0 and 16
41
Consider a data set containing IQ scores for the
general public
  • What shape would you expect a histogram of this
    data set to have?
  • Symmetric
  • Skewed to the left
  • Skewed to the right
  • Bimodal

42
Consider a data set of the scores of students on
a very easy exam in which most score very well
but a few score very poorly
  • What shape would you expect a histogram of this
    data set to have?
  • Symmetric
  • Skewed to the left
  • Skewed to the right
  • Bimodal

43
Section 2.3
  • How Can We describe the Center of Quantitative
    Data?

44
Mean
  • The sum of the observations divided by the number
    of observations

45
Median
  • The midpoint of the observations when they are
    ordered from the smallest to the largest (or from
    the largest to the smallest)

46
Find the mean and median
  • CO2 Pollution levels in 8 largest nations
    measured in metric tons per person
  • 2.3 1.1 19.7 9.8 1.8 1.2 0.7 0.2
  • Mean 4.6 Median 1.5
  • Mean 4.6 Median 5.8
  • Mean 1.5 Median 4.6

47
Outlier
  • An observation that falls well above or below the
    overall set of data
  • The mean can be highly influenced by an outlier
  • The median is resistant not affected by an
    outlier

48
Mode
  • The value that occurs most frequently.
  • The mode is most often used with categorical data

49
Section 2.4
  • How Can We Describe the Spread of Quantitative
    Data?

50
Measuring Spread Range
  • Range difference between the largest and
    smallest observations

51
Measuring Spread Standard Deviation
  • Creates a measure of variation by summarizing the
    deviations of each observation from the mean and
    calculating an adjusted average of these
    deviations

52
Empirical Rule
  • For bell-shaped data sets
  • Approximately 68 of the observations fall within
    1 standard deviation of the mean
  • Approximately 95 of the observations fall within
    2 standard deviations of the mean
  • Approximately 100 of the observations fall
    within 3 standard deviations of the mean

53
Parameter and Statistic
  • A parameter is a numerical summary of the
    population
  • A statistic is a numerical summary of a sample
    taken from a population
Write a Comment
User Comments (0)
About PowerShow.com