ENMA 420/520 Statistical Processes Spring 2007 - PowerPoint PPT Presentation

Loading...

PPT – ENMA 420/520 Statistical Processes Spring 2007 PowerPoint presentation | free to download - id: 6469e1-OWQ4N



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

ENMA 420/520 Statistical Processes Spring 2007

Description:

ENMA 420/520 Statistical Processes Spring 2007 Michael F. Cochrane, Ph.D. Dept. of Engineering Management Old Dominion University ENMA 420/520 Syllabus Instructor ... – PowerPoint PPT presentation

Number of Views:0
Avg rating:3.0/5.0
Date added: 11 October 2019
Slides: 77
Provided by: Autho496
Learn more at: http://www.odu.edu
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: ENMA 420/520 Statistical Processes Spring 2007


1
ENMA 420/520Statistical ProcessesSpring 2007
  • Michael F. Cochrane, Ph.D.
  • Dept. of Engineering Management
  • Old Dominion University

2
ENMA 420/520Syllabus
  • Instructor
  • Michael F. Cochrane, Ph.D.
  • Chief Analyst, US Joint Forces Command, Joint
    Experimentation Directorate (J9)
  • Phones
  • 757-203-5131 (Office)
  • mcochran_at_odu.edu
  • http//www.odu.edu/engr/mcochrane

3
ENMA 420/520Syllabus - Continued
  • Office hours (by appointment)
  • Textbooks
  • Statistics for Engineering and the Sciences
  • Mendenhall, William and Sincich, Terry
  • Data Analysis with Microsoft Excel
  • Berk and Carey

4
ENMA 420/520Syllabus - Continued
  • Course Objectives
  • Target audience
  • Never taken statistics or need refresher
  • Purpose
  • Foundation linkages to engineering applications

5
ENMA 420/520Syllabus - Continued
  • Spreadsheet software
  • Excel
  • Theory ? application
  • Data Analysis add-in to Excel (/Tools/Data
    Analysis)
  • Software as an enabling tool
  • Availability of more powerful statistical sw
  • JMP, SAS, SPSS, _at_Risk
  • Not as prevalent as spreadsheet sw
  • Targeted to specialized users
  • Will demo

6
ENMA 420/520Syllabus - Continued
End of course project for 520 students
  • Course format
  • Class schedule in syllabus
  • Two examinations
  • Mid term gt take home
  • Final gt in-class
  • Two quizzes gt take home
  • Homework
  • Assigned but not graded

7
ENMA 420/520Syllabus - Continued
  • ENMA 520
  • Mid-term exam 30 points
  • Final exam 30 points
  • Two quizzes 30 points
  • Course project 10 points
  • ENMA 420
  • Mid-term exam 30 points
  • Final exam 30 points
  • Two quizzes 40 points

8
ENMA 420/520Syllabus - Continued
  • Do not copy everything on board!!!
  • Use the web (www.odu.edu/engr/mcochrane )

9
ENMA 420/520Syllabus - Continued
  • The ODU Honor Code

10
Probability StatisticsGetting Started
  • Flow of course
  • Statistics as means of information processing
  • Introduction to probability
  • Standard probability models
  • Statistics as means of inferring population
    characteristics
  • Alternative approach to inferential statistics

11
Getting StartedClass 1
  • Reading assignments
  • M S
  • Sections 1.1 - 1.4 (Introduction)
  • Sections 2.1 - 2.8 (Descriptive Statistics)
  • Recommended Problems
  • MS Chapter 2 45-48, 53-56

The textbook chapters will always be from
the syllabus!
12
Statistics Principal Branches
  • Descriptive Statistics
  • Organizing
  • Summarizing
  • Describing
  • Contrasting
  • Develop insights into
  • data sets
  • Inferential Statistics
  • Inferring
  • Estimating
  • Modeling
  • Develop insights into populations

13
Data Sets Populations Samples
  • Population (universe)
  • Enumerative data set
  • Examples of population data sets?
  • Sample
  • Subset of data from population
  • Examples of sample data sets?

Important to distinguish between a population
a sample
Parameters - characteristics of
population Statistic - characteristic of sample
14
Are All Data the Same?Measurable Data
  • Quantitative data
  • Measurable quantity (numerically valued scales)
  • Example of quantitative data?
  • Interval data
  • Distinct units of distance but no zero
  • Example?
  • Ratio data
  • Distinct units of distance and a zero
  • Example?

15
Types of DataNon-Measurable Data
  • Qualitative (categorical) data
  • Non-measurable quantity
  • Caution sometimes quantified for convenience
  • Example of qualitative data?
  • Nominal data
  • No meaningful order
  • Example?
  • Ordinal
  • Distinct ranking possible
  • Example?

16
Recurring Key Concept A Data Detective
  • A data detective
  • Essential for modeling
  • Understand data underlying processes
  • Prerequisite first step
  • Population or sample?
  • Type of data?
  • Data patterns
  • Data as clue to process
  • Next step - describing data!

17
Descriptive Statistics
  • Digest data
  • Arrange or present data
  • Develop summary characteristics of data
  • Communicating with data

18
Topics and Concepts
  • Describing qualitative data
  • Various graphical methods
  • Describing quantitative data
  • Graphical methods
  • Numerical descriptions of data sets
  • Important concepts
  • Understand strengths weaknesses of methods
  • Suitability of method to specific applications
  • Present data to highlight insights

19
Describing Qualitative Data
  • Basic steps
  • Define categories
  • Assign observations ? categories
  • Category frequency
  • Category relative frequency
  • Present graphically
  • Key concept
  • Minimize category ambiguity
  • Observation ? 1 only 1 category
  • Examples?

20
Categorical DataPrincipal Graphing Techniques
  • Visualizing categorical data
  • Histograms
  • Frequency
  • Relative frequency
  • Cumulative frequency
  • Pie charts
  • Pictographs
  • Use common sense
  • What is message you are communicating?

21
Visualizing Categorical DataExample Problem
  • Problem Cause of accidents in Florida (1988)

22
Categorical DataExample Problem
  • Many different ways to present data
  • Use good judgment
  • Which approach highlights your message
  • What if you had data for 1988 1989?
  • What if you had data for many years?

23
Visualizing DataGraphing Quantitative Data
  • Same fundamental purpose
  • Communicate information
  • Gain insights into data sets, relationships
  • Methods to be discussed
  • Dot plots
  • Stem leaf diagrams
  • Histograms (relative cumulative frequency)

24
Visualizing Quantitative DataExample Data Set
  • Took sample of home mortgage rates in a
    neighborhood

25
Dot PlotA Quick Dirty Method
  • Simplest graph
  • Suitable for small data sets
  • Single axis ? scale that spans data range
  • Range ? minimum to maximum values
  • Each observation is dot on axis
  • Most statistical software includes
  • Excel does not provide

26
Example Dot Plot
  • Constructed using MiniTab
  • What are strengths weaknesses?

27
Dot Plot Summary
  • Advantages
  • Easy to construct (back of napkin)
  • Identify range, possible outliers, distribution
    of data
  • Outlier ? highly unusual observation
  • Disadvantages
  • Limited to small data sets
  • May be difficult to reconstruct original data set
  • Have to be careful with scale

28
Dot Plot Compressed Scaling
  • Note contrast with previous dot plot
  • Lose observation measurability
  • However can now observe data groupings
  • Before ? observations evenly distributed
  • Now ? see mound shaped distribution

29
Stem Leaf DiagramsAlternative Graphing
Technique
  • Steps in constructing
  • Divide observation into 2 parts
  • Stem leaf (choose convenient scales)
  • 8.2
  • List stems in order
  • Proceed through all observations
  • Arrange leaves in order

30
Stem Leaf DiagramExample Data Set
  • Stem-and-leaf of Rate
  • Leaf Unit 0.10
  • 2 6 08
  • 8 7 024579
  • (6) 8 001235
  • 6 9 025
  • 3 10 058
  • Stem-and-leaf of Rate
  • Leaf Unit 0.10
  • 1 6 0
  • 2 6 8
  • 5 7 024
  • 8 7 579
  • (5) 8 00123
  • 7 8 5
  • 6 9 02
  • 4 9 5
  • 3 10 0
  • 2 10 58

31
Stem Leaf Diagrams Summary
  • Advantages
  • Visualize data groupings
  • Can recreate original data set
  • Simple to construct
  • Disadvantages
  • Limited to small data sets

32
HistogramsDealing With Large Data Sets
  • Visualizing large data sets
  • Aggregate observations into classes
  • Observations lose individual identity
  • Example data
  • Possible classes
  • 6 - 7
  • 7 - 8
  • 8 - 9
  • and so forth

33
HistogramsFour Easy Steps
  • Determine range of data
  • Divide range into convenient class intervals
  • Key step
  • Consider open intervals for extremes
  • Text mentions rules of thumb
  • Excel will do it for you (do not let it!)
  • Count observations in each interval
  • Graph

34
HistogramExample Problem Using Excel
  • Consider home mortgage problem
  • Build
  • Histogram
  • Relative frequency histogram
  • Cumulative frequency histogram
  • Also called ogive
  • Pareto diagram

35
Insert Excel Demo Here
36
(No Transcript)
37
(No Transcript)
38
(No Transcript)
39
(No Transcript)
40
(No Transcript)
41
Quantitative DataNumerical Descriptors
  • Develop numeric characterizations of data
  • Central tendency
  • Locate general location (center) of data
  • Variation
  • Describe dispersion (spread) of observations in
    set
  • Relative standing
  • Describe observations relative to others in set
  • Note each provides different perspective of data.

42
Measures of Central Tendency
  • Three most common measures
  • Mean
  • Median
  • Mode
  • Others exist
  • Trimmed mean
  • Truncated mean
  • Conditional mean

43
Measures of Central TendencyThe Average
44
Measures of Central TendencyMedian
  • Median - the middle observation
  • 50 of observations below, 50 above
  • m? median of sample
  • ? ? median of population
  • A resistant measure
  • Relatively insensitive to extreme values
  • Contrast to the mean!

45
Determining the Median
46
Determining the Median Even Numbered Example
47
Determining the Median Odd Numbered Example
48
Measures of Central TendencyThe Mode
  • Mode of Y
  • Value yi that occurs with most frequency
  • Note
  • Mode may or may not exist
  • Y1, 2, 4, 6
  • There may be one or more modes
  • One mode Unimodal
  • Two modes Bimodal
  • More than two modes Multimodal

49
Measures of Central TendencyExcel Special
Functions
  • Mean
  • Average ( )
  • Median
  • Median ( )
  • Mode
  • Mode ( )

50
Types of Frequency CurvesA Symmetric Frequency
Distribution
Where are the mean, median and mode?
51
Types of Frequency CurvesA Symmetric Frequency
Distribution
Where are the mean, median and mode?
52
Types of Frequency CurvesAn Asymmetric Frequency
Distribution
Skewed to the Right, Right Tailed, or Positive
Skewness Where are the mean, median and mode?
53
Types of Frequency CurvesAn Asymmetric Frequency
Distribution
Skewed to the Right, Right Tailed, or Positive
Skewness Where are the mean, median and mode?
54
Types of Frequency CurvesAn Asymmetric Frequency
Distribution
Skewed to the Left, Left Tailed or Negative
Skewness Where are the mean, median and mode?
55
Types of Frequency CurvesAn Asymmetric Frequency
Distribution
Skewed to the Left, Left Tailed or Negative
Skewness Where are the mean, median and mode?
56
Measures of Dispersion
  • Central tendency incomplete/misleading picture
  • Consider 3 sets of student grades
  • Y1 60, 60, 100, 100 What is the mean?
  • Y2 30, 100, 95, 95 What is the mean?
  • Y3 75, 80, 80, 85 What is the mean?

57
Measures of Dispersion
  • Will discuss
  • Range
  • Variance
  • Standard deviation

58
Measure of DispersionThe Range
  • Range ? Max(yi) - Min(yi)
  • Note Excel special functions
  • Is the range resistant to extreme values?

59
Measures of DispersionVariance and Standard
Deviation
60
General CommentsVariances
  • Why not use average deviation from the mean as a
    measure of dispersion?
  • Why is denominator (n-1) for a sample?
  • What happens to variances of samples and
    populations as n increases?
  • Make sure you can use relevant Excel functions
  • Easier expression

61
Distribution Heuristics
  • Empirical rule
  • Assume a bell shaped frequency curve
  • Distribution of observations about mean
  • 68 of observations ? s
  • 95 of observations ? 2s
  • gt 99 of observations ? 3s
  • What is basis?

62
Distribution Heuristics
  • Camp-Meidel Inequality
  • Assume unimodal distribution

63
Distribution Heuristics
  • Chebyshevs Rule
  • For any distribution (no assumptions)
  • For k gt1

64
Distribution HeuristicsExample
  • Resistors have following characteristics
  • ? 200? and ? 2 ?
  • Chebyshevs Rule
  • k2 at least 75 of resistors within 200?4
  • No assumptions needed
  • Camp-Meidel
  • k2 at least 88.9 of resistors within 200?4
  • Assumption distribution is unimodal
  • Empirical Rule
  • 95 of resistors within 200?4
  • Assumption distribution is bell shaped
  • Note impact of the assumptions.

65
Measure of DispersionCoefficient of Variation
  • Data set one (a sample)
  • Y15, 10, 12, 15
  • mean 10.5, std. deviation 4.20
  • Data set two (a sample)
  • Y250, 100, 120, 150
  • mean 105, std. deviation 42.03
  • Is Y2 more widely dispersed than Y1??

66
Coefficient of Variation
  • Notes
  • CV compensates for different scaling
  • CV is unitless
  • Can compare distributions with different units
  • Fails to be useful when mean is ? 0

67
Measures of Relative StandingPercentiles
68
Special Percentiles
  • QL - lower quartile or 25th percentile
  • Qm - mid-quartile or 50th percentile (median)
  • QU - upper quartile or 75th percentile
  • Suppose that 650 is QU for GRE scores, what can
    you say about peoples scores?

69
Data Detective
  • Exploratory data analysis
  • Understanding data
  • Six characteristics of a data set
  • Shape - unimodal, bimodal?
  • Location - measures of central tendency
  • Spread - variance
  • Outliers - unusual observations in data set
  • Clustering - similar to modality
  • Granularity - what discrete values are allowed?

70
Reasons for outliers
  • Measurement errors
  • Observation comes from different population
  • Rare occurrence

71
Identifying OutliersEmpirical Rule
  • Objective
  • Determine how many standard deviations
    observation is from mean
  • Approach
  • Standardized variables ? z scores

How many std. deviations is yi away from the mean?
72
Identifying OutliersUsing z Scores
  • Suppose you had observation with z 3.5
  • What would you naturally conclude?
  • Note
  • May be used on populations or samples

73
Identifying OutliersBoxplots
  • Another method for visualizing data set
  • Data variation
  • Identification of outliers
  • Need to understand components of boxplot

74
Boxplot Example Car Horsepower Ratings
Outer fence
Inner fence
75th percentile
Mean (104.5)
Median (94)
25th percentile
Inner fence
75
Descriptive Statistics
  • The art of communications
  • Learn the tools available
  • Understand purpose of analysis
  • Use common sense
  • Look for insights into process/experiment

76
Homework Assignment Class 1
  • Reading assignments
  • M S
  • Sections 1.1 - 1.4 (Introduction)
  • Sections 2.1 - 2.8 (Descriptive Statistics)
  • B C
  • Chapters 1 through 4
  • Recommended Problems
  • MS Chapter 2 45-48, 53-56
About PowerShow.com