Analyze Phase Introduction to Hypothesis Testing - PowerPoint PPT Presentation

About This Presentation
Title:

Analyze Phase Introduction to Hypothesis Testing

Description:

Title: Analyze - Intro to Hypothesis Testing Subject: Analyze - Intro to Hypothesis Testing Author: Open Source Six Sigma Keywords: Analyze - Intro to Hypothesis Testing – PowerPoint PPT presentation

Number of Views:182
Avg rating:3.0/5.0
Slides: 29
Provided by: OpenSourc63
Category:

less

Transcript and Presenter's Notes

Title: Analyze Phase Introduction to Hypothesis Testing


1
Analyze PhaseIntroduction to Hypothesis Testing
2
Hypothesis Testing (ND)
Welcome to Analyze
X Sifting
Inferential Statistics
Hypothesis Testing Purpose
Tests for Central Tendency
Intro to Hypothesis Testing
Tests for Variance
Hypothesis Testing ND P1
ANOVA
Hypothesis Testing ND P2
Hypothesis Testing NND P1
Hypothesis Testing NND P2
Wrap Up Action Items
3
Six Sigma Goals and Hypothesis Testing
  • Our goal is to improve our Process Capability,
    this translates to the need to move the process
    Mean (or proportion) and reduce the Standard
    Deviation.
  • Because it is too expensive or too impractical
    (not to mention theoretically impossible) to
    collect population data, we will make decisions
    based on sample data.
  • Because we are dealing with sample data, there is
    some uncertainty about the true population
    parameters.
  • Hypothesis Testing helps us make fact-based
    decisions about whether there are different
    population parameters or that the differences are
    just due to expected sample variation.

4
Purpose of Hypothesis Testing
  • The purpose of appropriate Hypothesis Testing is
    to integrate the Voice of the Process with the
    Voice of the Business to make data-based
    decisions to resolve problems.
  • Hypothesis Testing can help avoid high costs of
    experimental efforts by using existing data. This
    can be likened to
  • Local store costs versus mini bar expenses.
  • There may be a need to eventually use
    experimentation, but careful data analysis can
    indicate a direction for experimentation if
    necessary.
  • The probability of occurrence is based on a
    pre-determined statistical confidence.
  • Decisions are based on
  • Beliefs (past experience)
  • Preferences (current needs)
  • Evidence (statistical data)
  • Risk (acceptable level of failure)

5
The Basic Concept for Hypothesis Tests
  • Recall from the discussion on classes and cause
    of distributions that a data set may seem Normal,
    yet still be made up of multiple distributions.
  • Hypothesis Testing can help establish a
    statistical difference between factors from
    different distributions.

Did my sample come from this population? Or
this? Or this?
6
Significant Difference
  • Are the two distributions significantly
    different from each other? How sure are we of our
    decision?
  • How do the number of observations affect our
    confidence in detecting population Mean?

??
??
Sample 2
Sample 1
7
Detecting Significance
  • Statistics provide a methodology to detect
    differences.
  • Examples might include differences in suppliers,
    shifts or equipment.
  • Two types of significant differences occur and
    must be well understood, practical and
    statistical.
  • Failure to tie these two differences together is
    one of the most common errors in statistics.

HO The sky is not falling. HA The sky is
falling.
8
Practical vs. Statistical
  • Practical Difference The difference which
    results in an improvement of practical or
    economic value to the company.
  • Example, an improvement in yield from 96 to 99
    percent.
  • Statistical Difference A difference or change
    to the process that probably (with some defined
    degree of confidence) did not happen by chance.
  • Examples might include differences in suppliers,
    markets or servers.

We will see that it is possible to realize a
statistically significant difference without
realizing a practically significant difference.
9
Detecting Significance
  • During the Measure Phase, it is important that
    the nature of the problem be well understood.
  • In understanding the problem, the practical
    difference to be achieved must match the
    statistical difference.
  • The difference can be either a change in the
    Mean or in the variance.
  • Detection of a difference is then accomplished
    using statistical Hypothesis Testing.

Mean Shift
Variation Reduction
10
Hypothesis Testing
  • A Hypothesis Test is an a priori theory relating
    to differences between variables.
  • A statistical test or Hypothesis Test is
    performed to prove or disprove the theory.
  • A Hypothesis Test converts the practical problem
    into a statistical problem.
  • Since relatively small sample sizes are used to
    estimate population parameters, there is always a
    chance of collecting a non-representative sample.
  • Inferential statistics allows us to estimate the
    probability of getting a non-representative
    sample.

11
DICE Example
  • We could throw it a number of times and track how
    many each face occurred. With a standard die, we
    would expect each face to occur 1/6 or 16.67 of
    the time.
  • If we threw the die 5 times and got 5 ones, what
    would you conclude? How sure can you be?
  • Pr (1 one) 0.1667 Pr (5 ones) (0.1667)5
    0.00013
  • There are approximately 1.3 chances out of 1000
    that we could have gotten 5 ones with a standard
    die.
  • Therefore, we would say we are willing to take a
    0.1 chance of being wrong about our hypothesis
    that the die was loaded since the results do
    not come close to our predicted outcome.

12
Hypothesis Testing
Type I Error
a
DECISIONS
Sample Size
ß
n
Type II Error
13
Statistical Hypotheses
  • A hypothesis is a predetermined theory about the
    nature of, or relationships between variables.
    Statistical tests can prove (with a certain
    degree of confidence), that a relationship
    exists.
  • We have two alternatives for hypothesis.
  • The null hypothesis Ho assumes that there are
    no differences or relationships. This is the
    default assumption of all statistical tests.
  • The alternative hypothesis Ha states that there
    is a difference or relationship.

P-value gt 0.05 Ho no difference or
relationship P-value lt 0.05 Ha is a
difference or relationship
Making a decision does not FIX a problem, taking
action does.
14
Steps to Statistical Hypothesis Test
  • State the Practical Problem.
  • State the Statistical Problem.
  • HO ___ ___
  • HA ___ ? ,gt,lt ___
  • Select the appropriate statistical test and risk
    levels.
  • a .05
  • ß .10
  • Establish the sample size required to detect the
    difference.
  • State the Statistical Solution.
  • State the Practical Solution.

Noooot THAT practical solution!
15
How Likely is Unlikely?
  • Any differences between observed data and claims
    made under H0 may be real or due to chance.
  • Hypothesis Tests determine the probabilities of
    these differences occurring solely due to chance
    and call them P-values.
  • The a level of a test (level of significance)
    represents the yardstick against which P-values
    are measured and H0 is rejected if the P-value
    is less than the alpha level.
  • The most commonly used levels are 5, 10 and 1.

16
Hypothesis Testing Risk
  • The alpha risk or Type 1 Error (generally called
    the Producers Risk) is the probability that we
    could be wrong in saying that something is
    different. It is an assessment of the
    likelihood that the observed difference could
    have occurred by random chance. Alpha is the
    primary decision-making tool of most statistical
    tests.

17
Alpha Risk
  • Alpha (? ) risks are expressed relative to a
    reference distribution.
  • Distributions include
  • t-distribution
  • z-distribution
  • ?2- distribution
  • F-distribution

The a-level is represented by the clouded
areas. Sample results in this area lead to
rejection of H0.
18
Hypothesis Testing Risk
  • The beta risk or Type 2 Error (also called the
    Consumers Risk) is the probability that we
    could be wrong in saying that two or more things
    are the same when, in fact, they are different.

19
Beta Risk
  • Beta Risk is the probability of failing to reject
    the null hypothesis when a difference exists.

Distribution if H0 is true
? 0.05
H0 value
Accept H0
Distribution if Ha is true
? Pr(Type II error)
?
Critical value of test statistic
20
Distinguishing between Two Samples
  • Recall from the Central Limit Theorem as the
    number of individual observations increase the
    Standard Error decreases.
  • In this example when n2 we cannot distinguish
    the difference between the Means (gt 5 overlap,
    P-value gt 0.05).
  • When n30, we can distinguish between the Means
    (lt 5 overlap, P-value lt 0.05) There is a
    significant difference.

Theoretical Distribution of Means When n 2 d
5 S 1
?
Theoretical Distribution of Means When n 30 d
5 S 1
21
Delta SigmaThe Ratio between d and S
  • Delta (d) is the size of the difference between
    two Means or one Mean and a target value.
  • Sigma (S) is the sample Standard Deviation of the
    distribution of individuals of one or both of the
    samples under question.
  • When ? ? S is large, we dont need statistics
    because the differences are so large.
  • If the variance of the data is large, it is
    difficult to establish differences. We need
    larger sample sizes to reduce uncertainty.

We want to be 95 confident in all of our
estimates!
22
Typical Questions on Sampling
  • Question How many samples should we take?
  • Answer Well, that depends on the size of your
    delta and Standard Deviation.
  • Question How should we conduct the
    sampling?Answer Well, that depends on what
    you want to know.
  • Question Was the sample we took large
    enough?Answer Well, that depends on the size
    of your delta and Standard Deviation.
  • Question Should we take some more samples just
    to be sure?Answer No, not if you took the
    correct number of samples the first time!

23
The Perfect Sample Size
  • The minimum sample size required to provide
    exactly 5 overlap (risk). In order to
    distinguish the Delta.
  • Note If you are working with Non-normal Data,
    multiply your calculated sample size by 1.1

24
Hypothesis Testing Roadmap
25
Hypothesis Testing Roadmap
26
Hypothesis Testing Roadmap
27
Common Pitfalls to Avoid
  • While using Hypothesis Testing the following
    facts should be borne in mind at the conclusion
    stage
  • The decision is about Ho and NOT Ha.
  • The conclusion statement is whether the
    contention of Ha was upheld.
  • The null hypothesis (Ho) is on trial.
  • When a decision has been made
  • Nothing has been proved.
  • It is just a decision.
  • All decisions can lead to errors (Types I and
    II).
  • If the decision is to Reject Ho, then the
    conclusion should read There is sufficient
    evidence at the a level of significance to show
    that state the alternative hypothesis Ha.
  • If the decision is to Fail to Reject Ho, then
    the conclusion should read There isnt
    sufficient evidence at the a level of
    significance to show that state the alternative
    hypothesis.

28
Summary
  • At this point, you should be able to
  • Articulate the purpose of Hypothesis Testing
  • Explain the concepts of the Central Tendency
  • Be familiar with the types of Hypothesis Tests
Write a Comment
User Comments (0)
About PowerShow.com