L1b.1 - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

L1b.1

Description:

Lecture 1b: Some basic statistical principles II Statistical null hypotheses and the meaning of p Test statistics Statistical errors in hypothesis testing – PowerPoint PPT presentation

Number of Views:14
Avg rating:3.0/5.0
Slides: 35
Provided by: Scott744
Category:

less

Transcript and Presenter's Notes

Title: L1b.1


1
Lecture 1b Some basic statistical principles II
  • Statistical null hypotheses and the meaning of p
  • Test statistics
  • Statistical errors in hypothesis testing
  • Power and effect size
  • Statistical null hypotheses problems and caveats

2
Statistical null hypotheses
  • The default to which you compare your data
  • Usually, one sets up the analysis such that if
    you reject the null hypothesis, you have a
    pattern which is consistent with the biological
    prediction
  • so that in many cases, the null hypothesis
    specifies a lack of pattern.

3
The meaning of p
  • Informal the probability that the null
    hypothesis is true
  • Strictly correct the probability of observing
    data as deviant (from the expected results) as
    the observed results if in fact the null
    hypothesis were true, assuming the data were
    properly collected, and all statistical
    assumptions are met.

4
To reject or not reject?
  • The decision to reject or accept the null
    hypothesis is based on p.
  • This requires some agreement (convention) as to
    what p value we will consider as significant.
    This threshold value is arbitrary!

5
Test statistics
  • In standard statistical analysis, p is estimated
    by reference to the distribution of an
    appropriate test statistic.
  • If we know the distribution of the test
    statistic, we can calculate the probability of
    getting a test statistic value at least as large
    (small) as the calculated value if H0 were true,
    i.e., p.

6
An example
  • Two samples (1, 2) with mean values that differ
    by some amount d.
  • What is the probability p of observing this
    difference under H0 that the two means are in
    fact equal?

Frequency
7
An example (contd)
  • If H0 is true, the expected distribution of the
    test statistic t is

Probability (p)
t
0
1
2
3
-3
-2
-1
8
An example (contd)
  • For the two populations, suppose t 2.01
  • What is the probability of getting a value at
    least this large under H0 that the two means are
    in fact equal?
  • Since p is small, it is unlikely that H0 is true.
  • Therefore, reject H0.

9
Statistical errors in hypothesis testing
  • Two types a true null hypothesis may be
    rejected, or a false null hypothesis may be
    accepted.
  • Type I error (a) the probability of rejecting a
    true null hypothesis
  • Type II error (b) the probability of accepting
    a false null hypothesis

10
Errors in inference
Reality
Conclusion
H0 is true
H0 is false
Accept H0
no error
?
Reject H0
no error
11
Errors in inference an example
Reality
Conclusion
No HIV
HIV
Seronegative
5
99
1
Seropositive
95
H0
HA
12
One- and two-tailed null hypotheses
1- a
a/2
a/2
  • For 2-tailed H0, there are two rejection regions
    of size a/2.
  • For 1-tailed H0 there is one rejection region of
    size a.

Probability
1- a
1- a
a
a
t
13
Example 2-tailed H0
  • No difference in populations
  • H0 m1 m2
  • Since H0 is 2- tailed, would reject H0 if m1 - m2
    gt 0 or m1 - m2 lt 0.

14
Example 1-tailed H0
  • The average size of individuals in population 1
    is greater than population 2
  • H0 m1 - m2 ?? 0
  • Since H0 is 1- tailed, would reject H0 if m1 - m2
    gt 0 only.

15
One versus two-tailed hypotheses
Sample 2
Sample 1
Frequency
  • 2-tailed hypothesis reject if any non-random
    pattern is detected.
  • 1-tailed hypothesis reject if a specified
    directional non-random pattern is detected
  • H0 m1 m2 (2-tailed, reject)
  • H0 m1 ?? m2 (1-tailed, accept)

16
Important note!
  • For given directionality, 1- tailed test is
    more powerful than 2-tailed
  • Therefore, always specify the nature of H0 before
    your analysis!

a
a/2
Probability
3
2
17
Parameters of statistical inference
  • Type I error rate (a)
  • Power (1 - Type II error rate 1 - b)
  • Sample size (N)
  • Effect size (d)
  • Each of the above is a function of the other
    three. Hence, if three are known, so is the
    fourth.

18
Power
  • Power is the probability of rejecting the null
    hypothesis when it is false and a specified
    alternate null hypothesis is true, i.e. 1- b.
  • Power can only be calculated when a specific
    alternate null hypothesis is specified.
    Therefore, power depends on the alternate null
    hypothesis.
  • Powerful tests can detect small differences, weak
    tests only large differences.

19
Calculating power an example
  • Expected distribution of means of samples of 5
    housefly wing lengths from normal populations
    specified by m as shown above curves and sY
    1.74. Centre curve represents null hypothesis,
    H0 m 45.5, curves at sides represent
    alternative hypotheses, m 37 or m 54.
    Vertical lines delimit 5 rejection regions for
    the null hypothesis.

H1 m 37
H0 m 45.5
H1 m 54
35 40 45
50 55
20
Power contd
  • Increases in type II error, b, as alternative
    hypothesis, H1, approaches null hypothesis, H0 --
    that is, m1 approaches m . Shading represents b.
    Vertical lines mark off 5 critical regions
    (2.5 in each tail) for the null hypothesis. To
    simplify the graph, the alternative distributions
    are shown for one tail only.

21
Effect size
  • Every null hypothesis in any statistical test
    implies a value for some population parameter.
  • E.g. if two sample means are equal, the absolute
    value of the difference d between the two
    populations is zero

Frequency
X
22
Effect size (contd)
Frequency
  • More generally, since H0 specifies a lack of some
    phenomenon, d quantifies the degree to which the
    phenomenon is present.
  • So if H0 is false, it is false to some specific
    degree, quantified by d, the effect size.

X
23
Types of power analysis I power as a function of
a, d and N
  • Often done after a statistical test, where N
    (sample size) and effect size (d) are determined
    and the null hypothesis has been accepted.
  • Then, for specified a, we can calculate 1- b
    (the power of the test)
  • If 1- b is low, then the Type II error rate is
    large, so there is a good chance we have accepted
    a false H0.

Frequency
X
24
Types of power analysis II N as a function of a,
d and power
  • A certain effect size (d) is anticipated (perhaps
    based on a preliminary sample) with a desired a
    and 1- b.
  • Given a, b and d, we can calculate the minimum
    sample size Nmin required to achieve the desired
    specifications.
  • This exercise can be very useful in planning
    experiments.

Frequency
X
25
Types of power analysis III d as a function of
a, N and power
  • Given a desired a, 1- b and N, what is the
    minimal detectable effect size dmin?
  • If dmin is large, then only large deviations from
    H0 will be detected (i.e. will result in
    rejection of H0).
  • Thus, we should be VERY VERY careful NOT to infer
    that some phenomenon does not exist if we accept
    H0.

Frequency
X
26
Power dependence on sample size
  • Power curves for testing H0 m 45.5. H1 m ?
    45.5 for n 5 and for n 35.
  • For given observed wing length, the probability
    of rejecting a false null hypothesis decreases as
    N decreases.

27
Why power matters
N 200
Frequency
  • Two samples, identical means and variances, but
    differ in N
  • in first case, power is large, p lt .05, therefore
    reject H0
  • in second case, power is low, p gt .05, therefore
    accept H0.

m1
m2
Size
N 30
Frequency
m1
m2
Size
28
Power conclusions
  • If sample sizes are small, the power of any test
    is usually low.
  • So, unless one knows the power of the analysis, a
    decision to accept the null hypothesis is
    meaningless!
  • Conversely, if power is very high, rejection of
    the null is very likely, even if deviations from
    null expectations are small (and perhaps
    biologically meaningless)!

29
Statistical hypothesis testing problems and
caveats
  • Problem 1 many H0s are very unlikely to be true
    a priori
  • so that their rejection is not very informative.

Treatment 1
Treatment 2
Control
Average yield
Treatment
30
Statistical hypothesis testing problems and
caveats
  • Problem 2 Nominal type I error (e.g. a 0.5) is
    entirely arbitrary, and may not bear any
    relationship to biological significance
  • and even less to decision-making

Threshold for decision-making
Probabilty
-3
-2
-1
0
1
3
2
t
31
Statistical hypothesis testing problems and
caveats
  • Problem 3 p is probability of obtaining a test
    statistic at least as extreme as that observed if
    H0 is true
  • but often the actual (sampling) distribution of
    the test statistic does not match the (assumed)
    distribution under the null.

Sampled
Probabilty
Null
-3
-2
-1
0
1
3
2
t
32
Statistical hypothesis testing problems and
caveats
  • Problem 4 for fixed effect size, p depends on
    sample size (n)
  • so that one can almost always reject H0 if the
    sample is sufficiently large, even if the
    observed effect is trivial

Larger effect size
Type I error
Smaller effect size
0.05
Sample size (n)
33
Statistical hypothesis testing problems and
caveats
  • Problem 5 since p depends on sample size (n)
  • using a fixed nominal a (e.g. a 0.05) as n
    increases is logically inconsistent even for n
    infinity and true H0, a 0.05!

Fixed a (e.g. 0.05)
0.05
a depends on n
Nominal type I error (a)
0
Sample size (n)
34
Statistical hypothesis testing solutions
  • Avoid testing trivial null hypotheses
  • Distinguish between biological (or other)
    significance and statistical significance
  • Always provide estimates of effect sizes and
    their precision, statistical significance (or
    lack thereof) notwithstanding
  • Consider using randomization and/or resampling
    methods to generate actual distribution of test
    statistics.
Write a Comment
User Comments (0)
About PowerShow.com