L1b.1 - PowerPoint PPT Presentation

1 / 34

About This Presentation

Title:

L1b.1

Description:

Lecture 1b: Some basic statistical principles II Statistical null hypotheses and the meaning of p Test statistics Statistical errors in hypothesis testing – PowerPoint PPT presentation

Number of Views:14

Avg rating:3.0/5.0

Slides: 35

Provided by: Scott744

Category:

more less

Transcript and Presenter's Notes

Title: L1b.1

1
Lecture 1b Some basic statistical principles II

Statistical null hypotheses and the meaning of p
Test statistics
Statistical errors in hypothesis testing
Power and effect size
Statistical null hypotheses problems and caveats

2
Statistical null hypotheses

The default to which you compare your data
Usually, one sets up the analysis such that if
you reject the null hypothesis, you have a
pattern which is consistent with the biological
prediction
so that in many cases, the null hypothesis
specifies a lack of pattern.

3
The meaning of p

Informal the probability that the null
hypothesis is true
Strictly correct the probability of observing
data as deviant (from the expected results) as
the observed results if in fact the null
hypothesis were true, assuming the data were
properly collected, and all statistical
assumptions are met.

4
To reject or not reject?

The decision to reject or accept the null
hypothesis is based on p.
This requires some agreement (convention) as to
what p value we will consider as significant.
This threshold value is arbitrary!

5
Test statistics

In standard statistical analysis, p is estimated
by reference to the distribution of an
appropriate test statistic.
If we know the distribution of the test
statistic, we can calculate the probability of
getting a test statistic value at least as large
(small) as the calculated value if H0 were true,
i.e., p.

6
An example

Two samples (1, 2) with mean values that differ
by some amount d.
What is the probability p of observing this
difference under H0 that the two means are in
fact equal?

Frequency
7
An example (contd)

If H0 is true, the expected distribution of the
test statistic t is

Probability (p)
t
0
1
2
3
-3
-2
-1
8
An example (contd)

For the two populations, suppose t 2.01
What is the probability of getting a value at
least this large under H0 that the two means are
in fact equal?
Since p is small, it is unlikely that H0 is true.
Therefore, reject H0.

9
Statistical errors in hypothesis testing

Two types a true null hypothesis may be
rejected, or a false null hypothesis may be
accepted.
Type I error (a) the probability of rejecting a
true null hypothesis
Type II error (b) the probability of accepting
a false null hypothesis

10
Errors in inference
Reality
Conclusion
H0 is true
H0 is false
Accept H0
no error
?
Reject H0
no error
11
Errors in inference an example
Reality
Conclusion
No HIV
HIV
Seronegative
5
99
1
Seropositive
95
H0
HA
12
One- and two-tailed null hypotheses
1- a
a/2
a/2

For 2-tailed H0, there are two rejection regions
of size a/2.
For 1-tailed H0 there is one rejection region of
size a.

Probability
1- a
1- a
a
a
t
13
Example 2-tailed H0

No difference in populations
H0 m1 m2
Since H0 is 2- tailed, would reject H0 if m1 - m2
gt 0 or m1 - m2 lt 0.

14
Example 1-tailed H0

The average size of individuals in population 1
is greater than population 2
H0 m1 - m2 ?? 0
Since H0 is 1- tailed, would reject H0 if m1 - m2
gt 0 only.

15
One versus two-tailed hypotheses
Sample 2
Sample 1
Frequency

2-tailed hypothesis reject if any non-random
pattern is detected.
1-tailed hypothesis reject if a specified
directional non-random pattern is detected

H0 m1 m2 (2-tailed, reject)
H0 m1 ?? m2 (1-tailed, accept)

16
Important note!

For given directionality, 1- tailed test is
more powerful than 2-tailed
Therefore, always specify the nature of H0 before
your analysis!

a
a/2
Probability
3
2
17
Parameters of statistical inference

Type I error rate (a)
Power (1 - Type II error rate 1 - b)
Sample size (N)
Effect size (d)
Each of the above is a function of the other
three. Hence, if three are known, so is the
fourth.

18
Power

Power is the probability of rejecting the null
hypothesis when it is false and a specified
alternate null hypothesis is true, i.e. 1- b.
Power can only be calculated when a specific
alternate null hypothesis is specified.
Therefore, power depends on the alternate null
hypothesis.
Powerful tests can detect small differences, weak
tests only large differences.

19
Calculating power an example

Expected distribution of means of samples of 5
housefly wing lengths from normal populations
specified by m as shown above curves and sY
1.74. Centre curve represents null hypothesis,
H0 m 45.5, curves at sides represent
alternative hypotheses, m 37 or m 54.
Vertical lines delimit 5 rejection regions for
the null hypothesis.

H1 m 37
H0 m 45.5
H1 m 54
35 40 45
50 55
20
Power contd

Increases in type II error, b, as alternative
hypothesis, H1, approaches null hypothesis, H0 --
that is, m1 approaches m . Shading represents b.
Vertical lines mark off 5 critical regions
(2.5 in each tail) for the null hypothesis. To
simplify the graph, the alternative distributions
are shown for one tail only.

21
Effect size

Every null hypothesis in any statistical test
implies a value for some population parameter.
E.g. if two sample means are equal, the absolute
value of the difference d between the two
populations is zero

Frequency
X
22
Effect size (contd)
Frequency

More generally, since H0 specifies a lack of some
phenomenon, d quantifies the degree to which the
phenomenon is present.
So if H0 is false, it is false to some specific
degree, quantified by d, the effect size.

X
23
Types of power analysis I power as a function of
a, d and N

Often done after a statistical test, where N
(sample size) and effect size (d) are determined
and the null hypothesis has been accepted.
Then, for specified a, we can calculate 1- b
(the power of the test)
If 1- b is low, then the Type II error rate is
large, so there is a good chance we have accepted
a false H0.

Frequency
X
24
Types of power analysis II N as a function of a,
d and power

A certain effect size (d) is anticipated (perhaps
based on a preliminary sample) with a desired a
and 1- b.
Given a, b and d, we can calculate the minimum
sample size Nmin required to achieve the desired
specifications.
This exercise can be very useful in planning
experiments.

Frequency
X
25
Types of power analysis III d as a function of
a, N and power

Given a desired a, 1- b and N, what is the
minimal detectable effect size dmin?
If dmin is large, then only large deviations from
H0 will be detected (i.e. will result in
rejection of H0).
Thus, we should be VERY VERY careful NOT to infer
that some phenomenon does not exist if we accept
H0.

Frequency
X
26
Power dependence on sample size

Power curves for testing H0 m 45.5. H1 m ?
45.5 for n 5 and for n 35.
For given observed wing length, the probability
of rejecting a false null hypothesis decreases as
N decreases.

27
Why power matters
N 200
Frequency

Two samples, identical means and variances, but
differ in N
in first case, power is large, p lt .05, therefore
reject H0
in second case, power is low, p gt .05, therefore
accept H0.

m1
m2
Size
N 30
Frequency
m1
m2
Size
28
Power conclusions

If sample sizes are small, the power of any test
is usually low.
So, unless one knows the power of the analysis, a
decision to accept the null hypothesis is
meaningless!
Conversely, if power is very high, rejection of
the null is very likely, even if deviations from
null expectations are small (and perhaps
biologically meaningless)!

29
Statistical hypothesis testing problems and
caveats

Problem 1 many H0s are very unlikely to be true
a priori
so that their rejection is not very informative.

Treatment 1
Treatment 2
Control
Average yield
Treatment
30
Statistical hypothesis testing problems and
caveats

Problem 2 Nominal type I error (e.g. a 0.5) is
entirely arbitrary, and may not bear any
relationship to biological significance
and even less to decision-making

Threshold for decision-making
Probabilty
-3
-2
-1
0
1
3
2
t
31
Statistical hypothesis testing problems and
caveats

Problem 3 p is probability of obtaining a test
statistic at least as extreme as that observed if
H0 is true
but often the actual (sampling) distribution of
the test statistic does not match the (assumed)
distribution under the null.

Sampled
Probabilty
Null
-3
-2
-1
0
1
3
2
t
32
Statistical hypothesis testing problems and
caveats

Problem 4 for fixed effect size, p depends on
sample size (n)
so that one can almost always reject H0 if the
sample is sufficiently large, even if the
observed effect is trivial

Larger effect size
Type I error
Smaller effect size
0.05
Sample size (n)
33
Statistical hypothesis testing problems and
caveats

Problem 5 since p depends on sample size (n)
using a fixed nominal a (e.g. a 0.05) as n
increases is logically inconsistent even for n
infinity and true H0, a 0.05!

Fixed a (e.g. 0.05)
0.05
a depends on n
Nominal type I error (a)
0
Sample size (n)
34
Statistical hypothesis testing solutions

Avoid testing trivial null hypotheses
Distinguish between biological (or other)
significance and statistical significance
Always provide estimates of effect sizes and
their precision, statistical significance (or
lack thereof) notwithstanding
Consider using randomization and/or resampling
methods to generate actual distribution of test
statistics.

Write a Comment

User Comments (0)