Inference about a Mean Vector - PowerPoint PPT Presentation

1 / 110
About This Presentation
Title:

Inference about a Mean Vector

Description:

We could use special tables to find critical values ... (Chi Square random variable/df)-1 ... Root Percent x1dif x2dif. 0.42640573 100.00 0.07016070 -0.05016332 ... – PowerPoint PPT presentation

Number of Views:216
Avg rating:3.0/5.0
Slides: 111
Provided by: jamesjc
Category:

less

Transcript and Presenter's Notes

Title: Inference about a Mean Vector


1
III. Inferences about a Mean Vector
  • A. Basic Inference about a Single Mean m
  • 1. Hypothesis Testing - Scientific method-based
    means for using sample data to evaluate
    conjectures about a population.
  • 2. Null Hypothesis - Statement of the
    conjectured value(s) for the parameter that
    includes (but is not necessarily limited to)
    equality between the conjectured value and the
    tested parameter. Usually denoted
  • H0 parameter (? ?) hypothesized value
  • This is equivalent to a claim that the
    difference between the observations and the
    hypothesized value are due to random variation.

2
3. Alternative Hypothesis - Statement of the
conjectured value(s) for the parameter that is
mutually exclusive and collectively exhaustive
with respect to the Null Hypothesis (and so
includes the lt and/or gt relationship between
conjectured value and the tested parameter.
Usually denoted H1 parameter (lt ? gt)
hypothesized value. This is equivalent to a
claim that the difference between the
observations and the hypothesized value are
systematic (i.e., due to something other than
random variation. Note that our conclusion is
stated with regards to the null hypothesis, which
can either be i) rejected or ii) not rejected
that is, we never accept the null hypothesis.
3
4. Critical Region - Area containing all
possible estimated values of the parameter that
will result in rejection of the null
hypothesis. 5. Critical Value(s) The
value(s) that divide the critical region(s) from
the do not reject region 6. Test Statistic
Sample-based value that will be compared to the
critical region(s) to decide whether to reject or
not reject the null hypothesis. The generic form
of the test statistic is
4
7. Decision Rule Statement that specifies
values of the test statistic that will result in
rejection or non-rejection of the null
hypothesis 8. Two-Tailed Hypothesis Test -
Evaluation of a conjecture for which sample
results that are either sufficiently less than or
greater than the conjectured value of the
parameter will result in rejection of the null
hypothesis, i.e., for a null hypothesis that only
includes equality between the conjectured value
and the tested parameter.
5
9. One-Tailed Hypothesis Test - Evaluation of a
conjecture for which sample results that are only
sufficiently less than or greater than the
conjectured value of the parameter will result in
rejection of the null hypothesis, i.e., for a
null hypothesis that includes an inequality
between the conjectured value and the tested
parameter. 10. Upper-Tailed Test Hypothesis
test for which sample results that are only
sufficiently greater than the conjectured value
of the parameter will result in rejection of the
null hypothesis) 11. Lower-tailed Test
Hypothesis test for which sample results that are
only sufficiently less than the conjectured value
of the parameter will result in rejection of the
null hypothesis
6
12. Type I Error - Rejection of a true null
hypothesis. The probability of this occurrence
(given that the null hypothesis is true) is
denoted as ?.   13. Type II Error Non-rejection
of a false null hypothesis. The probability of
this occurrence (given that the null hypothesis
is false) is denoted as ?.
7
14. Level of Significance The probability of
rejecting the null when it is actually true. For
a two-tailed test
Reject Region
Reject Region
Decision rule do not reject H0 if t?/2 ? t ?
t?/2 otherwise reject H0
8
For an upper-tailed test
Reject Region
Decision rule do not reject H0 if t ? ta
otherwise reject H0
9
For a lower-tailed test
Reject Region
Decision rule do not reject H0 if ta ? t
otherwise reject H0
10
15. Steps in Hypothesis Testing - State the
Null Alternative Hypotheses - Select the
Appropriate Test Statistic - State the Desired
Level of Significance a, Find the Critical
Value(s) and State the Decision Rule
- Calculate the Test Statistic - Use the
Decision Rule to Evaluate the Test Statistic and
Decide Whether to Reject or Not Reject the Null
Hypothesis. Interpret your results.
11
When testing a hypothesis about a single mean,
the appropriate test statistic (when the parent
population is normal or the sample is
sufficiently large) is
where t has n 1 degrees of freedom and
12
Example suppose we had the following fifteen
sample observations on some random variable X1
At a significance level of a 0.10, do these
data support the assertion that they were drawn
from a population with a mean of 4.0? In other
words, test the null hypothesis
13
Lets use the five steps of hypothesis testing
to assess the potential validity of this
conjecture - State the Null and Alternative
Hypotheses H0 ?1 4.0 H1 ?1 ? 4.0 - Select
the Appropriate Test Statistic n 15 lt 30 but
the data appear normal, so use
14
- State the Desired Level of Significance a,
Find the Critical Value(s) and State the Decision
Rule a0.10 and we have a two-tailed test, so
ta/2 ?1.761.
Reject Region
Reject Region
Decision rule do not reject H0 if 1.761 ? t ?
1.761 otherwise reject H0
15
- Calculate the Test Statistic We have
so
- Use the Decision Rule to Evaluate the Test
Statistic and Decide Whether to Reject or Not
Reject the Null Hypothesis 1.761 ? t ? 1.761,
i.e., 1.761 ? 1.84 ? 1.761 so reject H0. At a
0.10. the sample evidence does not support the
claim that the mean of X1 is 4.0.
16
Another example suppose we had the following
fifteen sample observations on some random
variable X2
At a significance level of a 0.10, do these
data support the assertion that they were drawn
from a population with a mean of -1.5? In other
words, test the null hypothesis
17
Lets use the five steps of hypothesis testing
to assess the potential validity of this
conjecture - State the Null and Alternative
Hypotheses H0 ? -1.5 H1 ? ? -1.5 - Select
the Appropriate Test Statistic n 15 lt 30 but
the data appear normal, so use
18
- State the Desired Level of Significance a,
Find the Critical Value(s) and State the Decision
Rule a0.10 and we have a two-tailed test, so
ta/2 ?1.761.
Reject Region
Reject Region
Decision rule do not reject H0 if 1.761 ? t ?
1.761 otherwise reject H0
19
- Calculate the Test Statistic We have
so
- Use the Decision Rule to Evaluate the Test
Statistic and Decide Whether to Reject or Not
Reject the Null Hypothesis 1.761 ? t ? 1.761,
i.e., 1.761 ? -1.748 ? 1.761 so do not reject
H0. At a 0.10. the sample evidence does not
refute the claim that the mean of X2 is -1.5.
20
Consider the two-tailed test - note that
rejecting H0 when t is large is equivalent to
rejecting H0 if its square
is large! Note also that - t2 represents
the distance from the sample mean x to the
hypothesized population mean m0, expressed in
terms of sx (which is equivalent to square
squared generalized) - If the null hypothesis
is true, t2 F1, n - 1.
_
_
21
What if we wished to make simultaneous
inferences about the two variables X1 and
X2? We could take Bonferronis apporach set
the experimentwise probability of Type I error
a, then use a/m (where m is the number of
simultaneous inferences we wish to make) at the
comparisonwise probability of Type I error. For
our previous examples (with experimentwise
probability of Type I error a 0.10 and n 1
14 degrees of freedom), we have a comparisonwise
probability of Type I error of a 0.05. This
gives us critical values ta/2 ?2.145, and we
would reject neither of the previous two null
hypotheses!
22
B. Inference about a Mean Vector m A natural
generalization of the squared univariate distance
t is the multivariate analog Hotellings T2

where
_
_
_
note that here n-1S is the estimated covariance
matrix of X.


23
This gives us a framework for testing hypotheses
about a mean vector, where the null and
alternative hypotheses are H0 ? ?0 H1 ? ?
?0 The T2 statistic can be rewritten as


multivariate normal Np(m,S) random vector
multivariate normal Np(m,S) random vector
(Wishart Wp,n 1(S) random matrix/df)-1



24
So when the null hypothesis is true, the T2
statistic can be written as the product of two
multivariate normal Np(m,S) and a Wishart Wp,n
1(S) distribution this is a complete
generalization of the univariate case, where we
could write the squared test statistic t2 as


univariate normal N(m,s) random variable
univariate normal N(m,s) random variable
(Chi Square random variable/df)-1
We could use special tables to find critical
values of T2 for various combinations of a and
degrees of freedom, but that is not necessary
because
What happens to this in the univariate case?
25
Example suppose we had the following fifteen
sample observations on some random variables X1
and X2
At a significance level of a 0.10, do these
data support the assertion that they were drawn
from a population with a centroid (4.0,
-1.5)? In other words, test the null hypothesis
26
The scatter plot of pairs (x1, x2), sample
centroid (x1, x2), and hypothesized centroid (m1,
m2) is provided below
hypothesized centroid (m1, m2)0
sample centroid (x1, x2)
_ _
Do these data appear to support our null
hypothesis?
27
Lets go through the five steps of hypothesis
testing to assess the potential validity of our
assertion. - State the Null and Alternative
Hypotheses - Select the Appropriate
Test Statistic n p 15 2 13 is not very
large, but the data appear relatively bivariate
normal, so use
28
- State the Desired Level of Significance a,
Find the Critical Value(s) and State the Decision
Rule a 0.10 and n1 p 2, n2 n p 13
degrees of freedom, we have f2,13 (0.10) 2.76
0.5-a0.90
a0.10
90 Do Not Reject Region
Reject Region
But we dont yet have a decision rule since
29
Thus our critical value is
So we have Decision rule do not reject H0 if
T2 ? 5.95 otherwise reject H0
30
- Calculate the Test Statistic We have
so
31
- Use the Decision Rule to Evaluate the Test
Statistic and Decide Whether to Reject or Not
Reject the Null Hypothesis T2 5.970 ?
5.95 so reject H0. The sample evidence supports
the claim that the mean vector differs from
32
Why were the results of the original univariate
tests and the bivariate test so consistent? Look
at the data!
hypothesized centroid (m1, m2)0
sample centroid (x1, x2)
_ _
What do these data suggest about the
relationship between X1 and X2?
33
What if we took the same values for variables X1
and X2 and changed their pairings?
Of course, the univariate test results would not
change but at a significance level of a 0.10,
do these data now support the assertion that they
were drawn from a population with a centroid
(4.0, -1.5)? In other words, retest the null
hypothesis
34
Since we are using the same significance level
(a 0.10) and the degrees of freedom are
unchanged, the decision rule is unchanged as
well
0.5-a0.90
a0.10
90 Do Not Reject Region
Reject Region
So our critical value is
Decision rule do not reject H0 if T2 ? 5.95
otherwise reject H0
35
However, we will have to recalculate the test
statistic (why?)
so
36
Use the Decision Rule to Evaluate the Test
Statistic and Decide Whether to Reject or Not
Reject the Null Hypothesis T2 3.411 ?
5.95 so do not reject H0. The sample evidence
does not support the claim that the mean vector
differs from
37
Why are we further from rejecting in the
bivariate case? Again, look at the data!
hypothesized centroid (m1, m2)0
sample centroid (x1, x2)
_ _
What do these data suggest about the
relationship between X1 and X2?
38
What if we took the same values for variables X1
and X2 and changed their pairings so that their
correlation was positive?
Again, the univariate test results would not
change but at a significance level of a 0.10,
do these data now support the assertion that they
were drawn from a population with a centroid
(4.0, -1.5)? In other words, retest the null
hypothesis
39
Since we are using the same significance level
(a 0.10) and the degrees of freedom are
unchanged, the decision rule is unchanged as
well
0.5-a0.90
a0.10
90 Do Not Reject Region
Reject Region
So our critical value is
Decision rule do not reject H0 if T2 ? 5.95
otherwise reject H0
40
However, we will again have to recalculate the
test statistic
so
41
Use the Decision Rule to Evaluate the Test
Statistic and Decide Whether to Reject or Not
Reject the Null Hypothesis T2 247.470 ?
5.95 so reject H0. The sample evidence supports
the claim that the mean vector differs from
42
Why is our decision so radically different?
Again, look at the data!
hypothesized centroid (m1, m2)0
sample centroid (x1, x2)
_ _
What do these data suggest about the
relationship between X1 and X2?
43
C. Hotellings T2 and Likelihood Ratio
Tests Likelihood Ratio Method general
principle for constructing test procedures
which - have several optimal properties for
large samples - are particularly convenient
for testing multivariate hypotheses Recall that
the maximum of the multivariate likelihood (over
possible values of m and S) is given by

44
Where
under H0 m m0, the normal likelihood
specializes to

45
by an earlier result we can rewrite the exponent
in L(m0, S), which yields

46
Recall our earlier result For a p x p
symmetric positive definite matrix B and scalar b
gt 0, it follows that

for all positive definite S of dimension p x p,
with equality holding only for

47
If we apply this result with
we have
with
48
Now we can compare the maximum of L(m0, S) with
the unrestricted maximum of L(m, S) to determine
the plausibility of the null hypothesis this
ratio is called the Likelihood Ratio Statistic


or equivalently
which is called Wilks Lambda.
49
Note that a small value for Wilks lambda
suggests that the null hypothesis H0m m0 is
not likely to be true (and leads to rejection)
that is, reject H0 if

where ca is the lower (100a)th percentile of the
distribution of L. But what is the distribution
of Wilks Lambda???
50
There is a relationship between T2 and L that
can help us here suppose X1,Xn is a random
sample from an N(m,S) population. Then the test
based on T2 is equivalent to the likelihood ratio
test of H0m m0 because



substitute the appropriate critical value of T2
here to find the critical likelihood ratio value
which also has the advantage of demonstrating
that T2 can be calculated as the ratio of two
determinants (avoiding the need to calculate
S-1). Solving for T2 yields

51
Example for our previous bivariate data,
perform the likelihood ratio test of the
hypothesis that the centroid is (4.0, -1.0)
In our previous test of the null hypothesis
we obtained a Hotellings T2 test statistic value
of 5.997 and a critical value (at a 0.10) of
5.95.
52
Lets go through the five steps of hypothesis
testing to assess the potential validity of our
assertion. - State the Null and Alternative
Hypotheses - Select the Appropriate
Test Statistic n p 15 2 13 is not very
large, but the data appear relatively bivariate
normal, so use
53
- State the Desired Level of Significance a,
Find the Critical Value(s) and State the Decision
Rule a 0.10 and n1 p 2, n2 n p 13
degrees of freedom, we have f2,13 (0.10) 2.76
0.5-a0.90
a0.10
90 Do Not Reject Region
Reject Region
But we dont yet have a decision rule since
54
Thus our critical T2 value is
This leads to a critical likelihood ratio value of
So we have Decision rule do not reject H0 if
? 0.7018 otherwise reject H0
55
- Calculate the Test Statistic From our
earlier results we have T2 5.970, so the
calculated value of the likelihood ratio test
statistic for this sample is
56
- Use the Decision Rule to Evaluate the Test
Statistic and Decide Whether to Reject or Not
Reject the Null Hypothesis T2 0.7011 ?
0.7018 so reject H0. The sample evidence
supports the claim that the mean vector differs
from
57
SAS code for a Hypothesis Test of a Mean Vector
OPTIONS LINESIZE 72 NODATE PAGENO 1 DATA
stuff INPUT x1 x2 x1o 4.0 x2o -1.5 x1dif
x1 - x1o x2dif x2 - x2o LABEL x1'Observed
Values of X1' x2'Observed Values of X2'
x1o'Hypothesized Value of X1'
x2o'Hypothesized Value of X2'
x1dif'Difference Between Observed and
Hypothesized Values of X1'
x2dif'Difference Between Observed and
Hypothesized Values of X2' CARDS 1.43 -0.69 .
. . . . . . .
. 9.42 -7.64
58
SAS code for a Hypothesis Test of a Mean Vector
(continued)
PROC MEANS DATAstuff N MEAN STD T PRT VAR X1 X2
X1dif X2dif TITLE4 'Using PROC MEANS to generate
univariate summary statistics' RUN PROC CORR
DATAstuff COV VAR X1 X2 TITLE4 'Using PROC
CORR to generate the sample covariance
matrix' RUN PROC GLM DATAstuff MODEL x1dif
x2dif /nouni MANOVA HINTERCEPT TITLE4 'Using
PROC GLM to test a Hypothesis of a Mean
Vector' RUN
59
SAS output for Univariate Hypothesis Test of
Means
The MEANS
Procedure Variable Label
N

x1 Observed Values of X1
15 x2 Observed
Values of X2
15 x1dif Difference Between Observed and
Hypothesized Values of X1 15 x2dif
Difference Between Observed and Hypothesized
Values of X2 15

Variable Mean Std Dev t Value
Pr gt t
x1
5.2653333 2.6686403 7.64 lt.0001
x2 -3.0913333 3.5248543
-3.40 0.0043 x1dif 1.2653333
2.6686403 1.84 0.0876 x2dif
-1.5913333 3.5248543 -1.75
0.1023

60
SAS output Sample Covariance Correlation
Matrices
The CORR Procedure
2 Variables x1 x2
Covariance Matrix, DF 14
x1
x2 x1 Observed Values of X1
7.12164095 -0.72558524 x2
Observed Values of X2 -0.72558524
12.42459810 Simple
Statistics Variable N
Mean Std Dev Sum x1
15 5.26533 2.66864
78.98000 x2 15 -3.09133
3.52485 -46.37000
Simple Statistics Variable
Minimum Maximum Label x1
1.43000 9.42000 Observed Values of X1
x2 -7.88000 2.87000
Observed Values of X2 Pearson
Correlation Coefficients, N 15
Prob gt r under H0 Rho0
x1 x2
x1 1.00000
-0.07714 Observed Values of X1
0.7847 x2
-0.07714 1.00000 Observed
Values of X2 0.7847
61
SAS output for a Hypothesis Test of a Mean
Vector
The GLM Procedure Multivariate Analysis of
Variance Characteristic Roots and Vectors
of E Inverse H, where H
Type III SSCP Matrix for Intercept
E Error SSCP Matrix
Characteristic Characteristic
Vector V'EV1 Root Percent
x1dif x2dif
0.42640573 100.00 0.07016070
-0.05016332 0.00000000 0.00
0.07188395 0.05715782 MANOVA
Test Criteria and Exact F Statistics for
the Hypothesis of No Overall Intercept
Effect H Type III SSCP Matrix
for Intercept E Error
SSCP Matrix S1 M0
N5.5 Statistic Value
F Value Num DF Den DF Pr gt F Wilks' Lambda
0.70106280 2.77 2 13
0.0994 Pillai's Trace 0.29893720
2.77 2 13 0.0994 Hotelling-Lawley
Trace 0.42640573 2.77 2 13
0.0994 Roy's Greatest Root 0.42640573
2.77 2 13 0.0994
62
The previous test was a specific example of
application of the General Likelihood Ratio
Method Let q be a vector consisting of all the
unknown parameters that take values in some
parameter set Q (i.e., q ? Q). For example, in
the p-dimensional multivariate normal case, q
m1, mp q11,, q1p q21,, q2p qp1,, qpp
and Q consists of the p-dimensional space where
-??mi ??, I 1,,p combined with the 0.5p(p
1)-dimensional space of variances and covariances
such that ? is positive definite. Thus Q has
dimension n p 0.5p(p 1). Also let L(q) be
the likelihood function obtained by evaluating
the joint density of X1,Xn at their observed
values x1,xn.

















63
Now under the null hypothesis H0 q Q0, q is
restricted to lie in some region Q0 ? Q. For
example, in the p-dimensional multivariate normal
case with m m0 and S unspecified, we have



so Q has dimension n0 0 0.5p(p 1)
0.5p(p 1). Now a likelihood ratio test of
H0q ? Q0 is rejected in favor of H0q ? Q0 when



for a suitably chosen constant c (some
percentile of the distribution of L).
64
Conveniently, for a relatively large sample size
n, under the null hypothesis H0 q Q0,

.
65
D. Confidence Regions and Simultaneous
Comparisons of Component Means To extend the
concept of a univariate 100(1 - a) confidence
interval for the mean
to a p-dimensional multivariate space, let q be
a vector of unknown population parameters such
that q ? Q. The 100(1 a) Confidence Region
determined by data X X1, X2 ,,Xn is
denoted R(X), is the region satisfying PR(X)
will contain the true q 1 - a






66
For example, a univariate 100(1 - a) confidence
interval for the mean may look like this
A corresponding confidence region in
p-dimensional multivariate space is given by
Equivalently,
provided nS-1 is the appropriate measure of
distance.

67
More formally, a 100(1 - a) confidence region
for the mean vector m of a p-dimensional normal
distribution is the ellipsoid determined by all
possible points m that satisfy


where
68
Example Use our original bivariate sample to
construct a two-dimensional 90 confidence region
and determine if the point (4.0, -1.5) lies in
this region.

We have already calculated the summary
statistics
and
69
So a 100(1 - a) confidence region for the mean
vector m is the ellipsoid determined by all
possible points m that satisfy


So does
fall inside this 90 confidence region?
70
We have that
it does not fall inside the 90 confidence
region! Does the point (-1.5, 4.0) lie in this
region? We have that
This point falls FAR outside the 90 confidence
region!
71
We can draw the axes of the ellipsoid that
represents the 100(1 - a) confidence region for
the mean vector m the ellipsoid determined by
all possible points m that satisfy


For our ongoing problem at the 90 level of
confidence, this is equivalent to
72
We can draw the axes of the ellipsoid that
represents the 100(1 - a) confidence region for
the mean vector m. Recall that the directions and
relative lengths of axes of this confidence
interval ellipsoid are determined by going

units along the corresponding eigenvectors
ei. Beginning at the centroid x, the axes of
the confidence interval ellipsoid are

_

Note that the ratios of the lis aid in
identification of the relative amounts of
elongation along pairs of axes.
73
Example We can draw the 100(1 - a) confidence
region for the mean vector m the eigenvalue-
eigenvector pairs li, ei for the sample
covariance matrix S are



so the half-lengths of the major and minor axes
are given by
74
The axes lie along the corresponding
eigenvectors ei

when these vectors are plotted with the sample
centroid x as the origin
_

75
sample centroid (x1, x2)
_ _
76
sample centroid (x1, x2)
_ _
Now we move ? 3.73 units along the vector e1

and ? 2.793 units along the vector e2.

77
shadow of the 90 confidence region on the X1 axis
shadow of the 90 confidence region on the X2 axis
We can also project the ellipse onto the axes to
form simultaneous confidence intervals.
78
Simultaneous Confidence Statements The joint
confidence region
for some constant c correctly assesses the
plausibility of various value of m.
However they are somewhat awkward we may
desire confidence statements about individual
components of the mean vector m that would hold
simultaneously with a reasonably level of
confidence. Such intervals are referred to as
Simultaneous Confidence Intervals. One approach
to constructing these intervals is to relate them
to the T2-based confidence region.


79
To start, suppose X Np(m, S) and that they
form the linear combination


From earlier results we have that
and
and
80
Now, if we take a sample X1,,Xn from the Np(m,
S) population and construct a corresponding
sample of Zs


the sample mean and variance of the observed
values z1,,zn are
and
81
We can now develop simultaneous confidence
intervals by judicious choices of a. For a fixed
a and unknown, a 100(1 a) confidence interval
for mz am is



which leads to the statement
which can be rewritten as
82
Note that the inequality
can be interpreted as a statement about
components of m. For example, taking a to be a 1
in the ith position and zeros will yield the
usual confidence interval about mi. However, it
is obvious that if we did so for i 1,,p at
some fixed level of confidence 1 a, the
confidence of all statements taken together
(i.e., the experimentwise level of confidence) is
not 1 a!


83
Given a set of data x1,,xn and a particular a,
the confidence interval is the set of all am
such that




or equivalently
So a simultaneous confidence region is given by
the set of all am values such that t2 is
relatively small for all choices of a!


84
Considering values of a for which t2 ? c2, it is
natural to try to determine
  • Now by a previous maximization lemma
  • For a given p x 1 vector d and p x p positive
    definite matrix B, and an arbitrary nonzero
    vector x




with the maximum attained when x cB-1d for any
constant c. Now if we take x a, d (x m),
and B S, we get
_







85
_
with the maximum occurring where a ? S-1(x m).



86
Thus, for a sample X1,,Xn from the Np(m, S)
population with positive definite S, we have
simultaneously for all a that



will contain am with probability 1 a. These
are often referred to as T2-intervals.

87
Note that successive choices of a 1 0 0 .
0, a 0 1 0 . 0,, a 0 0 0 . 1 for
the T2-intervals allow us to conclude that



will all hold simultaneously with probability 1
a.
88
or that a judicious choice of a (such as a
1 -1 0 . 0) for the T2-intervals allows us to
test special contrasts


which again will hold simultaneously with
probability 1 a.
89
Example Use our original bivariate sample to
construct simultaneous 90 confidence intervals
for the means of two variables X1 and X2.

We have already calculated the summary
statistics
and
90
Our choices of a are 1 0 and 0 1, which
yield T2-intervals

91
and
which will hold simultaneously with probability
1 a.
92
Example Use our original bivariate sample to
construct simultaneous 90 confidence intervals
for the sum of and difference between means of
two variables X1 and X2.

Again, we will make use of our previously
calculated the summary statistics
and
93
Our choices of a are 1, 1 and 1 -1, so
the T2-intervals are


94
and
which will hold simultaneously with probability
1 a.
95
Our choices of a are 1, 1 and 1 -1, so
the T2-intervals are


which again will hold simultaneously with
probability 1 a.
96
shadow of the 90 confidence region on the X1 axis
3.6
6.9
-0.9
shadow of the 90 confidence region on the X2 axis
-5.3
Note that the projections the ellipse onto the
axes do form the simultaneous confidence
intervals.
97
Also note that, for this example, the
one-at-a-time (univariate) confidence intervals
would be
98
and
These intervals are not guaranteed to hold
simultaneously with probability 1 a. Why?
99
shadow of the 90 confidence region on the X1 axis
univariate 90 confidence interval for X1
4.0
6.5
-1.5
univariate 90 confidence interval for X2
-4.7
shadow of the 90 confidence region on the X2 axis
Note that the univariate intervals are shorter
they do not consider covariance between X1 and X2!
100
Even if we make a Bonferroni-type adjustment to
the one-at-a-time (univariate) confidence
intervals
101
and
These intervals are not guaranteed to hold
simultaneously with probability 1 a. Why?
102
Bonferonni adjusted univariate 90 confidence
interval for X1
shadow of the 90 confidence region on the X1 axis
3.8
6.7
-1.1
Bonferonni adjusted univariate 90 confidence
interval for X2
-5.0
shadow of the 90 confidence region on the X2 axis
Note that the Bonferonni adjusted univariate
intervals are still shorter they do not
consider covariance between X1 and X2!
103
E. Large Sample Inferences About a Population
Mean Vector When the sample is large (n gt gt p)
we dont need to rely on the multivariate
normality of the population to make inferences
about the mean vector m (why not?). From an
earlier result, we know that

.
so we can say that
104
This result leads directly to large sample (n gt
gt p) simultaneous confidence intervals and
hypothesis tests for the mean vector m. Let
X1,Xn be a random sample from a population with
mean m, positive definite covariance matrix S,
and some arbitrary distribution. When n p is
large, the hypothesis H0 m m0 is rejected
in favor of H1 m ? m0 at a level of
significance a, if






105
The difference between the normal theory (small
sample) test and the large sample test is the
critical value of the distribution
In fact, as n p grows, the distribution for
the normal theory (small sample) test almost
surely approaches the distribution of the large
sample test
So it is never truly inappropriate to use the
normal theory (small sample) test just
conservative.
106
We can also build a large sample (n gt gt p)
simultaneous confidence intervals for the mean
vector m. Let X1,Xn be a random sample from a
population with mean m, positive definite
covariance matrix S, and some arbitrary
distribution. When n p is large,





will contain am, for every a, with probability
1 a. Consequently, we can make the following
100(1 a) simultaneous confidence statements


107
Furthermore, for all pairs (mi, mk), i, k
1.,p, we have that

contain (mi, mk) with probability 1 a.

108
Example Suppose we have collected a random
sample of 107 observations in p 5 dimensions
and calculated the following summary statistics

Construct the five simultaneous 90 confidence
intervals for the individual mean components mi ,
i 1,,5.
109
We have a large sample (n 107 gt gt p 5), so
we will use the large sample approach to
constructing the simultaneous 90 confidence
intervals for the individual components of the
mean vector m.

110
If we used these intervals to test the null
hypothesis
at a 1 - 0.90 0.10 significance level, we
would reject the null (why?).
Write a Comment
User Comments (0)
About PowerShow.com