Techniques for studying correlation and covariance structure - PowerPoint PPT Presentation

1 / 40
About This Presentation
Title:

Techniques for studying correlation and covariance structure

Description:

Techniques for studying correlation and covariance structure – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 41
Provided by: lave9
Category:

less

Transcript and Presenter's Notes

Title: Techniques for studying correlation and covariance structure


1
Techniques for studying correlation and
covariance structure
  • Principal Components Analysis (PCA)
  • Factor Analysis

2
Principal Component Analysis
3
Let
have a p-variate Normal distribution
with mean vector
Definition
The linear combination
is called the first principal component if
is chosen to maximize
subject to
4
Consider maximizing
subject to
Using the Lagrange multiplier technique
Let
5
Now
and
6
Summary
is the first principal component if
is the eigenvector (length 1)of S associated with
the largest eigenvalue l1 of S.
7
The complete set of Principal components
Let
have a p-variate Normal distribution
with mean vector
Definition
The set of linear combinations
are called the principal components of
if
are chosen such that
8
and
  • Var(C1) is maximized.
  • Var(Ci) is maximized subject to Ci being
    independent of C1, , Ci-1 (the previous i -1
    principle components)

Note we have already shown that
is the eigenvector of S associated with the
largest eigenvalue, l1 ,of the covariance matrix
and
9
We will now show that
is the eigenvector of S associated with the ith
largest eigenvalue, li of the covariance matrix
and
Proof (by induction Assume true for i -1, then
prove true for i)
10
Now
has covariance matrix
11
Hence Ci is independent of C1, , Ci-1 if
We want to maximize
subject to
Let
12
Now
and
13
Now
hence
(1)
Also for j lt i
Hence fj 0 for j lt I and equation (1) becomes
14
are the eignevectors
of S associated with the eigenvalues
Thus
and
  • Var(C1) is maximized.
  • Var(Ci) is maximized subject to Ci being
    independent of C1, , Ci-1 (the previous i -1
    principal components)

where
15
Recall any positive matrix, S
where
are eigenvectors of S of length 1 and
are eigenvalues of S.
16
Example
  • In this example wildlife (moose) population
    density was measured over time (once a year) in
    three areas.

17
picture
Area 3
Area 2
Area 1
18
The Sample Statistics
The mean vector
The covariance matrix
The correlation matrix
19
Principal component Analysis
The eigenvalues of S
The eigenvectors of S
The principal components
20
Area 3
Area 2
Area 1
21
Area 3
Area 2
Area 1
22
Area 3
Area 2
Area 1
23
Graphical Picture of Principal Components
Multivariate Normal data falls in an ellipsoidal
pattern.
The shape and orientation of the ellipsoid is
determined by the covariance matrix S.
The eignevectors of S are vectors giving the
directions of the axes of the ellopsoid. The
eigenvalues give the length of these axes.
24
  • Recall that if S is a positive definite matrix

where P is an orthogonal matrix (PP PP I)
with the columns equal to the eigenvectors of S.
and D is a diagonal matrix with diagonal
elements equal to the eigenvalues of S.
25
  • The vector of Principal components

has covariance matrix
26
  • An orthogonal matrix rotates vectors, thus

rotates the vector
into the vector of Principal components
Also
tr(D)
27
  • The ratio

denotes the proportion of variance explained by
the ith principal component Ci.
28
The Example
29
Also
where
30
  • Comment
  • If instead of the covariance matrix, S, The
    correlation matrix R, is used to extract the
    Principal components then the Principal
    components are defined in terms of the standard
    scores of the observations

The correlation matrix is the covariance matrix
of the standard scores of the observations
31
More Examples
32
(No Transcript)
33
(No Transcript)
34
(No Transcript)
35
(No Transcript)
36
(No Transcript)
37
  • Computation of the eigenvalues and eigenvectors
    of S

Recall
38
continuing we see that
For large values of n
39
The algorithm for computing the eigenvector
  • Compute

rescaling so that the elements do not become to
large in value. i.e. rescale so that the largest
element is 1.
  • Compute

using the fact that
  • Compute l1 using

40
  • Repeat using the matrix
  • Continue with i 2 , , p 1 using the matrix

Example Using Excel - Eigen
Write a Comment
User Comments (0)
About PowerShow.com