Loading...

PPT – Principal Components PowerPoint presentation | free to view - id: 1400dd-MmU2Y

The Adobe Flash plugin is needed to view this content

Principal Components

- As part of a study on football helmets,

scientists collected head measurements from a

number of football players - They measured 6 different aspects of the players

heads

Principal Components

- Dealing with 6 dimensional data is difficult
- We could plot pairs of variables
- There are 15 such pairs
- But it turns out that even this may not give a

clear picture of the data

Principal Components

- Suppose we have data that is distributed over the

3D plane that goes thru (1,0,0), (0,1,0), (0,0,1) - No matter which pair of axes we use, the plots

will just look like a cloud of points - We will never realize that the data actually lies

on a 2 dimensional surface

Principal Components

- Recall the formula for correlation in simple

linear regression - Corr Sxy/?(SxxSyy)
- The numerator, Sxy, is proportional to covariance
- It measures the extent to which larger values of

X are associated with larger (or smaller) values

of Y - If Cov0, then X and Y are not related

Principal Components

- In terms of a plot, this means that the plot of Y

vs X is just a cloud of points - The plot does not tilt in either direction

Principal Components

- The problem with the data in the plane example is

that the values are correlated - If we could look down the edge of the plane, then

we would see that there is not a third dimension

to the data - All the data lies in only the two dimensions of

the plane

Principal Components

- In reality, data rarely lies exactly on a plane
- But the cloud of points can extend much more in

some directions than in others - Typically, the data forms a cloud that resembles

an ellipsoid

Principal Components

- If we can find the axes of the ellipsoid, we can

view the data in terms of these components - The longest axis is the most interesting
- The shortest axis does not have much information

Principal Components

- NOTE There are two ways to proceed
- We can work with the Covariance matrix
- Or we can work with the Correlation matrix
- The theory is developed for the Cov matrix
- Sometimes the Corr matrix makes more sense

Principal Components

- For variables x1, x2,
, xk, define the

covariance matrix so that the (i,j) element is

Sxixj - This means that the matrix will be symmetric
- If two variables are uncorrelated, then the

corresponding element of Cov will be zero (or

nearly so)

Principal Components

- If we find the eigenvectors and eigenvalues of

Cov, this will diagonalize the Cov matrix - Ccov(data)
- v,deig(c)
- D is a diagonal matrix of eigenvalues
- Diag(d) returns a list of the eigenvalues

Principal Components

- If we transform our data by V, then Cov of the

transformed data will be D - Data2(data-ones(size(data)) diag(means(data)))v

- Then cov(data2)d
- This means that all the variables in data2 are

uncorrelated

Principal Components

- Data2 is called the principal components of data1
- Furthermore, the variances of data2 are the

eigenvalues of Cov(data1) - This means that the largest eigenvalue is the

most variable PC - This corresponds to the longest axis of the

original ellipsoid of data

Principal Components

- In some sense, the sum of the e-values is the

overall variance - We can think of the individual e-values in terms

of what percent of the total they are - Eig() tends to return the e-values in ascending

order - We want them in descending order
- Dsort-sort(-diag(d))
- Then cumsum(dsort)/sum(dsort) tells us what

percent of the total would be contained in the

first k PCs

Principal Components

- General rule use enough PCs to contain 80-90 of

the total - Balance this against how many PCs
- If only 2-3 PCs contain most of the total, then

our problem is a lot simpler than we thought - Besides plots, we can use the PCs to detect

groupings of the data or outliers, say

Principal Components

- Other multivariate methods
- MANOVA ANOVA based on several variables rather

than just one - MV discriminant analysis what distinguishes one

group from another? - Canonical correlation what components of two

sets of data are most correlated? - All of these involve ideas similar to PCA