Christine Steinhoff - PowerPoint PPT Presentation

1 / 66
About This Presentation
Title:

Christine Steinhoff

Description:

The diagonals of Cov(X) are . In matrix notation, The covariance matrix is symmetric ... In the diagonal terms, by assumption, large values correspond to ... – PowerPoint PPT presentation

Number of Views:217
Avg rating:3.0/5.0
Slides: 67
Provided by: stei162
Category:

less

Transcript and Presenter's Notes

Title: Christine Steinhoff


1
Lecture Part3 Principal Component Analysis and
its application in bioinformatics
Christine Steinhoff
2
Part 1 Linear Algebra Basics
Part 2 Principal Component Analysis An
Introduction
Part 3 Principal Component Analysis Examples
3
Part 1 Linear Algebra Basics
4
OUTLINE
What do we need from linear algebra for
understanding principal component analysis ?
  • Motivation
  • Standard deviation, Variance, Covariance
  • The Covariance matrix
  • Symmetric matrix and orthogonality
  • Eigenvalues and Eigenvectors
  • Properties

5
Motivation
6
Motivation
Proteins 1 and 2 measured for 200 patients
Protein 2
Protein1
7
Motivation
Patients 1
200
Genes 1 22,000
Microarray Experiment ? Visualize ? ? Which
genes are important ? ? For which subgroup of
patients ?
8
Motivation
Genes 1 200
Patients 1
10
9
Basics for Principal Component Analysis
  • Orthogonal/Orthonormal
  • Some Theorems...
  • Standard deviation, Variance, Covariance
  • The Covariance matrix
  • Eigenvalues and Eigenvectors

10
Standard Deviation
The average distance from the mean of the data
set to a point
MEAN
Example Measurement 1 0,8,12,20 Measurement
2 8,9,11,12
M1
M2
Mean 10
Mean 10
SD 8.33
SD 1.83
11
Variance
Example Measurement 1 0,8,12,20 Measurement
2 8,9,11,12
M1
M2
Mean 10
Mean 10
SD 8.33
SD 1.83
Var 69.33
Var 3.33
12
Covariance
Standard Deviation and Variance are
1-dimensional How much do the dimensions vary
from the mean with respect to each other
? Covariance measures between 2 dimensions
We easily see, if XY we end up with variance
13
Covariance Matrix
Let X be a random vector. Then the covariance
matrix of X, denoted by Cov(X), is The
diagonals of Cov(X) are
. In matrix notation,
The covariance matrix is symmetric
14
Symmetric Matrix
Let be a square matrix of size
nxn. The matrix A is symmetric, if
for all
15
Orthogonality/Orthonormality
ltv1,v2gt lt(1 0),(0 1)gt 0
Two vectors v1 and v2 for which ltv1,v2gt0 holds
are said to be orthogonal
Unit vectors which are orthogonal are said to be
orthonormal.
16
Eigenvalues/Eigenvectors
Let A be an nxn square matrix and x an nx1 column
vector. Then a (right) eigenvector of A is a
nonzero vector x such that For some scalar
Procedure Finding the eigenvalues
Finding corresponding eigenvectors
R eigen(matrix) Matlab eig(matrix)
17
Some Remarks
If A and B are matrices whose sizes are such that
the given operations are defined and c is any
scalar then,
18
Now,
We have enough definitions to go into the
procedure how to perform Principal Component
Analysis
19
Part 2 Principal Component Analysis An
Introduction
20
OUTLINE
What is principal component analysis good
for? Principal Component Analysis PCA
  • One Toy example a spring
  • The basic Idea of Principal Component Analysis
  • The idea of transformation
  • How to get there ? The mathematics part
  • Some remarks
  • Basic algorithmic procedure

21
Idea of PCA
We often do not know which measurements best
reflects the dynamics in our system
http//www.snl.salk.edu/shlens/pub/notes/pca.pdf
22
Idea of PCA
We often do not know which measurements best
reflects the dynamics in our system
So, PCA should reveal The dynamics are along
the x axis
x
23
Idea of PCA
  • Introduced by Pearson (1901) and Hotelling (1933)
    to describe the variation in a set of
    multivariate data in terms of a set of
    uncorrelated variables
  • We typically have a data matrix of n observations
    on p correlated variables x1,x2,xp
  • PCA looks for a transformation of the xi into p
    new variables yi that are uncorrelated

24
Idea
Genes x1 xp
Patients 1
n
Dimension high So how can we reduce the
dimension ? Simplest way take the first one,
two, three Plot and discard the rest Obviously
a very bad idea.
Matrix X
25
Transformation
We want to find a transformation that involves
ALL columns, not only the first ones So find a
new basis, order it such that in the first
component lies almost ALL information of the
whole dataset
Looking for a transformation of the data matrix X
(pxn) such that Y ?T X?1 X1 ?2
X2.. ?p Xp
26
Transformation
What is a reasonable choice for the ? ? Remember
We wanted a transformation that maximizes
information That means captures Variance in
the data
Maximize the variance of the projection of the
observations on the Y variables ! Find ? such
that Var(?T X) is maximal The matrix
CVar(X) is the covariance matrix of the Xi
variables
27
Transformation
Can we intuitively see that in a picture?
Good
Better
28
Transformation
PC1
PC2
Orthogonality
29
How do we get there?
Patients 1
n
Genes x1 xp
X is a real valued pxn matrix Cov(X) is a real
value pxp matrix or nxn matrix -gt decide whether
you want to analyse patient groups Or do you want
to analyse gene groups?
30
How do we get there?
Lets decide for genes
Cov(X)
31
How do we get there
Some Features on Cov(X)
  • Cov(X) is a symmetric pxp matrix
  • The diagonal terms of Cov(X) are the variance
    genes across patients
  • The off-diagonal terms of Cov(X) are the
    covariance between gene vectors
  • Cov(X) captures the correlations between all
    possible pairs of measurements
  • In the diagonal terms, by assumption, large
    values correspond to interesting dynamics
  • In the off diagonal terms large values correspond
    to high redundancy

32
How do we get there?
The principal Components of X are the
Eigenvectors of Cov(X)
Assume, we can manipulate X a bit Lets call
this Y Y should be manipulated in a way that it
is a bit more optimal than X was What does
optimal mean? That means
SMALL!
Var
Cov
Var
Var
LARGE!
In other words should be diagonal and large
values on the diagonal
33
How do we get there?
The manipulation is a change of the basis with
orthonormal vectors And they are ordered in a
way that the most important comes first
(principal) ... How do we put this in
mathematical terms?
Find orthonormal P such that
Y P X
With Cov(Y) diagonalized
Then the rows of P are the principal components
of X
34
How do we get there?
Cov(Y) 1/(n-1) YY t
AXX t
35
How do we get there?
A is symmetric Therefore there is a matrix E of
eigenvectors and a diagonal matrix D such that
Now define P to be the transpose of the matrix E
of eigenvectors
Then we can write A
36
How do we get there?
Now we can go back to our Covariance Expression
Cov(Y)
37
How do we get there?
The inverse of an orthogonal matrix is its
transpose (due to its definition)
In our context that means
Cov(Y)
38
How do we get there?
P diagonalizes Cov(Y) Where P is the transpose of
the matrix of Eigenvectors of XX t The principal
components of X are the eigenvectors of XX
t (thats the same as the rows of P) The ith
diagonal value of Cov(Y) is the variance of X
along pi (along the ith principal)
Essentially we need to compute EIGENVALUES
and EIGENVECTORS
Explained variance
Principal components
Of the covariance matrix of the original matrix X
39
Some Remarks
  • If you multiply one variable by a scalar you get
    different results
  • This is because it uses covariance matrix (and
    not correlation)
  • PCA should be applied on data that have
    approximately the same scale in each variable
  • The relative variance explained by each PC is
    given by eigenvalue/sum(eigenvalues)
  • When to stop? For example Enough PCs to have a
    cumulative variance explained by the PCs that is
    gt50-70
  • Kaiser criterion keep PCs with eigenvalues gt1

40
Some Remarks
41
Some Remarks
If variables have very heterogenous variances we
standardize them The standardized variables Xi
Xi (Xi-mean)/?variance The new
variables all have the same variance, so each
variable have the same weight.
42
REMARKS
  • PCA is useful for finding new, more informative,
    uncorrelated features it reduces dimensionality
    by rejecting low variance features
  • PCA is only powerful if the biological question
    is related to the highest variance in the dataset

43
Algorithm
Data (Data.old mean ) /sqrt(variance)
Cov(data) 1/(N-1) Datatr(Data)
Find Eigenvector/Eigenvalue (Function in R and
matlab eig) and sort
Eigenvectors V Eigenvalues P
Project the original data P data
Plot as many components as necessary
44
Part 3 Principal Component Analysis Examples
45
OUTLINE
Principal component analysis in bioinformatics
46
OUTLINE
Principal component analysis in bioinformatics
47
Example 1
48
Lefkovits et al.
T cells belong to a group of white blood cells
known as lymphocytes and play a central role in
cell-mediated immunity.
49
Lefkovits et al.
50
Lefkovits et al.
51
Lefkovits et al.
Clones 1
n
Spots x1 xp
X is a real valued pxn matrix They want to
analyse relatedness of clones Cov(X) is a real
value nxn matrix They take Correlation matrix
(which is on the top the division by the standard
deviations)
52
Lefkovits et al.
53
Example 2
54
Yang et al.
  • Transforming growth factor-beta
  • TGF-beta is a potent inducer of growth arrest in
    many cell types, including epithelial cells.
  • This activity is the basis of the tumor
    suppressor role of the TGF-beta signaling system
    in carcinomas.
  • contribute to cancer progression.
  • special relevance in mesenchymal differentiation,
    including bone development.
  • Deregulated expression or activation of
    components of this signaling system can
    contribute to skeletal diseases, e.g.
    osteoarthritis.

55
Yang et al.
Stock 1 T constitutively active tkv receptor
Stock 2 B constitutively active babo receptor
T1,T2,T3 B1,B2,B3 Contr1,2,3
genes x1 xp
56
Yang et al.
  • Filter Genes
  • Only expressed (present) genes
  • that show at least some effect comparing the
    three groups

57
Yang et al.
58
Yang et al.
tkv
Babo
Control
59
Ulloa-Montoya et al.
Multipotent Adult progenitor cells
Pluripotent Embryonic stem cells
Mesenchymal stem cells
60
Ulloa-Montoya et al.
61
(No Transcript)
62
Yang et al.
But We only see the different experiments If
we do it the other way round that means
analysing for the genes not for the experiments
we see grouping of genes But we never see both
together. So, can we relate somehow the
experiments and the genes? That means group
genes whose expression might be explained by the
the respective experimental group (tkv, babo,
control)? This goes into correspondence
analysis
63
(No Transcript)
64
Vectorspace and Basis
  • Let F be a field (for example real numbers) whose
    elements are called scalars.
  • A vector space over the field F is a set V
    together with the operations
  • vector addition V V ? V denoted v
    w, where v, w ? V, and
  • scalar multiplication F V ? V denoted av,
    where a ? F and v ? V,
  • Satisfying
  • Vector addition is associative (u, v, w ? V, u
    (v w) (u v) w)
  • Vector addition is commutative (v, w ? V, v w
    w v)
  • Vector addition has an identity element (0 ? V,
    such that v 0 v for all v ? V)
  • Vector addition has an inverse element
  • (v ? V, there exists w ? V, such that v w
    0.
  • Distributivity holds for scalar multiplication
    over vector addition
  • (a ? F and v, w ? V, a(v w) a v a w)
  • Distributivity holds for scalar multiplication
    over field addition
  • (a, b ? F and v ? V, (a b) v a v b v)
  • Scalar multiplication is compatible with
    multiplication in the field of scalars
  • (a, b ? F and v ? V, a (b v) (ab) v)
  • Scalar multiplication has an identity element

65
Vectorspace and Basis
  • A basis of V is a linearly independent set of
    vectors in V which spans V.
  • Example Fn the standard basis
  • V is finite dimensional if there is a finite
    basis. Dimension of V is the number of elements
    of a basis. (Independent of the choice of basis.)

66
Orthogonal
An orthogonal matrix is a square matrix Q whose
transpose is its inverse
Matrix is orthogonally diagonalizable that
is, there exists an orthogonal matrix such
that
Orthogonal vectors inner product is zero
ltv,vgt0 Orthonormal vectors orthogonality and
length 1
Write a Comment
User Comments (0)
About PowerShow.com