PLS Vs. SVD in Dimensionality Reduction

About This Presentation

Title:

PLS Vs. SVD in Dimensionality Reduction

Description:

Find the axis with greatest variance. Project your data ... The magical formula for w is. 1. w. x. y. PLS: Intuition 2. Problem with linear regression is... – PowerPoint PPT presentation

Number of Views:173

Avg rating:3.0/5.0

Slides: 16

Provided by: ting3

Learn more at: http://www.cs.cmu.edu

Category:

more less

Transcript and Presenter's Notes

Title: PLS Vs. SVD in Dimensionality Reduction

1
PLS Vs. SVD inDimensionalityReduction

Paul Hsiung
December 3, 2002
16-811

2
Problem

Curse of dimensionality
Very sparse data
A lot of 0s
Some attributes irrelevant
Others are repeated
Many machine learning algorithm are infeasible at
high dimensions.
Compound dataset example

3
SVD Quick Review

Find the axis with greatest variance.
Project your data unto this axis.
Let the top n eigenvectors be the space of your
new decomposed data.

x2
e2
e1
x1
4
Partial Least Squares Intuition 1

SVD max the variance of X, PLS max the covariance
of X and Y.
SVD does not factor in Y when decomposing.
A good picture would be

5
Linear Regression

Given data output Y, input X
Find a w such that wTx best approximates Y in the
least square sense.
The magical formula for w is

y
? w ?
? 1 ?
x
6
PLS Intuition 2

Problem with linear regression is
PLS does as the name says, it finds the least
squares except its partial.
As it builds Bpls, it will decompose X into T
We can control how many dimension T has by the
number of iterations in PLS

7
PLS No Guts, No Glory
8
PLS Aftermath

Collect all small t1tn into T. Same for P, B,
and W.
Notice that T is s x n and thats our decomposed
dataset.
We define . R will
transform any X into T.
Prediction is done by Q is a column of 1s.

9
Dataset

Training set is 26,000 by 6000. Test is 1,400 by
6000.
Single output
Very sparse lots of 0s
Used ROC curve to rank results.

10
(No Transcript)
11
(No Transcript)
12
(No Transcript)
13
(No Transcript)
14
PLS Overfitting
Predict Training Predict Test
5 Dim 0.853 0.862
10 Dim 0.927 0.870
20 Dim 0.954 0.866
100 Dim 0.966 0.820
15
Conclusion

PLS at dim 10 is equivalent to SVD at dim 100.
But SVD is slightly better in the high
dimensions.
PLS tends to overfit after dim 10.
PLS as a predictor works pretty well.

Write a Comment

User Comments (0)

About PowerShow.com

PLS Vs. SVD in Dimensionality Reduction - PowerPoint PPT Presentation

PLS Vs. SVD in Dimensionality Reduction

Find the axis with greatest variance. Project your data ... The magical formula for w is. 1. w. x. y. PLS: Intuition 2. Problem with linear regression is... – PowerPoint PPT presentation