PLS Vs. SVD in Dimensionality Reduction - PowerPoint PPT Presentation

About This Presentation
Title:

PLS Vs. SVD in Dimensionality Reduction

Description:

Find the axis with greatest variance. Project your data ... The magical formula for w is. 1. w. x. y. PLS: Intuition 2. Problem with linear regression is... – PowerPoint PPT presentation

Number of Views:173
Avg rating:3.0/5.0
Slides: 16
Provided by: ting3
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: PLS Vs. SVD in Dimensionality Reduction


1
PLS Vs. SVD inDimensionalityReduction
  • Paul Hsiung
  • December 3, 2002
  • 16-811

2
Problem
  • Curse of dimensionality
  • Very sparse data
  • A lot of 0s
  • Some attributes irrelevant
  • Others are repeated
  • Many machine learning algorithm are infeasible at
    high dimensions.
  • Compound dataset example

3
SVD Quick Review
  • Find the axis with greatest variance.
  • Project your data unto this axis.
  • Let the top n eigenvectors be the space of your
    new decomposed data.

x2
e2
e1
x1
4
Partial Least Squares Intuition 1
  • SVD max the variance of X, PLS max the covariance
    of X and Y.
  • SVD does not factor in Y when decomposing.
  • A good picture would be

5
Linear Regression
  • Given data output Y, input X
  • Find a w such that wTx best approximates Y in the
    least square sense.
  • The magical formula for w is

y
? w ?
? 1 ?
x
6
PLS Intuition 2
  • Problem with linear regression is
  • PLS does as the name says, it finds the least
    squares except its partial.
  • As it builds Bpls, it will decompose X into T
  • We can control how many dimension T has by the
    number of iterations in PLS

7
PLS No Guts, No Glory
8
PLS Aftermath
  • Collect all small t1tn into T. Same for P, B,
    and W.
  • Notice that T is s x n and thats our decomposed
    dataset.
  • We define . R will
    transform any X into T.
  • Prediction is done by Q is a column of 1s.

9
Dataset
  • Training set is 26,000 by 6000. Test is 1,400 by
    6000.
  • Single output
  • Very sparse lots of 0s
  • Used ROC curve to rank results.

10
(No Transcript)
11
(No Transcript)
12
(No Transcript)
13
(No Transcript)
14
PLS Overfitting
Predict Training Predict Test
5 Dim 0.853 0.862
10 Dim 0.927 0.870
20 Dim 0.954 0.866
100 Dim 0.966 0.820
15
Conclusion
  • PLS at dim 10 is equivalent to SVD at dim 100.
    But SVD is slightly better in the high
    dimensions.
  • PLS tends to overfit after dim 10.
  • PLS as a predictor works pretty well.
Write a Comment
User Comments (0)
About PowerShow.com