PEGASOS Primal Efficient subGrAdient SOlver for SVM - PowerPoint PPT Presentation

About This Presentation
Title:

PEGASOS Primal Efficient subGrAdient SOlver for SVM

Description:

... Svm SOlver. 2. Support Vector Machines. QP form: More 'natural' ... Analysis faster convergence rates. Experiments outperforms state-of-the-art. Extensions ... – PowerPoint PPT presentation

Number of Views:371
Avg rating:3.0/5.0
Slides: 20
Provided by: sha7155
Learn more at: https://home.ttic.edu
Category:

less

Transcript and Presenter's Notes

Title: PEGASOS Primal Efficient subGrAdient SOlver for SVM


1
PEGASOS Primal Efficient sub-GrAdient SOlver for
SVM
YASSO Yet Another Svm SOlver
  • Shai Shalev-Shwartz
  • Yoram Singer
  • Nati Srebro

The Hebrew University Jerusalem, Israel
2
Support Vector Machines
QP form
More natural form
Regularization term
Empirical loss
3
Outline
  • Previous Work
  • The Pegasos algorithm
  • Analysis faster convergence rates
  • Experiments outperforms state-of-the-art
  • Extensions
  • kernels
  • complex prediction problems
  • bias term

4
Previous Work
  • Dual-based methods
  • Interior Point methods
  • Memory m2, time m3 log(log(1/?))
  • Decomposition methods
  • Memory m, Time super-linear in m
  • Online learning Stochastic Gradient
  • Memory O(1), Time 1/?2 (linear kernel)
  • Memory 1/?2, Time 1/?4 (non-linear kernel)
  • Typically, online learning algorithms do not
    converge to the optimal solution of SVM

Better rates for finite dimensional instances
(Murata, Bottou)
5
PEGASOS
A_t S Subgradient method
A_t 1 Stochastic gradient
Subgradient
Projection
6
Run-Time of Pegasos
  • Choosing At1 and a linear kernel over Rn
  • ? Run-time required for Pegasos to find ?
    accurate solution w.p. 1-?
  • Run-time does not depend on examples
  • Depends on difficulty of problem (? and ?)

7
Formal Properties
  • Definition w is ? accurate if
  • Theorem 1 Pegasos finds ? accurate solution w.p.
    1-? after at most iterations.
  • Theorem 2 Pegasos finds log(1/?) solutions s.t.
    w.p. 1-?, at least one of them is ? accurate
    after iterations

8
Proof Sketch
A second look on the update step
9
Proof Sketch
  • Lemma (free projection)
  • Logarithmic Regret for OCP (Hazan et al06)
  • Take expectation
  • f(wr)-f(w) 0 ? Markov gives that w.p. 1-?
  • Amplify the confidence

10
Experiments
  • 3 datasets (provided by Joachims)
  • Reuters CCAT (800K examples, 47k features)
  • Physics ArXiv (62k examples, 100k features)
  • Covertype (581k examples, 54 features)
  • 4 competing algorithms
  • SVM-light (Joachims)
  • SVM-Perf (Joachims06)
  • Norma (Kivinen, Smola, Williamson 02)
  • Zhang04 (stochastic gradient descent)
  • Source-Code available online

11
Training Time (in seconds)
12
Compare to Norma (on Physics)
obj. value test error
13
Compare to Zhang (on Physics)
Objective
But, tuning the parameter is more expensive than
learning
14
Effect of kAt when T is fixed
Objective
15
Effect of kAt when kT is fixed
Objective
16
I want my kernels !
  • Pegasos can seamlessly be adapted to employ
    non-linear kernels while working solely on the
    primal objective function
  • No need to switch to the dual problem
  • Number of support vectors is bounded by

17
Complex Decision Problems
  • Pegasos works whenever we know how to calculate
    subgradients of loss func. l(w(x,y))
  • Example Structured output prediction
  • Subgradient is ?(x,y)-?(x,y) where y is the
    maximizer in the definition of l

18
bias term
  • Popular approach increase dimension of xCons
    pay for b in the regularization term
  • Calculate subgradients w.r.t. w and w.r.t
    bCons convergence rate is 1/?2
  • DefineCons At need to be large
  • Search b in an outer loopCons evaluating
    objective is 1/?2

19
Discussion
  • Pegasos Simple Efficient solver for SVM
  • Sample vs. computational complexity
  • Sample complexity How many examples do we need
    as a function of VC-dim (?), accuracy (?), and
    confidence (?)
  • In Pegasos, we aim at analyzing computational
    complexity based on ?, ?, and ? (also in Bottou
    Bousquet)
  • Finding argmin vs. calculating min It seems that
    Pegasos finds the argmin more easily than it
    requires to calculate the min value
Write a Comment
User Comments (0)
About PowerShow.com