Lagrangian Support Vector Machines - PowerPoint PPT Presentation

About This Presentation
Title:

Lagrangian Support Vector Machines

Description:

Start with dual formulation: ... accomplished with MATLAB code, in core. Method is extendible to out of core implementations. LSVM classifies massive datasets ... – PowerPoint PPT presentation

Number of Views:93
Avg rating:3.0/5.0
Slides: 26
Provided by: musi3
Learn more at: https://ftp.cs.wisc.edu
Category:

less

Transcript and Presenter's Notes

Title: Lagrangian Support Vector Machines


1
Lagrangian Support Vector Machines
  • David R. Musicant and O.L. Mangasarian
  • December 1, 2000

Carleton College
2
Lagrangian SVM (LSVM)
  • Fast algorithm simple iterative approach
    expressible in 11 lines of MATLAB code
  • Requires no specialized solvers or software
    tools, apart from a freely available equation
    solver
  • Inverts a matrix of the order of the number of
    features (in the linear case)
  • Extendible to nonlinear kernels
  • Linear convergence

3
The Discrimination ProblemThe Fundamental
2-Category Linearly Separable Case
A
A-
Separating Surface
4
The Discrimination ProblemThe Fundamental
2-Category Linearly Separable Case
  • Given m points in the n dimensional space Rn
  • Represented by an m x n matrix A
  • Membership of each point Ai in the classes 1 or
    -1 is specified by
  • An m x m diagonal matrix D with along its
    diagonal

5
Preliminary Attempt at the (Linear) Support
Vector MachineRobust Linear Programming
  • Solve the following mathematical program
  • where y nonnegative error (slack) vector
  • Note y 0 if convex hulls of A and A- do not
    intersect.

6
The (Linear) Support Vector MachineMaximize
Margin Between Separating Planes
A
A-
7
The (Linear) Support Vector Machine Formulation
  • Solve the following mathematical program
  • where y nonnegative error (slack) vector
  • Note y 0 if convex hulls of A and A- do not
    intersect.

8
SVM Reformulation
  • Standard SVM formulation
  • Add g2 to the objective function, and use 2-norm
    of slack variable y

Experiments show that this does not reduce
generalization capability.
9
Simple Dual Formulation
  • Dual of this problem is
  • I Identity matrix
  • Non-negativity constraints only
  • Leads to a very simple algorithm

Formulation ideas explored by Friess, Burges,
others
10
Simplified notation
  • Make substitution in dual problem to simplify
  • Dual problem then becomes
  • When computing , we use
  • Sherman-Morrison-Woodbury identity
  • Only need to invert a matrix of size (n 1) x (n
    1)

11
Deriving the LSVM Algorithm
  • Start with dual formulation
  • Karush-Kuhn-Tucker necessary and sufficient
    optimality conditions are
  • This is equivalent to the following equation

12
LSVM Algorithm
  • Last equation generates a fast algorithm if we
    replace the lhs u by the rhs u by
    as follows
  • Algorithm converges linearly if
  • In practice, we take
  • Only one matrix inversion is necessary
  • Use SMW identity

13
LSVM Algorithm Linear Kernel11 Lines of MATLAB
Code
function it, opt, w, gamma svml(A,D,nu,itmax,t
ol) lsvm with SMW for min 1/2u'Qu-e'u s.t.
ugt0, QI/nuHH', HDA -e Input A, D, nu,
itmax, tol Output it, opt, w, gamma it, opt,
w, gamma svml(A,D,nu,itmax,tol)
m,nsize(A)alpha1.9/nueones(m,1)HDA
-eit0 SHinv((speye(n1)/nuH'H))
unu(1-S(H'e))olduu1 while itltitmax
norm(oldu-u)gttol z(1pl(((u/nuH(H'u))-alph
au)-1)) olduu unu(z-S(H'z))
itit1 end optnorm(u-oldu)wA'Dugamma
-e'Dufunction pl pl(x) pl (abs(x)x)/2
14
LSVM with Nonlinear Kernel
  • Start with dual problem
  • Substitute to obtain

15
Nonlinear kernel algorithm
  • Define
  • Then algorithm is identical to linear case
  • One caveat SMW identity no longer applies,
    unless an explicit decomposition for the kernel
    is known
  • LSVM in its current form is effective on
    moderately sized nonlinear problems.

16
Experiments
  • Compared LSVM with standard SVM (SVM-QP) for
    generalization accuracy and running time
  • CPLEX 6.5 and SVMlight 3.10b
  • Tuning set w/ tenfold cross-validation used to
    find appropriate values of n
  • Demonstrated that LSVM performs well on massive
    problems
  • Data generated with NDC data generator
  • All experiments run on Locop2
  • 400 MHz Pentium II Xeon, 2 Gigabytes memory
  • Windows NT Server 4.0, Visual C 6.0

17
LSVM on UCI Datasets
LSVM is extremely simple to code, and performs
well.
18
LSVM on UCI Datasets
LSVM is extremely simple to code, and performs
well.
19
LSVM on Massive Data
  • NDC (Normally Distributed Clusters) data
  • This is all accomplished with MATLAB code, in
    core
  • Method is extendible to out of core
    implementations

LSVM classifies massive datasets quickly.
20
LSVM with Nonlinear Kernels
Nonlinear kernels improve classification accuracy.
21
Checkerboard Dataset
22
k-Nearest Neighbor Algorithm
23
LSVM on Checkerboard
  • Early stopping 100 iterations
  • Finished in 58 seconds

24
LSVM on Checkerboard
  • Stronger termination criteria (100,000
    iterations)
  • 2.85 hours

25
Conclusions and Future Work
  • Conclusions
  • LSVM is an extremely simple algorithm,
    expressible in 11 lines of MATLAB code
  • LSVM performs competitively with other well-known
    SVM solvers, for linear kernels
  • Only a single matrix inversion in n1 dimensions
    (where n is usually small) is required
  • LSVM can be extended for nonlinear kernels
  • Future work
  • Out-of-core implementation
  • Parallel processing of data
  • Integrating reduced SVM or other methods for
    reducing the number of columns in kernel matrix
Write a Comment
User Comments (0)
About PowerShow.com