Incremental Support Vector Machine Classification Second SIAM International Conference on Data Minin

About This Presentation

Title:

Incremental Support Vector Machine Classification Second SIAM International Conference on Data Minin

Description:

Even for huge problems (1 billion) NO optimization packages (LP,QP) needed. Outline of Talk ... where e is a vector of ones. Separate by two bounding planes, ... – PowerPoint PPT presentation

Number of Views:252

Avg rating:3.0/5.0

Slides: 23

Provided by: olvilman9

Learn more at: https://ftp.cs.wisc.edu

Category:

more less

Transcript and Presenter's Notes

Title: Incremental Support Vector Machine Classification Second SIAM International Conference on Data Minin

1
Incremental Support Vector Machine
ClassificationSecond SIAM International
Conference on Data Mining Arlington, Virginia,
April 11-13, 2002

Glenn Fung Olvi Mangasarian

Data Mining Institute University of Wisconsin -
Madison
2
Key Contributions

Fast incremental classifier based on PSVM
Proximal Support Vector Machine
Capable of modifying an existing linear
classifier by both adding and retiring data
Extremely simple to implement
Small memory requirement
Even for huge problems (1 billion)
NO optimization packages (LP,QP) needed

3
Outline of Talk

(Standard) Support vector machines (SVM)
Classification by halfspaces
Proximal linear support vector machines (PSVM)
Classification by proximity to planes
The incremental and decremental algorithm
Option of keeping or retiring old data
Numerical results
1 Billion points in 10 dimensional space
classified in less than 3 hours!
Numerical results confirm that algorithm time is
linear in the number of data points

4
Support Vector MachinesMaximizing the Margin
between Bounding Planes
A
A-
5
Proximal Support Vector MachinesFitting the Data
using two parallel Bounding Planes
A
A-
6
Standard Support Vector MachineAlgebra of
2-Category Linearly Separable Case
7
Standard Support Vector Machine Formulation
8
PSVM Formulation
We have from the standard QP SVM formulation
This simple, but critical modification, changes
the nature of the optimization problem
tremendously!!
9
Advantages of New Formulation

Objective function remains strongly convex.
An explicit exact solution can be written in
terms of the problem data.
PSVM classifier is obtained by solving a single
system of linear equations in the usually small
dimensional input space.
Exact leave-one-out-correctness can be obtained
in terms of problem data.

10
Linear PSVM

Setting the gradient equal to zero, gives a
nonsingular system of linear equations.
Solution of the system gives the desired PSVM
classifier.

11
Linear PSVM Solution
12
Linear Proximal SVM Algorithm
13
Linear Nonlinear PSVM MATLAB Code
function w, gamma psvm(A,d,nu) PSVM linear
and nonlinear classification INPUT A,
ddiag(D), nu. OUTPUT w, gamma w, gamma
psvm(A,d,nu) m,nsize(A)eones(m,1)HA
-e v(dH) vHDe
r(speye(n1)/nuHH)\v solve (I/nuHH)rv
wr(1n)gammar(n1) getting w,gamma from
r
14
Incremental PSVM Classification
15
Linear Incremental Proximal SVM Algorithm
16
Linear Incremental Proximal SVM Adding Retiring
Data

Capable of modifying an existing linear
classifier by both adding and retiring data
Option of retiring old data is similar to adding
new data
Financial Data old data is obsolete
Option of keeping old data and merging it with
the new data
Medical Data old data does not obsolesce.

17
Numerical experimentsOne-Billion Two-Class
Dataset

Synthetic dataset consisting of 1 billion points
in 10- dimensional input space
Generated by NDC (Normally Distributed
Clustered) dataset generator
Dataset divided into 500 blocks of 2 million
points each.
Solution obtained in less than 2 hours and 26
minutes
About 30 of the time was spent reading data
from disk.
Testing set Correctness 90.79

18
Numerical Experiments Simulation of Two-month
60-Million Dataset

Synthetic dataset consisting of 60 million
points (1 million per day) in 10- dimensional
input space
Generated using NDC
At the beginning, we only have data
corresponding to the first month
Every day
The oldest block of data is retired (1 Million)
A new block is added (1 Million)
A new linear classifier is calculated daily
Only an 11 by 11 matrix is kept in memory at the
end of each day. All other data is purged.

19
Numerical experimentsSeparator changing through
time
20
Numerical experiments Normals to the separating
hyperplanes Corresponding to 5 day intervals
21
Conclusion

Proposed algorithm is an extremely simple
procedure for generating linear classifiers in an
incremental fashion for huge datasets.
The linear classifier is obtained by solving a
single system of linear equations in the
small dimensional input space.
The proposed algorithm has the ability to retire
old data and add new data in a very simple
manner.
Only a matrix of the size of the input space is
kept in memory at any time

22
Future Work

Extension to nonlinear classification
Parallel formulation and implementation on
remotely located servers for massive datasets
Real time on-line application, e.g. fraud
detection

Write a Comment

User Comments (0)