Introduction to Support Vector Machines for Data Mining - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Introduction to Support Vector Machines for Data Mining

Description:

First introduced by Vapnik and Chervonenkis in COLT-92. Bases on ... the goal is to construct a hyperplane that is close to as many points as possible. ... – PowerPoint PPT presentation

Number of Views:146
Avg rating:3.0/5.0
Slides: 20
Provided by: BerksLehi2
Category:

less

Transcript and Presenter's Notes

Title: Introduction to Support Vector Machines for Data Mining


1
Introduction to Support Vector Machines for Data
Mining
  • Mahdi Nasereddin Ph.D.
  • Pennsylvania State University
  • School of Information Sciences and Technology

2
Agenda
  • Introduction
  • Support Vector Machines
  • Preliminary Experimentation
  • Conclusion
  • Questions?

3
Data Mining Techniques
  • Neural Networks
  • Decision Trees
  • Multivariate Adaptive Regression Splines (MARS)
  • Rule Induction
  • Nearest Neighbor Method and discriminant analysis
  • Genetic Algorithms
  • Support Vector Machines

4
Support Vector Machines
  • First introduced by Vapnik and Chervonenkis in
    COLT-92
  • Bases on Statistical Learning Theory
  • Applications
  • Basic Theory
  • Classification
  • Regression

5
Successful Applications of SVMS
  • Protein Structure Prediction http//www.cs.umn.ed
    u/hpark/papers/surface.pdf
  • Intrusion Detection www.cs.nmt.edu/IT
  • Handwriting Recognition
  • Detecting Steganography in digital images
    http//www.cs.dartmouth.edu/farid/publications/ih
    02.html

6
Successful Applications of SVMS
  • Breast Cancer Prognosis Chemotherapy Effect on
    Survival Rate (Lee, Mangasarian and Wolberg,
    2001)
  • Particle and Quark-Flavour Identification in High
    Energy Physics (http//wwwrunge.physik.uni-freibur
    g.de/preprints/EHEP9901.ps)
  • Function Approximation

7
Support Vector Machines(Linearly separable case)
8
Support Vector Machines(Linearly separable case)
9
Support Vector Machines(Linearly separable case)
10
Non-Linearly separable case
11
SVM for Regression
  • In case of regression, the goal is to construct a
    hyperplane that is close to as many points as
    possible.
  • For both classification and regression, learning
    is done via quadratic programming (one optimum
    point)

12
Strengths and Weaknesses of SVM
  • Strengths
  • Training is relatively easy
  • No local optimal, unlike in neural networks
  • It scales relatively well to high dimensional
    data
  • Weaknesses
  • Need a good kernel function

13
Preliminary Experimentation Forecasting GDP
using Oil Prices (with F. Malik)
  • Forecasting model
  • Objective To predict the Gross Domestic Product
    (GDP) for the next quarter using
  • Oil prices (including time lag)
  • GDP time

14
Data Set
  • We looked at quarterly Oil prices and GDP data
  • January 1947 December 2002
  • Oil price data were obtained from Bureau of Labor
    Statistics
  • GDP data were obtained from the Bureau of
    Economic Analysis.
  • We used the growth rate of GDP and the growth
    rate of oil prices.

15
Models
  • Neural Networks
  • Back-propagation
  • One hidden layer
  • Delta rule was used for training
  • LS-SVM (Van Gestel, 2001)
  • Matlab toolbox

16
Experimentation
  • Created the training data to predict the last 40
    quarters GDP (test data)
  • Trained the neural network and the SVM
  • Used the model to predict GDP, and calculated the
    error of prediction

17
Results
Model MAE
Neural Network 0.0044
LS-SVM 0.0052
18
Good References
  • Introductions
  • Martin Law, An Introduction to Support Vector
    Machines
  • Andrew More, Support Vector Machines
    www.cs.cmu.edu/awm
  • N. Cristianini www.support-vector.net/tutorial.htm
    l
  • In depth
  • Support Vector Machines book www.support-vector.ne
    t

19
Questions
  • E-mail mxn16_at_psu.edu
  • Presentation will be posted (by Friday) at
    http//www.bklv.psu.edu/faculty/nasereddin
Write a Comment
User Comments (0)
About PowerShow.com