Change Detection in Data Streams by Testing Exchangeability - PowerPoint PPT Presentation

About This Presentation
Title:

Change Detection in Data Streams by Testing Exchangeability

Description:

Change Detection in Data Streams by Testing Exchangeability Shen-Shyang Ho JPL/Caltech The research is part of the author s PhD dissertation (in computer science ... – PowerPoint PPT presentation

Number of Views:139
Avg rating:3.0/5.0
Slides: 36
Provided by: niss150
Learn more at: https://www.niss.org
Category:

less

Transcript and Presenter's Notes

Title: Change Detection in Data Streams by Testing Exchangeability


1
Change Detection in Data Streams by Testing
Exchangeability
  • Shen-Shyang Ho
  • JPL/Caltech

The research is part of the authors PhD
dissertation (in computer science) at George
Mason University Conference travel is partially
sponsored by NASA Postdoctoral Program (NPP)
Travel Grant.
2
Outline
  • Introduction
  • Previous Work (Statistics and Machine
    Learning/Data Mining/Computer Vision)
  • Intuition
  • Background (Exchangeability/Martingale)
  • Methodology
  • Comparison and Experimental Results
  • Application I Adaptive Support Vector Machine
    (Classification Model)
  • Application II Video Shot Change Detection
    (Cluster Model)

3
Introduction
Assumption Data vectors are observed
sequentially.
4
Introduction
5
Previous Work
  • Statistics - Sequential Analysis is statistical
    inference with the assumption that the number of
    observations/samples required is not
    pre-determined.
  • Sequential Probability Ratio Test A. Wald
    (1945)
  • Application Quality Control (Military/Manufactur
    ing)
  • CUSUM (Cumulative Sum) E. S. Page (1954)
  • Refer to Sequential Analysis Design Methods
    and Applications Journal for recent research.
  • Most recent issue (vol 27, no 2, 2008) papers
    on structural change/minimax method for
    change-point detection problems/multidecision
    quickest change-point detection 3 out of 6
    papers.
  • Machine Learning/Data Mining
  • Applications Concept Drift Problem, Adaptive
    classifier, Anomaly in Internet Traffic,
    Video-shot change detection
  • Proposed methodology is usually problem-specific
  • Monitoring error, sliding window, weighted data,
    ensemble classifier
  • Statistical method Likelihood ratio method,
    Bayesian methods, Hypothesis Testing

6
Related Data Mining/Machine Learning/Computer
Vision Research
  1. Xiuyao Song, Mingxi Wu, Christopher M. Jermaine,
    Sanjay Ranka Statistical change detection for
    multi-dimensional data. KDD 2007 667-676
  2. Kolter, J.Z. and Maloof, M.A. Dynamic Weighted
    Majority An ensemble method for drifting
    concepts. Journal of Machine Learning Research
    82755--2790, 2007.
  3. Klinkenberg, Ralf and Joachims, Thorsten
    Detecting Concept Drift with Support Vector
    Machines. Proceedings of the Seventeenth
    International Conference on Machine Learning
    (ICML) 487--494, 2000.
  4. Bi Song, Namrata Vaswani, Amit K. Roy Chowdhury
    Closed-Loop Tracking and Change Detection in
    Multi-Activity Sequences. CVPR 2007
  5. Paul L. Rosin Thresholding for Change Detection.
    ICCV 1998 274-279
  6. Balachander Krishnamurthy, Subhabrata Sen, Yin
    Zhang, Yan Chen Sketch-based change detection
    methods, evaluation, and applications. Internet
    Measurement Conference 2003 234-247
  7. Tsuyoshi Idé, Keisuke Inoue Knowledge Discovery
    from Heterogeneous Dynamic Systems using
    Change-Point Correlations. SDM 2005
  8. Tsuyoshi Idé, Koji Tsuda Change-Point Detection
    using Krylov Subspace Learning. SDM 2007
  9. Daniel Kifer, Shai Ben-David, Johannes Gehrke,
    Detecting Changes in Data Streams, Proc. 30th
    VLDB Conference, 2004.
  10. ...

7
Motivation
Lack of Exchangeability implies Change in Data
Distribution/Model
3/29/2016
7
8
Intuition
9
Background
  • Vovk et als work on Testing Exchangeability
    Online (ICML 2003) and Algorithmic Learning in
    a random world (Springer) -
  • Testing exchangeability assumption in an online
    mode.
  • Explicit Martingale for testing the hypothesis of
    exchangeability

(Refer to http//www.vovk.net (conformal
prediction) )
3/29/2016
9
10
Background
3/29/2016
10
11
Background
12
Methodology - Strangeness
  • Strangeness measures how well one data point (for
    each data point seen so far) is represented by a
    data model compared to other points
  • Applicable to classification, regression or
    cluster model
  • measure diversity / disagreements, i.e. the
    higher the strangeness of a point, the less
    likely it comes from the model

Condition for a valid strangeness measure A
strangeness value of a data point at a particular
time instance should be independent of the order
it is observed with respect to the other data
points.
13
Classification Model
Strangeness (SVM) Lagrange Multiplier
14
Classification Model
Strangeness (SVM) Lagrange Multiplier
3/29/2016
15
Cluster Model
16
Regression Model
(Papadopoulos et al., Inductive Confidence
Machines for Regression, ECML, LNAI 2430, pp
345-356, 2002)
17
Methodology
18
Methodology
19
Methodology
20
Methodology
21
Methodology
3/29/2016
21
22
Experimental Result Performance Measure
3/29/2016
22
23
Experimental Result Varying
3/29/2016
23
24
Experimental Result Varying Strangeness
25
Experimental Result Varying
26
Experimental Result
3/29/2016
26
27
Experimental Result
3/29/2016
27
28
Experimental Result Different Methods
29
Application Adaptive SVM
30
Application Adaptive SVM
Simulated USPS 3-Digit Image Data Stream
t
01120120034003340415655611577789987
3/29/2016
30
31
Application Adaptive SVM
A (blue) True Change Point Known to the
SVM B(red) Adaptive SVM using martingale
method C(magenta) SVM using sliding window of
size 250 D(black) SVM using sliding window of
size 500 E(green) SVM using sliding window of
size 1000
32
Application Video-Shot Change Detection
Martingale Change Detection using multiple
features (MVMT Multiple-view martingale test)
33
Application Video-Shot Change Detection
  • HI Histogram Intersection
  • Chi-Square Measure
  • Euclidean Distance (ED)

3/29/2016
33
34
Reference
  1. S.-S. Ho and H. Wechsler, Detecting Change-Points
    in Unlabeled Data Streams using Martingale, Proc.
    20th Int. Joint. Conf. Artificial Intelligence
    (IJCAI 2007), Hyderabad, India, Jan. 6 - 12,
    2007.
  2. S-S Ho, A Martingale Framework for Concept Change
    Detection in Time-Varying Data Streams, Proc Int.
    Conf. on Machine Learning (ICML 2005), Bonn,
    Germany, Aug. 7 - 11, 2005
  3. S-S Ho and H. Wechsler, Adaptive Support Vector
    Machine for Time-Varying Data streams Using the
    Martingale, Proc. Int. Joint Conf. on Artificial
    Intelligence (IJCAI 2005), Edinburgh, Scotland,
    July 30 - Aug. 5, 2005
  4. S-S Ho and H. Wechsler, On the detection of
    concept change in time-varying data streams by
    testing exchangeability, Proc. Conference on
    Uncertainty in Artificial Intelligence (UAI
    2005), Edinburgh, Scotland, July 26 - 29, 2005
  5. http//shenshyang.googlepages.com/codes (matlab
    codes datasets)

3/29/2016
34
35
Acknowledgement
  • Harry Wechsler, PhD Advisor (George Mason
    University)
  • Volodya Vovk, (Royal Holloway, University of
    London)
  • Alexander Gammerman (Royal Holloway, University
    of London)
  • Oak Ridge Associated University (ORAU)

3/29/2016
35
Write a Comment
User Comments (0)
About PowerShow.com