Learning Stable Multivariate Baseline Models for Outbreak Detection - PowerPoint PPT Presentation

View by Category
About This Presentation
Title:

Learning Stable Multivariate Baseline Models for Outbreak Detection

Description:

Learning Stable Multivariate Baseline Models for Outbreak Detection Sajid M. Siddiqi, Byron Boots, Geoffrey J. Gordon, Artur W. Dubrawski The Auton Lab – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 18
Provided by: Saji3
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Learning Stable Multivariate Baseline Models for Outbreak Detection


1
Learning Stable Multivariate Baseline Models for
Outbreak Detection
  • Sajid M. Siddiqi, Byron Boots, Geoffrey J.
    Gordon, Artur W. Dubrawski
  • The Auton Lab School of Computer Science
    Carnegie Mellon University

presented by Robin Sabhnani from the Auton Lab
This work was partly funded by NSF grant
IIS-0325581 and CDC award R01-PH000028
2
Motivation
  • Lots of health-related data available
  • Much of this data is temporal
  • Many data sources are also multivariate

3
Motivation
  • When detecting anomalies, the crucial information
    could be hidden in the dynamics of the data as
    well as the interaction between different data
    streams
  • Our goal Learn good models for simulating
    baseline data for use in training algorithms as
    well as detecting anomalies
  • Linear Dynamical Systems are a good choice

4
Outline
  • Linear Dynamical Systems
  • Learning Stable Models
  • Experimental Setup
  • Results
  • Conclusion

5
Linear Dynamical Systems (LDS)
hidden variables (low-dimensional)
. . .
X1
X2
Xt
Xt1
Y1
Y2
Yt
Yt1
observed data (high-dimensional)
6
Linear Dynamical Systems (LDS)
hidden variables (low-dimensional)
. . .
X1
X2
Xt
Xt1
Y1
Y2
Yt
Yt1
observed data (high-dimensional)
7
Linear Dynamical Systems (LDS)
hidden variables (low-dimensional)
. . .
X1
X2
Xt
Xt1
Y1
Y2
Yt
Yt1
observed data (high-dimensional)
  • Dynamics matrix A models temporal evolution

8
Linear Dynamical Systems (LDS)
hidden variables (low-dimensional)
. . .
X1
X2
Xt
Xt1
Y1
Y2
Yt
Yt1
observed data (high-dimensional)
  • Dynamics matrix A models temporal evolution
  • Multivariate Gaussian noise vt , wt models
    interaction between streams

9
Linear Dynamical Systems
  • The Good
  • Linear Dynamical Systems (aka State-Space models,
    aka Kalman Filters) are a generalization of ARMA
    models and can represent a wide range of time
    series
  • LDS parameters can be learned from data
  • The Bad
  • LDSs learned from data are often unstable
  • Simulation from an unstable LDS degenerates

10
Stability
  • Stability of an LDS depends on its dynamics
    matrix A
  • Let ?1,,?n be the eigenvalues of A in
    decreasing order of magnitude
  • A is stable if ?1 lt 1
  • Constraining ?1 during learning is hard
  • We devise an iterative optimization method that
    beats previous approaches in efficiency and
    accuracy

A Constraint Generation Approach to Learning
Stable Linear Dynamical Systems, S. Siddiqi, B.
Boots, G. Gordon, NIPS 2007
11
Stability
  • Learning stable LDS models allows us to
  • Compress large temporal multivariate datasets
  • Generate realistic data sequences
  • Predict the future given some data
  • Deviations from predicted data indicate anomalies

12
Experimental Setup
  • Data
  • OTC drug sales data for 22 categories in 29
    Pittsburgh zip codes over 60 days
  • track all zipcodes for cough/cold category
    (multi-zipcode data)
  • track all drug-categories for city of pittsburgh
    (multi-drug data)
  • Experiments
  • Learn a LDS model using first 15 days, and
  • Simulate a sequence (qualitative task)
  • Reconstruct state sequence (quantitative task)
  • Predict future occurrences (quantitative task)
  • Algorithms
  • Constraint Generation (our method),
  • LB-1 (state of the art stability algorithm),
  • Least Squares (naïve, no stability guarantees)

Subspace Identification with guaranteed
stability using subspace identification, S. Lacy
and D. Bernstein, ACC 2002
13
Data Simulations
  • Instability causes Least-Squares simulations to
    diverge
  • Constraint Generation yields most realistic
    simulations that are also stable

14
State Reconstruction
  • Obtained by computing the residual ?t Axt
    xt12 , where xt are the estimated states
  • Least squares has the best score by definition,
    since it is learned by regression on xt?xt1, but
    at the cost of instability

squared error Multi-drug data Multi-zip data
Constraint Generation 57,338 26,171
LB-1 60,669 26,431
Least Squares 56,203 16,918
  • Stable methods trade off reconstruction error vs.
    stability
  • Constraint Generation learns the most accurate
    models that are also stable

15
Prediction (preliminary results)
  • Average prediction error obtained by tracking
    (filtering) up to time t, then simulating upto
    time t and calculating the sum of squared error,
    and averaging this over all t and t gt t

avg sqd err (std dev) Multi-drug data Multi-zip data
Constraint Generation 59,845 (310) 45,465 (317)
LB-1 53,494 (364) 44,677 (266)
Least Squares 79,649 (648) n/a
  • Stable methods yield superior results to least
    squares

16
Conclusion
  • Linear Dynamical Systems effective at modeling
    multivariate time series data
  • Stability crucial for accurate performance
  • Superior performance of stable methods in
    baseline generation and prediction on OTC data
  • Constraint Generation learns a more accurate
    model with more realistic simulations, most
    efficiently. Further work needed on prediction
    accuracy metric.

A Constraint Generation Approach to Learning
Stable Linear Dynamical Systems, S. Siddiqi, B.
Boots, G. Gordon, NIPS 2007
17
Thank You! Questions?
  • further questions to siddiqi_at_cs.cmu.edu
About PowerShow.com