Dot%20Plots%20For%20Time%20Series%20Analysis - PowerPoint PPT Presentation

About This Presentation
Title:

Dot%20Plots%20For%20Time%20Series%20Analysis

Description:

Split the TS into subsequences and symbolize them ... Random Projection symbolization. Applies PAA (Piecewise Aggregate Approximation) Input TS: ... – PowerPoint PPT presentation

Number of Views:132
Avg rating:3.0/5.0
Slides: 20
Provided by: drag77
Learn more at: http://alumni.cs.ucr.edu
Category:

less

Transcript and Presenter's Notes

Title: Dot%20Plots%20For%20Time%20Series%20Analysis


1
Dot Plots For Time Series Analysis
  • Dragomir Yankov, Eamonn Keogh, Stefano Lonardi
  • Dept. of Computer Science Eng.
  • University of California Riverside
  • Ada Waichee Fu
  • Dept. of Computer Science Eng.
  • The Chinese University of Hong Kong

2
Sequence analysis with dot plots
  • Introduced by Gibbs McIntyre (1970)

t
a
g
t
a
a t g t a g
  • Observed patterns
  • Matches (homologies)
  • Reverses
  • Gaps (differences or mutations)

3
Dot Plots For Time Series Analysis
  • Problem statement How can we meaningfully adapt
    the DP analysis for real value data
  • The DP method would ideally be
  • Robust to noise
  • Invariant to value and time shifts
  • Invariant to certain amount of time warping
  • Efficiently computable

4
Related work
Recurrence plots (Eckman et al (1987))
  • Provide intuitive 2D view
  • of multidimensional dynamical systems
  • Matrix is computed over the heaviside function

Problem with recurrence plots
Matches are locally (point) based rather than
subsequence based
5
The proposed solution
  • Reducing the dot plot procedure to the motif
    finding problem
  • Applying the Random Projection algorithm for
    finding motifs in time series data (Chiu et al
    2003)

It satisfies the initial requirements of
robustness to outliers and invariance to time and
value shifts
  • Presegmenting the series to achieve time warping
    invariance

6
Dot plots and motif finding
  • Def match, trivial match, motif

- D(P,Q) lt R, we say that Q is a match of P
- D(P,Q) lt R,D(P,Q1)lt R, we say that Q1 is
a trivial match of P
- A non trivial match is a motif
  • Def Time series dot plot a plot that contains
    a
  • point at position (i,j) iff TS1(i) and TS2(j)
    represent
  • the same motif

7
The Random Projection algorithm
  • Based on PROJECTION (Buhler Tompa 2002)
  • Algorithm outline
  • Split the TS into subsequences and symbolize them
  • Separate the symbolic sequences into classes of
    equivalence using PROJECTION
  • Mark as motifs sequences from the same class of
    equivalence

8
Random Projection symbolization
Utilizes the Symbolic Aggregate Approximation
(SAX) scheme
  • Applies PAA (Piecewise Aggregate Approximation)

9
Random Projectionmotif finding
  • d random dimensions are masked and the strings
    are divided into separate bins

- The symbolic representations of the plotted
time series are stored into tables
10
Random Projectionmotif finding
- Updating the dot plot collision matrix
- The update is performed for m iterations.
11
Random Projection for streaming
  • Complexity space O(M), time O(mM)
  • For practical data sets M is very sparse
  • For time series data small values of m (order of
    10) generate highly descriptive plots
  • Random Projection as online algorithm
  • Good time performance
  • Updatability

12
Experimental evaluation
Dot Plots for anomaly detection
  • Recurrent data
  • with variable
  • state length
  • The anomaly is of the same type A
  • Small time warpings (shifts) are detected B
  • Larger time warpings are omitted C

13
Experimental evaluation
Dot Plots for anomaly detection
Recurrent data with fixed state length
14
Experimental evaluation
Dot Plots for pattern detection
Stock market data
15
Experimental evaluation
Dot Plots for pattern detection
Audio data
16
Experimental evaluation
Dot Plots for pattern detection
MUMer
Random Projection
Discrete data for some tasks obtaining a real
value representation is beneficial
17
Dynamic sliding window
  • The fixed window does not perform well when
  • The size of the recurrent states varies
  • We do not guess correctly the size of the states
  • Solution use time series segmentation heuristics
    and a dynamic sliding window

18
Dynamic sliding window
Comparison of the dynamic and fixed sliding
windows
Tide data set
Synthetic dataset
The dynamic sliding window preserves
more information about the frequency variability
19
Conclusion
  • This work studies the problem of building dot
    plots for real value time series data
  • It demonstrates its equivalence to the motif
    finding problem
  • Introduced is an efficient and robust approach
    for building the dot plots
  • The performance of the tool is evaluated
    empirically on a number of data sets with
    different characteristics
  • Finally, a dynamic sliding window technique is
    proposed, which improves the quality and the
    descriptiveness of the plots
Write a Comment
User Comments (0)
About PowerShow.com