Title: The Evolution of Spatial Outlier Detection Algorithms An Analysis of Design
1The Evolution of Spatial Outlier Detection
Algorithms- An Analysis of Design
- CSci 8715 Spatial Databases
- Ryan Stello Kriti Mehra
2Outline
- Background of the project
- Problem Statement
- Review of issues in traditional outlier detection
- Spatial Outlier Detection
- Outlier Detection in Spatio-Temporal Data
- Issues with modeling spatio-temporal data
- Solutions proposed
- Techniques to model spatio-temporal data
- Issues with detecting outliers in these models
- Summary
- Suggestions for Future Work
3Background of the project
- - Survey of papers
- Focus is on Spatio-Temporal Outlier Detection
- Classification is done with a view to understand
outlier detection in the Spatio-Temporal domain
4Problem Statement
- Given Techniques for outlier detection in
traditional, spatial and spatio-temporal domain - Find To provide a classification of spatial
outlier detection algorithms and highlight
shortcomings as problem complexity increases to
the spatio-temporal domain - Objective To inform the reader of the complexity
of spatial outlier detection and motivate further
efforts - Constraints Non-exhaustive
5Review of issues in traditional outlier techniques
- Low Dimensional spatial outlier detection
- Restricted so that imposing a grid is easy
- Distributive
- Normalizes the whole data and pulls out an
outlier - Average of values is a typical metric
- Median could be used
- Iterative
- Apply to each neighborhood
- High- Dimensional Spatial Outlier Detection
- Statistical Applied to the entire data and hence
fails - Space Reduction
6Spatial Outlier Detection
- Spatial Data
- Define a region
- Define proximity relationship
- Only when region and proximity relationship are
defined can the concept of spatial outlier be
defined - Various methods have been devised to define
region and proximity relationship. - Which one should be applied to our application?
7Spatio-Temporal Data
- Increased Complexity due to
- Issues with spatial outlier detection already
exist - A new attribute has to be considered - Time
- High dimensionality
- Scenarios
- Car making a sharp turn
- Movement of a cluster of stars
- Pollution of a lake due to industrial dump
- Global warming
- Should the definition of region be the same?
- Should the definition of proximity relationship
be the same? - Would the data model used in these scenarios be
the same?
8Modeling Spatio-Temporal Data
Issues and Solutions
9Issues
- Based on the snapshot phenomena GIS
applications take photographs of a region
periodically. - Difficult to determine whether two mobile systems
interacted between snapshots - Drawback Data-oriented, data can be recorded
only at fixed intervals of time.
F1, F2 Initial and final position of Flock of
sheep I1, I2 Initial and final position of rain
clouds Did the flock of sheep get wet?
10Solution Proposed
- Transition from Data based modeling to
Representation based modeling - Representation of the data is required to
incorporate spatial and temporal aspect
11Spatio-Temporal Representations
- - Neighborhood-Based
- - Time Series Matching
12Neighborhood-Based
- Determine the neighborhood of object
- Merge neighborhood sharing edges based on common
concept - Process
- 1. Create Micro Neighborhood based on immediate
spatial neighborhood to obtain a Voronoi Polygon - Voronoi Diagram of a set of objects O is the
subdivision of the - plane into n polygons, with the property that a
point q lies in the - polygon corresponding to an object oi iff dist
(q,oi)ltdist (q,oj) for - each oj belonging to O and jltgti
- 2. Create Macro Neighborhood by merging micro
neighborhoods that share an edge. - 3. Detect outliers based on how different the
value is from threshold.
13Extension to the concept
- Is the temporal aspect embedded in the semantic
process? - More spatial than spatio temporal
representation - Extension E.g. Cars neighborhood overlap
X
14Outlier Detection in Mobile Objects
- Data for mobile objects contains large number of
outliers - Metric-based outlier detection is not effective
- Non-metric distance based functions Similarity
Based - Time Series compared against a known/expected
time series - This method has complexity due to difficulty in
- Determining the expected time series
- What is the acceptable tolerance for imprecise
matches? - How much noise is acceptable?
15Example of Spatio-temporal analysis of mobile
object
- Normal behavior
- Representation of normal behavior of a car would
require defining possibilities ( variations
caused by taking an exit - and lane changing)
- Precision
- frequency matching
- deviance from norm
16Summary
17Future Work
- Past helps to analyze cause of events
- Food for thought
- Using spatio-temporal outlier detection to
predict the future is more relevant than using it
to analyze the past
18