Dimensionality Reduction in Sensor Networks - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Dimensionality Reduction in Sensor Networks

Description:

[3] Misha Belkin and Partha Niyogi, 'Laplacian eigenmaps for dimensionality ... Application: Wireless Motion Detection Sensor Net ... – PowerPoint PPT presentation

Number of Views:173
Avg rating:3.0/5.0
Slides: 29
Provided by: alf80
Category:

less

Transcript and Presenter's Notes

Title: Dimensionality Reduction in Sensor Networks


1
Dimensionality Reduction in Sensor Networks
  • Alfred O. Hero
  • Dept. EECS, Dept BME, Dept. Statistics
  • University of Michigan - Ann Arbor
    hero_at_eecs.umich.edu
  • http//www.eecs.umich.edu/hero

Boston University, May 2006
  • Sensor net applications
  • The importance of dimensionality reduction
  • Sensor localization via DR
  • Distributed Weighted Multi-dimensional scaling
  • Laplacian Eigenmaps Adaptive Neighbors
  • Anomaly detection via DR
  • Conclusions

2
Acknowledgements
  • Sensor Net collaborators
  • (PG) Randy Moses, Rob Nowak, Raviv Raich, Neal
    Patwari, Jose Costa, Doron Blatt
  • (G) Kevin Carter, Clyde Shih, Derek Justice
  • (UG) Adam Pocholsky, Jionglin Wu
  • (K12) Panna Felsen, Abiola Adatero
  • Sensor net sponsors
  • NSF ITR program (J. Cozzens)
  • DARPA ISP program (D. Cochran, C. Schwartz)
  • AFOSR MURI program (J. Tagney)
  • ARL (B. Sadler)
  • Motorola (J. Correal)
  • Raytheon (H. Schmitt)

3
Sensor Network Applications
Environmental monitoring and localization
Internet monitoring and anomaly detection
Internally sensed tomography and endpoint
estimation
Intruder detection and surveillance
Multiple source tracking with sensor swarms
4
Dimensionality Bottlenecks
  • Data dimension
  • Sensor response variables Y
  • 1,000,000 samples of an EM/Acoustic field on each
    of N sensors
  • 10242 pixels of a projected image on a IR camera
    sensor
  • N2 expansion factor to account for all pairwise
    correlations
  • Latent variables S
  • 250 targets with 6 dimensional states each with
    10 possible labels
  • 10243 image volume
  • 1000 behavior patterns
  • Information dimension
  • Number of free parameters describing probability
    densities f(Y) or f(SY)
  • For known statistical model info dim model dim
  • For unknown model info dim dim of density
    approximation
  • Parametric-model driven dimension reduction
  • DR by sufficiency, DR by maximum likelihood, DR
    by ancillarity
  • Data-driven dimension reduction
  • Manifold learning, structure discovery

5
Two Geometries to Consider
Manifold Embedding
  • (Metric) data geometry
  • (Non-metric) information geometry

Domain
are i.i.d. samples from
6
Data-driven DR
  • Data-driven projection to lower dimensional
    subsapce
  • Extract low-dim structure from high-dim data
  • Data may lie on curved (but locally linear)
    subspace

1 Josh .B. Tenenbaum, Vin de Silva, and John C.
Langford A Global Geometric Framework for
Nonlinear Dimensionality Reduction Science, 22
Dec 2000. 2 Jose Costa, Neal Patwari and
Alfred O. Hero, Distributed Weighted
Multidimensional Scaling for Node Localization in
Sensor Networks, IEEE/ACM Trans. Sensor
Networks, to appear 2005. 3 Misha Belkin and
Partha Niyogi, Laplacian eigenmaps for
dimensionality reduction and data
representation, Neural Computation, 2003.
7
Application Cooperative Localization
  • Use measurements made between pairs of
    unknown-location devices to self localize
  • Time-of-Arrival (TOA)
  • Received Signal Strength (RSS)
  • Connectivity (Proximity)
  • Quantized RSS (QRSS)
  • Angle-of-Arrival (AOA)

8
Manifold Learning for Localization
6
4
5
4 Y. Shang, W. Ruml, Y. Zhang, M.P.J. Fromherz,
Localization from mere connectivity, in Mobihoc
03, June 2003, pp. 201212. 5 N. Patwari, A.O.
Hero III Adaptive neighborhoods for manifold
learning-based sensor localization, IEEE SPAWC
2005, June 2005. 6 J. Costa, N. Patwari, A.O.
Hero III Distributed Weighted Multidimensional
Scaling for Node Localization in Sensor
Networks, IEEE/ACM Trans. Sensor Networks,
(submitted) June 2004.
9
Iterative self-localization algorithm
10
dwMDS RSS measurements
When initialized with NN oracle dwMDS is unbaised
and comes close to CRB Without oracle NNs are
estimated by in-range neighbors. First stage
dwMDS location estimates have high bias. Two
stage dwMDS attains similar performance as single
stage dwMDS with NN oracle
11
LEAN Connectivity
7
7 Y. Shang, W. Ruml, Y. Zhang, M.P.J. Fromherz,
Localization from mere connectivity, in Mobihoc
03, June 2003, pp. 201212.
12
Application Internet anomaly detection
  • Measurements Distribution of traffic (5 min)
  • From each sensor (router) in space and time

Figure Abilene Network, 11 routers, backbone of
US .edu / research network
Destination IP d
Port p
Source IP s
  • Related Work
  • Subspace-based decomposition 8

8 A. Lakhina, M. Crovella, C. Diot, Mining
Anomalies Using Traffic Feature Distributions,
ACM SIGCOMM 2005, Aug. 2005.
13
Internet anomaly detection background
  • Anomalies Worm outbreaks, DoS attacks,
    Intrusion activity (scans)
  • Monitor Collect data from sensors (routers) in
    space and time
  • Hypothesis Anomalies will change distribution
    of traffic across sensors
  • Distribution traffic by src/dst port, IP
    addresses packet sizes, etc.
  • Problem How to find anomalous relationships
    across space and time?

9 N. Patwari, A. O. Hero, A. Pacholski,
Manifold Learning Visualization of Network
Traffic Data, ACM Wksp on Mining Net. Data
(MineNet05), Aug 2005.
14

Spatial degrees of freedom
  • Spatio-temporal measurement vector

15
Intrinsic dimension estimation
Knee?
  • Scree plots
  • Plot residual fitting errors of
  • SVD, Isomap, LE, LLE
  • Kolmogorov/Entropy/Correlation dimension
  • Box counting, sphere packing (Liebovitch and
    Toth1989)
  • Maximum likelihood
  • Poisson approximation to Binomial
    (LevinaBickel2004)
  • Entropic graphs
  • Spanner-graph length approximation to entropy
    functional (CostaHero2003)

ISOMAP residual curve
16
Intrinsic Dimension Estimation
  • Lakhina, Crovella, Diot Subspace-based
    detection of traffic anomalies 8
  • Intrinsic dim. estimation via kNN entropic graphs
    10

Figure Data set of 7, total packets by link,
has dimension between 4 and 5
10 J.A. Costa, A.O. Hero, "Geodesic Entropic
Graphs for Dimension and Entropy Estimation in
Manifold Learning", IEEE Trans. on Signal
Processing, vol. 52, no. 8, pp. 2210-2221,
August, 2004.
17
Dimension-based Anomaly detection
  • The k-NN algorithm is more sensitive to small
    complexity changes than the Maximum Likelihood
    algorithm 11

11 E. Levina and P. Bickel. Maximum likelihood
estimation of intrinsic dimension. Neural
Information Processing Systems NIPS, Vancouver,
CA, Dec. 2004.
18
Clustering router flows spatial
  • Sensors at routers measure flows per source IP
    address
  • 07-Jan-2005 during 1545-1550 UTD
  • Packets are sampled 1/100
  • Last 11 bits zeroed for privacy -gt data are
    221length (sparse) vectors
  • WASH measures
  • NYCM measures
  • ATLA measures

19
Dynamic Sensor Maps
  • Typical router map, 18-Jan 1700 UTD
  • Sensors (routers) as positioned by dwMDS
  • Coordinates are normalized (flows) so are unitless
  • Lines show physical Abilene links
  • Small dots (- - -) show distance from 4-week mean
    coord

20
Maps Respond to Anomalous Traffic
  • Wed. 19-Jan 2005, 000-100 UTD
  • At 030, 035 large network scan
  • 22,000 anomalous flows observed at STTL, DNVR,
    KSCY, IPLS, ATLA
  • 60-byte, TCP
  • From a few Miss. State U. IPs, Src Port lt 1024
  • To range of Microsoft IPs, Dest Port 113

21
Pure Time Series Small Change
  • Abilene Backbone Total Flows, by router
  • 18-19 Jan

Network Scan
22
Anomaly Detection Algorithm
  • Multivariate t-test comparing the current coords
    to a
  • history of coordinates
  • Declare alarm when t-value exceeds threshold
  • Eg 18-19 Jan-05

Network Scan
3
2
2 45kflow port scan from .tw to .dk 3 46kflow
port scan from .tw to .pl
23
Clustering router flows temporal
  • Before Sensor on all routers, for one 5-min
    interval
  • Now Sensor on one router, for each 5-min, over
    24 hours
  • How does traffic distribution change over time?
  • Flows by source IP
  • During 2-Jan-05
  • Using Isomap
  • Credit Jionglin Wu

24
Application Wireless Motion Detection Sensor Net
  • Hypothesis Movement changes RF propagation
    channel
  • Issues Low SNR, missing data, antennas move in
    wind
  • Normal variations Battery powers, frequency
    hopping
  • Method Use N sensors to measure O(N2) channels
  • Experiment gridded SN on unmowed grass, N15
  • H0 No motion in deployment area
  • H1 Person walks through/around deployment area

Deployment Test in Motion condition
Picture Crossbow mica2 sensors
25
Geometric Entropy Minimization
  • Minimum-entropy-sets 14 and anomaly detection
  • Equivalent to level-set and minimum-volume-set
    tests 15 for Lebesgue densities
  • UMP for testing composite hypotheses

14 A. Hero and O. Michel, Asymp theoryy of
minimal k-point random graphs,, IEEE Trans IT,
1999. 15 C. Scott and R. Nowak, Learning
minimum volume sets, JMLR 2006
26
GEM vs. UMP test
27
Geometric Entropy Minimization (GEM)
  • GEM learns minimum volume set of given
    probability 16
  • Sliding window draws latest 100 samples from H0
  • Level of significance 0.001

In Motion
Key Score (1-p) Anomalies from
GEM xxxxx Ground truth of motion
Score
No Motion
(secs)
Sample Number (2 sps)
16 A. Hero, N. Patwari, J. Costa, GEM for
non-parametric anomaly detection, NIPS-06.
28
Conclusions
  • Any modeling of sensor data produces DR
  • Data-driven DR can be useful
  • Estimator of required dimension is essential
  • Distributed DR is feasible
  • Other directions
  • Blind calibration calibration-while-track
  • Folding in semi-parametric likelihood models
  • Accounting for energy/bandwidth/throughput
    constraints
  • Resource allocation and sensor management POMDP,
    RL, CROPS
Write a Comment
User Comments (0)
About PowerShow.com