Applications of Active Learning in Sensor Network - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Applications of Active Learning in Sensor Network

Description:

Gaussian Process (GP) for sensor network modeling: ... Proposed solution: Sparse approximation for GP Regression ... Sampling strategy using GP ' ... – PowerPoint PPT presentation

Number of Views:51
Avg rating:3.0/5.0
Slides: 22
Provided by: tyy
Category:

less

Transcript and Presenter's Notes

Title: Applications of Active Learning in Sensor Network


1
Applications of Active Learning in Sensor Network
  • 11-785 Active Learning Seminar
  • November 13, 2008

2
Road Map
  • Paper 1. Active Learning Driven Data Acquisition
    for Sensor Networks (Muttreja, Raghunathan, Ravi
    and Niraj, ISCC 2006)
  • A sampling strategy for sensor networks based on
    predictive modeling technique.
  • Paper 2. Cost-effective Outbreak Detection in
    Networks (Leskovec, Krause, Guestrin, Faloutsos,
    VanBriesen, Grance, KDD07)
  • Optimal subset selection algorithm applied to
    sensor placement.

3
Active Learning Driven Data Acquisition for
Sensor Network
  • (Muttreja, Raghunathan, Ravi and Niraj, ISCC 2006)

4
What is this paper about?
  • Task Given a sensor network to monitor some
    physical phenomenon (temperature, air pressure,
    precipitate, etc), decide when to query which
    sensor.
  • This problem is called data acquisition policy
    design.
  • Approach Construct a probabilistic model over
    the network, and have it guide the sampling
    decision.

5
Why is this relevant to Active Learning?
  • Think of the untapped sensor nodes as a pool of
    possible data points one can get the true answer
    (label) of at some cost then the task here is an
    analogue of sampling policy in Active Learning
  • In fact, many of the model driven approaches are
    studied from an active learning perspective
  • Model-driven data acquisition in sensor network
    (Deshpande,Guestrin, et al 2004)
  • Using probabilistic models for data management
    in acquisitional environments (Deshapande,
    Guestrin, Maddenn 2005)

6
Sensor Network
  • A wireless network consisting of spatially
    distributed autonomous devices using sensors to
    monitor physical or environmental conditions,
    such as temperature, sound, vibration, pressure,
    etc
  • Severe energy constraint.
  • Hence data acquisition policy is critical.

7
Key assumption
  • The data field is spatio-temporally correlated -
    i.e., measured at one sensor node is correlated
    to its neighbor node, or the measurements at near
    future/past.
  • Certain level of distortion is acceptable - i.e.,
    users are often interested in only the
    approximation of the data field.

8
Model-driven data collection
  • Given those assumptions, how do you design data
    acquisition policy?
  • Solution presented here is caled predictive
    modeling technique.
  • In a nutshell maintain a model of the data
    field, have it decide which one to query at each
    cycle.

9
Model-driven data collection
  • Procedure
  • Build a probabilistic model of the data field
    based on currently known readings.
  • At each cycle T
  • At each node, compute the predictive measurement
    and the confidence level.
  • Sample a node if the confidence dips below some
    threshold
  • Recalibrate the model with the new data
  • Repeat the process till all node meet the
    threshold.
  • Repeat 2 at all cycle

10
Model-driven data collection
  • Which model ?
  • Some criterion
  • As general as possible
  • Output probabilistic estimate
  • Incrementally trainable
  • Provide efficient way to assess confidence of
    the candidate data point without actually
    measuring the reading at the point
  • Gaussian Process (GP) is a popular choice
  • http//www.gaussianprocess.org/

11
Gaussian Process (GP) for sensor network modeling
  • Gaussian Process is a nonparametric version of
    Gaussian distribution, extending multivariate
    Gaussian to infinite dimensionality.
  • Sensor network felt natural as a Gaussian process
  • Each node at each timeslot is a Gaussian r.v
  • True measurement as a mean
  • Distortion as variance.

12
Gaussian Process for sensor network modeling
  • If the the data field is a Gaussian Process
  • Then, the collection of random variables y y1
    y2 yi (i.e., reading from the sensors) is a
    joint Gaussian distribution given a set of sensor
    location.
  • Consider each node is indexed by xi ltnode,
    timegt tuple, and xi x1 x2 is any set of
    such nodes, then y is a mulitivairiate Gaussian
    r.v like this

C Covariance matrix set by Covariance Function
C(x, x ?) µ Mean Function
13
Gaussian Process for sensor network modeling
  • Given N training points, the model will make a
    inference on random variable y for a new node
    location (Derivation in tutorial from the
    website.)

CN is Covariance matrix of the training data
points, computed via Covariance Function C, k is
a covariance between the new node and all node in
the training
14
Confident measurement
  • Given GP,
  • And one more leap of mind Maximizing confidence
    equivalent to minimizing variance.
  • The task seems really simple
  • The simplest greedy heuristic go compute
    variance at all the nodes, and chose the one with
    most variance to query next. (used often in
    reality)
  • It may not be globally optimal.

15
Problems with the model 1
  • Greedy search may not be optimal.
  • The author proposed a heuristic distribution of
    interest. Basically adding some weights to the
    sensor who historically recorded unpredictable
    change
  • Define relative error and distribution of
    interest

Weight the confidence score with the interest
16
Problems with the model 2
  • Gaussian Process is not incrementally trainable
  • Each time the data set change, Need to recompute
    and invert covariance matrix.
  • Proposed solution Sparse approximation for GP
    Regression
  • Idea Find the bases of that spanes the
    covariance matrix
  • The paper does not provide detail on this
    technique.

17
Problems with the model 2
  • Express matrix in terms of the weights and the
    base vector (compact representation)
  • Once bases are found, the computation is almost
    the same

CM is Covariance matrix of the basis vectors, W
is a vector of weights that is estimated during
the trainig. k is a covariance between the new
node and basis vectors
18
Experiment
  • Experiment on the simulated data.
  • Two sets of experiments Centralized data
    aggregation and Clustered data aggregate
    (difference in terms of energy efficiency)
  • Over 100 time unit, Models history length 5
  • Two level of confidence threshold, ? 0.1 and
    0.2
  • Used fixed Covariance Matrix

Note w1 and w2 were NOT learned during the
experiment but were tuned during the development
phase. Others suggest learning the Covariance
Parameters as well. See David MacKays tutorial
from http//www.gaussianprocess.org/
19
Experiment
  • Root mean square error averaged over all the
    cycles.
  • Baseline method queries all nodes at each cycle
    (hence no error)
  • The proposed model demonstrates the significant
    energy savings.

20
Summary of the paper
  • Gaussian Process is a nice framework to use for
    node selection in sensor network.
  • Are there more principled way to search for
    optimal subset? (proposed solution looks a bit ad
    hoc)
  • How to chose covariance matrix is not clear.
  • How is the accuracy performance compared to
    competing framework?

21
Summary of the paper
  • Many details of the Gaussian Process are missing
    from this paper. See the following for more
    through discussion
  • More through discussion on GP regression
  • Gaussian process. Tutorial (MacKay, 1998)
  • Sparse Approximation
  • Gaussian process Iterative Sparse
    Approximation (Csato 2002)
  • Sampling strategy using GP
  • Model-driven data acquiring in sensor network
    (Deshpande,Guestrin, et al 2004)
  • Using probabilistic models for data management
    in acquisitional environments (Deshapande,
    Guestrin, Maddenn 2005)
Write a Comment
User Comments (0)
About PowerShow.com