Title: Exploiting Spatial Correlation Towards an Energy Efficient Clustered AGgregation Technique CAG
1Exploiting Spatial Correlation Towards an Energy
Efficient Clustered AGgregation Technique (CAG)
SunHee Yoon and Cyrus Shahabi Department of
Computer Science, USC
Presented by Sevcan Bilir
2Outline of the talk
- Goal/Motivation
- Background
- CAG algorithm
- Metrics/Simulation setup/Data sets
- Results
- Analysis of CAG
- Performance/Accuracy/Packet loss/Density
- Conclusion
3Goal of this paper
- Goal
- Provide an efficient and precise in-network
- clustered aggregation mechanism for users
- Challenges
- Energy efficiency vs. Correctness tradeoff
- Guaranteed error
- given user-provided error-threshold ?
4Motivation
- Why Spatial Correlation ?
- Nearby sensor nodes monitor similar
environmental features so Similar Readings.(ex
temp,pressure) - Densely deployed nodes register more redundant
data - Evidence Correlation prevalently existing in the
real world - Great Duck Island (GDI) UC Berkeley
- James Reserve UCLA
- How to leverage
- Spatial Correlation?
- - CAG
5Background
- Two in-network aggregation infrastructures
- TAG
- Directed Diffusion/Digest Diffusion/Synopsis
Diffusion - Clustering techniques (Energy Efficiency)
- LEACH
- Clusterhead rotation for energy preserving
- Avoid energy-draining bottlenecks
- TEEN/APTEEN
- Reactive/proactive clustering
- Reduce number of transmission
- Clustering with Correlation
- TiNA
- Focus on temporal correlation in in-network
aggregaition - PREMON
- Prediction model based on spatio-temporal
correlation
6CAG Algorithm
- Comparison of TAG CAG.
- TAG requires every node to participate in
aggregation. - CAG is a lossy algorithm requires only
representative (clusterheads) values to
participate in aggregation - CAG Cluster topology construction
simultaneously - Forms clusters when TAG-like forwarding tree is
built with user-specified error-threshold ? in
a query.
7CAG Algorithm-Query
- Algorithm of Cluster Formation
- func_query received
- If ((CR CR ? ) MR lt (CR CR ? ))
-
- clusterheadfalse //same cluster
- Broadcast query Q
-
- else
-
- CRMR
- Clusterheadtrue //becomes clusterhead
- Broadcast query Q
-
- CRClusterhead sensor Reading
- MRMy Local sensor Reading
8CAG Algorithm-Response
- In-network aggregation only with clusterhead
values - Only Clusterhead nodes transmit their values
- Bridge nodes nodes other than clusterheads.
- They bridge the segments of the forwarding tree
- They dont participate their local readings in
aggregation - They transmit aggregated values of clusterhead,
suppress their readings. - They make calculations for local aggregation
values - Bridge node optionally participate in
aggregation - Clusterhead may change for every query and
response cycle - Prevents from becoming energy-draining
bottlenecks.
9An example execution of CAG
1
1
10An example execution of TAG
Query
SELECT AVG (temp) FROM sensors
1
1
11Simulation Metrics
- Reduced number of transmissions
- (nTX(TAG) - nTX(CAG)) / nTX(TAG) 100
- Relative error
- (EstimateValue ActualValue / ActualValue)
100 - Is relative error always bounded by threshold
value? - Robustness
- Measured reliability Perfect Reliability
- Impact on different densities
- ? Compare CAG with TAG, with different ?
12Simulation Setup
- 250m250m grid
- Random node placement
- 3 different densities
- Dense Average 26 neighbors/node (550 nodes)
- Moderate Average 17 neighbors/node (375 nodes)
- Sparse Average 9 neighbors/node (200 nodes)
- Multi-hop random topology At least 5-hops
- 5 different levels of synthetic correlated data
using Jindal et al. - Correlation parameter H 1, 3, 5, 7, and 9
- H1 generates data with almost no spatial
correlation - Measured reliability profile from Zhao et al.
- Loss profiles to assign links between nodes
- Aggregation operator AVG
- Count,sum,standart deviation operators are also
implemented - t 0, 0.5, 1, 2, 4, 10 , average of 30 runs per
each - Simulations are also conducted for t 15,20,40
- Only present tlt10 results, because they are
more interesting for relative error metric
evaluation.
13Data Sets
- Synthetic data from the Statistical model
- Five data sets with different degrees of
correlation - Correlation Coefficient H 1, 3, 5, 7, and 9
Jindal et al. 2004 - H1 generates data with almost no spatial
correlation - Synthetic data from Ecological model
- Realistic spatial pattern with known spatial
properties (fractal pattern) Lennon 2000 - Measured sensor data from Great Duck Island
- Four modalities Mainwaring 2004
- Humidity, Temperature, Light, Pressure
14Synthetic data
(c)
(b)
(a)
(d)
(e)
(f)
Synthetic data (a)1H, (b)3H, (c)5H, (d)7H, (e)9H,
and (f) Spatial Pattern
15Variograms of data sets
Variograms of synthetic data and Spatial Pattern
Variograms of real sensor data from Great Duck
Island
- Variogram characterize the spatial correlation
between pairs of points - ?(h) 1/2E(X(p) X(ph))2
- for all possible locations p, where X(p) and
X(ph) are the values - at the head and tail of each pair of points
with the distance h - Spatial pattern variogram is similar with 7H and
9H
16Analysis of CAG
- Nbc (Number of transmissions) comparison
- TAG 2NN(query)N(response)
- CAG 2N (N-1)Pu (uncorrelated (i.i.d.) data)
vs. - 2N (N-1)Pc (correlated data)
- where Pu and Pc are prob. of two nodes are in the
same cluster. - When Pu or Pc1? gt N (query)1(response-one
clusterhead) - When Puor Pc0 ? gt N(query) N(response-N
clusterhead) - Accuracy of result
- The error bound using CAG
- Relative error is guaranteed to be within ?,
- In Correlated data such that N ?? k
- Nnumber of nodes, knumber of clusters
- In Normally Distributed data
- k(random) gt k(correlated), which means R-error
in random data greater than correlated data
17Experiment Results (1) Performance
Performance with measured reliability
Performance with Spatial Pattern
While ? increases number of clusters decreases,
so fewer Xs. While If H increases number of
clusters decrease, so fewer Xs.
18Experiment Results (2) Accuracy
Precision with measured reliability
Precision with perfect reliability
Relative error is always bounded by ? , except
for two cases of H 9
- With Increasing ? (from experimental
observations) - ? lt 10 higher relative error as data becomes
correlated (Increasing H ) - ? 10 lower relative error as data becomes
correlated (Increasing H ) - WHY?
19Experiment Results (3) Density
Impact on different densities
(fixed t0.5)
With Increasing density Fewer transmissions with
increasing H
20Experiment Results (4) Tradeoff
Tradeoff between efficiency and accuracy with
measured reliability
CAG can maximally take advantage of the highly
correlated sensor data
21Results with GDI Date
Results with GDI data
- There are different level of spatial correlation
in real sensor data - For pressure and temperature R-error is always
bounded by t - Pressure reading are maximally benefits from CAG,
- because pressure readings are are
strongly correlated - Light readings are the weakest correlated so
R-error is generally higher
22Limitations of CAG
- Cluster range is scale dependent (if correlated
data sensor are very far each other, energy
consumption is high) - In normal data distribution, probability of more
clusters formation so that R-error may not be
bounded by threshold all the time.
23Benefits of CAG
- Energy efficient
- For each query, different clusterheads.
- Accurate result
- As data is correlated and distribution is
balanced - Robust on packet loss in term of correctness
- Scalable in terms of number of sensor nodes
- Simple and easy
- Architecture and implement
- Only one more clause (threshold t) in TAG syntax
24Contribution Future work
- Contributions
- Prove the Importance of semantic broadcast to
reduce the energy by combined effort of routing
and application layers CAG - Leverage spatial correlation property for
efficiency and correctness challenges of WSN - Analysis and systematic evaluation of CAG
- Future work
- Systematic measurement of spatially correlated
data - Sophisticated Modeling of measured data
- Automated system to guide for users with proper
threshold for better efficiency and precision
tradeoff.