Data Quality and Query Cost in Wireless Sensor Networks - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

Data Quality and Query Cost in Wireless Sensor Networks

Description:

periodic arrival process. random arrival process. The superposition ... Discussion of Results. Simulated Testing Dataset. A. Jindal and K. Psounis. Reference: ... – PowerPoint PPT presentation

Number of Views:67
Avg rating:3.0/5.0
Slides: 31
Provided by: vcCsNt
Category:

less

Transcript and Presenter's Notes

Title: Data Quality and Query Cost in Wireless Sensor Networks


1
Data Quality and Query Cost in Wireless Sensor
Networks
  • David Yates, Erich Nahum, Jim Kurose, and
    Prashant Shenoy
  • IEEE PerCom 2008

2
Papers
Data Quality and Query Cost in Wireless Sensor
Networks IEEE PerSeNS 2007
with analysis of performance trend
Data Quality and Query Cost in Wireless Sensor
Networks IEEE PerCom 2008
3
Outline
  • Introduction
  • Caching and Lookup Policies
  • Data Quality and Query Cost
  • Discussion of Results
  • Performance Trends
  • when value deviation is most important
  • when end-to-end delay is most important
  • Conclusion

4
Introduction (1/4)
  • Data-centric WSNs
  • Environmental and infrastructure monitoring
  • Commercial and industrial sensing
  • Performance Metrics
  • accuracy
  • total system end-to-end delay
  • the quality of the data provided to sensor
    networks applications

5
Introduction (2/4)
Sensor Network Deployment Example
Monitoring and control center
Routers and switches
Sensor Field
Data server / Gateway (and cache)
What if the gateway is augmented with storage?
Data Acquisition and Caching
6
Introduction (3/4)
Data Server or Gateway with a Cache
cache hit vs. cache miss
7
Introduction (4/4)
  • system delay
  • the time between a query arriving
    and corresponding
  • reply departing from
  • zero for a cache hit
  • value deviation
  • the unsigned difference between the data value in
  • and the true value at location i

8
Caching and Lookup PoliciesPrecise Policies and
Approximate Policies
Full
age threshold parameter
All hits
cache entries are never deleted, updated, or
replaced
Greedy Policies
Spatial Locality
Cache Utilization
Greedy age lookups ( ) Greedy distance lookups
( ) Median-of-3 lookups ( )
Precise Policies
Simple lookups ( ), Piggybacked queries ( )
All misses
Not Available
9
Data Quality and Query CostQuality Measurement
  • Data Quality
  • linear combination of normalized system delay and
    normalized value deviation
  • relative importance

Softmax normalization
Small values indicate better data quality!
Z-score normalization
10
Data Quality and Query CostSimulated Changes to
the Environment (1/2)
  • 3-dimensional sensor field
  • Rectangular planes on six faces
  • sensors
  • Four base stations are placed on the X-Y plane
  • These base stations are connected to the gateway
    server that has the common cache.
  • The sensors always communicate with their closest
    base station.

X
8 unit
Y
6 unit
4 unit
Z
11
Data Quality and Query CostSimulated Changes to
the Environment (2/2)
  • One-way communication to and from
  • minimum cost to query a location 2 units (query
    and reply)
  • maximum delay to query a location 2 seconds

normalization constant
distance
normalization constant
distance
12
Data Quality and Query CostTrace-driven Changes
to the Environment
  • Intel Lab Dataset
  • 2-dimensional field
  • 54 Mica2Dot sensors
  • light intensity the most dynamically changing of
    sensor values
  • Assume the sensors always communicate with their
    closest base station.

Sensor Field Intel Berkeley Research Lab
13
Data Quality and Query CostQuery Workload Model
(1/2)
  • Query Workload Model
  • periodic arrival process
  • random arrival process
  • The superposition of two query processes
  • polling component
  • slowly scans the sensor field at fixed rate
  • the period of the polling component of the query
    workload
  • random component
  • queries to different locations in the sensor
    field
  • average query arrival rate of the random
    component

14
Data Quality and Query CostQuery Workload Model
(2/2)
  • Simulated changes to the environment
  • exponentially distributed inter-arrival times
    with mean
  • 90 queries per second
  • Trace-driven changes to the environment
  • 0.9 queries per second

9 queries/second
0.09 queries/second
15
Discussion of ResultsSimulated Testing Dataset
  • A. Jindal and K. Psounis
  • Reference
  • Modeling Spatially-correlated Sensor Network
    Data, SECON 2004
  • Modeling Spatially Correlated Data in Sensor
    Networks, TOSN 2006

Download Tools
16
Discussion of ResultsQuery Cost vs. Data Quality
Trade-off
Query Cost vs. Data Quality
A 0.1
A 0.1
Correlated changes over 1000 locations
Trace-driven changes over 54 locations
0 cache hit
linear trade-off
linear trade-off
100 cache hit
17
Discussion of ResultsQuery Cost vs. End-to-End
Delay
Query Cost vs. End-to-End Delay
A 0.1
A 0.1
Correlated changes over 1000 locations
Trace-driven changes over 54 locations
1.18
4.4
an increase in the normalized delay term!
18
Discussion of ResultsQuery Cost vs. Data Quality
Trade-off
Query Cost vs. Data Quality
A 0.9
A 0.9
Correlated changes over 1000 locations
Trace-driven changes over 54 locations
No trade-off
No trade-off
the best performance
the best performance
19
Discussion of ResultsHit Ratios, Query Costs,
and End-to-End Delays
Hit Ratios, Query Costs, and End-to-End Delays
, 90 queries/second
T 90, 0.9 queries/second
Correlated changes over 1000 locations
Trace-driven changes over 54 locations
Hit ratio
Query Cost
End-to-End Delay
20
Discussion of ResultsQuery Cost vs. Value
Deviation
Query Cost vs. Value Deviation
A 0.1
A 0.1
Correlated changes over 1000 locations
Trace-driven changes over 54 locations
increase the dispersion
21
Discussion of ResultsWhether Delay or Value
Deviation?
Query Cost vs. Data Quality
value deviation is more important than delay
A 0.1
A 0.1
Correlated changes over 1000 locations
Trace-driven changes over 54 locations
Quality is more important.
Cost is at a premium.
22
Discussion of ResultsWhether Delay or Value
Deviation?
Query Cost vs. Data Quality
value deviation is more important than delay
A 0.9
A 0.9
Correlated changes over 1000 locations
Trace-driven changes over 54 locations
Getting the fast response time of a cache hit
is worthwhile!
23
Performance TrendsWhen Value Deviation is Most
Important
Query Cost vs. Data Quality
value deviation is more important than delay
A 0.1
A 0.1
A 0.1
9 of 1000
90 of 1000
900 of 1000
Correlated changes / sec
Correlated changes / sec
Correlated changes / sec
linear trade-off
The results are robust!
24
Performance TrendsWhen Value Deviation is Most
Important
Value Deviation vs. Data Quality
value deviation is more important than delay
A 0.1
A 0.1
A 0.1
9 of 1000
90 of 1000
900 of 1000
Correlated changes / sec
Correlated changes / sec
Correlated changes / sec
strong positive correlation!
Environment Changes
Value Deviation
25
Performance TrendsWhen Value Deviation is Most
Important
Query Cost vs. Data Quality
value deviation is more important than delay
A 0.1
A 0.1
A 0.1
90 Queries/second
9 Queries/second
0.9 Queries/second
Trace-driven changes
Trace-driven changes
Trace-driven changes
linear trade-off
26
Performance TrendsWhen Value Deviation is Most
Important
Value Deviation vs. Data Quality
value deviation is more important than delay
A 0.1
A 0.1
A 0.1
90 Queries/second
9 Queries/second
0.9 Queries/second
Trace-driven changes
Trace-driven changes
Trace-driven changes
strong positive correlation!
27
Performance TrendsWhen System Delay is Most
Important
Query Cost vs. Data Quality
delay is more important than value deviation
A 0.9
A 0.9
A 0.9
9 of 1000
90 of 1000
900 of 1000
Correlated changes / sec
Correlated changes / sec
Correlated changes / sec
No trade-off
the best performance
The results are robust!
28
Performance TrendsWhen System Delay is Most
Important
End-to-End Delay vs. Data Quality
delay is more important than value deviation
A 0.9
A 0.9
A 0.9
9 of 1000
90 of 1000
900 of 1000
Correlated changes / sec
Correlated changes / sec
Correlated changes / sec
strong positive correlation!
29
Performance TrendsWhen System Delay is Most
Important
Query Cost vs. Data Quality
delay is more important than value deviation
A 0.9
A 0.9
A 0.9
90 Queries/second
9 Queries/second
0.9 Queries/second
Trace-driven changes
Trace-driven changes
Trace-driven changes
the best performance
30
Conclusion
  • We measure the benefit and cost of seven
    different caching and lookup policies.
  • when delay drives data quality
  • when value deviation drives data quality
  • Query Cost vs. Data Quality
  • linear trade-off
  • cost vs. accuracy and/or cost vs. delay are also
    linear
  • The performance trends generally remain the same.
  • with the environment changes on query cost and
    data quality performance
Write a Comment
User Comments (0)
About PowerShow.com