Database Middleware for Sensor Networks - PowerPoint PPT Presentation

About This Presentation
Title:

Database Middleware for Sensor Networks

Description:

Database Middleware for Sensor Networks – PowerPoint PPT presentation

Number of Views:160
Avg rating:3.0/5.0
Slides: 119
Provided by: S342
Learn more at: https://db.csail.mit.edu
Category:

less

Transcript and Presenter's Notes

Title: Database Middleware for Sensor Networks


1
Database Middleware for Sensor Networks
Sam Madden Assistant Professor,
MIT madden_at_csail.mit.edu
Slides prepared with Wei Hong
2
Motivation
  • Sensor networks (aka sensor webs, emnets) are
    here
  • Several widely deployed HW/SW platforms
  • Low power radio, small processor, RAM/Flash
  • Variety of (novel) applications scientific,
    industrial, commercial
  • Great platform for mobile ubicomp
    experimentation
  • Real, hard research problems to be solved
  • Networking, systems, languages, databases
  • Central problem ease of access, appropriate
    programming abstractions
  • I will summarize
  • Low-level sensornet issues
  • A particular middleware architecture
  • TinyDB TASK
  • Current and future research middleware ideas

3
Some Sensornet Apps
smart cooling in data centers
redwood forest microclimate monitoring
http//www.hpl.hp.com/research/dca/smart_cooling/
And More
condition-based maintenance
  • Homeland security
  • Container monitoring
  • Mobile environmental apps
  • Bird tracking
  • Zebranet
  • Home automation
  • Etc!

structural integrity
4
Architectural Overview
Internet
Directed Diffusion COUGAR
Middleware Issues APIs for current historical
access? Which data when? How to act on
data? Network and node status?
5
Declarative Queries
  • Programming Apps is Hard
  • Limited power budget
  • Lossy, low bandwidth communication
  • Require long-lived, zero admin deployments
  • Distributed Algorithms
  • Limited tools, debugging interfaces
  • Queries abstract away much of the complexity
  • Burden on the database developers
  • Users get
  • Safe, optimizable programs
  • Freedom to think about apps instead of details

6
TinyDB Declarative Query Interface to Sensornets
  • Platform Berkeley Motes TinyOS
  • Continuous variant of SQL TinySQL
  • Power and data-acquisition based in-network
    optimization framework
  • Extensible interface for aggregates, new types of
    sensors

7
Agenda
  • Part 1 Sensor Networks (40 mins)
  • TinyOS
  • NesC
  • Part 2 TinyDB TASK (50 mins)
  • Data Model and Query Language
  • Software Architecture
  • 30 minute break
  • Part 3 Alternative Middleware (130 mins)
    Architectures Research Directions
  • Finish around 12

8
Part 1
  • Sensornet Background
  • Motes Mote Hardware
  • TinyOS
  • Programming Model NesC
  • TinyOS Architecture
  • Major Software Subsystems
  • Networking Services

9
Sensor Networks a hot topic
  • New university courses
  • New conferences
  • ACM SenSys, IEEE IPSN, etc.
  • New industrial research lab projects
  • Intel, PARC, MSR, HP, Accenture, etc.
  • Startup companies
  • Crossbow, Dust, Ember, Sensicast, Moteiv, etc.
  • Media Buzz
  • Over 30 news articles since July 2002 covering
    Intel-Berkeley/UC Berkeley sensor network
    activities
  • One of 10 emerging technologies that will change
    the world MIT Technology Review

10
A Brief History of Sensornets
  • People have used sensors for a long time
  • Recent CS History
  • (1998) Pottie Kaiser Radio based networks of
    sensors
  • (1998) Pister et al Smart Dust
  • Initial focus on optical communication
  • By 1999, radio based networks, COTS Dust, Motes
  • (1999) Estrin Govindan
  • Ad-hoc networks of sensors
  • (2000) Culler/Hill et al TinyOS Motes
  • (2000) Bonnet/Seshadri Device Database Systems
  • (2002) Madden/Franklin/Hellerstein/Hong TinyDB
  • (2002) Hill / Dust SPEC, mm3 scale computing
  • UCLA / USC / Berkeley Continue to Lead Research
  • Many other players now
  • TinyOS/Motes as most common platform
  • Emerging commercial space
  • Crossbow, Ember, Dust, Sensicast, Moteiv, Intel

11
Why Now?
  • Commoditization of radio hardware
  • Cellular and cordless phones, wireless
    communication
  • Low cost -gt many/tiny -gt new applications!
  • Real application for ad-hoc network research from
    the late 90s
  • Coming together of EE CS communities

12
Motes
13
History of Motes
  • Initial research goal wasnt hardware
  • Has since become more of a priority with emerging
    hardware needs, e.g.
  • Power consumption
  • (Ultrasonic) ranging localization
  • MIT Cricket, NEST Project
  • Connectivity with diverse sensors
  • UCLA sensor board
  • Even so, now on the 5th generation of devices
  • Costs down to 50/node (Moteiv, Dust)
  • Greatly improved radio quality
  • Multitude of interfaces USB, Ethernet, CF, etc.
  • Variety of form factors, packages

14
Motes vs. Traditional Computing
  • Embedded OS
  • Lossy, Adhoc Radio Communication
  • Sensing Hardware
  • Severe Power Constraints

15
NesC/TinyOS
  • NesC a C dialect for embedded programming
  • Components, wired together
  • Quick commands and asynch events
  • TinyOS a set of NesC components
  • hardware components
  • ad-hoc network formation maintenance
  • time synchronization

Think of the pair as a programming environment
16
Radio Communication
  • Low Bandwidth Shared Radio Channel
  • 40kBits on motes
  • Much less in practice
  • Encoding, Contention for Media Access (MAC)
  • Very lossy 30 base loss rate
  • Argues against TCP-like end-to-end retransmission
  • And for link-layer retries
  • Generally, not well behaved

17
Types of Sensors
  • Sensors attach via daughtercard
  • Weather
  • Temperature
  • Light x 2 (high intensity PAR, low intensity,
    full spectrum)
  • Air Pressure
  • Humidity
  • Vibration
  • 2 or 3 axis accelerometers
  • Tracking
  • Microphone (for ranging and acoustic signatures)
  • Magnetometer
  • GPS
  • RFID Reader

18
Non-Volatile Storage
  • EEPROM
  • 512K off chip, 32K on chip
  • Writes at disk speeds, reads at RAM speeds
  • Interface random access, read/write 256 byte
    pages
  • Maximum throughput 10Kbytes / second
  • MatchBox Filing System
  • Provides a Unix-like file I/O interface
  • Single, flat directory
  • Only one file being read/written at a time

19
Power Consumption and Lifetime
  • Power typically supplied by a small battery
  • 1000-2000 mAH
  • 1 mAH 1 milliamp current for 1 hour
  • Typically at optimum voltage, current drain rates
  • Power Watts (W) Amps (A) Volts (V)
  • Energy Joules (J) W time
  • Lifetime, power consumption varies by application
  • Processor 5mA active, 1 mA idle, 5 uA sleeping
  • Radio 5 mA listen, 10 mA xmit/receive, 20mS /
    packet
  • Sensors 1 uA -gt 100s mA, 1 uS -gt 1 S / sample

20
Energy Usage in A Typical Data Collection Scenario
  • Each mote collects 1 sample of (light,humidity)
    data every 10 seconds, forwards it
  • Each mote can hear 10 other motes
  • Process
  • Wake up, collect samples ( 1 second)
  • Listen to radio for messages to forward (1
    second)
  • Forward data

21
Sensors Slow, Power Hungry, Noisy
22
TinyOS Getting Started
  • The TinyOS home page
  • http//webs.cs.berkeley.edu/tinyos
  • Start with the tutorials!
  • The CVS repository
  • http//sf.net/projects/tinyos
  • The NesC Project Page
  • http//sf.net/projects/nescc
  • Crossbow motes (hardware)
  • http//www.xbow.com
  • Intel Imote
  • www.intel.com/research/exploratory/motes.htm.

23
Part 2
  • The Design and Implementation of TinyDB

24
Part 2 Outline
  • TinyDB Overview
  • Data Model and Query Language
  • TinyDB Java API and Scripting
  • Demo with TinyDB GUI
  • TinyDB Internals
  • Extending TinyDB
  • TinyDB Status and Roadmap

25
TinyDB Revisited
SELECT MAX(mag) FROM sensors WHERE mag gt
thresh SAMPLE PERIOD 64ms
  • High level abstraction
  • Data centric programming
  • Interact with sensor network as a whole
  • Extensible framework
  • Under the hood
  • Intelligent query processing query optimization,
    power efficient execution
  • Fault Mitigation automatically introduce
    redundancy, avoid problem areas

App
Query, Trigger
Data
TinyDB
26
Feature Overview
  • Declarative SQL-like query interface
  • Metadata catalog management
  • Multiple concurrent queries
  • Network monitoring (via queries)
  • In-network, distributed query processing
  • Extensible framework for attributes, commands and
    aggregates
  • In-network, persistent storage

27
Architecture
TinyDB GUI
JDBC
TinyDB Client API
DBMS
PC side
0
Mote side
0
TinyDB query processor
2
1
3
8
4
5
6
Sensor network
7
28
Data Model
  • Entire sensor network as one single,
    infinitely-long logical table sensors
  • Columns consist of all the attributes defined in
    the network
  • Typical attributes
  • Sensor readings
  • Meta-data node id, location, etc.
  • Internal states routing tree parent, timestamp,
    queue length, etc.
  • Nodes return NULL for unknown attributes
  • On server, all attributes are defined in
    catalog.xml
  • Discussion other alternative data models?

29
Query Language (TinySQL)
  • SELECT ltaggregatesgt, ltattributesgt
  • FROM sensors ltbuffergt
  • WHERE ltpredicatesgt
  • GROUP BY ltexprsgt
  • SAMPLE PERIOD ltconstgt ONCE
  • INTO ltbuffergt
  • TRIGGER ACTION ltcommandgt

30
Comparison with SQL
  • Single table in FROM clause
  • Only conjunctive comparison predicates in WHERE
    and HAVING
  • No subqueries
  • No column alias in SELECT clause
  • Arithmetic expressions limited to column op
    constant
  • Only fundamental difference SAMPLE PERIOD clause

31
TinySQL Examples
Find the sensors in bright nests.
Sensors
  • SELECT nodeid, nestNo, light
  • FROM sensors
  • WHERE light gt 400
  • EPOCH DURATION 1s

1
Epoch Nodeid nestNo Light
0 1 17 455
0 2 25 389
1 1 17 422
1 2 25 405
32
TinySQL Examples (cont.)
Count the number occupied nests in each loud
region of the island.
Epoch region CNT() AVG()
0 North 3 360
0 South 3 520
1 North 3 370
1 South 3 520
33
Event-based Queries
  • ON event SELECT
  • Run query only when interesting events happens
  • Event examples
  • Button pushed
  • Message arrival
  • Bird enters nest
  • Analogous to triggers but events are user-defined

34
Query over Stored Data
  • Named buffers in Flash memory
  • Store query results in buffers
  • Query over named buffers
  • Analogous to materialized views
  • Example
  • CREATE BUFFER name SIZE x (field1 type1, field2
    type2, )
  • SELECT a1, a2 FROM sensors SAMPLE PERIOD d INTO
    name
  • SELECT field1, field2, FROM name SAMPLE PERIOD d

35
Using the Java API
  • SensorQueryer
  • translateQuery() converts TinySQL string into
    TinyDBQuery object
  • Static query optimization
  • TinyDBNetwork
  • sendQuery() injects query into network
  • abortQuery() stops a running query
  • addResultListener() adds a ResultListener that is
    invoked for every QueryResult received
  • removeResultListener()
  • QueryResult
  • A complete result tuple, or
  • A partial aggregate result, call
    mergeQueryResult() to combine partial results
  • Key difference from JDBC push vs. pull

36
Writing Scripts with TinyDB
  • TinyDBs text interface
  • java net.tinyos.tinydb.TinyDBMain run select
  • Query results printed out to the console
  • All motes get reset each time new query is posed
  • Handy for writing scripts with shell, perl, etc.

37
Using the GUI Tools
  • Demo time

38
Inside TinyDB
Multihop Network
Query Processor
10,000 Lines Embedded C Code 5,000 Lines
(PC-Side) Java 3200 Bytes RAM (w/ 768 byte
heap) 58 kB compiled code (3x larger than 2nd
largest TinyOS Program)
Filterlight gt 400
Schema
TinyOS
TinyDB
39
Tree-based Routing
  • Tree-based routing
  • Used in
  • Query delivery
  • Data collection
  • In-network aggregation
  • Relationship to indexing?

40
Power Consumption and Lifetime
  • Power typically supplied by a small battery
  • At full power, device will last 2-3 days -gt
    Critical Constraint
  • Lifetime, power consumption varies by application
  • Scales with duty cycle amount of time on
  • Low data rate (lt 1 sample / 30 secs) gt 6 months
    possible from AA batteries

Must Synchronize!
Fundamental challenge distributed coordination
with low power!
41
Time Synchronization
  • All messages include a 5 byte time stamp
    indicating system time in ms
  • Synchronize (e.g. set system time to timestamp)
    with
  • Any message from parent
  • Any new query message (even if not from parent)
  • Punt on multiple queries
  • Timestamps written just after preamble is xmitted
  • All nodes agree that the waking period begins
    when (system time epoch dur 0)
  • And lasts for WAKING_PERIOD ms
  • Adjustment of clock happens by changing duration
    of sleep cycle, not wake cycle.

42
Extending TinyDB
  • Why extending TinyDB?
  • New sensors ? attributes
  • New control/actuation ? commands
  • New data processing logic ? aggregates
  • New events
  • Analogous to concepts in object-relational
    databases

43
Adding Attributes
  • Types of attributes
  • Sensor attributes raw or cooked sensor readings
  • Introspective attributes parent, voltage, ram
    usage, etc.
  • Constant attributes constant values that can be
    statically or dynamically assigned to a mote,
    e.g., nodeid, location, etc.

44
Adding Attributes (cont)
  • Interfaces provided by Attr component
  • StdControl init, start, stop
  • AttrRegister
  • command registerAttr(name, type, len)
  • event getAttr(name, resultBuf, errorPtr)
  • event setAttr(name, val)
  • command getAttrDone(name, resultBuf, error)
  • AttrUse
  • command startAttr(attr)
  • event startAttrDone(attr)
  • command getAttrValue(name, resultBuf, errorPtr)
  • event getAttrDone(name, resultBuf, error)
  • command setAttrValue(name, val)

45
Adding Attributes (cont)
  • Steps to adding attributes to TinyDB
  • Create attribute nesC components
  • Wire new attribute components to TinyDBAttr
    configuration
  • Reprogram TinyDB motes
  • Add new attribute entries to catalog.xml
  • Constant attributes can be added on the fly
    through TinyDB GUI

46
Adding Aggregates
  • Step 1 wire new nesC components

47
Adding Aggregates (cont)
  • Step 2 add entry to catalog.xml
  • ltaggregategt
  • ltnamegtAVGlt/namegt
  • ltidgt5lt/idgt
  • lttemporalgtfalselt/temporalgt
  • ltreaderClassgtnet.tinyos.tinydb.AverageClasslt/read
    erClassgt
  • lt/aggregategt
  • Step 3 (optional) implement reader class in Java
  • a reader class interprets and finalizes aggregate
    state received from the mote network, returns
    final result as a string for display.

48
TinyDB Status
  • Latest released with TinyOS 1.1 (9/03)
  • Install the task-tinydb package in TinyOS 1.1
    distribution
  • First release in TinyOS 1.0 (9/02)
  • Widely used by research groups as well as
    industry pilot projects
  • Successful deployments in Intel Berkeley Lab and
    redwood trees at UC Botanical Garden
  • Largest deployment 80 weather station nodes
  • Network longevity 4-5 months

49
The Redwood Tree Deployment
  • Redwood Grove in UC Botanical Garden, Berkeley
  • Collect dense sensor readings to monitor climatic
    variations across
  • altitudes,
  • angles,
  • time,
  • forest locations, etc.
  • Versus sporadic monitoring points with 30lb
    loggers!
  • Current focus study how dense sensor data affect
    predictions of conventional tree-growth models

50
Data from Redwoods
36m
33m 111
32m 110
30m 109,108,107
20m 106,105,104
10m 103, 102, 101
51
TASK
52
A SensorNet Dilemma
  • Sensors still packaged like HeathKits
  • Pretty hard to cope with out of the box
  • Bare metal encourages one-off applications
  • Inhibits reuse
  • Deployment not intuitive
  • No configuration/monitoring tools
  • SensorNet PhD Factor
  • Today 2.5 PhDs needed to deploy a SensorNet
  • Needs to be Zero

53
TASK Design Requirements
  • Ease of S/W Installation
  • Deployment tools
  • Reconfigurability
  • Health/Mgmt Monitoring
  • Network Reliability Guarantee
  • Interpretable Sensor Results
  • Tool Integration
  • Audit Trails
  • Lifetime estimates

For Developers
  • Familiar API
  • Extensibility of S/W
  • Modular services

54
Tiny Application Sensor Kit
TASK Client Tools
External Tools
TaskView
Internet
TASK Field Tools
SensorNet Appliance
TASK Server
  • Simplicity vs. Functionality
  • Modularity
  • Remote control
  • Fault Tolerant

TinyDB Sensor Network
55
SensorNet Appliance
SNA
  • Intelligent Gateway
  • Proxy for the sensornet
  • Distributes query
  • Stages results
  • Manages configuration
  • Components
  • TASK Server
  • TinyDB Client (Java)
  • DBMS (PostgreSQL)
  • WebServer (Apache)

http, other
TASKServer
DBMS
ODBC
TinyDB Client
SensorNet
56
Tools
  • Field Tool
  • In-situ diagnostics
  • TaskView
  • Integrated tool for management and monitoring

57
For more information
  • http//triplerock.cs.bekeley.edu/tinydb

58
Part 3
  • Middleware Architecture and Research Topics

59
Architectural Overview
Internet
60
Whats Left?
  • TinyDB and TinyOS provide a reasonable low-level
    substrate
  • TASK sufficient for many data collection apps
  • But there are other architecture issues
  • Efficiency concerns
  • Currently transmit readings from all sensors on
    each epoch
  • Variable, context sensitive rates
  • Data quality issues
  • Missing and faulty sensors?
  • Architectural issues
  • Actuation / closed loop issues stuff
  • Disconnection, etc.

61
Sensor Network Research
  • Very active research area
  • Cant summarize it all
  • Focus database-relevant research topics
  • Some outside of Berkeley
  • Other topics that are itching to be scratched
  • But, some bias towards work that we find
    compelling

62
Topics
  • Improving TinyDB Efficiency
  • In-network aggregation
  • Acquisitional Query Processing
  • Alternative Architectures
  • Statistical Techniques
  • Heterogeneity
  • Intermittent Connectivity
  • New features
  • In-network storage
  • Closing the loop
  • Integration with traditional databases

63
Topics
  • Improving TinyDB Efficiency
  • In-network aggregation
  • Acquisitional Query Processing
  • Alternative Architectures
  • Statistical Techniques
  • Heterogeneity
  • Intermittent Connectivity
  • New features
  • In-network storage
  • Closing the loop
  • Integration with traditional databases

64
Tiny Aggregation (TAG)
  • In-network processing of aggregates
  • Common data analysis operation
  • Aka gather operation or reduction in
    programming
  • Communication reducing
  • Operator dependent benefit
  • Across nodes during same epoch
  • Exploit query semantics to improve efficiency!

Madden, Franklin, Hellerstein, Hong. Tiny
AGgregation (TAG), OSDI 2002.
65
Basic Aggregation
  • In each epoch
  • Each node samples local sensors once
  • Generates partial state record (PSR)
  • local readings
  • readings from children
  • Outputs PSR during assigned comm. interval
  • Interval assigned based on depth in tree
  • At end of epoch, PSR for whole network output at
    root
  • New result on each successive epoch

66
Illustration In-Network Aggregation
SELECT COUNT() FROM sensors
Interval 4
Sensor
Sample Period
1 2 3 4 5
4 1
3
2
1
4
Interval
Time
1
67
Illustration In-Network Aggregation
SELECT COUNT() FROM sensors
Interval 3
Sensor
1 2 3 4 5
4 1
3 2
2
1
4
2
Interval
68
Illustration In-Network Aggregation
SELECT COUNT() FROM sensors
Interval 2
Sensor
1 2 3 4 5
4 1
3 2
2 1 3
1
4
1
3
Interval
69
Illustration In-Network Aggregation
SELECT COUNT() FROM sensors
Interval 1
5
Sensor
1 2 3 4 5
4 1
3 2
2 1 3
1 5
4
Interval
70
Illustration In-Network Aggregation
SELECT COUNT() FROM sensors
Interval 4
Sensor
1 2 3 4 5
4 1
3 2
2 1 3
1 5
4 1
Interval
1
71
Illustration In-Network Aggregation
SELECT COUNT() FROM sensors
Interval 4
Sensor
1 2 3 4 5
4 zzz zzz zzz 1
3 zzz zzz 2 zzz
2 1 3 zzz zzz
1 5 zzz zzz zzz zzz
4 zzz zzz zzz 1
Interval
1
72
Aggregation Framework
  • As in extensible databases, TinyDB supports any
    aggregation function conforming to

Aggnfinit, fmerge, fevaluate Finit a0 ?
lta0gt Fmerge lta1gt,lta2gt ? lta12gt Fevaluate lta1gt
? aggregate value
Partial State Record (PSR)
Example Average AVGinit v ?
ltv,1gt AVGmerge ltS1, C1gt, ltS2, C2gt ? lt S1
S2 , C1 C2gt AVGevaluateltS, Cgt ? S/C
Restriction Merge associative, commutative
73
Taxonomy of Aggregates
  • TAG insight classify aggregates according to
    various functional properties
  • Yields a general set of optimizations that can
    automatically be applied

Drives an API!
Property Examples Affects
Partial State MEDIAN unbounded, MAX 1 record Effectiveness of TAG
Monotonicity COUNT monotonic AVG non-monotonic Hypothesis Testing, Snooping
Exemplary vs. Summary MAX exemplary COUNT summary Applicability of Sampling, Effect of Loss
Duplicate Sensitivity MIN dup. insensitive, AVG dup. sensitive Routing Redundancy
74
Use Multiple Parents
  • Use graph structure
  • Increase delivery probability with no
    communication overhead
  • For duplicate insensitive aggregates, or
  • Aggs expressible as sum of parts
  • Send (part of) aggregate to all parents
  • In just one message, via multicast
  • Assuming independence, decreases variance

SELECT COUNT()
of parents n E(cnt) n (c/n
p2) Var(cnt) n (c/n)2 p2 (1 p2) V/n
P(link xmit successful) p P(success from A-gtR)
p2 E(cnt) c p2 Var(cnt) c2 p2 (1
p2) ? V
75
Multiple Parents Results
  • Better than previous analysis expected!
  • Losses arent independent!
  • Insight spreads data over many links

76
Acquisitional Query Processing (ACQP)
  • TinyDB acquires AND processes data
  • Could generate an infinite number of samples
  • An acqusitional query processor controls
  • when,
  • where,
  • and with what frequency data is collected!
  • Versus traditional systems where data is provided
    a priori

Madden, Franklin, Hellerstein, and Hong. The
Design of An Acqusitional Query Processor.
SIGMOD, 2003.
77
ACQP Whats Different?
  • How should the query be processed?
  • Sampling as a first class operation
  • How does the user control acquisition?
  • Rates or lifetimes
  • Event-based triggers
  • Which nodes have relevant data?
  • Index-like data structures
  • Which samples should be transmitted?
  • Prioritization, summary, and rate control

78
Operator Ordering Interleave Sampling Selection
At 1 sample / sec, total power savings could be
as much as 3.5mW ? Comparable to processor!
  • SELECT light, mag
  • FROM sensors
  • WHERE pred1(mag)
  • AND pred2(light)
  • EPOCH DURATION 1s
  • E(sampling mag) gtgt E(sampling light)
  • 1500 uJ vs. 90 uJ

79
Exemplary Aggregate Pushdown
  • SELECT WINMAX(light,8s,8s)
  • FROM sensors
  • WHERE mag gt x
  • EPOCH DURATION 1s
  • Novel, general pushdown technique
  • Mag sampling is the most expensive operation!

80
Topics
  • Improving TinyDB Efficiency
  • In-network aggregation
  • Acquisitional Query Processing
  • Alternative Architectures
  • Statistical Techniques
  • Heterogeneity
  • Intermittent Connectivity
  • New features
  • In-network storage
  • Closing the loop
  • Integration with traditional databases

81
Statistical Techniques
  • Approximations, summaries, and sampling based on
    statistics and statistical models
  • Applications
  • Limited bandwidth and large number of nodes -gt
    data reduction
  • Lossiness -gt predictive modeling
  • Uncertainty -gt tracking correlations and changes
    over time
  • Physical models -gt improved query answering

82
TinyDB Retrospective
  • Data aggregation
  • Can reduce communication

TinyDB
Query
SQL-style query
  • Declarative interface
  • Sensor nets are not just for PhDs
  • Decrease deployment time

Every time step
83
Limitations of TinyDB approach
TinyDB
Query
SQL-style query
  • Data collection
  • Every node must wake up at every time step
  • Data loss ignored
  • No quality guarantees
  • Wastes resources by ignoring correlations
  • Query distribution
  • Every node must receive query

Every time step
84
Sensor net data is correlated
  • Data is not i.i.d. ? shouldnt ignore missing
    data
  • Observing one sensor ? information about other
    sensors (and future values)
  • Observing one type of reading ? information about
    other local readings

85
BBQ Model-driven data acquisition
posterior belief
Probabilistic Model
Example model Multidimensional Gaussian
Query
Middleware Layer
SQL-style query with desired confidence
  • Strengths of model-based data acquisition
  • Observe fewer attributes
  • Exploit correlations
  • Reuse information between queries
  • Directly deal with missing data
  • Answer more complex (probabilistic) queries

86
Probabilistic models and queries
Users perspective
Query SELECT nodeId, temp 0.5C, conf(.95) FROM
sensors WHERE nodeId in 1..8
System selects and observes subset of
nodes Observed nodes 3,6,8
Query result
Node 1 2 3 4 5 6 7 8
Temp. 17.3 18.1 17.4 16.1 19.2 21.3 17.5 16.3
Conf. 98 95 100 99 95 100 98 100
87
Supported queries
  • Value query
  • Xi ? with prob. at least 1-?
  • SELECT and Range query
  • Xi?a,b with prob. at least 1-?
  • which sensors have temperature greater than 25C
    ?
  • Aggregation
  • average ? of subset of attribs. with prob. gt
    1-?
  • combine aggregation and selection
  • probability gt 10 sensors have temperature greater
    than 25C ?
  • Queries require solution to integrals
  • Many queries computed in closed-form
  • Some require numerical integration/sampling

88
Experimental results
  • Redwood trees and Intel Lab datasets
  • Learned models from data
  • Static model
  • Dynamic model Kalman filter, time-indexed
    transition probabilities
  • Evaluated on a wide range of queries

89
Cost versus Confidence level
90
Obtaining approximate values
Query True temperature value epsilon with
confidence 95
91
Next Step Outliers and Unusual Events
  • Once we have a model of the expected behavior, we
    can
  • Detect unusual (low probability) events
  • Predict missing values
  • Often, there are several expected behavior
    modes, which we want to differentiate between
  • E.g., if we can characterize failure modes, we
    can discard them
  • Applying well known probabilistic techniques to
    allow TinyDB to deal with such issues.

92
IDSQ
  • Similar idea suppose you want to e.g., localize
    a vehicle in a field of sensors
  • Idea task sensors in order of best improvement
    to estimate of some value
  • Choose leader(s)
  • Suppress subordinates
  • Task subordinates, one at a time
  • Until some measure of goodness (error bound) is
    met

See Scalable Information-Driven Sensor Querying
and Routing for ad hoc Heterogeneous Sensor
Networks. Chu, Haussecker and Zhao. Xerox TR
P2001-10113. May, 2001.
93
Model location estimate as a point with
2-dimensional Gaussian uncertainty.
Graphical Representation
Principal Axis
94
Lots of Other Work with of This Flavor
  • Precision / Energy Tradeoff -- Want nodes to
    sleep except when their data is needed
  • Olston et al. Approximate Caching. SIGMOD 03.
  • Cheng et al. Kalman Filters. SIGMOD 04.
  • Lazaridis and Mehrotra. Approximate Selection
    Queries over Imprecise Data. ICDE 2004.
  • UCI Quasar Project
  • Timeliness Real Time Constraints
  • John A. Stankovic etl al. Real Time Communication
    and Coordination in Sensor Networks. Proceedings
    of the IEEE, 91(7), July 2003.
  • Tian He et al. SPEED a stateless protocol
    (ICDCS03)

95
In-Net Regression
  • Linear regression simple way to predict future
    values, identify outliers
  • Regression can be across local or remote values,
    multiple dimensions, or with high degree
    polynomials
  • E.g., node A readings vs. node Bs
  • Or, location (X,Y), versus temperature
  • E.g., over many nodes

Guestrin, Thibaux, Bodik, Paskin, Madden.
Distributed Regression an Efficient Framework
for Modeling Sensor Network Data . Under
submission.
96
In-Net Regression (Continued)
  • Problem may require data from all sensors to
    build model
  • Solution partition sensors into overlapping
    kernels that influence each other
  • Run regression in each kernel
  • Requiring just local communication
  • Blend data between kernels
  • Requires some clever matrix manipulation
  • End result regressed model at every node
  • Useful in failure detection, missing value
    estimation

97
Topics
  • Improving TinyDB Efficiency
  • In-network aggregation
  • Acquisitional Query Processing
  • Alternative Architectures
  • Statistical Techniques
  • Heterogeneity
  • Intermittent Connectivity
  • New features
  • In-network storage
  • Closing the loop
  • Integration with traditional databases

98
Heterogeneous Sensor Networks
  • Leverage small numbers of high-end nodes to
    benefit large numbers of inexpensive nodes
  • Still must be transparent and ad-hoc
  • Key to scalability of sensor networks
  • Interesting heterogeneities
  • Energy battery vs. outlet power
  • Link bandwidth Chipcon vs. 802.11x
  • Computing and storage ATMega128 vs. Xscale
  • Pre-computed results
  • Sensing nodes vs. QP nodes

99
Computing Heterogeneity with TinyDB
  • Separate query processing from sensing
  • Provide query processing on a small number of
    nodes
  • Attract packets to query processors based on
    service value
  • Compare the total energy consumption of the
    network
  • No aggregation
  • All aggregation
  • Opportunistic aggregation
  • HSN proactive aggregation

Mark Yarvis and York Liu, Intels Heterogeneous
Sensor Network Project, ftp//download.intel.com/r
esearch/people/HSN_IR_Day_Poster_03.pdf.
100
5x7 TinyDB/HSN Mica2 Testbed
101
Data Packet Saving
  • How many aggregators are desired?
  • Does placement matter?

102
Topics
  • Improving TinyDB Efficiency
  • In-network aggregation
  • Acquisitional Query Processing
  • Alternative Architectures
  • Statistical Techniques
  • Heterogeneity
  • Intermittent Connectivity
  • New features
  • In-network storage
  • Closing the loop
  • Integration with traditional databases

103
Occasionally Connected Sensornets
internet
GTWY
Mobile GTWY
Mobile GTWY
Mobile GTWY
GTWY
104
Occasionally Connected Sensornets Challenges
  • Networking support
  • Tradeoff between reliability, power consumption
    and delay
  • Data custody transfer duplicates?
  • Load shedding
  • Routing of mobile gateways
  • Query processing
  • Operation placement in-network vs. on mobile
    gateways
  • Proactive pre-computation and data movement
  • Tight interaction between networking and QP

Fall, Hong and Madden, Custody Transfer for
Reliable Delivery in Delay Tolerant Networks,
http//www.intel-research.net/Publications/Berkele
y/081220030852_157.pdf.
105
Other Occasionally Connected Work
  • Kevin Fall. Delay Tolerant Networks. SIGCOMM
    2003.
  • Juang et al. Enery efficient computing for
    wildlife tracking. ASPLOS 2002.
  • Li et al. Sending messages to mobile users in
    disconnected ad-hoc wireless networks. MOBICOM
    2000.
  • Shah et al. Data Mules. SNPA 2003.

106
Topics
  • Improving TinyDB Efficiency
  • In-network aggregation
  • Acquisitional Query Processing
  • Alternative Architectures
  • Statistical Techniques
  • Heterogeneity
  • Intermittent Connectivity
  • New features
  • In-network storage
  • Closing the loop
  • Integration with traditional databases

107
Distributed In-network Storage
  • Collectively, sensornets have large amounts of
    in-network storage
  • Good for in-network consumption or caching
  • Challenges
  • Distributed indexing for fast query dissemination
  • Resilience to node or link failures
  • Graceful adaptation to data skews
  • Minimizing index insertion/maintenance cost

108
Example DIM
  • Functionality
  • Efficient range query for multidimensional data.
  • Approaches
  • Divide sensor field into bins.
  • Locality preserving mapping from m-d space to
    geographic locations.
  • Use geographic routing such as GPSR.
  • Assumptions
  • Nodes know their locations and network boundary
  • No node mobility

Xin Li, Young Jin Kim, Ramesh Govindan and Wei
Hong, Distributed Index for Multi-dimentional
Data (DIM) in Sensor Networks, SenSys 2003.
109
Topics
  • Improving TinyDB Efficiency
  • In-network aggregation
  • Acquisitional Query Processing
  • Alternative Architectures
  • Statistical Techniques
  • Heterogeneity
  • Intermittent Connectivity
  • New features
  • In-network storage
  • Closing the loop
  • Integration with traditional databases

110
Closing the Loop
  • Challenge want more than data collection
  • Condition-based sensing, rate adjustment
  • Condition-based actuation
  • E.g.,
  • Kansal et al. Sensor Uncertainty Reduction Using
    Low Complexity Actuation. IPSN 2004.
  • work from Qiong Luo HKUST et al in CIDR.
  • Various process control systems ladder logic,
    SCADA, etc.
  • Questions
  • Appropriate languages
  • Resource contention on actuators
  • Closed-loop safety concerns

111
Topics
  • Improving TinyDB Efficiency
  • In-network aggregation
  • Acquisitional Query Processing
  • Alternative Architectures
  • Statistical Techniques
  • Heterogeneity
  • Intermittent Connectivity
  • New features
  • In-network storage
  • Closing the loop
  • Integration with traditional databases

112
Alternative Middleware Integration into an
Existing DBMS
113
Concluding Remarks
  • Sensor networks are an exciting emerging
    technology, with a wide variety of applications
  • Many research challenges in all areas of computer
    science
  • Database community included
  • Some agreement that a declarative interface is
    right
  • TinyDB and other early work are an important
    first step
  • But theres lots more to be done!
  • Real challenge is building appropriate middleware
    abstractions

114
Questions?
http//db.lcs.mit.edu/madden/middleware_tutorial.p
pt
115
In-Network Join Strategies
  • Types of joins
  • non-sensor -gt sensor
  • sensor -gt sensor
  • Optimization questions
  • Should the join be pushed down?
  • If so, where should it be placed?
  • What if a join table exceeds the memory available
    on one node?

116
Choosing Where to Place Operators
  • Idea choose a join node to run the operator
  • Over time, explore other candidate placements
  • Nodes advertise data rates to their neighbors
  • Neighbors compute expected cost of running the
    join based on these rates
  • Neighbors advertise costs
  • Current join node selects a new, lower cost node

Bonfils Bonnet, Adaptive and Decentralized
Operator Placement for In-Network QueryProcessing
IPSN 2003.
117
Topics
  • In-network aggregation
  • Acquisitional Query Processing
  • Heterogeneity
  • Intermittent Connectivity
  • In-network Storage
  • Statistics-based summarization and sampling
  • In-network Joins
  • Adaptivity and Sensor Networks
  • Multiple Queries

118
Adaptivity In Sensor Networks
  • Queries are long running
  • Selectivities change
  • E.g. night vs day
  • Network load and available energy vary
  • All suggest that some adaptivity is needed
  • Of data rates or granularity of aggregation when
    optimizing for lifetimes
  • Of operator orderings or placements when
    selectivities change (c.f., conditional plans for
    correlations)
  • As far as we know, this is an open problem!

119
Multiple Queries and Work Sharing
  • As sensornets evolve, users will run many queries
    simultaneously
  • E.g., traffic monitoring
  • Likely that queries will be similar
  • But have different end points, parameters, etc
  • Would like to share processing, routing as much
    as possible
  • But how? Again, an open problem.
Write a Comment
User Comments (0)
About PowerShow.com