TinyDB - PowerPoint PPT Presentation

1 / 45
About This Presentation
Title:

TinyDB

Description:

Mote side. PC side. 8. 3. Part 1. Query Language. 4. Query Language (TinySQL) ... 1. 2. 3. 4. 5. Mote ID. time. Epoch (10s -100s of seconds) 2-4s Waking Period ... – PowerPoint PPT presentation

Number of Views:249
Avg rating:3.0/5.0
Slides: 46
Provided by: S248
Category:
Tags: tinydb | mote

less

Transcript and Presenter's Notes

Title: TinyDB


1
TinyDB
  • Slides modified from
  • Sam Madden, Wei Hong

2
TinyDB Architecture
TinyDB GUI
JDBC
TinyDB Client API
DBMS
PC side
0
Mote side
0
TinyDB query processor
2
1
3
8
4
5
6
Sensor network
7
3
Part 1
  • Query Language

4
Query Language (TinySQL)
  • SELECT ltaggregatesgt, ltattributesgt
  • FROM sensors ltbuffergt
  • WHERE ltpredicatesgt
  • GROUP BY ltexprsgt
  • SAMPLE PERIOD ltconstgt ONCE
  • INTO ltbuffergt
  • TRIGGER ACTION ltcommandgt

5
Comparison with SQL
  • Single table in FROM clause
  • Only conjunctive comparison predicates in WHERE
    and HAVING
  • No subqueries
  • No column alias in SELECT clause
  • Arithmetic expressions limited to column op
    constant
  • Only fundamental difference SAMPLE PERIOD clause

6
TinySQL Examples
Find the sensors in bright nests.
Sensors
  • SELECT nodeid, nestNo, light
  • FROM sensors
  • WHERE light gt 400
  • EPOCH DURATION 1s

1
7
TinySQL Examples (cont.)
Count the number occupied nests in each loud
region of the island.
8
Data Model
  • Entire sensor network as one single,
    infinitely-long logical table sensors
  • Columns consist of all the attributes defined in
    the network
  • Typical attributes
  • Sensor readings
  • Meta-data node id, location, etc.
  • Internal states routing tree parent, timestamp,
    queue length, etc.
  • Nodes return NULL for unknown attributes
  • On server, all attributes are defined in
    catalog.xml

9
Query over Stored Data
  • Named buffers in Flash memory
  • Store query results in buffers
  • Query over named buffers
  • Analogous to materialized views
  • Example
  • CREATE BUFFER name SIZE x (field1 type1, field2
    type2, )
  • SELECT a1, a2 FROM sensors SAMPLE PERIOD d INTO
    name
  • SELECT field1, field2, FROM name SAMPLE PERIOD d

10
Event-based Queries
  • ON event SELECT
  • Run query only when interesting events happens
  • Event examples
  • Button pushed
  • Message arrival
  • Bird enters nest
  • Analogous to triggers but events are user-defined

11
Inside TinyDB
Multihop Network
Query Processor
10,000 Lines Embedded C Code 5,000 Lines
(PC-Side) Java 3200 Bytes RAM (w/ 768 byte
heap) 58 kB compiled code (3x larger than 2nd
largest TinyOS Program)
Filterlight gt 400
Schema
TinyOS
TinyDB
12
Part 2
  • Routing Structure in TinyDB

13
Tree-based Routing
  • Tree-based routing
  • Used in
  • Query delivery
  • Data collection
  • In-network aggregation
  • Relationship to indexing?

14
Power Management Approach
  • Coarse-grained app-controlled communication
    scheduling

Epoch (10s -100s of seconds)
Mote ID
1
zzz
zzz
2
3
4
5
time
2-4s Waking Period
15
Illustration Aggregation
SELECT COUNT() FROM sensors
Interval 4
Sensor
Epoch
Interval
1
16
Illustration Aggregation
SELECT COUNT() FROM sensors
Interval 3
Sensor
2
Interval
17
Illustration Aggregation
SELECT COUNT() FROM sensors
Interval 2
Sensor
1
3
Interval
18
Illustration Aggregation
SELECT COUNT() FROM sensors
Interval 1
5
Sensor
Interval
19
Illustration Aggregation
SELECT COUNT() FROM sensors
Interval 4
Sensor
Interval
1
20
Time Synchronization
  • All messages include a 5 byte time stamp
    indicating system time in ms
  • Synchronize (e.g. set system time to timestamp)
    with
  • Any message from parent
  • Any new query message (even if not from parent)
  • Punt on multiple queries
  • Timestamps written just after preamble is xmitted
  • All nodes agree that the waking period begins
    when (system time epoch dur 0)
  • And lasts for WAKING_PERIOD ms
  • Adjustment of clock happens by changing duration
    of sleep cycle, not wake cycle.

21
Part 3
  • Aggregation Support in TinyDB

22
Topics
  • In-network aggregation
  • Acquisitional Query Processing
  • Heterogeneity
  • Intermittent Connectivity
  • In-network Storage
  • Statistics-based summarization and sampling
  • In-network Joins
  • Adaptivity and Sensor Networks
  • Multiple Queries

23
Topics
  • In-network aggregation
  • Acquisitional Query Processing
  • Heterogeneity
  • Intermittent Connectivity
  • In-network Storage
  • Statistics-based summarization and sampling
  • In-network Joins
  • Adaptivity and Sensor Networks
  • Multiple Queries

24
Tiny Aggregation (TAG)
  • Aggregate data along the tree
  • Classify aggregates according to various
    functional properties

Madden, Franklin, Hellerstein, Hong. Tiny
AGgregation (TAG), OSDI 2002.
25
Aggregation Framework
  • As in extensible databases, TinyDB supports any
    aggregation function conforming to

Aggnfinit, fmerge, fevaluate Finit a0 ?
lta0gt Fmerge lta1gt,lta2gt ? lta12gt Fevaluate lta1gt
? aggregate value
Partial State Record (PSR)
Example Average AVGinit v ?
ltv,1gt AVGmerge ltS1, C1gt, ltS2, C2gt ? lt S1
S2 , C1 C2gt AVGevaluateltS, Cgt ? S/C
Restriction Merge associative, commutative
26
Part 1
  • Designing an Acquisitional Query Processor

Madden, Franklin, Hellerstein, and Hong. The
Design of An Acqusitional Query Processor.
SIGMOD, 2003.
27
Acquisitional Query Processing (ACQP)
  • Traditional databases assume that data is
    provided a priori
  • Sensing and sensor data processing incurs
    overhead as well!
  • An acqusitional query processor controls
  • When should samples for a particular query be
    taken?
  • In what order should samples for query be taken
    and should sampling be interleaved with
    processing?
  • Is it worth expending computational power to
    process and relay a sample

28
ACQP Whats Different?
  • How should the query be processed?
  • Sampling as a first class operation
  • How does the user control acquisition?
  • Rates or lifetimes
  • Event-based triggers
  • Which nodes have relevant data?
  • Index-like data structures
  • Which samples should be transmitted?
  • Prioritization, summary, and rate control

29
Operator Ordering Interleave Sampling Selection
At 1 sample / sec, total power savings could be
as much as 3.5mW ? Comparable to processor!
  • SELECT light, mag
  • FROM sensors
  • WHERE pred1(mag)
  • AND pred2(light)
  • EPOCH DURATION 1s
  • E(sampling mag) gtgt E(sampling light)
  • 1500 uJ vs. 90 uJ

30
Exemplary Aggregate Pushdown
  • SELECT WINMAX(light,8s,8s)
  • FROM sensors
  • WHERE mag gt x
  • EPOCH DURATION 1s
  • Novel, general pushdown technique
  • Mag sampling is the most expensive operation!

31
Part 3
  • Open Database Research Issues in Sensor Networks

32
Topics
  • In-network aggregation
  • Acquisitional Query Processing
  • Heterogeneity
  • Intermittent Connectivity
  • In-network Storage
  • Statistics-based summarization and sampling
  • In-network Joins
  • Adaptivity and Sensor Networks
  • Multiple Queries

33
Heterogeneous Sensor Networks
  • Leverage small numbers of high-end nodes to
    benefit large numbers of inexpensive nodes
  • Still must be transparent and ad-hoc
  • Key to scalability of sensor networks
  • Interesting heterogeneities
  • Energy battery vs. outlet power
  • Link bandwidth Chipcon vs. 802.11x
  • Computing and storage ATMega128 vs. Xscale
  • Pre-computed results
  • Sensing nodes vs. QP nodes

34
Occasionally Connected Sensornets
internet
GTWY
Mobile GTWY
Mobile GTWY
Mobile GTWY
GTWY
35
Distributed In-network Storage
  • Collectively, sensornets have large amounts of
    in-network storage
  • Good for in-network consumption or caching
  • Challenges
  • Distributed indexing for fast query dissemination
  • Resilience to node or link failures
  • Graceful adaptation to data skews
  • Minimizing index insertion/maintenance cost

36
Statistical Techniques
  • Approximations, summaries, and sampling based on
    statistics
  • Applications
  • Limited bandwidth and large number of nodes -gt
    data reduction
  • Lossiness -gt predictive modeling
  • Uncertainty -gt tracking correlations and changes
    over time

37
In-Net Regression
  • Linear regression simple way to predict future
    values, identify outliers
  • Regression can be across local or remote values,
    multiple dimensions, or with high degree
    polynomials
  • E.g., node A readings vs. node Bs
  • Or, location (X,Y), versus temperature
  • E.g., over many nodes

Guestrin, Thibaux, Bodik, Paskin, Madden.
Distributed Regression an Efficient Framework
for Modeling Sensor Network Data . Under
submission.
38
In-Net Regression (Continued)
  • Problem may require data from all sensors to
    build model
  • Solution partition sensors into overlapping
    kernels that influence each other
  • Run regression in each kernel
  • Requiring just local communication
  • Blend data between kernels
  • Requires some clever matrix manipulation
  • End result regressed model at every node
  • Useful in failure detection, missing value
    estimation

39
Correlated Attributes
  • Data in sensor networks is correlated e.g.,
  • Temperature and voltage
  • Temperature and light
  • Temperature and humidity
  • Temperature and time of day
  • etc.

40
Exploiting Correlations in Query Processing
  • Simple idea
  • Given predicate P(A) over expensive attribute A
  • Replace it with P over cheap attribute A such
    that P evaluates to P
  • Problem unless A and A are perfectly
    correlated, P ? P for all time
  • So we could incorrectly accept or reject some
    readings
  • Alternative use correlations to improve
    selectivity estimates in query optimization
  • Construct conditional plans that vary predicate
    order based on prior observations

41
Exploiting Correlations (Cont.)
  • Insight by observing a (cheap and correlated)
    variable not involved in the query, it may be
    possible to improve query performance
  • Improves estimates of selectivities
  • Use conditional plans
  • Example

42
In-Network Join Strategies
  • Types of joins
  • non-sensor -gt sensor
  • sensor -gt sensor
  • Optimization questions
  • Should the join be pushed down?
  • If so, where should it be placed?
  • What if a join table exceeds the memory available
    on one node?

43
Choosing Where to Place Operators
  • Idea choose a join node to run the operator
  • Over time, explore other candidate placements
  • Nodes advertise data rates to their neighbors
  • Neighbors compute expected cost of running the
    join based on these rates
  • Neighbors advertise costs
  • Current join node selects a new, lower cost node

Bonfils Bonnet, Adaptive and Decentralized
Operator Placement for In-Network QueryProcessing
IPSN 2003.
44
Adaptivity In Sensor Networks
  • Queries are long running
  • Selectivities change
  • E.g. night vs day
  • Network load and available energy vary
  • All suggest that some adaptivity is needed
  • Of data rates or granularity of aggregation when
    optimizing for lifetimes
  • Of operator orderings or placements when
    selectivities change (c.f., conditional plans for
    correlations)
  • As far as we know, this is an open problem!

45
Multiple Queries and Work Sharing
  • As sensornets evolve, users will run many queries
    simultaneously
  • E.g., traffic monitoring
  • Likely that queries will be similar
  • But have different end points, parameters, etc
  • Would like to share processing, routing as much
    as possible
  • But how? Again, an open problem.
Write a Comment
User Comments (0)
About PowerShow.com