Monitoring%20Streams%20--%20A%20New%20Class%20of%20Data%20Management%20Applications - PowerPoint PPT Presentation

About This Presentation
Title:

Monitoring%20Streams%20--%20A%20New%20Class%20of%20Data%20Management%20Applications

Description:

Monitoring Streams -- A New Class of Data Management Applications. Don Carney Brown University ... qn. q2. s. m. . s. . Catalog. Router. inputs. outputs ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 33
Provided by: mitc99
Learn more at: http://web.cs.wpi.edu
Category:

less

Transcript and Presenter's Notes

Title: Monitoring%20Streams%20--%20A%20New%20Class%20of%20Data%20Management%20Applications


1
Monitoring Streams -- A New Class of Data
Management Applications
Don Carney Brown University Ugur
Çetintemel Brown University Mitch Cherniack
Brandeis University Christian Convey Brown
University Sangdon Lee Brown University Greg
Seidman Brown University Michael Stonebraker
MIT Nesime Tatbul Brown University Stan
Zdonik Brown University
2
Background
  • MIT/Brown/Brandeis team
  • First Aurora, then Borealis
  • Practical system
  • Designed for Scalablility 106 stream inputs,
    queries
  • QoS-Driven Resource Management
  • Stream Storage Management
  • Realiability/ Fault Tolerance
  • Distribution and Adaptivity
  • First stream startup StreamBase
  • Financial applications

3
Example Stream Applications
  • Market Analysis
  • Streams of Stock Exchange Data
  • Critical Care
  • Streams of Vital Sign Measurements
  • Physical Plant Monitoring
  • Streams of Environmental Readings
  • Biological Population Tracking
  • Streams of Positions from Individuals of a Species

4
Not Your Average DBMS
  1. External, Autonomous Data Sources
  2. Querying Time-Series
  3. Triggers-in-the-large
  4. Real-time response requirements
  5. Noisy Data, Approximate Query Results

5
Outline
  • 2. Aurora Overview/ Query Model
  • Runtime Operation
  • Adaptivity


6
Aurora from 100,000 Feet
Query
. . .
. . .
. . .
Query
. . .
. . .
. . .
. . .
Query
7
Aurora from 100 Feet
Slide
s
s
. . .
. . .
s
s
m
. . .
. . .
. . .
È
m
Tumble
s
m
  • Queries Workflow (Boxes and Arcs)
  • Workflow Diagram Aurora Network
  • Boxes Query Operators
  • Arcs Streams
  • Query Operators (Boxes)
  • Simple FILTER, MAP, RESTREAM
  • Binary UNION, JOIN, RESAMPLE
  • Windowed TUMBLE, SLIDE, XSECTION, WSORT
  • Streams (Arcs)
  • stream tuple sequence from common source
  • (e.g., sensor)
  • tuples timestamped on arrival (Internal use QoS)

8
Aurora in Action
Slide
s
s
s
s
s
s
. . .
. . .
s
s
s
s
s
s
s
App
s
m
s
s
s
m
m
s
. . .
. . .
. . .
È
È
È
È
È
È
È
m
m
m
App
Tumble
Tumble
Tumble
s
m
s
s
m
s
m
s
Arcs Tuple Queues
Box-at-a-time Scheduling
Outputs Monitored for QoS
9
Continuous and Historical Queries
1 Hour
Connection Point
10
Quality-of-Service (QoS)
B
C
A
Tuples Delivered
Output Value
Delay
  • Specifies Utility Of Imperfect Query Results
  • Delay-Based (specify utility of late results)
  • Delivery-Based, Value-Based (specify utility of
    partial results)
  • QoS Influences
  • Scheduling, Storage Management, Load Shedding

11
Talk Outline
  • Introduction
  • 2. Aurora Overview
  • 3. Runtime Operation
  • 4. Adaptivity
  • 5. Related Work and Conclusions


12
Runtime OperationBasic Architecture
Router
Scheduler
Box Processors
QOS Monitor
13
Runtime OperationScheduling Maximize Overall
QoS
Delay 2 sec Utility 0.5
A Cost 1 sec
  • Choice 1

(, age 1 sec)
Delay 5 sec Utility 0.8
B Cost 2 sec
Choice 2
(, age 3 sec)
Schedule Box A now rather than later Ideal
Maximize Overall Utility Presently exploring
scalable heuristics (e.g., feedback-based)
14
Runtime OperationScheduling Minimizing Per
Tuple Processing Overhead
B
A
A (x)
A (y)
A (z)
B (A (x))
B (A (y))
B (A (z))
Default Operation Context Switch
  • Train Scheduling

15
Runtime OperationStorage Management
  • Run-time Queue Management
  • Prefetch Queues Prior to Being Scheduled
  • Drop Tuples from Queues to Improve QoS
  • 2. Connection Point Management
  • Support Efficient (Pull-Based) Access to
    Historical Data
  • E.g., indexing, sorting, clustering,

16
Talk Outline
  • Introduction
  • 2. Aurora Overview
  • 3. Runtime Operation
  • 4. Adaptivity
  • 5. Related Work and Conclusions


17
Stream Query Optimization
  • Differences with Traditional Query Optimization?

18
Stream Query Optimization
  • New classes of operators (windows) may mean new
    rewrites
  • New execution modes (continuous/pipelining)
  • More dynamic fluctuations in statistics ? compile
    time optimization not possible
  • Global optimization not practical as huge query
    networks ? Adaptive optimization.
  • Other cost models taking memory into account, not
    throughput but output rate, etc.
  • Query optimization and load shedding

19
Query Optimization
  • Compile-time, Global Optimization Infeasible
  • Too Many Boxes
  • Too Much Volatility in Network, Data
  • Dynamic, Local Optimization
  • Threshold re when to optimize

20
Motivation of Query Migration
  • Continuous query over streams
  • Statistics unknown before start
  • Statistics changing during execution
  • Stream rates, arrival pattern, distribution, etc
  • Need for dynamic adaptation
  • Plan re-optimization
  • Change the shape of query plan tree

21
Run-time Plan Re-Optimization
  • Step 1 - Decide when to optimize
  • Statistics Monitoring
  • Step 2 Generate new query plan
  • Query Optimization
  • Step 3 Replace current plan by new plan
  • Plan Migration

22
Adaptivity in Query Optimization
Dynamic Optimization Migration
1. Identify Subnetwork
2. Buffer Inputs
3. Drain Subnetwork
4. Optimize Subnetwork
5. Turn on Taps
23
Naïve Plan Migration Strategy
BC
AB
AB
BC
A
A
B
B
C
C
  • Migration Steps
  • Pause execution of old plan
  • Drain out all tuples inside old plan
  • Replace old plan by new plan
  • Resume execution of new plan

Problem Works for stateless operators only
24
Stateful Operator in CQ
  • Why stateful
  • Need non-blocking operators in CQ
  • Operator needs to output partial results
  • State data structure keep received tuples

Example Symmetric NL join w/ window constraints
ax
b2
ax
b3
State A
State B
Key Observation The purge of tuples in states
relies on processing of new tuples.
AB
b1
b2
b3
b4
b5
ax
A
B
ax
25
Naïve Migration Strategy Revisited
BC
AB
Deadlock Waiting Problem
A
B
C
(2) All tuples drained
  • Steps
  • (1) Pause execution of old plan
  • (2) Drain out all tuples inside old plan
  • (3) Replace old plan by new plan
  • (4) Resume execution of new plan

(3) Old Replaced By new
(4) Processing Resumed
26
AdaptivityQuery Optimization
  • State Movement Protocol
  • Parallel Track Protocol

27
Moving State Strategy
  • Basic idea
  • Share common states between two migration boxes
  • Key steps
  • State Matching
  • Match states based on IDs.
  • State Moving
  • Create new pointers for matched states in new box
  • Whats left?
  • Unmatched states in new box

QABCD
QABCD
CD
AB
SABC
SD
SA
SBCD
CD
BC
SD
SBC
SAB
SC
BC
AB
SB
SC
SA
SB
QA
QB
QC
QD
QA
QB
QC
QD
Old Box
New Box
28
Parallel Track Strategy
  • Basic idea
  • Execute both plans in parallel and gradually
    push old tuples out of old box by purging
  • Key steps
  • Connect boxes
  • Execute in parallel
  • Until old box expired (no old tuple or
    sub-tuple)
  • Disconnect old box
  • Start execute new box only

QABCD
QABCD
SABC
SD
SBCD
SA
CD
AB
SBC
SAB
SD
SC
BC
CD
SA
SB
SB
SC
BC
AB
QA
QB
QC
QD
QD
QA
QB
QC
29
AdaptivityLoad Shedding
  • 1. Two Load Shedding Techniques
  • Random Tuple Drops
  • Add DROP box to network (DROP a special case of
    FILTER)
  • Position to affect queries w/ tolerant
    delivery-based QoS reqts
  • Semantic Load Shedding
  • FILTER values with low utility (acc to
    value-based QoS)
  • 2. Triggered by QoS Monitor
  • e.g., after Latency Analysis reveals certain
    applications are continuously receiving poor QoS

30
AdaptivityDetecting Overload
  • Throughput Analysis

Cost c Selectivity s
Input rate r
1/c gt r Þ Problem
Latency Analysis
31
ImplementationGUI
32
ImplementationRuntime
33
Conclusions
  • Aurora Stream Query Processing System
  • Designed for Scalability
  • QoS-Driven Resource Management
  • Continuous and Historical Queries
  • Stream Storage Management
  • Implemented Prototype
  • Web site www.cs.brown.edu/research/aurora/
Write a Comment
User Comments (0)
About PowerShow.com