Monitoring Streams -- A New Class of Data Management Applications PowerPoint PPT Presentation

presentation player overlay
About This Presentation
Transcript and Presenter's Notes

Title: Monitoring Streams -- A New Class of Data Management Applications


1
Monitoring Streams -- A New Class of Data
Management Applications
Don Carney Brown University Ugur
Çetintemel Brown University Mitch Cherniack
Brandeis University Christian Convey Brown
University Sangdon Lee Brown University Greg
Seidman Brown University Michael Stonebraker
MIT Nesime Tatbul Brown University Stan
Zdonik Brown University
2
Example Stream Applications
  • Critical Care
  • Streams of Vital Sign Measurements
  • Physical Plant Monitoring
  • Streams of Environmental Readings
  • Market Analysis
  • Streams of Stock Exchange Data
  • Biological Population Tracking
  • Streams of Positions from Individuals of a Species

3
Not Your Average DBMS
  1. External, Autonomous Data Sources
  2. Querying Time-Series
  3. Triggers-in-the-large
  4. Real-time response requirements
  5. Approximate Query Results

4
Aurora At-A-Glance
  • Stream Query Processing System
  • 3 Schools, 5 Faculty, 11 Grad Students, Several
    Ugrads
  • Features
  • Designed for Scalablility 106 stream inputs,
    queries
  • QoS-Driven Resource Management
  • Continuous and Historical Queries
  • Stream Storage Management
  • Implemented Prototype Demo Submission, Fall 02
  • This paper
  • System Overview Architecture and High-Level
    Strategies

5
Talk Outline
  • Introduction
  • 2. Aurora Overview
  • Runtime Operation
  • Adaptivity
  • 5. Related Work and Conclusions


6
Aurora from 100,000 Feet
Query
. . .
. . .
. . .
Query
. . .
. . .
. . .
. . .
Query
7
Aurora from 100 Feet
Slide
s
s
. . .
. . .
s
s
m
. . .
. . .
. . .
È
m
Tumble
s
m
  • Queries Workflow (Boxes and Arcs)
  • Workflow Diagram Aurora Network
  • Boxes Query Operators
  • Arcs Streams
  • Query Operators (Boxes)
  • Simple FILTER, MAP, RESTREAM
  • Binary UNION, JOIN, RESAMPLE
  • Windowed TUMBLE, SLIDE, XSECTION, WSORT
  • Streams (Arcs)
  • stream tuple sequence from common source
  • (e.g., sensor)
  • tuples timestamped on arrival (Internal use QoS)

8
Aurora in Action
Slide
s
s
s
s
s
s
. . .
. . .
s
s
s
s
s
s
s
App
s
m
s
s
s
m
m
s
. . .
. . .
. . .
È
È
È
È
È
È
È
m
m
m
App
Tumble
Tumble
Tumble
s
m
s
s
m
s
m
s
Arcs Tuple Queues
Box-at-a-time Scheduling
Outputs Monitored for QoS
9
Continuous and Historical Queries
1 Hour
Connection Point
10
Quality-of-Service (QoS)
B
C
A
Tuples Delivered
Output Value
Delay
  • Specifies Utility Of Imperfect Query Results
  • Delay-Based (specify utility of late results)
  • Delivery-Based, Value-Based (specify utility of
    partial results)
  • QoS Influences
  • Scheduling, Storage Management, Load Shedding

11
Talk Outline
  • Introduction
  • 2. Aurora Overview
  • 3. Runtime Operation
  • 4. Adaptivity
  • 5. Related Work and Conclusions


12
Runtime OperationBasic Architecture
Router
Scheduler
Box Processors
QOS Monitor
13
Runtime OperationScheduling Maximize Overall
QoS
Delay 2 sec Utility 0.5
A Cost 1 sec
  • Choice 1

(, age 1 sec)
Delay 5 sec Utility 0.8
B Cost 2 sec
Choice 2
(, age 3 sec)
Schedule Box A now rather than later Ideal
Maximize Overall Utility Presently exploring
scalable heuristics (e.g., feedback-based)
14
Runtime OperationScheduling Minimizing Per
Tuple Processing Overhead
B
A
A (x)
A (y)
A (z)
B (A (x))
B (A (y))
B (A (z))
Default Operation Context Switch
  • Train Scheduling

15
Runtime OperationStorage Management
  • Run-time Queue Management
  • Prefetch Queues Prior to Being Scheduled
  • Drop Tuples from Queues to Improve QoS
  • 2. Connection Point Management
  • Support Efficient (Pull-Based) Access to
    Historical Data
  • E.g., indexing, sorting, clustering,

16
Talk Outline
  • Introduction
  • 2. Aurora Overview
  • 3. Runtime Operation
  • 4. Adaptivity
  • 5. Related Work and Conclusions


17
AdaptivityQuery Optimization
  • Compile-time, Global Optimization Infeasible
  • Too Many Boxes
  • Too Much Volatility in Network, Data

Dynamic, Local Optimization
1. Identify Subnetwork
2. Buffer Inputs
3. Drain Subnetwork
4. Optimize Subnetwork
5. Turn on Taps
18
AdaptivityLoad Shedding
  • 1. Two Load Shedding Techniques
  • Random Tuple Drops
  • Add DROP box to network (DROP a special case of
    FILTER)
  • Position to affect queries w/ tolerant
    delivery-based QoS reqts
  • Semantic Load Shedding
  • FILTER values with low utility (acc to
    value-based QoS)
  • 2. Triggered by QoS Monitor
  • e.g., after Latency Analysis reveals certain
    applications are continuously receiving poor QoS

19
AdaptivityDetecting Overload
  • Throughput Analysis

Cost c Selectivity s
Input rate r
1/c gt r Þ Problem
Latency Analysis
20
Talk Outline
  • Introduction
  • 2. Aurora Overview
  • 3. Runtime Operation
  • 4. Adaptivity
  • 5. Related Work and Conclusions


21
Related Work
  • Stream Processing Systems
  • Niagara CDTY00, STREAM BW01, Tribeca SH98
  • Telegraph MF02, MSHR02
  • Adaptive Query Processing
  • Eddies AH00, Tukwila IFFLW99, Query
    Scrambling AFTU96
  • Multiple Query Optimization
  • SG90, RC88
  • Approximate Query Answering
  • Online Aggregation HHW97, AQUA AGP99
  • Active Databases
  • PD99, SPAM91, HC99
  • Continuous Queries
  • Tapestry TGNO92, OpenCQ LPT99, Chronicle
    JMS95

22
Conclusions
  • Aurora Stream Query Processing System
  • Designed for Scalability
  • QoS-Driven Resource Management
  • Continuous and Historical Queries
  • Stream Storage Management
  • Implemented Prototype
  • Web site www.cs.brown.edu/research/aurora/

23
ImplementationGUI
24
ImplementationRuntime
Write a Comment
User Comments (0)
About PowerShow.com