Dynamic Plan Migration for Continuous Query over Data Streams - PowerPoint PPT Presentation

About This Presentation
Title:

Dynamic Plan Migration for Continuous Query over Data Streams

Description:

Dynamic Plan Migration for Continuous Query over Data Streams – PowerPoint PPT presentation

Number of Views:116
Avg rating:3.0/5.0
Slides: 27
Provided by: davi119
Learn more at: https://davis.wpi.edu
Category:

less

Transcript and Presenter's Notes

Title: Dynamic Plan Migration for Continuous Query over Data Streams


1
Dynamic Plan Migration for Continuous Query over
Data Streams
  • Yali Zhu, Elke Rundensteiner and George Heineman
  • Database System Research Group
  • Worcester Polytechnic Institute
  • Massachusetts, USA

Research partly supported by the RDC grant
2003-04 on On-line Stream Monitoring Systems
Untethered Healthcare, Intrusion Detection, and
Beyond.
2
Motivation
  • Continuous query over streams
  • Statistics unknown before start
  • Statistics changing during execution
  • Stream rates, arrival pattern, distribution, etc
  • Need for dynamic adaptation
  • Plan re-optimization
  • Change the shape of the query plan tree

3
Run-time Plan Re-Optimization
  • Step 1 - Decide when to optimize
  • Statistics Monitoring
  • Step 2 Generate new query plan
  • Query Optimization
  • Step 3 Replace current plan by new plan
  • Plan Migration

4
Naïve Plan Migration Strategy
BC
AB
AB
BC
A
A
B
B
C
C
  • Migration Steps
  • Pause execution of old plan
  • Drain out all tuples inside old plan
  • Replace old plan by new plan
  • Resume execution of new plan

Problem Works for stateless operators only
5
Stateful Operator in CQ
  • Why stateful
  • Need non-blocking operators in CQ
  • Operator needs to output partial results
  • State data structure keep received tuples

Example Symmetric NL join w/ window constraints
ax
b2
ax
b3
State A
State B
Key Observation The purge of tuples in states
relies on processing of new tuples.
AB
b1
b2
b3
b4
b5
ax
A
B
ax
6
Naïve Migration Strategy Revisited
BC
AB
Deadlock Waiting Problem
A
B
C
(2) All tuples drained
  • Steps
  • (1) Pause execution of old plan
  • (2) Drain out all tuples inside old plan
  • (3) Replace old plan by new plan
  • (4) Resume execution of new plan

(3) Old Replaced By new
(4) Processing Resumed
7
Problem Definition
  • Dynamic Plan Migration
  • Input (two migration boxes)
  • One contains old plan
  • One contains new plan
  • Have same input and output queues
  • Result
  • Old box is replaced by new box
  • Valid Migration
  • No missing tuples
  • No duplicates
  • Key points
  • - Involved plans contain stateful operators
  • Need to migrate yet still retain useful states
  • and discard useless states.

8
State of the Art
  • Efficient mid-query re-optimization of
    sub-optimal query execution plans
  • Kabra, DeWitt 1998
  • Only migrates unprocessed portion
  • Query plan competing model
  • Ioannidis, Ng, et. al. 1992 Graefe, Cole.
    1994
  • Generate several candidate query plans before
    start
  • Execute all, choose one after a while

9
Outline
  • Problem Motivation and Definition
  • Dynamic Migration Strategies
  • Moving State Strategy
  • Parallel Track Strategy
  • Experimental Results

10
Moving State Strategy
  • Basic idea
  • Share common states between two migration boxes
  • Key steps
  • State Matching
  • Match states based on IDs.
  • State Moving
  • Create new pointers for matched states in new box
  • Whats left?
  • Unmatched states in new box

QABCD
QABCD
CD
AB
SABC
SD
SA
SBCD
CD
BC
SD
SBC
SAB
SC
BC
AB
SB
SC
SA
SB
QA
QB
QC
QD
QA
QB
QC
QD
Old Box
New Box
11
Unmatched States
QABCD
  • State Recomputing
  • Recursively recompute unmatched SBC and SBCD from
    bottom up
  • Why always possible?
  • Old and new boxes have same input queues
  • The states associated with input queues always
    match
  • Why necessary?

AB
SA
SBCD
CD
SBC
SD
BC
SB
SC
QA
QB
QC
QD
12
Terms on Tuples
QABCD
SABC
SD
CD
SAB
  • New/Old tuples
  • Old tuples already in old box
  • when migration starts
  • New tuples not exist in old box
  • when migration starts
  • Sub-tuples
  • Tuple ABCD is result of
  • Tuple A, B, C and D are sub-tuples of tuple ABCD
  • Tuple ABCD has 2416 possible combinations of
    old/new sub-tuples

SC
BC
SA
SB
AB
QA
QB
QC
QD
13
Why Recompute Unmatched States
  • To get the complete results of ABCD, we need all
    16 old/new combinations

SA
SBCD
AB
SD
SBC
CD
SB
SC
BC
If SBC not recomputed, will miss results with
both B and C as OLD
QC
QD
QA
QB
B
C
D
A
B
C
D
A
B
C
D
A
Old Tuple
New Tuple
14
Cost Estimation of MS Migration
  • Cost of MS consists of
  • Cost of state matching
  • ID comparison (neglectable)
  • Cost of state moving
  • Create pointers (neglectable)
  • Cost of state recomputing
  • Majority of cost
  • Affecting parameters
  • Operator selectivities
  • of tuples in states
  • Estimated as (input rate x window size)
  • See paper for detailed cost models

One cost model conclusion Cost of MS has
polynomial relation to window size
15
MS Migration Pros and Cons
  • Pros
  • Fast when of tuples in states is small
  • Low input rates, low selectivity or small window
  • Cons
  • Output silence during entire migration stage
  • Can query output even during migration?
  • Motivation for Parallel Track Strategy

16
Parallel Track Strategy
  • Basic idea
  • Execute both plans in parallel and gradually
    push old tuples out of old box by purging
  • Key steps
  • Connect boxes
  • Execute in parallel
  • Until old box expired (no old tuple or
    sub-tuple)
  • Disconnect old box
  • Start execute new box only

QABCD
QABCD
SABC
SD
SBCD
SA
CD
AB
SBC
SAB
SD
SC
BC
CD
SA
SB
SB
SC
BC
AB
QA
QB
QC
QD
QD
QA
QB
QC
17
Potential Duplicates
Duplicate Prevention
  • Tuple ABCD
  • 2416 possible old/new sub-tuple combinations
  • Same case not generated by both boxes
  • Otherwise we may have duplicates
  • In new box
  • all states start empty
  • only generates ABCD as (new,new,new,new)
  • In old box
  • may generate all 16 cases
  • duplicate the case of (new,new,new,new)

At root op in old box If both to-be-joined
tuples have all-new sub-tuples, dont join.
QABCD
SABC
SD
CD
SC
SAB
BC
Other op in old box Proceed as normal
SA
SB
AB
QD
QA
QC
QB
18
Estimation of PT Migration
T
Old
Old
Old
Old
W
TM-start
Old Box
1st W
New
New
SABC
SD
CD
2nd W
New
New
TM-end
SC
SAB
BC
Estimation Formula
SA
SB
AB
TPT 2W
QA
QB
QC
QD
19
PT Migration Duration
  • Given enough system computing resources
  • new tuples processed right away
  • PT migration duration 2W
  • If not enough system resources
  • New tuples accumulated in queues
  • PT migration duration gt 2W

20
Cost Estimation of PT Migration
  • Cost of PT
  • cost of process 2W tuples in old box
  • cost of process 2W tuples in new box
  • Parameters
  • Input rates, window size, selectivity
  • Similar to MS strategy

21
PT Migrations Pros and Cons
  • Pros
  • Keep on producing results even during migration
  • no results during MS migration
  • Cons
  • Migration duration is at least 2W
  • MS may be faster depending on tuples in states

22
Outline
  • Problem Definition and Motivation
  • Dynamic Migration Strategies
  • Moving State Strategy
  • Parallel Track Strategy
  • Experimental Results

23
Experimental Setup
  • Embed in the CAPE system
  • CAPE Continuous Adaptive Processing Engine
  • A streaming query engine developed at DSRG, WPI
  • VLDB04 demo
  • Layers of Adaptations
  • Punctuation exploring
  • Adaptive scheduling
  • Query migration
  • Dynamic distribution
  • Input Streams
  • By stream generator of CAPE
  • Poisson arrival pattern
  • Experiments on migration duration
  • Vary window size

24
Migration Duration vs. Window Size
25
Conclusions
  • Identify problem of migration for stateful
    operators
  • First solutions for continuous query migration
  • Moving state strategy
  • Parallel track strategy
  • Embed both strategies into stream system
  • Cost model and experimental evaluation
  • Cost model confirmed by experiments
  • Identify performance trade-off of the two
    strategies

26
Thank You
  • For more information, check the CAPE website _at_
  • http//davis.wpi.edu/dsrg/CAPE/
Write a Comment
User Comments (0)
About PowerShow.com