Eddies: Continuously Adaptive Query Processing - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

Eddies: Continuously Adaptive Query Processing

Description:

During query execution the query plan chosen by the query optimizer ... a tuple the eddy one ticket is debited from the eddies running count for that operator ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 19
Provided by: wimA3
Category:

less

Transcript and Presenter's Notes

Title: Eddies: Continuously Adaptive Query Processing


1
Eddies Continuously Adaptive Query Processing
  • Advanced Database Management
  • Presentation 2
  • Tzavala Polina EYO617

2
Problems with static query optimization.
  • During query execution the query plan chosen by
    the query optimizer becomes inefficient. This is
    due to changes in
  • 1.Computing resources cost of operators
  • 2.Data characteristics metadata statistics
    ,operator selectivities
  • 3.User preferences users Control properties
    of queries while they execute .
  • 4.The rate tuples arrive from inputs.

3
Overview
  • 1.Which algorithms are suitable for the
    implementation adaptive query operators ?
  • 2.How does Eddy operate?
  • Description of the Eddy Architecture.
  • 3.Efficiency issues.

4
Algorithms for Adaptable Joins (1)
  • 1.Few Synchronization Barriers
  • One input frozen,waiting for the other
  • Cant adapt while waiting for a barrier.
  • i.e Merge sort one table-scan waits until
  • the other table-scan produces a value
  • larger than any seen before.
  • The algorithm should favor joins that have
  • Few and short barriers
  • At worst, adaptable barriers

?
2000 2001 2002 2003 2004
2 3 4 5 6
5
Algorithms for Adaptable Joins (2)
  • 2.Many Moments of Symmetry.
  • Moment of Symmetry the end of a scheduling
    dependency between the input relations.
  • Declares that re-ordering is allowed.
  • e.g for nested-loops join
  • Moment of Symmetry the end of each inner
    loop.

6
Which algorithms to choose for reoptimization?
  • 1.Frequent moments of symmetry,
  • 2.Adaptive or nonexistent barriers, and
  • 3.Minimal ordering constraints.
  • (-) merge joins,Nested loops
  • ()the Ripple Joins.

7
Ripple Joins Prime for Adaptivity
  • Ripple Joins
  • Pipelined hash join (a.k.a. hash ripple, Xjoin)
  • No synchronization barriers
  • Continuous symmetry
  • Good for equi-join
  • Simple (or block) ripple join
  • Synchronization barriers at corners
  • Moments of symmetry at corners
  • Good for non-equi-join
  • Index nested loops
  • Short barriers
  • No symmetry

R
S
?
8
Extreme Adaptivity Eddies
  • The basic idea
  • Query processing consists of sending tuples
    through a series of operators
  • Why not treat it like a routing problem?
  • Rely on back-pressure (i.e., queue overflow) to
    tell us where to send tuples
  • Merge the optimization and execution phases of
    query processing.
  • Each tuple has a flexible ordering of the query
    operators.

9
The Eddy
  • N-ary module consisting of query operators
  • Basic unit of adaptivity
  • Subplan with select, project, join operators
  • Represents set of possible orderings of a subplan
  • Each tuple may flow through a different
    ordering.

U
10
Eddies Architecture
  • Modules (query operators) communicate via a fixed
    dataflow graph (a query plan).
  • Each module runs as an independent thread.
  • The edges in the graph correspond to finite
    message queues of fixed size.
  • When a producer and consumer run at differing
    rates, the faster thread may block on the queue
    waiting for the slower thread to catch up.

11
Eddies Architecture
  • A tuple in a eddy is associated with a tuple
    descriptor
  • Contains a vector of Ready bits and Done bits.
  • The Eddy ships the tuple to only the operators
    that have the Ready bits turned on
  • After an operator is processed its Done bits are
    set
  • If all done bits are set the tuple is sent to the
    Eddys output
  • Else its sent to another operator
  • How to route tuples to the different Eddy
    operators?

12
1.Naïve Scheme
  • Eddies buffer is implemented as a priority queue.
  • When a tuple enters a buffer its priority is set
    to low.
  • After its processed by an operator in the Eddy
    and returned to the buffer its priority is set to
    high.
  • This ensures that tuples do not get stuck in the
    Eddy. I.e. starvation
  • This scheme dynamically adjusts to work required
    of operators
  • Operators that are slower (i.e. take 4 seconds
    vs. 1 second will receive less tuples)
  • Ignores operator selectivity.

13
2.Lottery Scheme
  • Each time a tuple is routed to a operator the
    operator is credited with a ticket
  • When the operator returns a tuple the eddy one
    ticket is debited from the eddies running count
    for that operator
  • Tracks how efficiently a operator drains tuples
    from the system
  • When a tuple is to be routed to an operator the
    Eddy holds a lottery
  • Only the operators that have their Ready bit sets
    can participate in the lottery
  • An operators chance of winning the lottery and
    receiving
  • the tuple corresponds to the count of tickets
    for that operator.
  • Dynamically adjusts to selectivity of
    operators .

14
Window Scheme
  • Problem with lottery scheme is that it uses to
    much past info
  • Problem An operator that gained a lot of ticket
    initially but then became slow
  • In this scheme the lottery scheme is modified
    such that the lottery only looks at tickets
    gained by an operator in a fixed window.
  • Keeps track of two types of tickets
  • Banked tickets
  • Used when running the lottery
  • Escrow tickets
  • Used to measure efficiency during the window
  • At the end of a window
  • Banked Tickets Escrow Tickets
  • Escrow Tickets 0
  • Ensure operators re-approve themselves each
    window

15
Comparison
  • Shows that due to Fluid Dynamics ( the varying
    rates of operators) the Naïve approach naturally
    adjusts based on the cost of operators
  • Shows that Lottery also adjusts based on workload

16
Comparison
17
Comparison
  • Naïve eddies does not adjust based on selectivity
  • Naïve performs between the best and worse
  • Lottery does adjust based on selectivity

18
Eddy advantages and disadvantages
  • Mechanism for adaptively re-routing queries
  • Makes optimizers task simpler
  • Can do nearly as well as well-optimized plan in
    some cases
  • Handles variable costs, variable selectivities
  • But doesnt really handle joins very well
    attempts to address in follow-up work
  • STeMs break a join into separate data
    structures requires re-computation at each step
  • STAIRs create intermediate state and shuffle it
    back and forth.
Write a Comment
User Comments (0)
About PowerShow.com