CS4432: Database Systems II - PowerPoint PPT Presentation

About This Presentation
Title:

CS4432: Database Systems II

Description:

NOTE: often only left-deep trees are being considered to keep the search space small. ... If at leaf of plan, identify any potential index to use index-lookup ... – PowerPoint PPT presentation

Number of Views:18
Avg rating:3.0/5.0
Slides: 16
Provided by: siro7
Learn more at: http://web.cs.wpi.edu
Category:

less

Transcript and Presenter's Notes

Title: CS4432: Database Systems II


1
CS4432 Database Systems II
  • Query Optimizer Cost Based Optimization

2
SQL query
parse
parse tree
convert
answer
logical query plan
execute
apply laws
statistics
Pi
improved l.q.p
pick best
estimate result sizes
(P1,C1),(P2,C2)...
l.q.p. sizes
estimate costs
consider physical plans
P1,P2,..
3
A Query (Evaluation) Plan
  • An extended relational algebra tree
  • Annotations at each node indicate
  • access methods to use for each table.
  • implementation methods used for each relational
    operator.

4
How to cost a physical plan?
  • We need estimated size of intermediate results
    Chapter 16.4
  • Cost of each operator/algorithm Chapter 15
  • Buffer available for the query

5
Result of cost-based optimization
  • Good physical plan
  • Consider different join orderings
  • Consider different access methods for accessing
    the relations

6
How to generate that good Physical Plan?
  • Many alternate search algorithms are possible
  • Exhaustive listing of all possible plans
  • Dynamic programming
  • Branch and bound
  • Greedy bottom-up plan construction
  • NOTE often only left-deep trees are being
    considered to keep the search space small.

7
Why left-deep trees?
  • Fundamental decision in System R (IBM)
  • Only left-deep join trees are considered.
  • Left-deep trees can generate all fully pipelined
    plans.
  • Intermediate results not written to temporary
    files.
  • Not all left-deep trees are fully pipelined
    (e.g., SM join).

8
Enumeration of Left-Deep Trees
  • Left-deep trees differ in
  • the order of relations,
  • the access method for each relation, and
  • the join method for each join.
  • Number of left deep plans still exponential n
    relations implies n! left-deep tree orderings

9
Enumeration of Left-Deep Trees
  • Enumerated using N passes (if N relations
    joined)
  • Pass 1 Find best 1-relation plan for each
    relation.
  • Pass 2 Find best way to join result of each
    1-relation plan (as outer) to another relation.
    (All 2-relation plans.)
  • Pass N Find best way to join result of a
    (N-1)-relation plan (as outer) to the Nth
    relation. (All N-relation plans.)
  • For each subset of relations, retain
  • Cheapest plan overall, plus
  • Cheapest plan for each interesting
  • order of the tuples.

B
A
C
D
Pass 1
Pass 2
Pass 3
10
Enumeration Example
  • Example 16.3.3, Also read Chapter 16.6.5
  • If too many relations (Chapter 16.6.6)
  • Dynamic Programming expensive if too many
    relations (say more than 6 relations).
  • Use greedy (faster algorithm, but may yield plans
    not as good as Dynamic Programming)

11
Operator Types
  • Stateful versus stateless operators
  • Select is stateless
  • Join is stateful
  • Blocking versus non-blocking operators
  • Select is non-blocking
  • Agg functions are blocking
  • Pipelined versus non-pipelined operators
  • Select is pipelinable
  • What about Join ? (see next slide)

12
Join?
  • Join Revelation is that it depends on the
    implementation strategy chosen for an operator
  • Iteration-join pipelinable
  • Merge-sort join blocking
  • Index join pipelinable
  • Hash join blocking

13
Costing of a complete plan
  • We went over an example query plan
  • Important first we classify operators as
    pipelined or not-pipelined
  • If pipelined, then for stateless operators the IO
    cost is zero (for example, for Select or Project)

14
Costing of a Complete Query Plan
  • What about a Select? How is it implemented?
  • If in middle of plan, pipeline it (one tuple at a
    time iteration)
  • If at leaf of plan, identify any potential index
    to use index-lookup to implement the Select
  • If index available, cost of implemention the
    select operator is equal to cost of an index
    lookup

15
Costing of a complete plan
  • Main idea
  • Determine of distinct values V(R,a)
  • Determine physical impl. Strategies per operator
  • Then, compute IO costs for each operator
  • Then, sum up all costs.
  • Done.
Write a Comment
User Comments (0)
About PowerShow.com