Multi-Query Optimization - PowerPoint PPT Presentation

About This Presentation
Title:

Multi-Query Optimization

Description:

Multi-Query Optimization Prasan Roy Indian Institute of Technology - Bombay Overview Multi-Query Optimization: What? Problem statement Multi-Query Optimization: Why? – PowerPoint PPT presentation

Number of Views:84
Avg rating:3.0/5.0
Slides: 30
Provided by: Pras86
Category:

less

Transcript and Presenter's Notes

Title: Multi-Query Optimization


1
Multi-Query Optimization
  • Prasan Roy
  • Indian Institute of Technology - Bombay

2
Overview
  • Multi-Query Optimization What?
  • Problem statement
  • Multi-Query Optimization Why?
  • Application scenarios
  • Multi-Query Optimization How?
  • A cost-based practical approach
  • Prototyping Multi-Query Optimization
  • On MS SQL-Server at Microsoft
  • Research prototype at IIT-Bombay

3
Multi-Query Optimization What?
  • Exploit common subexpressions (CSEs) in query
    optimization
  • Consider DAG execution plans in addition to tree
    execution plans

4
Example
B
A
C
D
B
C
Best Plan for A JOIN B JOIN C
Best Plan for B JOIN C JOIN D
5
Example (contd)
  • Alternative

D
A
B
C
Common Subexpression
6
Multi-Query Optimization Why?
  • Queries on views, nested queries,
  • Overlapping query batches generated by
    applications
  • Update expressions for materialized views
  • Query invocations with different parameters
  • . . .

Practical solutions needed!
7
Multi-Query Optimization How?
  • Set up the search space
  • Identify the common subexpressions
  • Explore the search space efficiently
  • Find the best way to exploit the common
    subexpressions

8
Problems
  • Materializing and sharing a CSE not necessarily
    cheaper
  • Mutually exclusive alternatives
  • (A JOIN B JOIN C)
  • (B JOIN C JOIN D)
  • (C JOIN D JOIN E)
  • What to share (B JOIN C) or (C JOIN D) ?

Huge search space!
9
Earlier Work Practical Solutions
  • As early as 1976
  • Preprocess query before optimization Hall,
    IBM-JRD76
  • As late as 1998
  • Postprocess optimized plans Subramanium and
    Venkataraman, SIGMOD98

Query optimizer is not aware!
10
Earlier Work Theoretical Studies
  • Sellis, TODS88, Cosar et al., CIKM93, Shim
    et al., DKE94,...
  • Set of queries Q1, Q2, , Qn
  • For each query Qi, set of execution plans
    Pi1, Pi2, , Pim
  • Pij is a set of tasks from a common pool
  • Pick a plan for each query such that the cost of
    tasks in the union is minimized

Not integrated with existing optimizers, no
practical study
11
Microsoft Experience
  • with Paul Larson,
  • Microsoft Research

12
Prototyping MQO on SQL-Server
  • Add multi-query optimization capability to
    SQL-Server
  • Well integrated with the existing optimization
    framework
  • another optimization level
  • minimal changes, minimal extra lines of code
  • First cut exhaustive
  • How slow can it be?
  • A working prototype by the summer-end

13
What (almost) already exists in the SQL-Server
Optimizer
  • AND/OR Query-DAG representation of plan space

Group (OR node)
14
What actually exists in the SQL-Server Optimizer
  • Relations cloned for each use

B1
A
C1
D
B2
C2
15
Preprocessing Step Query-DAG
Unification
  • Performed in a bottom-up traversal

?
?
?
?
?
?
B1
A
C1
D
B2
C2
16
Common Subexpression Identification
  • Unified nodes are CSEs

Common Subexpression
A
B
C
D
17
Exploring the Search Space A Naïve Algorithm
  • For each set S of common subexpressions
  • materialize each node in S
  • MatCost(S) sum of materialization costs of the
    nodes in S
  • invoke optimizer to find the best plan for the
    root and for each node S
  • CompCost(S) sum of costs of above plans
  • Cost(S) MatCost(S) CompCost(S)
  • Pick S with the minimum Cost

18
Doing Better Incremental
Reoptimization
  • Goal best plan for Si ? best plan for Sj
  • Observation
  • Best plans change for only the ancestors of
    nodes in Si XOR Sj
  • Algorithm
  • Propagate changed costs in bottom-up
    topological order from nodes in Si XOR Sj
  • Update min-cost plan at each node visited
  • Do not propagate further up if min-cost plan
    remains unchanged at a node

Work done at IIT-Bombay
19
Incremental Optimization Example
  • Si ?

min-cost
A
B
C
D
20
Incremental Optimization Example
  • Si ? Sj (B JOIN C)

?
?
Now materialized
?
Previous min-cost
New min-cost
A
B
C
D
21
Current Status
  • A first-cut implementation working
  • Lines of C code added 1500 approx.

22
Future Work
  • Performance tuning and smarter data structures
    needed
  • Ways to restrict enumeration taking DAG structure
    into account

23
Research at IIT-Bombay Heuristics for
MQO
  • with S. Sudarshan, S. Seshadri

24
A Greedy Heuristic
  • Pick nodes for materialization one at a time, in
    benefit order
  • Benefit(n) reduction in cost on materialization
    of n

Benefit computation is expensive
25
Monotonicity Assumption
  • Benefit of a node does not increase due to
    materialization of other nodes
  • Exploited to avoid some benefit computations

Optimization costs decrease by 90
26
A Postpass Heuristic Volcano-SH
  • No change in Volcano best plan computation
  • Cost-based materialization of nodes in best
    Volcano plan
  • Implementation easy
  • Low overhead

Optimizer is not aware
27
A Volcano Variant Volcano-RU
  • Volcano best plan search aware of best plans for
    earlier queries
  • Cost based materialization of best plan nodes
    that are used by later queries
  • Implementation easy
  • Low overhead

Local decisions, plan quality sensitive to query
sequence
28
Experimental Conclusion
  • Greedy
  • Expensive, but practical
  • Overheads typically offset by plan quality
  • especially for expensive canned queries
  • Almost linear scaleup with query batch size
  • typically, only the width of the Query DAG
    affected
  • Volcano-RU
  • Mostly better than Volcano-SH, same overhead
  • Negligible overhead over Volcano
  • recommended for cheap but complex queries

29
Conclusion
  • Multi-query optimization is needed
  • Multi-query optimization is practical!
  • Multi-query optimization is an easy next step for
    DAG-based optimizers
Write a Comment
User Comments (0)
About PowerShow.com