Title: COMP 5138 Relational Database Management Systems
1COMP 5138Relational Database Management Systems
Semester 2, 2007 Lecture 12 Query Processing and
Optimization
2Todays Agenda
- Parsing and Translation
- Evaluation
- Individual operations
- Entire operations
- Optimization
- Cost-based
- Heuristic
3Overview Query Processing
- Basic Steps
- Parsing and Translation
- Optimisation
- Evaluation
Source Silberschatz/Korth/Sudarshan Database
System Concepts, 2002.
4Basic Steps in Query Processing
- Parsing and Translation
- translate the query into its internal form. This
is then translated into relational algebra. - Parser checks syntax, verifies relations
- Query Optimization
- Amongst all equivalent query-evaluation plans
choose the one with lowest cost. - Query Evaluation
- Query-execution engine takes a query-evaluation
plan, executes that plan, and returns the answers
to the query.
5Parsing and Translation
- SQL gets translated into relational algebra,
which can be shown as expression tree. - Operators have one or more input sets and return
one (or more) output set(s). - Leafs correspond to (base) relations.
- Example SELECT name FROM students ,
enrolled WHERE cid COMP5138
Projection
?name
Join
Selection
?cidCOMP5138
students
enrolled
6Todays Agenda
- Parsing and Translation
- Evaluation
- Individual operations
- Entire operations
- Optimization
- Cost-based
- Heuristic
7Query Evaluation Plan
- Each relational algebra operation can be
evaluated using one of several different
algorithms - Correspondingly, a relational-algebra expression
can be evaluated in many ways. - Annotated expression specifying detailed
evaluation strategy is called an evaluation-plan. - Example
- We can use an index onbalance to find
accountswith balancelt2500. -
- Or can perform completerelation scan and
discardaccounts with balance ? 2500
? balance
?balance?2500
account
8Expression Tree vs. Query Evaluation Plan
? balance
Project balance
Project balance
?balance?2500
Index Range Scan(account.balance, balance?2500)
Filterbalance?2500
account
Table Scan (account)
- Relational algebra operators logical operators
which represent the intermediate results. - Physical operators show how query is evaluated.
9Measures of Query Costs
- Query Optimization Amongst all equivalent
evaluation plans choose the one with lowest cost.
- Cost is generally measured as total elapsed time
for answering query - Many factors contribute to time cost
- disk accesses, CPU, or even network communication
- Typically disk access is the predominant cost,
and is also relatively easy to estimate. - For simplicity, we just use number of block
transfers from disk as the cost measure - We ignore the difference in cost between
sequential and random I/O for simplicity - We also ignore CPU costs for simplicity
10Access Path
- An access path is a method of retrieving tuples
- File scan, or index that matches a selection (in
the query) - A tree index matches (a conjunction of) terms
that involve only attributes in a prefix of the
search key. - E.g., Tree index on lta, b, cgt matches the
selection a5 AND b3, and matches a5 AND bgt6,
but not b3. - A hash index matches (a conjunction of) terms
that has a term attribute value for every
attribute in the search key of the index. - E.g., Hash index on lta, b, cgt matches a5 AND
b3 AND c5 but it does not match b3, or a5
AND b3, or agt5 AND b3 AND c5.
11Selection
- Find the most selective access path, retrieve
tuples using it, and apply any remaining terms
that dont match the index - Most selective access path An index or file scan
that we estimate will require the fewest page
I/Os. - Table scan search algorithms that locate and
retrieve records that fulfill a selection
condition. - Cost estimate (number of disk blocks scanned)
bR - bR denotes number of blocks containing records
from relation R - e.g. Oracle TABLE SCAN, TABLE ACCESS FULL
- Index scan search algorithms that use an index
- selection condition must be on search-key of
index. - Cost HTi number of blocks containing
retrieved records (PK 1 block) - HT number of index level
- e.g. Oracle INDEX RANGE SCAN or INDEX UNIQUE SCAN
12Projection
- The expensive part is removing duplicates.
- SQL systems dont remove duplicates unless the
keyword DISTINCT is specified in a query. - Sorting Approach Sort on ltsid, bidgt and remove
duplicates. (Can optimize this by dropping
unwanted information while sorting.) - Hashing Approach Hash on ltsid, bidgt to create
partitions. Load partitions into memory one at a
time, build in-memory hash structure, and
eliminate duplicates.
13Sorting
- Importance of sorting
- SQL queries can specify that the output be sorted
- SQL and relational algebra can be implemented
efficiently if the input are sorted - We may build an index on the relation, and then
use the index to read the relation in sorted
order. May lead to one disk block access for
each tuple. - For relations that fit in memory, techniques like
quick sort can be used. For relations that dont
fit in memory, external sort-merge is a good
choice.
14External Sort-Merge
?
Let M denote memory size (in pages).
- Create sorted runs. Let i be 0 initially.
Repeatedly do the following till the end of the
relation (a) Read M blocks of relation
into memory (b) Sort the in-memory blocks
(c) Write sorted data to run Ri increment
i.Let the final value of i be N - Merge the runs (N-way merge). We assume (for now)
that N lt M. - Use N blocks of memory to buffer input runs, and
1 block to buffer output. Read the first block of
each run into its buffer page - repeat
- Select the first record (in sort order) among all
buffer pages - Write the record to the output buffer. If output
is full, write it to disk. - If this is the last records of the input buffer
page then read
the next block (if any) of the run into the
buffer. - until all input buffer pages are empty
- If i ? M, several merge passes are required.
- In each pass, contiguous groups of M - 1 runs are
merged.
15External Sort-Merge - Example
16External Sort-Merge
?
- Cost analysis
- Total number of merge passes required
?logM1(br/M)?. - Disk accesses for initial run creation as well as
in each pass is 2br - for final pass, we dont count write cost
- we ignore final write cost for all operations
since the output of an operation may be sent to
the parent operation without being written to
disk - Thus total number of disk accesses for external
sorting - br ( 2 ?logM1(br / M)? 1)
17Join Operations
- Several different algorithms to implement joins
- Nested-loop join
- Block nested-loop join
- Indexed nested-loop join
- Merge-join
- Hash-join
- Choice based on cost estimate
- Examples use the following information
- Number of records of students 1,000 enrolled
10,000 - Number of blocks of students 100 enrolled 400
18Nested-Loop Join
- To compute the theta join R ? Sfor each tuple
r in R do begin for each tuple s in S do begin
if ?(r,s)true then add rs to the
result endend - R is called the outer relation,S the inner
relation of the join - Requires no indices and can be used with any kind
of join condition. - Expensive since it examines every pair of tuples
in the two relations.
19Nested-Loop Join Cost Analysis
- In the worst case, if there is memory only to
hold one block of each relation, the estimated
cost is R ? bS
bR - If the smaller relation fits entirely in memory,
use that as the inner relation. Reduces cost to
bR bS disk accesses. - Example
- In the worst case scenariostudents as outer
relation 1000 ? 400 100 400,100 disk
accessesenrolled as outer relation 10000 ?100
400 1,000,400 disk accesses - If smaller relation (students) fits entirely in
memory, the cost estimate will be 500 disk
accesses. - Block nested-loops algorithm (next slide) is
preferable.
20Block Nested-Loop Join
- Variant of nested-loop join in which every block
of inner relation is paired with every block of
outer relation. - for each block BR of R do begin for each block
Bs of S do begin for each tuple r in BR do
begin for each tuple s in Bs do begin
if ?(r,s)true then add rs to the
result end end endend
21Block Nested-Loop Join Cost Analysis
- Worst case estimate bR ? bS bR block
accesses. - Each block in the inner relation S read once for
each block in the outer relation (instead of once
for each tuple in the outer relation) - Best case bR bS block accesses.
- Improvements to nested loop block nested loop
algorithms - In block nested-loop, use M 2 disk blocks as
blocking unit for outer relations, where M
memory size in blocks use remaining two blocks
to buffer inner relation and output - Cost ?bR / (M-2)? ? bS bR
- If equi-join attribute forms a key on inner
relation, stop inner loop on first match - Scan inner loop forward and backward alternately,
to make use of the blocks remaining in buffer
(with LRU replacement) - Use index on inner relation if available (next
slide)
22Indexed Nested-Loop Join
- Index lookups can replace file scans if
- join is an equi-join or natural join and
- an index is available on the inner relations
join attribute - For each tuple r in the outer relation R, use
the index to look up tuples in S that satisfy the
join condition with tuple R. - Worst case buffer has space for only one page
of R, and, for each tuple in R, we perform an
index lookup on S. - Cost of the join bR R ? c
- Where c is the cost of traversing index and
fetching all matching S tuples for one tuple of R - c can be estimated as cost of a single selection
on S using the join condition. - If an index exists on the appropriate column of
each relation, you can choose which relation is
the outer one - Compare the costs each way
23Merge-Join
- Sort both relations on their join attribute (if
not already sorted on the join attributes). - Merge the sorted relations to join them
- Join step is similar to the merge stage of the
sort-merge algorithm. - Main difference is handling of duplicate values
in join attribute - Can construct an index just to compute a join.
- Detailed algorithm in textbook
24Merge-Join Cost Analysis
- Can be used only for equi-joins and natural joins
- Each block needs to be read only once (assuming
all tuples for any given value of the join
attributes fit in memory - Thus number of block accesses for merge-join is
bR bS the cost of sorting if
relations are unsorted. - hybrid merge-join If one relation is sorted, and
the other has a secondary B-tree index on the
join attribute - Merge the sorted relation with the leaf entries
of the B-tree . - Sort the result on the addresses of the unsorted
relations tuples - Scan the unsorted relation in physical address
order and merge with previous result, to replace
addresses by the actual tuples - Sequential scan more efficient than random lookup
25Hash-Join
- Applicable for equi-joins and natural joins.
- A hash function h is used to partition tuples of
both relations - h maps JoinAttrs values to 0, 1, ..., n, where
JoinAttrs denotes the common attributes of R and
S used in the natural join. - R0, R1, . . ., Rn denote partitions of R tuples
- Each tuple r ? R is put in partition Ri where i
h(r JoinAttrs). - S0, S1. . ., Sn denotes partitions of S tuples
- Each tuple s ? S is put in partition Si, where i
h(sJoinAttrs).
26Hash-Join (contd)
27Other Operations
- Duplicate elimination can be implemented via
hashing or sorting. - On sorting duplicates will come adjacent to each
other, and all but one set of duplicates can be
deleted. - Hashing is similar duplicates will come into
the same bucket. - Aggregation can be implemented in a manner
similar to duplicate elimination. - Sorting or hashing can be used to bring tuples in
the same group together, then the aggregate
functions can be applied on each group. - Set Operations (UNION, INTERSECT, SET MINUS)
28Evaluation of Expressions
- So far we have seen algorithms for individual
operations - Alternatives for evaluating an entire expression
tree - Materialization (also set-at-a-time) simply
evaluate one operation at a time. The result of
each evaluation is materialized (stored) in a
temporary relation for subsequent use. - Pipelining (also tuple-at-a-time) evaluate
several operations simultaneously in a pipeline
29Materialization
- Materialized evaluation evaluate one operation
at a time, starting at the lowest-level. Use
intermediate results materialized into temporary
relations to evaluate next-level operations. - E.g., in figure below, compute and storethen
compute the store its join with customer, and
finally compute the projections on customer-name.
- Materialized evaluation is always
applicable - Costs can be quite high
30Pipelining
- Pipelined evaluation evaluate several
operations simultaneously, passing the results of
one operation on to the next. - E.g., in previous expression tree, dont store
result of - instead, pass tuples directly to the join.
Similarly, dont store result of join, pass
tuples directly to projection. - Much cheaper than materialization no need to
store a temporary relation to disk. - Pipelining may not always be possible e.g.,
sort, hash-join.
31Todays Agenda
- Parsing and Translation
- Evaluation
- Individual operations
- Entire operations
- Optimization
- Cost-based
- Heuristic
32Intro to Query Optimization
- A relational algebra expression may have many
equivalent expressions - Example SELECT BALANCE
FROM account WHERE
balance lt 2500 - Can be translated into
?balance?2500(?balance(account)) which is
equivalent to
?balance(?balance?2500(account))
33Query Optimization
- Central Problems
- Query is (by definition) declarative, e.g. it
does not specify the execution order. But we
need an executable plan. - The goal of query optimization is less to find
the optimal plan, but more to avoid the worst.
Time for query optimization adds to total
query execution time. - Three evaluation plan
- Interaction of evaluation techniques
- Cost-based
- Heuristic
34Cost-based Query Optimization
- Generation of query-evaluation plans for an
expression involves several steps - Generating logically equivalent expressions
- Use equivalence rules to transform an expression
into an equivalent one. - Annotating resultant expressions to get
alternative query plans - Choosing the cheapest plan based on estimated
cost - The overall process is called cost-based
optimization. - Some systems only have rule-based optimization,
i.e. they dont take statistical information into
account.
35Statistic Information in the DB Catalog
- Need information about the relations and indexes
involved. Catalogs typically contain at least - tuples (NTuples) and pages (NPages) for each
relation. - distinct key values (NKeys) and NPages for each
index. - Index height, low/high key values (Low/High) for
each tree index. - Catalogs are updated periodically.
- Updating whenever data changes is too expensive
lots of approximation anyway, so slight
inconsistency ok. - More detailed information (e.g., histograms of
the values in some field) are sometimes stored.
36Heuristic Optimization
- Cost-based optimization is expensive, even with
dynamic programming. - Systems may use heuristics to reduce the number
of choices that must be made in a cost-based
fashion. - Heuristic optimization transforms the query-tree
by using a set of rules that typically (but not
in all cases) improve execution performance - Perform selection early (reduces the number of
tuples) - Perform projection early (reduces the number of
attributes) - Perform most restrictive selection and join
operations before other similar operations. - Some systems use only heuristics, others combine
heuristics with partial cost-based optimization.
37Transformation Example
- Query Find the names of all students who have
enrolled in some course with 6 credit
points. ?name(?credit_points6 (course
(enrolled student))) - Transformation using distribution rule 3
?name( (?credit_points6 (course))
(enrolled student))) - Performing the selection as early as possible
reduces the size of the relation to be joined.
38Example with Multiple Transformations
- Query Find the names of all students which
enrolled in a 6 credit_point course, whose grade
is a distinction (D).?name(?gradeD ?
credit_points6 (course (enrolled
student))) - Transformation using join association
rule?name(?gradeD ? credit_points6
((course enrolled) student)) - Second form provides an opportunity to apply the
perform selections early rule, resulting in the
subexpression - ?credit_points6 (course) ?
gradeD (enrolled) - Thus a sequence of transformations can be useful
39Optimization Strategies
- Ideally find the optimal plan.
- This approach is very expensive in time and
space. - Typically Find a good plan, avoiding the worst
plan. - do not generate all expressions
- Central Problem Join Ordering (Join
Enumeration) - Typical Approach
- use dynamic programming for generating join plans
- restrict to left-deep join trees
40Left Deep Join Trees
- In left-deep join trees, the right-hand-side
input for each join is a relation, not the result
of an intermediate join.
41Dynamic Programming in Optimization
- To find best join tree for a set of n relations
- To find best plan for a set S of n relations,
consider all possible plans of the form S1
(S S1) where S1 is any non-empty subset of S. - Recursively compute costs for joining subsets of
S to find the cost of each plan. Choose the
cheapest of the 2n 1 alternatives. - When plan for any subset is computed, store it
and reuse it when it is required again, instead
of recomputing it - Dynamic programming
- E.g. r1, r2, r3, r4, r5
- (r1, r2, r3), (r4, r5) 1212144
- (r1, r2, r3) and (r4, r5) 1212 24
42Wrap-Up
- Parsing and Translation
- Evaluation
- Individual operations
- Entire operations
- Optimization
- Cost-based
- Heuristic