Query Optimization - PowerPoint PPT Presentation

1 / 94
About This Presentation
Title:

Query Optimization

Description:

... of null values in new relation. Introducing repeating groups 'relation becomes attributes' ... Most cost estimates are based upon the cardinality of the relation ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 95
Provided by: thebutl
Category:

less

Transcript and Presenter's Notes

Title: Query Optimization


1
Query Optimization
  • Jonathon McDonald
  • Eric Nelson
  • Jordan Neshev
  • Ben Oberkirsch
  • Hung Tang

2
Agenda
  • Tuning the database for performance
  • Query decomposition
  • Heuristics
  • Cost estimation
  • Pipelining

3
Query Optimization
  • Tuning the Database for Performance

4
Controlled Redundancy
  • Objective - to introduce redundancy into
    relations by denormalization to improve query
    performance.
  • Denormalization - refinement to a relation such
    that the new relation(s) may violate the degree
    of normalization of the original relation.
  • Apply denormalization when
  • performance on current relations is not
    satisfactory
  • update rate is low and query rate is high

5
Idea and Application
  • With one table, speed up SQL queries by using
    indexes.
  • Create relations by duplicating certain
    attributes or by combining relations so as to
    accommodate SQL queries by reducing table joins.
  • 7 situations to use denormalization.
  • Combining relations (3 situations)
  • Duplicating attributes in relations (3
    situations)
  • Creating an extract table

6
Combining tables
  • Combining 11 relationships
  • should be considered if the relations are
    frequently referenced together
  • consider amount of null values in new relation
  • Introducing repeating groups
  • relation becomes attributes
  • in general, used if the number in groups is
    known, static and not too large.
  • Merging reference table into base relation

7
Duplicating Attributes in Relations to Reduce
Joins
  • Duplicating non-key attributes in 1
    relationships
  • Duplicating foreign key attributes in 1
    relationships
  • note this sets a new relationship in place
  • Duplicating attributes in (many-to-many)
    relationships
  • a relationship is created between two
    relations by introducing an intermediate
    relation.
  • duplicate attribute(s) from one of the relations
    into the intermediate relation.

8
Creating an Extract Table
  • Create a single table for report query.
  • The data stays static for a period of time.
  • The table is updated at certain off-peak time
    (e.g., overnight) as a batch job.
  • This avoids running the complex multi-join SQL
    during peak time.

9
Query Optimization
  • Query Decomposition

10
Intro
  • Query Processing - transforming a SQL query into
    an equivalent relational algebra form,
    implementing it, executing it, and returning the
    data result
  • Four phases of Query Processing
  • Query Decomposition
  • Optimization
  • Code Generation
  • Execution

11
Objective
  • Transform a query in high-level language into an
    equivalent relational algebra form.
  • checks for correct syntax
  • checks for correct semantic
  • Five steps
  • Analysis
  • Normalization
  • Semantic Analysis
  • Simplification
  • Query Restructuring

12
Analysis
  • SQL query is checked for syntax validity
  • Relations and attributes are verified with the
    system catalogue
  • Proper application of operations to attributes
    and object types
  • Creates an internal representation in the form of
    a query tree

13
Normalization
  • Convert the predicate (WHERE clause) into a
    normalized form that is more easily manipulated
  • Two normalization transformation
  • conjunctive normal form - conjuncts connected
    with the AND operator
  • disjunctive normal form - conjuncts connected
    with the OR operator

14
Semantic Analysis
  • Objective - check normalized queries for
    incorrect formulation or contradiction
  • Incorrect formulation occurs when components do
    not contribute to the generation of the result
  • Contradictory component occurs when no tuples
    satisfy the component

15
Simplification
  • Objective - transform query into a semantically
    equivalent and efficient form by eliminating
    redundant qualifications and common
    subexpressions
  • Applies idempotency rules of boolean algebra
  • Also takes into account access restrictions, view
    definitions, and integrity constraints

16
Query Restructuring
  • More transformation of the query into a form with
    a more efficient implementation
  • Gets into the query optimization aspect

17
Query Optimization
  • Heuristic Approaches

18
Overview
  • Represent the query as a relational algebra
    statement
  • Use transformation rules to obtain a more
    efficient representation of the query

19
Rules
  • Cascade of Selection
  • spqr sp(sq(sr(R)))
  • Commutativity of Selection
  • sp(sq(R)) sq(sp(R))
  • Intermediate Projections Extraneous
  • ?att1 ?att1, att2(R) ?att1(R)
  • Commutativity of Selection with Projection
  • sp(?att3, att4(R)) ?att3, att4(sp(R)) where p
    is a subset of att3, att4

20
Rules
  • Commutativity of theta join and Cartesian join
  • R lttheta joingt? S S lttheta joingt? R
  • R X S S X R
  • Commutativity of s with theta join (or Cartesian
    join)
  • When selection criteria are from 1 relation
  • sp(R lttheta joingt? S) sp(R) lttheta joingt? S
  • sp(R X S) sp(R) X S
  • Criteria from both relations
  • spq(RXS) sp(R) X sq(S)

21
Rules
  • Commutativity of ? and theta join (or Cartesian
    join)
  • ?L1 ?L2(RXS) ?L1 (R) X ?L2(S) where L1
    contains only attributes of R and L2 contains
    only attributes of S
  • Commutativity of ? and ?
  • R ? S S ? R S ? R R ? S
  • Commutativity of s with set ops (?,?,-)
  • sp(R op S) sp(R) op sp(S)
  • Commutativity of ? with ?
  • ?L(R?S) ?L(R) ? ?L(S)

22
Rules
  • Associativity of theta join (or Cartesian join)
  • (R X S) X T R X (S X T)
  • (R lttheta joingtp S) lttheta joingtqr T R lttheta
    joingtpr (S lttheta joingtq T) where q is a subset
    of attributes from S and T only
  • Associativity of ? and ?
  • (R ? S) ? T R ? (S ? T)
  • (R ? S) ? T R ? (S ? T)

23
Strategies for Rule usage
  • Select as early as possible
  • Combine X with subsequent s when s represents a
    join condition into a join operation.
  • Using associativity, rearrange leaf nodes to
    execute the most restrictive selection first.
  • Project as early as possible.
  • Compute repeated expressions only once.

24
Query Optimization
  • Relational Model Cost Estimation

25
Database Statistics
  • Why statistics
  • There are many ways to implement relational
    algebra operations
  • A DBMS must chose the most efficient one
  • Efficiency is determined by calculating formulae
    on current statistical information

26
Database Statistics
  • Which Statistics
  • Most cost estimates are based upon the
    cardinality of the relation
  • The dominant cost is disk access
  • Accuracy of estimate depends upon the currency of
    statistical information in the system catalog

27
Database Statistics
  • Types of Statistics
  • For each base relation R
  • nTuples(R) the number of tuples in R
  • bFactor(R) the number of tuples in R that fit
    into one block
  • nBlocks(R) the number of blocks needed to store
    R (assuming that the tuples of R are stored
    together, then nBlocks(R) nTuples(R) /
    bFactor(R)

28
Database Statistics
  • Types of Statistics (contd)
  • For each attribute A of base relation R
  • nDistinctA(R) the number of distinct values of
    A in R
  • minA(R), maxA(R) the minimum and maximum
    possible values of A in R
  • SCA(R) the selection cardinality of A in R.
    This is the average number of tuples that satisfy
    an equality condition on A
  • Equality SCA(R) 1 if A is key attribute of R
    SCA(R) nTuples(R)/nDistinctA(R)
    otherwise
  • Others

29
Database Statistics
  • Types of Statistics (contd)
  • For each multi-level index I on attribute set A
  • nLevelsA(R) the number of levels of I
  • nLfBlocksA(R) the number of leaf blocks in I

30
Relational Algebra Operations
  • Selection
  • Join
  • Project
  • Set Operations

31
Selection Operation
  • Works on a single relation R, and defines a
    single relations S
  • Predicate may be simple or complicated
  • Implementation strategy depends upon
  • The structure of the file in which it is stored
  • Whether the attributes in the predicate have been
    indexed/hashed

Implementation strategy is an algorithm
32
Selection Operation
  • Estimation of Cardinality of result set
  • Predicate of forms A ? x
  • ntuples(S) SCA(R)
  • For any attribute B ? A
  • nDistinctB(S) nTuples(S) if nTuples(S) lt
    nDistinctB(R)/2
  • nDistinctB(S) nDistinctB(R) if nTuples(S)
    gt 2nDistinctB(R)
  • Others

33
Selection Operation
  • Implementation Strategies
  • Linear search
  • Binary search
  • Equality on hash key
  • Equality on primary key
  • Inequality on primary key
  • Equality on clustering
  • Equality on non-clustering
  • Inequality on secondary B-tree index

34
Selection Operation
  • Linear Search
  • Unordered file (Heap file)
  • No index
  • Generally need to scan each tuple in each block
  • Equality condition of Primary/Unique Key
  • On average nBlocks(R)/2
  • Generally
  • nBlocks(R)

35
Selection Operation
  • Binary Search
  • Ordered file (Sequential file)
  • No index
  • Predicate of form Ax, file ordered on A
  • On average log2(nBlocks(R))
  • Generally
  • log2(nBlocks(R)) SCA(R)/bFactor(R)) -1

36
Selection Operation
  • Equality on hash key
  • Attribute A is the hash key
  • No overflow
  • Constant (1)
  • Overflow
  • Additional access may be necessary
  • Cost depends on amount of overflow and method to
    handle overflow

37
Selection Operation
  • Equality condition on Primary Key
  • By definition indexed
  • Equality condition
  • nLevels(I) 1

38
Selection Operation
  • Inequality condition on Primary Key
  • Strategy
  • Find tuple satisfying equality condition
  • If index sorted, access all tuples before or
    after this one
  • nLevels(I) nBlocks(R)/2

39
Selection Operation
  • Equality condition on clustering index
  • Equality condition on attribute A
  • Secondary index
  • Clustered index gt tuples are on same blocks
  • Number of tuples may require overflow into more
    than one block
  • Cost determined as first index accessed, then the
    number of blocks required to store the tuples
    satisfying the equality condition
  • nLevels(I) SCA(R)/bFactor(R)

40
Selection Operation
  • Equality condition on a non-clustering index
  • Equality condition on attribute A
  • Non-clustered index gt must assume tuples on
    different blocks
  • Cost determined as first index accessed, then the
    selection cardinality
  • nLevels(I) SCA(R)

41
Selection Operation
  • Inequality condition on a secondary B-tree index
  • On average, half the leaf nodes accessed
  • Via index, on average half the tuples accessed
  • nLevels(I) nLfBlocksA(I)/2 nTuples(R)/2

42
Selection Operation
  • Composite predicates
  • Conjunctive selection
  • Contains tuples satisfying all conjuncts
  • (Aa ? B b) C c
  • Disjunctive selection
  • Contains union of tuples satisfying disjuncts
  • (Aa Bb) ? C c

43
Selection Operation
  • Composite predicates
  • Cost analysis
  • Conjunctive selection without disjunction
  • Use a selection implementation strategy (2-8)
    discussed earlier if
  • Attribute in conjunct has index, or is ordered,
    estimate as in 2-8 discussed earlier
  • Selection condition involves equality on two or
    more attributes, and a composite index or hash
    exists on the combined attributes
  • Secondary indexes are defined on one or more
    attributes, and these attributes are involved
    only in equality conditions in the predicate

44
Selection Operation
  • Composite predicates
  • Cost analysis
  • Selections with disjunction
  • If an attribute in the selection condition
    contains an or, and that attribute has no index
    or sort order, then entire selection operation
    requires linear search
  • If an index exists on every attribute of the
    selection, then optimize by retrieving tuples
    that satisfy condition, and apply Union (see Set
    Operations)

45
Selection Operation
  • Estimation example (Staff relation)
  • There exists a hash index on staffNo
  • There exists a clustering index on branchNo
  • There exists a B-tree index on salary
  • Staff has the following statistics
  • nTuples 3000 bFactor 30 gt nBlocks 100
  • nDistinctsalary 500 gt SCsalary 6
  • minsalary 10 000 maxsalary 50 000
  • nLevelssalary 2 nLfBlockssalary 50

46
Selection Operation
  • Estimation example (contd)
  • Linear search cost
  • Search on key attribute is 50 blocks
  • Search on non-key attributes is 100 blocks
  • Can we improve
  • ? staffNo SG5
  • Equality condition in PK attribute
  • PK attribute is hashed
  • Use strategy 3 equality on hash key gt cost
    of one block

47
Selection Operation
  • Estimation example (contd)
  • Can we improve (contd)
  • ? salary gt 20 000
  • Range search on salary attribute
  • Salary attribute has Btree index
  • Use strategy 8
  • 2 50/2 3000/2 1527
  • Strategy 7 worse than linear search
  • Use linear search
  • Estimated cardinality
  • 3000(50000-20000)/(50000-10000) 2250

48
Join Operation
  • Commonly a performance concern
  • Most time consuming after the Cartesian Product
  • Critical to perform as efficiently as possible

49
Join Operation
  • Theta join
  • Tuples satisfying a predicate from the Cartesian
    product of R and S
  • Predicate of form R.a ? S.b
  • ? a logical operator
  • Equi-join
  • A theta-join where the predicate is equality
  • Natural join
  • Join involves all common attributes of R and S

50
Join Operation
  • Implementation Strategies
  • Block-nested loop join
  • Indexed nested loop join
  • Sort-merge join
  • Hash join

51
Join Operation
  • Estimation of cardinality
  • Cartesian product
  • nTuples(R) nTuples (S)
  • Join
  • Worst case
  • nTuples(T) ? nTuples(R) nTuples (S)
  • Assuming uniform distribution, better estimates
    exist

52
Join Operation
  • Estimation of cardinality (contd)
  • Equi-join (R.A S.B)
  • If A is a key attribute of R, then a tuple of S
    can only join with one tuple of R
  • nTuples(T) ? nTuples(S)
  • If B is a key attribute of S
  • nTuples(T) ? nTuples(S)
  • If neither A nor B are keys
  • nTuples(T) SCA(R) nTuples(S) or
  • nTuples(T) SCB(S) nTuples(R)

For any tuple s in S, expect SCA(R) for a given
value of A, and this number to appear in join.
Multiply this by number of tuples in S.
53
Join Operation
  • Nested loop join
  • Simplest algorithm
  • Outer loop iterates over each tuple in R
  • Inner loop iterates over each tuple in S
  • Can improve
  • Block nested loop join
  • Basic unit of read/write is a disk block
  • Add two additional loops that process blocks

54
Join Operation
  • Block nested loop join (contd)
  • Cost
  • Each block of R must be read
  • Each block of S must be read for each block of R
  • nBlocks(R) (nBlocks(R) nBlocks(S))
  • Improvement
  • Since cost is based upon disk access, read as
    many blocks of the outer loop relation in as
    possible into the database buffer
  • Cost

So choose smallest relation for outer loop
Save one buffer for inner relation and one buffer
for outer relation
55
Join Operation
  • Block nested loop join (contd)
  • Cost
  • nBlocks(R) nBlocks(S)(nBlocks(R)/(nBuffer-2))
  • nBlocks(R) nBlocks(S)

If all blocks of R can be read into the buffer
56
Join Operation
  • Index nested loop join
  • Index or hash function exists on join attributes
  • Replace inefficient file scan with index/hash
    lookup
  • Avoids enumeration of Cartesian product
  • Again, read in as many blocks as possible into
    database buffers
  • Cost
  • Join attribute A in S is PK

57
Join Operation
  • Index nested loop join (contd)
  • Join attribute A in S is PK
  • nBlocks(R) nTuples(R) (nLevelsA(I) 1)
  • Join attribute A in S is clustering index
  • nBlocks(R) nTuples(R)(nLevelsA(I)SCA(R)/bFact
    or(R))

58
Join Operation
  • Sort-merge join
  • For Equi-join, most efficient join is when both
    relations are sorted on join attributes
  • Then look for qualifying tuples by merging two
    relations
  • Tuples with same join attribute value guarenteed
    to be in consecutive order
  • Each block of each relation needs to be read only
    once

Assume each set of tuples with same join value
can be held in DB buffer at same time
59
Join Operation
  • Sort-merge join (contd)
  • Cost
  • nBlocks(R) nBlocks(S)
  • Need to presort?
  • Add in nBlocks(R)log2(nBlocks(R))

60
Join Operation
  • Hash join
  • May be used for Natural join or Equijoin
  • Partition phase
  • Select hash function h to partition R and S
  • h(R.B) ! h(S.C) gt R.B ! S.C
  • Probing phase
  • Note h(R.B) h(S.C) ?gt R.B S.C
  • Read each R partition
  • Attempt to join tuples with tuples in equivalent
    S partition
  • Can use nested loop join for this phase

61
Join Operation
  • Hash join (contd)
  • Cost
  • Read R and S to partition
  • nBlocks(R) nBlocks(S)
  • Write each partition to disk
  • nBlocks(R) nBlocks(S)
  • Read in each partition to find matching tuples
  • nBlocks(R) nBlocks(S)
  • Total
  • 3(nBlocks(R) nBlocks(S)

Assumes no overflow and in memory hash index
62
Join Operation
  • Hash join (contd)
  • Cost (contd)
  • If above assumption doesnt hold, then
    partitioning of relations cannot be done in one
    pass
  • Use recursive partitioning algorithm
  • 2(nBlocks(R) nBlocks(S))lognBuffer-1(nBlocks(S
    ))-1 nBlocks(R) nBLocks(S)

63
Join Operation
  • Estimation Example (Branch Staff)
  • There exists a hash index on PK staffNo
  • There exists a hash index on PK branchNo
  • There are 100 database buffer blocks
  • The system catalog holds
  • nTuples(Staff) 3000
  • bFactor(Staff) 30 gt nBlocks(Staff) 200
  • nTuples(Branch) 500
  • bFactor(Branch) 50 gt nBlocks(Branch) 10

64
Join Operation
  • Estimation Example (Branch Staff) (contd)
  • nTuples(PropertyForRent) 100 000
  • bFactor(PFR) 50 gt nBlocks(PFR) 2000
  • Estimate
  • J1 Staff ltJgtstaffNo PropertyForRent
  • J2 Branch ltJgtstaffNo PropertyForRent

65
Join Operation
  • Estimation Example (Branch Staff) (contd)

Best strategy depends on situation
66
Projection Operation
  • Extracts vertical subset of relation R, and
    produces single relation S
  • Implementation strategy considers
  • Removal of attributes that arent required
  • Not problematic
  • Elimination of duplicate tuples produced from
    previous step
  • More problematic

67
Projection Operation
  • Estimation of Cardinality of result set
  • Projection contains key
  • ntuples(S) nTuples(R)
  • Projection contains single non-key attribute A
  • ntuples(S) SCA(R)

68
Projection Operation
  • Cost Estimation of Duplicate Elimination
  • Duplicate elimination using sorting
  • Sort attributes left in reduced relation of R
  • Makes duplicates adjacent
  • Must create intermediate relation at a cost of
    nBlocks(R)
  • Sorting costs nBlocks(R)log2(nBlocks(R))
  • Total cost -
  • nBlocks(R) nBlocks(R)log2(nBlocks(R))

69
Projection Operation
  • Cost Estimation of Duplicate Elimination
  • Duplicate elimination using hashing
  • Partition
  • Allocate one buffer block for reading R, allocate
    one nBuffer 1 buffer blocks to store partitioned
    output
  • Remove unwanted attributes
  • Apply hash function to remaining attributes
    (chose hash function for uniform distribution)
  • Two tuples in different partitions guaranteed not
    to be duplicates
  • Search area for duplicates now reduced

Useful if large number of buffer blocks compared
to number of blocks for R
70
Projection Operation
  • Cost Estimation of Duplicate Elimination
  • Duplicate elimination using hashing (contd)
  • Eliminate duplicates
  • For each nBuffer 1 buffer blocks of stored
    partitioned output
  • Apply second hash funtion to tuple
  • Insert hash value into in-memory hash table
  • If tuples hash to same value as another tuple,
    check if duplicate
  • Eliminate new tuple if duplicate
  • Write result tuples

71
Projection Operation
  • Cost Estimation of Duplicate Elimination
  • Duplicate elimination using hashing (contd)
  • Cost
  • Assume hashing has no overflow
  • Exclude cost of writing result relation
  • Cost is
  • nBlocks(R) nb

nb is number of blocks required by intermediate
relation before duplicate elimination.
72
Set Operations
  • Binary operations
  • Union
  • Intersection
  • Set difference
  • Must be union compatible
  • Implementation
  • Sort both relations, and scan result set once
  • Union
  • Place in result set any tuple that appears in
    either original relation, eliminating duplicates

73
Set Operations
  • Implementation (contd)
  • Intersection
  • Place in result set any tuple that appears both
    original relation, eliminating duplicates
  • Set difference (R-S)
  • Place in result set any tuple that appears in R
    that is not in S
  • Cost
  • Use sort-merge join algorithm
  • nBlocks(R) nBlocks(S) nBlocks(R)log2(nBlocks
    (R)) nBlocks(S)log2(nBlocks(S))

74
Set Operations
  • Estimation of Cardinality of result set
  • Estimate of set operations more problematic,
    usually upper and lower bound given
  • Union (RUS)
  • max(nTuples(R), nTuples(S)) nTuples(T)
    nTuples(R) nTuples(S)
  • Intersection (RnS)
  • 0 nTuples(T) min(nTuples(R), nTuples(S))
  • Set Difference (R-S)
  • 0 nTuples(T) nTuples(R)

75
Query Optimization
  • Pipelining and Query Optimization in Oracle

76
Pipelining
  • Used to improve query processing performance
  • (aka On-the-Fly Processing)
  • What weve assumed until now
  • Materialization The output of one operation is
    stored in a temp relation (and written
    temporarily to disk) for use by the next
    operation
  • Alternative Pipeline results of one operation
    to another operation so that no temporary
    relation needs to be created for the intermediate
    result

77
Pipeline Implementation
  • Typically a separate thread within DBMS
  • Uses a buffer to hold the tuples passing from one
    operation to the other
  • Drawback Operation inputs may not all be
    available at once for processing which can limit
    algorithm options
  • (ie. Cant use Sort-Merge Join Algorithm if the
    pipelines input tuples arent also sorted on the
    join attribute)
  • Benefit Some execution strategies can be greatly
    improved

78
Pipelining example
  • sposition Manager(ssalary gt 2000(Staff))
  • Normal approach
  • Process the selection on salary
  • Store result in temp relation
  • Do second selection on temp relation
  • Pipeline approach
  • Apply the second selection to each tuple in the
    result of the first relation AS IT IS PRODUCED
  • Does away with need for a temp relation

79
Trees
  • Used to show the execution order of operations
  • 2 general types of trees
  • Linear trees (Left-deep, Right-deep, Other)
  • Non-linear trees (aka Bushy trees)

80
Linear Trees
  • Relation on one side of the operator always a
    base relation
  • When the operations are a join, the left child is
    the outer relation and the right is the inner
    relation
  • Inner relations must always be materialized
  • Makes left-deep trees extremely useful since all
    inner relations are base relations
  • Allows left-deep trees to be implemented using a
    fully-pipelined execution

81
Query Optimization in Oracle
  • Supports both Rule-Based and Cost-Based
    Optimization
  • Gives user ability to create and maintain
    Histograms
  • Allows the Execution Plan to be viewed

82
Oracle Rule-Based Optimizer
  • Contains a table of rankings for different access
    paths available for queries.
  • Optimizer determines rankings according to this
    table for each different Execution Strategy and
    then selects the strategy with the best score

83
Rule-Based Optimization Rankings
84
Rule-Based Optimization Ex.
  • SELECT propertyNoFROM PropertyForRentWHERE
    rooms gt 7 AND city St Louis
  • Available access paths
  • Full Table Scan (available for every query)Rank
    15
  • Unbounded range search on rooms column indexRank
    11
  • Single-Column access on the city indexRank 9

85
Oracle Cost-Based Optimizer
  • Use the Execution Strategy requiring the least
    amount of resource use
  • User can specify whether this is based on
    throughput or on response time
  • Can also suggest hints for the optimizer to
    consider
  • Relies on gathering of statistics generated by
    user
  • Manual updates necessary when database changes
  • Done on entire data structure or using sampling
    (row-based and block-based)

86
Optimizer Hints in Oracle
  • Used to force the optimizer into a decision
    different than it would otherwise choose
  • Can force the use of
  • Rule-Based Optimizer
  • Particular Access Path
  • Particular Join Order
  • Particular Join Operation (ie. Sort-Merge Join)

87
Optimizer Hints in Oracle
  • SELECT / INDEX(sexidx) / name, positionFROM
    StaffWHERE sex M
  • Forces the use of the sex index
  • Can improve performance over a full-table scan if
    we know that a relatively high of the staff are
    female

88
Histograms
  • Provide a more realistic estimation of data value
    distribution within columns
  • Used to better estimate of tuples returned by
    queries/operations
  • Width-Balanced Data divided into fixed number of
    equal width buckets
  • Height-Balanced Data divided such that each
    bucket contains the same of values
  • Overhead Storing and Maintaining

89
Histograms
90
Width-Balanced Histogram
20
40
28
8
4
91
Height-Balanced Histogram
92
Oracle Execution Plan
  • Seen by using the EXPLAIN PLAN command
  • Output written to a database table
  • Main Table Columns
  • STATEMENT_ID - Defined as EXPLAIN PLAN parameter
  • OPERATION - Name of internal operation performed
    (ie SELECT)
  • OPTIONS - Name of another internal operation
    performed
  • OBJECT_NAME - Name of the table or index
  • ID - Number assigned to each execution plan step
  • PARENT_ID - ID of next step that operates on ID
    step output
  • POSITION - Processing order for steps with same
    PARENT_ID
  • COST - Estimated cost of the operation (null for
    rule_based)
  • CARDINALITY - Estimated number of rows accessed
    by this operation

93
Oracle Execution Plan
94
References
  • Alfred V. Aho, Yehoshua Sagiv, Jeffrey D. Ullman
    Equivalences Among Relational Expressions. SIAM
    J. Comput. 8(2) 218-246(1979)
  • Stefano Ceri, Georg Gottlob Translating SQL Into
    Relational Algebra Optimization, Semantics, and
    Equivalence of SQL Queries. TSE 11(4)
    324-345(1985)
  • Sang K. Cha and Gio Wiederhold, "Kaleidoscope a
    cooperative menu-guidedquery interface,"
    Proceedings of the 1990 ACM SIGMOD
    internationalconference on Management of data,
    page 387, 1990, Atlantic City, NewJersey, United
    States
  • Raghu Ramakrishnan, Johannes Gehrke, "Database
    management systems Boston McGraw-Hill, 2000
    Chapter 12, 13
  • Abraham Silberschatz, Henry F. Korth, S.
    Sudarshan, Database system concepts Boston
    McGraw-Hill, 2002 Chapter 14
Write a Comment
User Comments (0)
About PowerShow.com