Title: Deferred update updates are not written to the database until after a transaction has reached it com
1Recovery Techniques
- Deferred update updates are not written to the
database until after a transaction has reached it
commit point. - Immediate update updates are applied to the
database as they occur without waiting to reach
the commit point.
2Query Processing
- Query decomposition transform a high-level
query into a relational algebra query and check
that the query is syntactically and semantically
correct. - Query optimization use transformation rules to
convert one relational algebra expression into an
equivalent form that is know to be more
efficient. - Code generation
- Runtime query execution
3SQL Processing Architecture
4Query Tree
- A leaf node is created for each base relation in
the query - A non-leaf node is created for each intermediate
relation produced by a relational algebra
operation - The root of the tree represents the result of the
query - The sequence of operation is directed from the
leaves to the root
5Normalization
- Conjunctive normal form a sequence of conjuncts
that are connected with the AND (?) operator.
Each conjunct contains one or more terms
connected by the OR (?) operator. - Disjunctive normal form a sequence of disjuncts
that are connected with the OR (?) operator. Each
disjunct contains one or more terms connected by
the AND (?) operator.
6Semantic Analysis
- Relation connection graph if the graph is not
connected, the query is incorrectly formulated. - Normalized attribute connection graph if the
graph has a cycle for which the valuation sum is
negative, the query is contradictory.
7Relation Connection Graph
- Create a node for each relation
- Create a node for the result
- Create edges between two nodes that represent a
join - Create edges between nodes that represent the
source of projection operations
8Normalized Attribute Connection Graph
- Create a node for each reference to an attribute
- Create a node for constant 0
- Create a directed edge between nodes that
represent a join - Create a directed edge between an attribute node
and a constant 0 node that represents a selection
operation. - Weight the edges a ? b with the value c, if it
represents the inequality condition (a ? b c) - Weight the edges 0 ? a with the value c, if it
represents the inequality condition (a ? c).
9Heuristical Processing Strategies
- Perform selection operations as early as possible
- Combine the Cartesian product with a subsequent
selection operation whose predicate represents a
join condition into a join operation. - Use associativitiy of binary operations to
rearrange leaf nodes so that the leaf nodes with
the most restrictive selection operations are
executed first. - Perform projection as early as possible.
- Compute common expression once and store the
result this is beneficial only if the size of
the result from the common expression is small
enough.
10Query Optimization in Oracle
- The process of choosing the most efficient way to
execute a SQL statement. - Many different ways to execute a SQL statement
often exist, for example, by varying the order in
which tables or indexes are accessed. - To execute a DML statement, Oracle may have to
perform many steps (execution plan). Each of
these steps either retrieves rows of data
physically from the database or prepares them in
some way for the user issuing the statement.
11A Query Example
SELECT ename, job, sal, dname FROM emp, dept
WHERE emp.deptno dept.deptno AND NOT EXISTS
(SELECT FROM salgrade WHERE emp.sal BETWEEN
losal AND hisal)
12An Execution Plan
13Execution Order of the Example Query
- Oracle performs step 3 and returns the resulting
rows, one by one, to step 2. - For each row returned by step 3, Oracle performs
the following steps - Oracle performs step 5 and returns the resulting
rowid to step 4. - Oracle performs step 4 and returns the resulting
row to step 2. - Oracle performs step 2, joining the single row
from step 3 with a single row from step 4, and
returns a single row to step 1. - Oracle performs step 6 and returns the resulting
row, if any, to step 1. - Oracle performs step 1. If a row is not returned
from step 6, then Oracle returns the row from
step 2 to the user issuing the SQL statement.
14Using EXPLAIN PLAN
- Displays execution plans chosen by the Oracle
optimizer for SELECT, UPDATE, INSERT, and DELETE
statements. - UTLXPLAN.SQL create a sample output table
called PLAN_TABLE in your schema. - UTLXPLS.SQL - Shows plan table output for serial
processing. - UTLXPLP.SQL - Shows plan table output with
parallel execution columns. - http//technet.oracle.com/doc/oracle8i_816/server.
816/a76992/ch13_exp.htm
15Components of Execution Plan
- An ordering of the tables referenced by the
statement. - An access method for each table mentioned in the
statement. - A join method for tables affected by join
operations in the statement.
16Goals of Optimization
- The best throughput i.e., using the least amount
of resources necessary to process all rows
accessed by the statement. - The best response time i.e., using the least
amount of resources necessary to process the
first row accessed by a SQL statement. - For parallel execution of a SQL statement, the
optimizer can choose to minimize elapsed time at
the expense of resource consumption.
17Rule-Based Optimization
- The optimizer chooses an execution plan based on
the access paths available and the ranks of these
access paths. - Oracle's ranking of the access paths is
heuristic. - Usually, operations of lower rank execute faster
than those associated with constructs of higher
rank. - http//technet.oracle.com/doc/oracle8i_816/server.
816/a76992/optimops.htm38864
18Access Paths and Their Ranking
- Single Row by Rowid Single Row by Cluster Join
- Single Row by Hash Cluster Key with Unique or
Primary Key - Single Row by Unique or Primary Key
- Clustered Join Hash Cluster Key Indexed Cluster
Key - Composite Index Single-Column Indexes
- Bounded Range Search on Indexed Columns
- Unbounded Range Search on Indexed Columns
- Sort-Merge Join
- MAX or MIN of Indexed Column
- ORDER BY on Indexed Column
- Full Table Scan
19Cost-Based Optimization
- The optimizer generates a set of potential
execution plans for the SQL statement based on
its available access paths and hints. - The optimizer estimates the cost of each
execution plan based on statistics in the data
dictionary for the data distribution and storage
characteristics of the tables, indexes, and
partitions accessed by the statement. - The optimizer compares the costs of the execution
plans and chooses the one with the smallest cost.
- http//technet.oracle.com/doc/oracle8i_816/server.
816/a76992/optimops.htm30274
20Cost-Based Optimizer Architecture
21Related Views
- USER_CLUSTERS and DBA_CLUSTERS
- USER_TABLES, ALL_TABLES, and DBA_TABLES
- USER_TAB_COLUMNS
- USER_INDEXES
- USER_TAB_PARTITIONS
- USER_TAB_SUBPARTITIONS
- USER_IND_PARTITIONS
- USER_IND_SUBPARTITIONS
- USER_PART_COL_STATISTICS
- USER_SUBPART_COL_STATISTICS
22Optimizing Joins
- Choose an access path to retrieve data from each
table in the join. - Select a join operation.
- Determine join order if more than two tables
involved. - SELECT
- FROM emp, dept
- WHERE emp.deptno dept.deptno
23Nested Loops Join
- The optimizer chooses one of the tables as the
outer table, or the driving table. The other
table is called the inner table. - For each row in the outer table, Oracle finds all
rows in the inner table that satisfy the join
condition. - Oracle combines the data in each pair of rows
that satisfy the join condition and returns the
resulting rows.
24Nested Loops Join
25Execution Steps
- Step 2 accesses the outer table (emp) with a full
table scan. - For each row returned by step 2, step 4 uses the
emp.deptno value to perform a unique scan on the
pk_dept index. - Step 3 uses the rowid from step 4 to locate the
matching row in the inner table (dept). - Oracle combines each row returned by step 2 with
the matching row returned by step 4 and returns
the result.
26Sort Merge Join
- Oracle sorts each row source to be joined if they
have not been sorted already by a previous
operation. The rows are sorted on the values of
the columns used in the join condition. - Oracle merges the two sources so that each pair
of rows, one from each source, that contain
matching values for the columns used in the join
condition are combined and returned as the
resulting row source.
27Sort Merge Join
28Execution Steps
- Steps 3 and 5 perform full table scans of the emp
and dept tables. - Steps 2 and 4 sort each row source separately.
- Step 1 merges the sources from steps 2 and 4
together, combining each row from step 2 with
each matching row from step 4, and returns the
resulting row source.
29Hash Join
- Oracle performs a full table scan on each of the
tables and splits each into as many partitions as
possible based on the available memory. - Oracle builds a hash table from one of the
partitions (if possible, Oracle selects a
partition that fits into available memory).
Oracle then uses the corresponding partition in
the other table to probe the hash table. All
partition pairs that do not fit into memory are
placed onto disk. - For each pair of partitions (one from each
table), Oracle uses the smaller one to build a
hash table and the larger one to probe the hash
table.
30Hash Join
31Execution Steps
- Steps 2 and 3 perform full table scans of the emp
and dept tables. - Step 1 builds a hash table out of the rows coming
from step 2 and probes it with each row coming
from step 3.
32Cluster Join
33Execution Steps
- Step 2 accesses the outer table (dept) with a
full table scan. - For each row returned by step 2, step 3 uses the
dept.deptno value to find the matching rows in
the inner table (emp) with a cluster scan.
34Cost of Join Operation
- The cost of a nested loops join access cost of
A (access cost of B number of rows from A) - The cost of a merge join access cost of A
access cost of B (sort cost of A sort cost of
B) - Estimated costs to perform a hash join (access
cost of A number of hash partitions of B)
access cost of B
35Additional Factors
- A nested loops join (NL) is inefficient when a
join returns a large number of rows typically,
more than 10,000 rows is considered large, and
the optimizer may choose not to use it. - If you are using the RBO, then a merge join is
the most efficient join when a join returns a
large number or rows. - If you are using the CBO, then a hash join is the
most efficient join when a join returns a large
number or rows.