Chapter 13 Query Processing - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Chapter 13 Query Processing

Description:

File scan is the lowest-level operator to access data. File scans are search algorithms that locate and retrieve records that fulfill a ... – PowerPoint PPT presentation

Number of Views:40
Avg rating:3.0/5.0
Slides: 29
Provided by: melissa115
Learn more at: http://www.cs.sjsu.edu
Category:

less

Transcript and Presenter's Notes

Title: Chapter 13 Query Processing


1
Chapter 13 Query Processing
  • Melissa Jamili
  • CS 157B
  • November 11, 2004

2
Outline
  • Overview
  • Measures of query cost
  • Selection operation
  • Basic algorithms
  • Selections using indices
  • Selections involving comparisons
  • Implementation of complex selections
  • Sorting

3
Overview
  • Query processing the range of activities
    involved in extracting data from a database.
  • Includes translation of queries,
    query-optimizing transformations, evaluation of
    queries.

4
Overview (cont.)
  • Steps involved in processing a query
  • Parsing and translation
  • Optimization
  • Evaluation

5
Step 1. Parsing and translation
  • System checks the syntax of the query.
  • Creates a parse-tree representation of the query.
  • Translates the query into a relational-algebra
    expression.

6
Step 2. Optimization
  • Each relational-algebra operation can be executed
    by one of several different algorithms.
  • Ex. Every tuple can be searched linearly or a B
    tree can be used to index the tuples.
  • A query optimizer must know the cost of each
    operation.

7
Step 3. Evaluation
  • A relational-algebra expression and evaluation
    primitive are needed.
  • evaluation primitive - relational-algebra
    expression annotated with instructions specifying
    how to evaluate each operation.

8
Step 3. Evaluation (cont.)
  • Query execution plan or query-evaluation plan
    sequence of primitive operations that can be used
    to evaluate the expression.

? balance (s balance lt 2500 (account))
9
Step 3. Evaluation (cont.)
  • Then the query-execution engine takes a
    query-evaluation plan, executes that plan, and
    returns the answers to the query.

10
Steps in query processing
11
Measures of Query Cost
  • Measured in terms of different resources,
    including disk accesses and CPU execution time
  • Measured in response time for a query-evaluation
    plan (clock time to execute the plan)

12
Measures of Query Cost (cont.)
  • Disk accesses, number of block transfers from
    disk, are usually the most important cost
  • To calculate add these numbers
  • of seek operations performed
  • of blocks read
  • of blocks written
  • After multiplying them by the average seek time,
    average transfer time, and average transfer time,
    respectively.

13
Selection Operation
  • File scan is the lowest-level operator to access
    data.
  • File scans are search algorithms that locate and
    retrieve records that fulfill a selection
    condition.
  • Allows an entire relation to be read where the
    relation is stored in a single, dedicated file.

14
Basic Algorithms
  • Two basic scan algorithms to implement the
    selection operation
  • A1 (linear search).
  • The system scans each file block and tests all
    records to see whether they satisfy the selection
    condition.
  • System terminates if the required record is
    found, without looking at the other records.
  • A2 (binary search).
  • System performs the binary search on the blocks
    of the file.
  • File is ordered on an attribute, and selection
    condition is an equality comparison.

15
Selections Using Indices
  • Index structures referred to a access paths,
    since they provide a path through which data can
    be located and accessed.
  • Primary index allows records to be read in an
    order that corresponds to the physical order in
    the file.
  • Secondary index any index that is not a primary
    index.

16
Selections Using Indices (cont.)
  • A3 (primary index, equality on key).
  • Use the index to retrieve a single record that
    satisfies the corresponding equality condition.
  • A4 (primary index, equality on nonkey).
  • Same as A3, but multiple records may need to be
    fetched.

17
Selections Using Indices (cont.)
  • A5 (secondary index, equality).
  • Retrieve a single record if the equality
    condition is on a key.
  • Retrieve multiple records if the indexing field
    is not a key.

18
Selections Involving Comparisons
  • Consider the form sA v (r).
  • Can be implemented by linear or binary search or
    by using indices
  • A6 (primary index, comparison).
  • For A v, look for first tuple that has the
    value of A v, return all the tuples starting
    from the tuple up to the end of the file.
  • For A lt v, file scan from the beginning up to
    (but not including) the first tuple with
    attribute A v.

19
Selections Involving Comparisons (cont.)
  • A7 (secondary index, comparison).
  • The lowest-level blocks are scanned
  • From the smallest value up to v for lt and .
  • From v up to the maximum value for gt and .

20
Implementation of Complex Selections
  • Previously only considered simple selection
    conditions with an equality or comparison
    operation.
  • Now consider more complex selection predicates.

21
Implementation of Complex Selections (cont.)
  • Conjunction s?1 n ?2 n n ?n (r)
  • Disjunction s?1 ? ?2 ? ? ?n (r)
  • Negation s-? (r)

22
Implementation of Complex Selections (cont.)
  • A8 (conjunctive selection using one index).
  • Determine if there is an access path for an
    attribute in one of the simple conditions.
  • If yes, algorithms A2 through A7 can be used.
  • On all records retrieved test if they satisfy the
    remaining simple conditions.

23
Implementation of Complex Selections (cont.)
  • A9 (conjunctive selection using composite index
    (multiple attributes).
  • Search index directly if selection specifies an
    equality condition on 2 or move attributes and a
    composite index exists.
  • Type of index determines if A3, A4, or A5 will be
    used.

24
Implementation of Complex Selections (cont.)
  • A10 (conjunction selection by intersection of
    identifiers).
  • Requires indices with record pointers.
  • Scan each index for pointers that satisfy an
    individual condition.
  • Take intersection of all retrieved pointers to
    get set of pointers that satisfy the conjunctive
    condition.
  • Then pointers used to retrieve actual records.
  • If indices not available on all conditions, then
    algorithm tests retrieved records against
    remaining conditions.

25
Implementation of Complex Selections (cont.)
  • A11 (disjunctive selection by union of
    identifiers).
  • Each index is scanned for pointers to tuples that
    satisfy the individual condition.
  • Union of all the retrieved pointers gives the set
    of pointers to all tuples that satisfy the
    disjunctive condition.
  • Pointers then used to retrieve actual records.

26
Sorting
  • Focus on external sorting where relations that
    are bigger than memory.
  • Most common is external merge-sort algorithm.

27
Sorting (cont.)
  • N-way merge
  • A number of runs are created and sorted.
  • Runs are merged.

28
Conclusion
  • Query processing - translation, transformation,
    and evaluation.
  • Query cost measured by number of disk accesses.
  • Selection operation algorithms
  • Sorting N-way merge.
Write a Comment
User Comments (0)
About PowerShow.com