File Processing : Basic Concepts of Query Processing - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

File Processing : Basic Concepts of Query Processing

Description:

Choose ONE QEP among QEPs based on. Execution Cost of each QEP, where cost means ... Selectivity. Distribution of data. STEM. PNU. Cost Model : Basic Concepts ... – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0
Slides: 16
Provided by: lik
Category:

less

Transcript and Presenter's Notes

Title: File Processing : Basic Concepts of Query Processing


1
File Processing Basic Concepts of Query
Processing
  • 2005, Spring
  • Pusan National University
  • Ki-Joune Li

2
Basic Concepts of Query
  • Query
  • Retrieve records satisfying predicates
  • Types of Query
  • Operators
  • Aggregate Query
  • Sorting

3
Relational Operators Select
  • Selection (?condition)
  • Retrieve records satisfying predicates
  • Example
  • Find Student where Student.Score gt 3.5
  • ?scoregt3.5(Student)
  • Index or Hash

4
Relational Operators Project
  • Project (?attributes)
  • Extract interesting attributes
  • Example
  • Find Student.name where score gt 3.5
  • ?name(?acoregt3.5(Student))
  • Full Scan

5
Cartisan Product
  • Cartisan Product (?)
  • Two Tables R1 ? R2
  • Produce all cross products
  • Join ( )

?

6
Join
  • Join ( )
  • Select combined records of cartisan product with
    same value of a common attribute (Natural Join)
  • Example
  • Student (StudentName, AdvisorProfessorID,
    Department, Score)
  • Professor(ProfessorName, ProfessorID,
    Department)
  • Student AdivsorProfessorIDProfessorID
    Professor
  • ? AdivsorProfessorIDProfessorID(Student
    ?Professor)
  • Double Scan Expensive Operation

7
Relational Algebra
  • Relational Algebra
  • Operand Table (Relation)
  • Operator Relational Operator (?, ?, , etc)
  • Example
  • Find Student Name where Student Score gt 3.5 and
    Advisor Professor belongs to CSE Department
  • ?student.name(?acoregt3.5(Student)
    ?DepartmentCSE (Professor) )
  • Relational Algebra Specifies the sequence of
    operations

8
Query Processing Mechanism
  • Query Processing Steps
  • 1. Parsing and translation
  • 2. Optimization
  • 3. Evaluation

9
Parsing and Translation
  • Parsing Query Statement (e.g. in SQL)
  • Translation into relational algebra
  • Equivalent Expression
  • For a same query statement
  • several relation algebraic expressions are
    possible
  • Example
  • ?balance ? 2500(?name(account )) ? ?name(?balance
    ? 2500(account ))
  • Different execution schedules
  • Query Execution Plan (QEP)
  • Determined by relational algebra
  • Several QEPs may be produced by Parsing and
    Translation

10
Query Optimization
  • Choose ONE QEP among QEPs based on
  • Execution Cost of each QEP, where cost means
    execution time
  • How to find cost of each QEP ?
  • Real Execution
  • Exact but Not Feasible
  • Cost Estimation
  • Types of Operations
  • Number of Records
  • Selectivity
  • Distribution of data

11
Cost Model Basic Concepts
  • Cost Model Number of Block Accesses
  • Cost
  • C Cindex Cdata
  • where Cindex Cost for Index Access
  • Cdata Cost for Data Block Retrieval
  • Cindex vs. Cdata ?
  • Cindex depends on index
  • Cdata
  • depends on selectivity
  • Random Access or Sequential Access
  • Selectivity
  • Number (or Ratio) of Objects Selected by Query

12
Cost Model Type of Operations
  • Cost model for each type of operations
  • Select
  • Project
  • Join
  • Aggregate Query
  • Query Processing Method for each type of
    operations
  • Index/Hash or Not

13
Cost Model Number of Records
  • Number of Records
  • Nrecord ? Nblocks
  • Number of Scans
  • Single Scan
  • O(N) Linear Scan
  • O(logN ) Index
  • Multiple Scans
  • O(NM ) Multiple Linear Scans
  • O(N logM ) Multiple Scans with Index

14
Selectivity
  • Selectivity
  • Affects on Cdata
  • Random Access
  • Scattered on several blocks
  • Nblock ? Nselected
  • Sequential Access
  • Contiguously stored on blocks
  • Nblock Nselected / Bf

15
Selectivity Estimation
  • Selectivity Estimation
  • Depends on Data Distribution
  • Example
  • Q1 Find students where 60 lt weight lt 70
  • Q2 Find students where 80 lt weight lt 90
  • How to find the distribution
  • Parametric Method
  • e.g. Gaussian Distribution
  • No a priori knowledge
  • Non-Parametric Method
  • e.g. Histogram
  • Smoothing is necessary
  • Wavelet, Discrete Cosine
Write a Comment
User Comments (0)
About PowerShow.com