SQL Server 2000 Query Processing and Optimization - PowerPoint PPT Presentation

1 / 43
About This Presentation
Title:

SQL Server 2000 Query Processing and Optimization

Description:

Clustering columns automatically added non-clustered indices (as key/bookmark) ... Locates base table row using 'bookmark' retrieved from non-clustered index ... – PowerPoint PPT presentation

Number of Views:132
Avg rating:3.0/5.0
Slides: 44
Provided by: donv2
Category:

less

Transcript and Presenter's Notes

Title: SQL Server 2000 Query Processing and Optimization


1
SQL Server 2000Query Processing and Optimization
  • Don Vilen
  • Program Manager
  • SQL Server Development Team

2
Agenda
  • SQL Server Overview
  • SQL Server Architecture
  • Storage and Access Methods
  • Query Processing and Optimization
  • Transaction Processing
  • Other Topics

3
Query Processing and Optimization
4
Query Processing and Optimization
  • SQL Server Architecture
  • Query Processor goals
  • Optimization techniques
  • Query plans
  • Futures

5
Server ArchitectureThe Big Picture
6
Query Processor Components
  • Query optimization
  • Selection of best execution plan
  • Cost-based, transformation-driven
  • Extensive logical inferences
  • Query execution
  • Algorithms to perform join, group by,
  • Hash-based, merge-based
  • Parallelism
  • Provides plan building blocks

7
Query Processor Goals
  • Responsibilities
  • Processing of DML queries (T-SQL and )
  • SELECT, INSERT, UPDATE, DELETE,
  • Processing of DDL operations
  • Index creation, DBCC CHECK,
  • Creation and maintenance of statistics
  • DBCC UPDATE STATISTICS

8
Query ProcessorGoal Performance
  • SQL Server 6.5 provides excellent OLTP
    performance
  • SQL Server 7.0 extends to set-oriented, decision
    support queries (star schema)
  • SQL Server 2000 extends to handle snowflake
    schemas, indexed views, partitioned views, and
    extensive parallelism

9
Query ProcessorGoal Modularity
  • Robustness
  • Rapid future innovation
  • Uniform internal interfaces
  • Learn from the lessons of 6.5
  • Keep the code clean ? extensible

10
Query Processor Goal Functionality
  • Distributed and heterogeneous queries
  • More complex queries, gt 16 tables
  • Some extensions, like SELECT TOP
  • Indexed views support
  • Statistics on non-indexed columns
  • Partitioned views
  • SELECT, INSERT, UPDATE, DELETE
  • Cascading DRI, etc
  • Rich heterogeneous query support (OLEDB)

11
Agenda
  • SQL Server Architecture
  • Query Processor goals
  • Optimization techniques
  • Query plans
  • Futures

12
Query OptimizerOverview
  • Rewrite- property- cost-driven
  • Like DB2 C/S v5 Tandem, not Oracle v6 (but
    Oracle 8i does)
  • Extensive rewrite set
  • E.g. index selection, join order,
  • Rich inference capabilities
  • E.g. contradiction detection
  • Sensitive to query complexity
  • Optimization time-out based on estimated
    execution cost

13
Query OptimizerOptimization model
Input tree
Subtree
Pool of alternatives
Rewrite
Output cheapest plan
New subtree
14
Query OptimizerOptimization time
  • Will you wait for the exhaustive optimization of
    your 20-table query?
  • Goal Make optimization time proportional to the
    query complexity
  • Query complexity ? cost of the optimal plan

15
Query OptimizerMulti-stage optimization
  • Multiple stages
  • No-choice queries (trivial plan)
  • Transaction processing queries
  • Complex query I
  • Complex query II
  • Parallel queries
  • No knobs!

16
Query OptimizerTree-rewrite and search
  • Change join order
  • (R JOIN S) JOIN T, can also be done as
  • (R JOIN T) JOIN S
  • Alternative or replacement
  • Evaluate filter conditions early
  • Extent of search based on query
  • Sub-second queries optimized quickly

17
Query OptimizerTransformations
  • Lots of transformations in SQL Server (over 300)

Filter (A.x 5)
GrpBy A.x, sum(A.y)
Join
Join
Join
B
A
B
A
B
A
Join
Join
Hash-Join
Filter (A.x 5)
B
B
GrpBy A.x, sum(A.y)
B
A
A
A
Simplification
Implementation
Exploration
18
Query OptimizerOver 300 transformation rules
  • Join reordering
  • Outerjoins
  • Subqueries
  • Aggregation
  • Star and snowflakes
  • Join elimination
  • Materialized views
  • Index plans
  • Update plans
  • Halloween protection
  • Empty tab simplification
  • (Integrity constraints)
  • Partitioned tables
  • Parallelism
  • Remote queries

19
Query OptimizerLogical inferences
  • Equivalence classes for columns
  • If ab then sort(a) same as sort(b)
  • Implied join predicates
  • Keys and functional dependencies
  • GROUPBY(e,ename) same as GROUPBY(e)
  • Contradiction detection
  • Infer empty table(s) from check constraints
  • Join simplification using FK constraints
  • Outer join simplification using nullability

20
Query OptimizerResult size estimation
  • Basis for cost estimation
  • Uses statistics on stored data
  • Densities and histograms
  • Keys (from unique indices)
  • Constraints (DRI, check constraints)
  • Available in showplan

21
Query OptimizerStatistics on demand
  • Optimizer relies on up-to-date statistics
  • Automatic create, drop and update statistics
  • Fall-back mechanism
  • Quick statistics estimation (heuristics)
  • Statistics on histogram and density
  • MAXDIFF to capture frequent values

22
Query OptimizerCost calculation
  • Cost to first/last row
  • I/O CPU, normalized to seconds
  • Row goals during optimization
  • E.g. optimize for first-10 rows retrieval
  • Available in showplan output
  • Forms basis for FAST-N/TOP-N hints

23
Query OptimizerChoosing the right plan
  • Give as much information as possible!
  • You wont always know how your tables will be
    used
  • Declare constraints they can help
  • Uniqueness
  • DRI
  • Nullability
  • Keep statistics up to date
  • Use auto-stats or your own maintenance plan
  • Provide useful indexes ?

24
Query OptimizerHints
  • Youre smart but
  • One or more indexes
  • Join order
  • Join, grouping, distinct algorithms
  • Row goal (FAST N)
  • But be careful!
  • Maintenance
  • Compatibility
  • your data volume can change

25
Query OptimizerUtility Operations
  • More than queries
  • Update plans
  • Bulk Import, export convert
  • CREATE INDEX
  • DBCC CHECKDB/CHECKTABLE
  • ALTER TABLE
  • CREATE STATISTICS

26
Query OptimizerOptimized update plans
  • Small updates (e.g., OLTP operations)
  • Row-by-row update all indexes for each row
  • Standard technique may use lots of random I/O
  • Large updates (e.g., warehouse refresh)
  • Index-by-index update
  • Pre-sorting per index merges change into index
  • Each index leaf is touched at most once
  • Saved index update cost often exceeds sort cost

27
1 of every 8 rows is deleted
(These spool operations share one work file)
8KB/page / 24B/entry 335 entries/page 70
fill factor ? 235 1 in 8 ? 30 deletions/page
A sort operation per updated index very fast
index maintenance
Is each index leaf touchedonce or 30 times?
On average, after lots of random insertions and
deletions, B-tree pages are about 70 full
thats why this fill factor is used in the
example.
28
Query OptimizerUtility example BCP IN
  • OLE/DB over local text file or TDS INSERT
    SELECT from tds-stream
  • Query semantics conversion, restriction
  • Standard optimized update plans
  • E.g. sorted insertions in each index

29
Agenda
  • SQL Server Architecture
  • Query processor goals
  • Optimization techniques
  • Query plans
  • Futures

30
Query PlansIterator execution model
  • An iterator consumes/produces a stream of rows
  • A query plan is a graph of iterators
  • Open, get-next, close methods
  • Pull parent-demand-driven
  • Modular, easy to add new iterators
  • Just like LEGO
  • E.g. sample, rank, top, segment, sort,

31
Query PlansExample Filter
  • GetNext Method
  • Repeat
  • Call child.GetNext() to obtain row r
  • IF EndOfStream, RETURN END_OF_STREAM
  • IF predicate(r) is TRUE, RETURN r

32
Query PlansCovering indices
  • Index with all needed columns
  • Clustering columns automatically added
    non-clustered indices (as key/bookmark)
  • CREATE INDEX ON Orders(employeeid)
  • Delivers employeeid and orderid columns
  • Clustering column not in 6.5 indices
  • 6.5 uses a RID (physical locator) model

33
Query PlansMulti-index operations
  • Index intersection, union, difference
  • month January AND store 3
  • month January AND store ltgt 3
  • Join indices of a single table
  • Extends covering indices
  • Performed using existing join algorithms (no
    special stuff)
  • Good for complex DSS/adhoc queries

34
Query PlansIndex-lookup JOIN
  • CUSTOMER JOIN ORDERS ON cust
  • For every row in CUSTOMER
  • Use index on ORDERS.cust to look up matches
  • Apply fetch performance techniques
  • Sort/pre-fetch lookup values
  • Fetch columns from CUSTOMER or ORDERS after join
  • Workhorse for OLTP queries

35
Query PlansIndex lookup/scan, Fetch
  • Indices are used to
  • Restrict rows (T.a between 10 and 20)
  • Extract columns (covering indices)
  • Obtain rows in sorted order or random order
  • QP can scan forwards or backwards
  • SQL 2000 supports mixed ASC and DESC indexes
  • Fetch locates rows in table from bookmark
  • Fetch is a separate operator

36
Query PlansFetch performance
  • Locates base table row using bookmark retrieved
    from non-clustered index
  • Simple fetch
  • Synchronous I/O
  • Scattered I/O
  • Pre-fetch reduces I/O latency issues
  • Delayed, hinted queue of records
  • Sort by RID for good caching

37
Query PlansSorting
  • Integrated as iterator, not a step
  • Pipeline result of last merge to consumer
  • Support for sort distinct
  • Sorted-input algorithms for join, grouping
  • Merge join
  • Stream aggregation

38
Query PlansAggregation GROUP BY
  • MIN/MAX
  • Use sort order provided by index
  • First non-null value
  • Forwards or backwards direction
  • Multiple min/max computed independently then
    joined
  • Local-partial-global
  • Aggregates split into multiple stages
  • Exploit interesting sort orders, joins, etc

39
Query PlansAggregation hash/stream
  • Select sum(sales) group by quarter

sort on quarterstream aggregation
hash on quarterhash aggregation
40
Query PlansMerge Join
  • T join R on T.a R.a
  • Sorted input streams
  • Nested loops on groups
  • Many-1 / Many-Many
  • Finds non-matches
  • Mark and re-scan
  • Outerjoin, not exist subq
  • Preserves order on join columns

41
Query PlansSummary
  • Large set of interesting plans
  • Combined together to answer user queries
  • Plan choice and combination decided at compile
    time by the Optimizer
  • We looked at-
  • Index-lookup join
  • Small outer input, large indexed inner
  • Merge join
  • Sorted inputs, e.g. (covering) index
  • Sorted output, e.g. stream aggregation
  • Aggregation min/max/local/global/partial
  • Others hash-join, nested-loop join,

42
Query ProcessorConclusion
  • State of the art query technology
  • Industrial strength
  • Framework for rapid future innovation
  • More than just a query processor
  • How you can help
  • Provide as much information as you can DRI,
    CHECK, NULL, etc
  • Keep statistics up to date (try auto-stats)
  • Provide sensible indexes

43
Questions?
Write a Comment
User Comments (0)
About PowerShow.com