DBMS Performance: A multidimensional Challenge - PowerPoint PPT Presentation

About This Presentation
Title:

DBMS Performance: A multidimensional Challenge

Description:

DBMS servers are defacto backbend for most applications. Simple Client ... Each application has unique performance requirement. Throughput ... Dual-Core/Quad core ... – PowerPoint PPT presentation

Number of Views:55
Avg rating:3.0/5.0
Slides: 37
Provided by: rca56
Category:

less

Transcript and Presenter's Notes

Title: DBMS Performance: A multidimensional Challenge


1
DBMS PerformanceA multidimensional Challenge
  • Vadiraja Bhatt
  • Database Performance engineering group.

2
DBMS Performance
  • DBMS servers are defacto backbend for most
    applications
  • Simple Client server
  • Muli-tier
  • ERP ( SAP, PeopleSoft..)
  • Financial
  • Healthcare
  • Web applications
  • Each application has unique performance
    requirement
  • Throughput v/s Response time

3
Performance challenges
  • Keeping up with various application
    characteristics
  • Financial
  • Mostly OLTP
  • Batch jobs
  • Report Generations
  • Security
  • Internet Portal
  • Mostly readonly
  • Loosely structured data
  • Transactions are not important
  • Short-lived sessions

4
Performance challengesTHE DATA EXPLOSION
  • Web 174 TB on the surface
  • Email 400,000 TB/ year
  • Instant Messaging 274 TB/ year
  • Transactions Growing 125/ year

Data Volume
Time
Source http//www.sims.berkeley.edu/research/proj
ects/how-much-info-2003/
5
Performance challenges
  • Demand for increased capacity
  • Large number of users
  • Varying class of services
  • Need to satisfy SLA
  • Several generations of applications.
  • Client servers to Web applications.
  • Legacy is always painful

6
Performance Challenges
  • Consolidation
  • Cheaper hardware
  • Increased maintenance cost
  • Increased CPU power
  • Reduction in TCO ( Total cost of Ownership)
  • Net effect
  • DBMS have to work lot harder to keep-up

7
System Performance 101
  • How do I get my hardware and software to do more
    work without buying more hardware ?
  • System Performance mainly depends on 3 areas

8
De-mystifying Processors
  • Classes of processors
  • CISC/RISC/EPIC
  • Penitum is the most popular one
  • Itanium didnt make a big impact ( inspite of
    superior performance)
  • Dual-Core/Quad core are making noices
  • One 1.5GHz CPU does not necessarily yield the
    same performance as two 750MHz CPUs. Various
    parameters to account for are
  • L1/L2/L3 cache sizes (Internal CPU caches)
  • Memory Latencies
  • Cycles per instruction

9
DBMS and CPU utilization
  • DBMS servers are memory centric
  • Experiments have shown that lt 50 utilization of
    CPU
  • gt 50 cycles wasted in memory related stalls
  • Computational power of CPU is underutilized
  • Memory access is very expensive
  • Large L2 cache Increased memory
    latency
  • Cacheline sharing
  • Light-weight locking constructs causes lot of
    cache-to-cache transfers
  • Context switch becomes expensive
  • Maintaining locality is very important

10
ASE Architecture overview
Operating System
Disks
Engine 0
Engine 1 ...
Engine N
CPUs
Registers File Descriptors
Registers File Descriptors
Registers File Descriptors
5
2
1
Running
Running
Running
Shared Executable (Program Memory)
Shared Memory
Lock Chains
Proc Cache
Sleep Queue
Run Queues
lock sleep
4
3
6
disk I/O
7
Pending I/Os
N E T
N E T
D I S K
send sleep
Other Memory
8
11
ASE Engines
  • ASE Engines are OS processes that schedule tasks
  • ASE Engines are multi-threaded
  • Native thread implementation
  • ASE Performs automatic load balancing of tasks
  • Load balancing is dynamic
  • N/w endpoints are also balanced
  • ASE has automatic task affinity management
  • Tasks tend to run on the same engine that they
    last ran on to improve locality

12
ASE Engines
  • 2 configuration parameters control the number of
    engines
  • The sp_engine stored procedure can be used to
    online or offline engines dynamically
  • Tune the number of engines based on the Engine
    Busy Utilization values presented by sp_sysmon

processors max online engines 4 number of engines at startup 2
13
Benefits of Multiple Engines
  • Multiple Engines take advantage of CPU processing
    power
  • Tasks operate in parallel
  • Efficient asynchronous processing
  • Network I/O is distributed across all engines
  • Adaptive Server performance is scalable according
    to the CPU processing power

14
ASE Engines
  • Logical Process management
  • Lets users do create execution classes
  • Bind applications/login to specific execution
    classes
  • Add engines to execution classes
  • Priority can be assigned to execution classes
  • Lets DBA create various service hierarchies
  • Dynamically change the priorities
  • Dynamically modify the engine assignments based
    on the CPU workload

15
Lets not forget Memory
  • Memory is a very critical parameter to obtain
    overall system performance
  • Every disk I/O saved is performance gained.
  • Tools to monitor and manage memory

16
ASE Memory Consumption
  • Over 90 of memory is reserved for buffer caches.
  • DBMS I/Os
  • Varying sizes
  • Random v/s Sequential
  • Asynchronous v/s Synchronous
  • Object access pattern
  • Most often accessed objects

17
Traditional cache management
  • All the objects share the space
  • Different objects can keep throwing each other
    out
  • Long running low priority query can throw out a
    often accessed objects
  • Increased contention in accessing data in cache

18
ASE Distributing I/O across Named Caches
  • If there there are many widely used objects in
    the database, distributing them across named
    caches can reduce spinlock contention
  • Each named cache has its own hash table

19
Distributing I/O across cachelets
20
ASE Named Caches
  • What to bind
  • Transaction Log
  • Tempdb
  • Hot objects
  • Hot indexes
  • When to use Named caches
  • For highly contented objects
  • Hot lookup tables, frequently used indexes,
    tempdb activity, high transaction throughput
    applications are all good scenarios for using
    named caches.
  • How to determine what is hot ??
  • ASE provides Cache Wizard

21
Cache Wizard Examples
  • sp_sysmon 000500, cache wizard, 2,
    default data cache
  • default data cache
  • Buffer Pool Information

  • Object Statistics Cache Occupancy
    Information


Run Size 100 Mb Usage 80 LR/sec 2500 PR/sec 1500 Hit 40
IO Size Wash Run Size APF LR/Sec PR/Sec Hit APF Eff Usage
4 Kb 17202 Kb 16 Mb 10 800 100 87.50 75 80
2 Kb 3276 Kb 84 Mb 10 1700 1400 17.65 20 80
Object LR/sec PR/sec Hit
db.dbo.cost_cutting 1800 1150 36.11
db.dbo.emp_perks 500 200 60.00
Obj_Size Size in Cache Obj_Cache Cache_Occp
102400 Kb 40960 Kb 40 40
2048 Kb 1024 Kb 50 1
22
Reading n Writing Disk I/O
  • I/O avoided is Performance Gained
  • ASE buffer cache has algorithms to
    avoid/delay/Optimize I/Os whenever possible
  • LRU replacement
  • MRU replacement
  • Tempdb writes delayed
  • Write ahead logging Only Log is written
    immediately
  • Group commit to batch Log writes
  • Coalescing I/O using Large buffer pools
  • UFS support
  • Raw devices and File systems supported
  • Asynchronous I/O is supported

23
ASE Asynchronous Prefetch
  • Issue I/Os in advance on data that we are going
    to process later
  • Non-clustered index scan
  • Leaf pages have (key, pageno)
  • Table scan
  • Recovery
  • Reduces waiting on I/o completion.
  • Eliminates context switches.

24
ASE Private Log cache
  • begin tran
  • sql..
  • Sql..
  • Sql..
  • commit tran

PLC
L1
L2
L3
Plc flush
Log Disk
Log Write
25
Logging Resource Contention
  • PLC eliminates steady state logging contention
  • During Commit
  • Acquire Lock on Last Log page
  • Flush the PLC
  • Allocate new log pages while flushing PLC
  • Issue write on the dirty log pages
  • On high transaction throughput systems (
    1million/min)
  • Asynchronous logging service ( ALS) acts as
    coordinator for flushing PLCs and issuing log
    writes
  • Eliminates contention for Log cache and
    streamlines log writing.

26
Very Large Database Support
  • Very Large Device Support
  • Support storage of 1 Million Terabytes in 1
    Server instance
  • Virtually unlimited devices per server
    (gt2Billion)
  • Individual devices up to 4TB (support large S-ATA
    drives)
  • Partitions
  • Divide up and manage very large tables as
    individual components

27
More on PARTITION
Small chunk
28
PARTITIONS - Benefits
  • Why Partitions
  • VLDB Support
  • High Performance and Parallel processing
  • Increased Operational Scalability
  • Lower Total Cost of Ownership (TCO) through
  • Improved Data Availability
  • Improved Index Availability
  • Enhanced Data Manageability
  • Benefits
  • Partition-level management operations require
    smaller maintenance windows, thereby increasing
    data availability for all applications
  • Partition-level management operations reduce DBA
    time required for data management
  • Improved VLDB and mixed work-load performance
    reduces the total cost of ownership by enabling
    server consolidation
  • Partitions that are unavailable perhaps due to
    disk problems do not impede queries that only
    need to access other partitions

29
Sustained Query performance
  • Upgrades are double edge sword
  • Offers new features, fixes, enhancement
  • May destabilized applications due to fixing
    double negative issues
  • Customer dont want to compromise current
    performance levels
  • Long testing process before any upgrade happens
  • Time is money
  • Expensive to upgrade from customer perspective
  • Unless we have mechanism to ensure same
    performance level we are at loss

30
Query Optimization - The Reality
  • ASE query optimization usually chooses the best
    query plan, however, sub-optimal query plans will
    occasionally result.
  • Furthermore, whenever changes are made to the
    optimization process on ASE
  • Most queries will execute faster
  • Some queries will be unaffected
  • Some queries may execute slower (possibly a lot
    slower)
  • The bottom line is that optimizers are not
    perfect!
  • As a Senior Sybase Optimizer Developer once told
    me,
  • If they were perfect, they would call them
    perfectizers.

31
ASE Solution Abstract Plans
  • The Abstract Plans feature provides a new
    mechanism for users to communicate with ASE
    regarding how queries are executed.
  • This communication
  • Does not require changing application code
  • Applies to virtually all query plan attributes
  • Occurs at the individual query-level
  • May be stored and bundled in with applications
  • Will persist across ASE releases

32
What Are Abstract Plans?
  • An abstract plan (AP) is a persistent, readable
    description of a query plan.
  • Abstract plans
  • Are associated with exactly one query
  • Are not syntactically part of the query
  • May describe all or part of the query plan
  • Override or influence the optimizer
  • A relational algebra language, specific to ASE,
    is used to define the grammar of the abstract
    plan language.
  • This AP language, which uses a Lisp-like syntax

33
What Do Abstract Plans Look Like?
  • Full plan examples
  • select from t1 where c0 (i_scan c_index
    t1)
  • select from t1, t2 where (nl_g_join
  • t1.c t2.c and t1.c 0 (i_scan i1 t1)
    (i_scan i2 t2) )

Instructs the optimizer to perform an index scan
on table t1 using the c_index index.
  • Instructs the optimizer to
  • perform a nested loop join with table t1 outer
    to t2
  • perform an index scan on table t1 using the i1
    index
  • perform an index scan on table t2 using the i2
    index

34
Compilation Flow - Capture Mode
  • Abstract plans are generated and captured during
    compilation
  • Query plans are generated

System Catalog on Disk (Persistent Storage)
SQL Text
For Compiled Objects Only
Query Resolution Process
Query Tree
Stored in
sysprocedures
ASE Memory (Non-Persistent Storage)
Query Compilation Process
Stored in
Query Plan
Abstract Plan
sysqueryplans
Stored in
Procedure Cache
35
Compilation Flow - Association Mode
  • At query execution, the optimizer will search for
    a matching abstract plan. If a match is found, it
    will impact the query plan. If not, it wont.

System Catalog on Disk (Persistent Storage)
SQL Text
Query Resolution Process
For Compiled Objects Only
Stored in
Query Tree
sysprocedures
Query Compilation Process
Matching AP?
No
Yes
ASE Memory (Non-Persistent Storage)
Abstract Plan
Read from
sysqueryplans
Stored in
Procedure Cache
Query Plan
36
Q A

Thank you
Write a Comment
User Comments (0)
About PowerShow.com