ASE133: Performance Tuning of ASE with special emphasis on Linux - PowerPoint PPT Presentation

1 / 61
About This Presentation
Title:

ASE133: Performance Tuning of ASE with special emphasis on Linux

Description:

What is hyper threading ... processors/Clock speed/Hyper threading % cat /proc/cpuinfo. fpu ... Hyperthreading is not equivalent to having a physical processor ... – PowerPoint PPT presentation

Number of Views:358
Avg rating:3.0/5.0
Slides: 62
Provided by: jeffta
Category:

less

Transcript and Presenter's Notes

Title: ASE133: Performance Tuning of ASE with special emphasis on Linux


1
ASE133 Performance Tuning of ASE with special
emphasis on Linux
  • How to make Penguins Fly

Girish VaitheeswaranStaff Software
EngineerSybase, Inc. girish_at_sybase.com
2
Contents
  • No Introduction to Linux
  • System Performance 101
  • De-mystifying Processors
  • Lets not forget Memory
  • Reading and Writing Disk I/O
  • Sending and Receiving Network I/O
  • Conclusion

3
No Introduction to Linux
  • Does Linux need an introduction!! ?
  • Non Microkernel based architecture (monolithic
    kernel)
  • Multitasking, Secure virtual memory OS
  • Different flavors of Linux available
  • Redhat, Suse, Redflag etc
  • Supported on various hardware platforms.
  • Intel (Xeon, Pentium, Itanium), AMD (Athlon),
    IBM, SGI, SUN etc
  • Importantly, appealing to Bean Counters

4
Linux Trends
  • Consolidation from big boxes to commodity
    hardware
  • Migration to 4CPU P3s with 4/8GB RAM running
    some flavor of Linux
  • Exploiting Moores law
  • Moving from 750Mhz CPUs to 1.5GHz CPUs with
    hyperthreading
  • Sizing based on Increased clock speeds.

5
ASE on linux
  • ASE on Linux supported since 11.0.3.3
  • ASE 12.5.0.3 supports RH2.1
  • ASE 12.5.1 will support RH 3.0
  • SuSE/RedFlag supported as well.
  • More info available at www.sybase.com/linux/ase

6
System Performance 101
  • How do I get my hardware and software to do more
    work without buying more hardware ?
  • System Performance mainly depends on 3 areas

7
De-mystifying Processors
  • Classes of processors
  • Xeon with HT Technology
  • Pentium with HT Technology
  • What is hyper threading
  • Doing more work in each clock cycle by providing
    thread level parallelism in each processor
  • Advantages of hyper threading
  • Support for multi-threaded code and multi-tasking
    operations through better utilization of
    processor resources.
  • Multiple threads/tasks running simultaneously to
    increase the number of transactions that can be
    executed.
  • Improved reaction and response times for end
    users.
  • Increased number of users a server system can
    support.

8
Consolidation
  • Consolidation Guidelines
  • One 1.5GHz CPU does not necessarily yield the
    same performance as two 750MHz CPUs. Various
    parameters to account for are
  • L1/L2/L3 cache sizes (Internal CPU caches)
  • Memory Latencies
  • Cycles per instruction
  • Number of engines to run
  • On HT enabled processors 1 physical CPU can run 2
    engines
  • Having a HT enabled processor is not equivalent
    to having 2 physical processors

9
Processor Tips
  • Identifying Number of processors/Clock
    speed/Hyper threading
  • cat /proc/cpuinfo

Processor 0
Vendor_id GenuineIntel
Cpu family 15
Model name Intel Xeon CPU 2.8GHz
Cache size 512KB
Flags fpu sse ht
Processor 1
Vendor_id GenuineIntel
Cpu family 15
Model name Intel Xeon CPU 2.8GHz
Cache size 512KB
Flags fpu sse ht
10
Processor Tips
  • Determining cpu usage (mpstat, top, vmstat)
  • mpstat 5 5

010333PM CPU user nice system idle Intr/s
010333PM All 83.40 0.00 11.80 4.80 1485.
010338PM All 81.20 0.00 14.20 4.60 1433
010343PM All 83.40 0.00 11.60 5.00 1450
010348PM All 84.60 0.00 10.40 5.00 1457
010353PM All 82.60 0.00 13.40 4.00 1429
010358PM all 83.04 0.00 12.28 4.68 1451

11
Processor Tips
  • Enabling Hyperthreading
  • Enabled by default in most processors
  • Note that if ht is shown in the cat
    /proc/cpuinfo output does not mean HT is enabled.
  • Disabling Hyperthreading
  • Can be disabled during BIOS setup
  • This has been useful in cases where there are
    multiple engines and the load on the cpus is
    close to 100

12
ASE Engines Unleashed
  • ASE Engines are Linux processes that schedule
    tasks
  • ASE Engines are multi-threaded
  • ASE Performs automatic load balancing of tasks
  • ASE has automatic task affinity management
  • Tasks tend to run on the same engine that they
    last ran on to improve locality
  • Linux does not have an ability to explicitly bind
    engines to processors. (Not Yet)
  • RH AS 2.1 has built in process-processor affinity
  • Internal benchmarks have demonstrated ASEs
    ability to scale to 64 engines.

13
ASE Engines Unleashed
  • 2 configuration parameters control the number of
    engines
  • The sp_engine stored procedure can be used to
    online or offline engines dynamically
  • Tune the number of engines based on the Engine
    Busy Utilization values presented by sp_sysmon
  • Extra dataserver threads RH 7.2 only
  • For Posix aio support

processors max online engines 4 number of engines at startup 2
14
Monitoring and Tuning Engines
  • sp_sysmons kernel section reports utilization as
    shown
  • sp_sysmon 000200, kernel

Engine busy Utilization CPU Busy I/O Busy Idle
Engine 0 97.3 3.7 0.0
Engine 1 97.5 3.5 0.0
Average 97.4 3.6 0.0
15
Monitoring and Tuning Engines
  • Influencing kernel utilization
  • CPU bound tasks
  • I/O bound tasks
  • Tuning runnable process search count
  • I/O polling process count

16
Logical Process Management
  • Logical process management can be used to
    influence the priority of tasks or to do load
    balancing by using engine groups.
  • E.g.Housekeeper tuning for aggressive garbage
    collection

Identify spid for HK GC sp_who --------------------------------------------- 7 sleeping HK GC Set the garbage collection to high priority sp_setpsexe 7, priority, high Validate the priority has been set correctly sp_showpsexe spid current_priority ----------------------------------- 7 HIGH
17
Logical Process Management
  • I/O bound tasks and cpu bound tasks can be
    balanced by using engine groups.
  • E.g. Mixed work load scenario running a resource
    hogging reporting application and an Online
    reservation at the same time.
  • Step1 Create 2 engine groups and associate
    engines to engine groups

exec sp_addengine 0, onl_reservation_engroup exec sp_addengine 1, onl_reservation_engroup exec sp_addengine 2, reporting_engroup exec sp_addengine 3, reporting_engroup
18
Logical Process Management
  • Step 2 Display information about execution
    objects
  • exec sp_showcontrolinfo
  • Step 3 Create 2 execution classes
    onl_reservation_execlass and reporting_execlass
  • exec sp_addexeclass onl_reservation_execlass,
    MEDIUM, 0, onl_reservation_engroup
  • exec sp_addexeclass reporting_execlass, MEDIUM,
    0, reporting_engroup

Type Engines Engine_group
EG 0 Onl_reservation_engroup
EG 1 Onl_reservation_engroup
EG 2 Reporting_engroup
EG 3 Reporting_engroup
19
Logical Process Management
  • Step 4 Bind application logins to the respective
    execution class
  • exec sp_bindexeclass onl_sa, LG, NULL,
    onl_reservation_execlass
  • exec sp_bindexeclass reporting_sa, LG, NULL,
    reporting_execlass
  • Step 5 Validate binding information
  • exec sp_showexeclass

Classname Priority Engine_group Engines
onl_reservation_execlass MEDIUM onl_reservation_engroup 0,1
reporting_execlass MEDIUM reporting_engroup 2,3
20
Some more Engine related Tunes
  • Runnable process search count determines the
    number of times ASE engines loop looking for
    runnable tasks before yielding to the OS.
  • Default value is good in general
  • Tune this parameter only if all of the below are
    true
  • There are multiple applications running on the
    same machine and you require ASE to yield to the
    OS so that the other applications can be
    scheduled
  • The average cpu busy utilization is lt 5

21
Some more Engine related Tunes
  • I/O polling process count determines the number
    of processes ASE runs before checking for Network
    or Disk I/O.
  • Tune this parameter only if all the following
    conditions are met
  • Increase the value if the total I/O checks is
    very high and the Avg Disk I/Os per check or Avg
    Net I/Os per check is very low.
  • If the avg cpu utilization is between 70-90

22
Lets take a Checkpoint
  • Hyperthreading is not equivalent to having a
    physical processor
  • Just clock speed does not give performance
  • Add more engines if Engine busy utilization is
    high
  • Logical process management for priority
    scheduling and mixed workloads

23
Lets not forget Memory
  • Memory is a very critical parameter to obtain
    overall system performance
  • Every disk I/O saved is performance gained.
  • Tools to monitor and manage memory

24
Using Large Memory
  • Users can use 2.7G of memory out of the box by
    just changing max memory parameter in ASE
    12.5.1

Physical Memory max memory 1380000 allocate max shared memory 1
Key Shmid Owner Perms Bytes nattach
0xac03b5d9 7208963 girish 600 2126512128 3
0xac03b5da 7241732 girish 600 699727872 3
Note that 2 shared memory segments have been
created one for 1.98G and one for 667M
25
Using Large memory 12.5.0.3 and below
  • In ASE 12.5.0.3 and below, to use Large memory on
    Linux do the following
  • Max configurable shared memory
  • 2.7GB addressable memory

Add in rc.local echo ltshm in bytesgt gt /proc/sys/kernel/shmmax Or add the following line in /etc/sysctl.conf kernel.shmmaxltmax_memory_in_bytesgt /sbin/sysctl p
In RUN_SERVER add the following echo 268435456 gt /proc//mapped_base
26
Monitoring Memory
  • View Memory parameters in kb
  • free k
  • cat /proc/meminfo

Total Used Free Shared Buffers cached
5855528 2596348 3259180 1979904 179260 120084
MemTotal 5855528 kB MemFree 3256604 kB MemShared 1979904 kB Buffers 179580 kB Cached 122636 kB SwapCached 0 kB Active 1988108 kB Inact_dirty 293532 kB Inact_clean 480 kB
27
Configuring ASE memory
  • sp_configure max memory to tune memory
    configured for ASE. Dynamic option since 12.5
  • Tune this parameter based on ASE resource
    requirements
  • Remaining memory does not go to default data
    cache starting ASE 12.5
  • Do I have extra memory ?

sp_configure memory .. An additional 6020480 K bytes of memory is available for reconfiguration. This is the difference between 'max memory' and 'total logical memory'.
28
Monitoring and Tuning ASE Parameters
  • To tune various ASE memory parameters
  • sp_monitorconfig all
  • If Reused column has yes watch out.

Name Num_free Num_active Pct_act Max_used Reused
Additional network memory 2611200 0 0.00 0 NA
Number of open indexes 15 480 96.96 510 Yes
Number of open objects 491 9 1.80 9 No
29
Monitoring and Tuning ASE Parameters
Config parameter How to tune Applications
Procedure cache size sp_monitorconfig, sp_sysmon Triggers/Stored procedures
Heap memory per user sp_monitorconfig Wide columns
Number of sort buffers Size of tables involved Create index, sort, group by, order by
Number of locks sp_monitorconfig Row locks, high throughput OLTP
Number of large I/o buffers sp_monitorconfig create database/alter database
Number of worker processes sp_monitorconfig Parallel Sort/Parallel Query
Additional network memory sp_monitorconfig Text/Image
30
Memory Tuning Tips
  • Lock Shared memory
  • Guarantees shared memory to be in RAM
  • Improves performance
  • Tune through sp_configure interface
  • Static option
  • Validate through message in errorlog
  • 112308.33 kernel Locking shared memory into
    physical memory.

sp_configure lock shared memory, 1
31
Named Caches
  • Sizing Caches key to improved performance
  • Cache Partitions improve performance and
    scalability
  • How ?
  • Create the Named cache with required size
  • Bind the cache to the hot table

sp_cacheconfig stock_cache, 5M 0200000000122003/07/10 103645.18 server Size of the 2K memory pool 5120 Kb The change is completed. The option is dynamic and the SQL Server need not be rebooted for the change to take effect.
sp_bindcache "stock_cache","production","stock"
32
Named Caches
  • What to bind
  • Transaction Log
  • Tempdb
  • Hot objects
  • Hot indexes
  • When to use Named caches
  • sp_sysmon Data Cache Management section reports
    gt 10 spinlock contention
  • sp_sysmon provides the recommendation to do so.
  • Hot lookup tables, frequently used indexes,
    tempdb activity, high transaction throughput
    applications are all good scenarios for using
    named caches.
  • How to determine what is hot ??
  • Cache Wizard

33
Cache Wizard
  • A new option to sp_sysmon cache wizard has been
    added in 12.5.1 to help in
  • Identifying hot objects in a cache
  • Evaluating effectiveness of Large buffer pools
  • Sizing data caches.
  • Evaluating effectiveness of APF

34
Cache Wizard Usage
  • Usage
  • sp_sysmon interval , cache wizard , top_N ,
    filter
  • Ranking Criterion
  • LogicalReads / sec
  • Always in decreasing order of PhysicalReads / sec
  • Filter clause
  • Caches containing filter pattern

35
Cache Wizard Examples
  • sp_sysmon 000500, cache wizard, 2,
    default data cache
  • default data cache
  • Buffer Pool Information

  • Object Statistics Cache Occupancy
    Information


Run Size 100 Mb Usage 80 LR/sec 2500 PR/sec 1500 Hit 40
IO Size Wash Run Size APF LR/Sec PR/Sec Hit APF Eff Usage
4 Kb 17202 Kb 16 Mb 10 800 100 87.50 75 80
2 Kb 3276 Kb 84 Mb 10 1700 1400 17.65 20 80
Obj_Size Size in Cache Obj_Cache Cache_Occp
102400 Kb 40960 Kb 40 40
2048 Kb 1024 Kb 50 1
Object LR/sec PR/sec Hit
db.dbo.cost_cutting 1800 1150 36.11
db.dbo.emp_perks 500 200 60.00
36
Cache Wizard Recommendations
  • Identifying hot objects
  • default data cache
  • Object Statistics Cache Occupancy
    Information
  • If Cache Hit is low
  • For each Object
  • If LR/sec is high and Obj hit is low, move
    object to a new cache
  • OR add memory to the cache.

Run Size 100 Mb Usage 88.40 LR/sec 2500 PR/sec 1500 Hit 40
Object LR/sec PR/sec Hit
master.dbo.cost_cutting 1800 1150 36.11
master.dbo.emp_perks 500 200 60.00
master.dbo.emp_info 100 90 10.00
master.dbo.revenue 100 60 40.00
Obj_Size Size in Cache Obj_Cache Cache_Occp
102400 Kb 40960 Kb 40 40
2048 Kb 1024 Kb 50 1
8192 Kb 1024 Kb 12.5 1
16384 Kb 8192 Kb 50 8
37
Cache Wizard Recommendations
  • Effectiveness of large buffer pools, apf
  • default data cache
  • Buffer Pool Information

  • If Pool usage is high and Pool Hit is low, add
    memory to the buffer pool
  • APF effectiveness provides information on how
    many pages brought in on account of APF got used.
  • If Pool hit is low and APF effectiveness is
    high, then consider increasing APF percentage.

Run Size 100 Mb Usage 88.40 LR/sec 2500 PR/sec 1500 Hit 40.00
IO Size Wash Run Size APF LR/Sec PR/Sec Hit APF Eff Usage
4 Kb 17202 Kb 16 Mb 10 800 100 87.50 75 80
2 Kb 3276 Kb 84 Mb 10 1700 1400 17.65 20 90
38
Cache Partitions
  • Cache Partitions help improve scaling
  • Decomposes the cache spinlock
  • Recommendation is to use as many cache partitions
    as there are engines.
  • How

sp_cacheconfig "order_index_cache", "5M", "cache_partition2 104420.85 server Size of the 2K memory pool 2560 Kb 104420.85 server Memory allocated for the order_index_cache cachelet 1 2560 Kb 104420.85 server Size of the 2K memory pool 2560 Kb 104420.85 server Memory allocated for the order_index_cache cachelet 2 2560 Kb The change is completed. The option is dynamic and the SQL Server need not be rebooted for the change to take effect.
39
Named Caches Vs Cache Partitions
  • Which one should I use ?
  • Named caches
  • Easily identifiable hot objects, indexes
  • Transaction Log
  • Tempdb
  • Cache Partitions
  • Complex applications with many objects
  • Named cache with heavy spinlock contention
  • Both
  • Best fit is to have named caches with cache
    partitions

40
Named Caches Bottom Line
79
41
Lets take a Checkpoint
  • Every disk I/O saved is performance gained
  • sp_monitorconfig to tune procedure cache, worker
    threads etc
  • Named caches help improve performance and
    scalability
  • Cache partitions Named Caches best combination
  • If application has large number of objects have
    as many cache partitions as engines
  • Tempdb, transaction log, hot indexes, hot objects
    are ideal candidates

42
Reading n Writing Disk I/O
  • I/O avoided is Performance Gained
  • ASE buffer cache has algorithms to
    avoid/delay/Optimize I/Os whenever possible
  • LRU replacement
  • MRU replacement
  • Tempdb writes delayed Improved select into
    performance
  • Write ahead logging Only Log is written
    immediately
  • Group commit to batch Log writes
  • Coalescing I/O using Large buffer pools
  • UFS support
  • Raw devices and File systems supported
  • Asynchronous I/O is supported

43
Raw Devices
  • Raw devices provide exceptional write performance
    and good read performance
  • Recommended for Transaction Log

44
File Systems
  • File System Caching can be effectively used to
    improve performance (especially reads).
  • File Systems as Secondary cache for ASE
  • Enables ASE to use gt 2.7GB
  • Very useful as pages not fitting in ASE cache are
    accommodated in FS Cache
  • Helps avoid expensive disk I/O
  • Many File System Flavors on Linux
  • extfs, xfs. IBM's JFS and the Reiserfs
  • Recommended file system
  • EXT2
  • EXT3 with journaling disabled

45
What file system to use
  • EXT3 with journaling disabled

24
46
File system vs Raw devices
  • When to use File system
  • Frequent reads
  • Infrequent writes
  • E.g. tempdb WITH DSYNC off using
    sp_deviceattr stored procedure
  • 1gt sp_deviceattr tmpdbdev","dsync","false"
  • 2gt go
  • 'dsync' attribute of device tmpdbdev' turned
    'off'. Restart Adaptive Server for the
  • change to take effect.
  • When to use Raw devices
  • Frequent writes
  • Infrequent reads
  • E.g Transaction log
  • How does one compare against the other ?

47
And the winner is.
  • Bottom line Use mix of File System and Raw
    devices.

60
48
Asynchronous I/O
  • Enables ASE to service user tasks after I/O is
    issued
  • The recommended scheme for doing I/O
  • AIO supported on Raw devices and File Systems on
    Linux
  • Enabled by default
  • Posix aio and Kernel Supported AIO RH AS 2.1
    are supported
  • What should I use ??

49
And the Winner is
95
50
Asynchronous I/O tunes
  • fs.aio.max-size specifies the maximum block size
    performed by one aio read or aio write
  • For optimal create database, alter database
    performance this should be tuned to 1048676(1MB)
  • To tune this parameter

Add the following in /etc/rc.local echo 1048576 gt /proc/sys/fs/aio-max-size Or Add fs.aio-max-size1048576 in /etc/sysctl.conf and run /bin/sysctl -p
51
Disk Tuning Recommendations
  • More disks in an array improve I/O parallelism
  • Stripe size for OLTP applications
  • Data 64K
  • Log 8K
  • Service Times
  • Data 12-15 ms
  • Log 2-3 ms

52
Disk Tuning Recommendations
  • RAID 5 for data
  • Independent Data disks with distributed parity
    blocks
  • Space utilization is lesser than 01
  • Read Throughput is high
  • Reliability is high
  • Write Ahead Logging helps delaying data writes
  • RAID 01 for Transaction Log
  • RAID 0 for the striping
  • RAID 1 for mirroring
  • Very reliable
  • High write throughput

53
I/O Commands
  • iostat for device statistics.
  • iostat -d -x /dev/sdb1 /dev/sdc1 /dev/sdd1 5 5
  • Linux 2.4.9-e.3 (vayu) 06/11/2003

Device Rrqm/s Wrqm/s R/s W/s Avrq-sz Avgqu-sz Await Svctm util
Sdb1 0.00 0.20 86.60 68.80 8.04 11.58 74.13 34.48 59.80
Sdc1 0.00 0.20 24.40 48.80 8.02 3.86 52.73 38.25 28.00
Sdd1 0.00 0.20 26.80 45.00 8.02 3.42 47.63 36.77 26.40
54
Linux I/O Commands
  • Identify file system types
  • /sbin/mount
  • Create an ext3 file system
  • /sbin/mke2fs b lt2K4Kgt -j /dev/sdg2
  • View existing raw devices
  • raw -qa
  • Create a raw device
  • raw /dev/raw/raw1 /dev/sdb1
  • And add the following in /etc/sysconfig/rawdevices
  • /dev/raw/raw1 /dev/sdb1
  • Tweaking the file system options
  • mount -t ext3 -o rw,noatime,datawriteback
    /dev/sde1 /tpcc

55
ASE Disk Monitoring
  • sp_sysmon 000001, diskio
  • Device
  • /tpcc_2/data/data1.dat

Data1 Per sec Per xacts Count of total
Reads
APF 0.0 0.0 0 0.0
Non-APF 0.4 0.1 2 0.3
Writes 154.4 22.1 772 99.7
Total I/O 154.8 22.1 774 23.9
56
Lets take a Checkpoint
  • Use Kernel Supported aio
  • Use Mixture of File systems and Raw devices
  • Raid 01 for transaction log, RAID 5 for data
    devices
  • High Speed device for transaction log and tempdb
  • I/O parallelism by having enough disks per array
  • 64KB stripe for data and 8KB stripe for
    transaction log

57
Sending and Receiving Network I/O
  • ASE is capable of handling thousands of users
    with good throughput and response time
  • 100BASE-T and Gigabit ethernet cards help improve
    network throughput and response time
  • Full duplex for simultaneous reading and writing
    to the network is recommended
  • One could also use multiple NIC cards for load
    balancing

58
Load Balancing using Multiple NICs
  • Client Machine 1
  • query ipaddr1 port1

Server Machine master ipaddr1 port1 master
ipaddr2 port2 query ipaddr1 port1
Serviced by listener1
Client Machine 2 query ipaddr2 port2
Serviced by listener2
59
Tuning ASE Network I/O
  • sp_sysmon Network I/O Management section
    provides information on ASE network performance
  • sp_configure send doneinproc tokens, 0 send
    doneinproc tokens only for select statements.
  • Network packet size tuning both on the client
    side and on the server side.
  • Server Side
  • default network packet size
  • max network packet size
  • Client Side
  • -A option for isql
  • CS_PACKETSIZE using ct-library
  • PACKETSIZE property using JConnect
  • The default value for tcp no delay of 1 should
    improve network performance.

60
Network Commands
  • netstat -ts
  • Optimize tcp performance by using port numbers
    between 1024 and 65000
  • echo 1024 65000 gt /proc/sys/net/ipv4/ip_local_p
    ort_range
  • Some more tcp tunes to improve performance
  • echo 0 gt /proc/sys/net/ipv4/tcp_sack
  • echo 0 gt /proc/sys/net/ipv4/tcp_timestamps
  • echo 0 gt /proc/sys/net/ipv4/tcp_window_scaling
  • Full duplex information
  • cat /proc/net/nicinfo/eth0.info grep Duplex
  • Duplex full

61
Lets take a Checkpoint
  • 100 BASE-T and Giga bit ethernet help improve
    network throughput
  • Large packet sizes for bcp, text/image or for any
    other large data transfer needs
  • Setting send doneinproc tokens to 0 helps improve
    network performance
Write a Comment
User Comments (0)
About PowerShow.com