Those Who Don't Learn From History - PowerPoint PPT Presentation

1 / 36
About This Presentation
Title:

Those Who Don't Learn From History

Description:

'Those who do not learn from history are doomed to repeat it. ... Low Level APIs -- Fathom Comp-06: Those Who Don't Learn From History Are Doomed To Repeat It ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 37
Provided by: TomBa71
Category:
Tags: don | fathom | history | learn | those

less

Transcript and Presenter's Notes

Title: Those Who Don't Learn From History


1
Those Who Don't Learn From History
  • Tom Bascom, President
  • Greenfield Technologies

2
Those who do not learn from history are doomed
to repeat it.  George Santayana (1863-1952)
3
Overview
  • Types Of monitoring activities
  • What To monitor
  • How Methods for gathering and
    presenting monitoring data
  • Why Benefits of historical data

4
Types of Monitoring Activities
  • Baselining
  • Benchmarking
  • Interactive troubleshooting
  • Capacity management
  • Resource Optimization

5
Baselining
  • Allows you to quantify changes in performance
  • Apply to
  • Frequently executed tasks
  • Important tasks
  • Time, Activity, Costs, Revenue, Resources

6
Benchmarking
  • Benchmarking is much like baselining but it
    generally seeks to find the limits of a
    configuration.
  • SpecINT
  • TPC
  • ATM
  • ReadProbe
  • 4glProbe

7
Interactive Troubleshooting
  • Confirm configuration (trust, but verify)
  • What is happening NOW?
  • You can prove anything with a single data
    point
  • How is NOW different from the baseline?

8
Capacity Management
  • Filesystem space
  • Database extent utilization
  • Memory consumption
  • Peak CPU utilization
  • Network utilization
  • IO throughput

9
Resource Optimization
  • Finding unbalanced resources
  • Disk hot spots
  • Wasted memory
  • -spin
  • Buffers (-B, -mmax )

10
What Metrics To Monitor
  • DB metrics
  • OS metrics
  • Application metrics
  • Business metrics

11
DB Metrics
  • Logical IO
  • CRUD
  • Global and per user
  • Table and index stats
  • Physical IO
  • Latch waits and timeouts
  • Record locks and transactions
  • Connections and servers
  • Extent and area utilization

12
OS Metrics
  • CPU utilization -- usr, sys, wio
  • Disk
  • Free space
  • Operations
  • Queues and service times
  • Memory budget, usage leaks
  • Network bandwidth, latency
  • Tables and limits nfiles, nproc, semaphores

13
Application/Business Metrics
  • Orders, applications taken
  • Shipped orders, closed loans
  • Items shipped, invoices printed, document
    packages prepared
  • Items in inventory, users online
  • Turn time, fallout ratio

14
Business Metrics
  • Expenses
  • Revenue
  • Margin
  • Profit

15
How To Gather Data
  • Screen Scraping
  • PROMON
  • Scripts
  • VSTs
  • ProTop
  • ProMonitor
  • Low Level APIs
  • Fathom

16
PROMON
04/29/05 Activity Summary
1012 (103605) Event Total
Per Sec Event Total Per
Sec Commits 354 35.4 DB Reads
1724 172.4 Undos 0
0.0 DB Writes 50 5.0 Record
Reads 15091 1509.1 BI Reads
0 0.0 Record Updates 64 6.4 BI
Writes 12 1.2 Record Creates
183 18.3 AI Writes 7
0.7 Record Deletes 71 7.1
Checkpoints 0 0.0 Record Locks
3776 377.6 Flushed at chkpt 0
0.0 Record Waits 0 0.0 Rec Lock
Waits 0 BI Buf Waits 0 AI Buf
Waits 0 Writes by APW 100 Writes by
BIW 17 Writes by AIW 71 DB Size
26 GB BI Size 249 MB AI Size 87
MB Empty blocks1268766 Free blocks 84945
RM chain 805939 Buffer Hits 96 Active
trans 0 121 Servers, 513 Users (204 Local,
309 Remote, 10 Batch), 4 Apws
17
Screen Scraping
promon DBNAME gt TMP/mon.TM ltlt - "EOF" 2gt
/dev/nullRD17p4 3 p 2 p p p 2 1 p 3 x EOF
18
Screen Scraping
ls -1 sample.??.?? while read FILE do
grep -i "logical read" FILE cut -c46-56 gtgt
TMP/lr grep -i "o/s read" FILE cut
-c46-56 gtgt TMP/osr grep -i "hit ratio"
FILE cut -c11-14 gtgt TMP/hr grep -i
"commit " FILE cut -c46-56 gtgt TMP/trx
grep -i "latch " FILE cut -c46-56 gtgt
TMP/lto done paste TMP/lr TMP/osr TMP/hr
TMP/trx TMP/lto
19
VSTs -- ProTop
160650 ProTop xvi -- Progress Database
Monitor 05/01/05 Sample
sports2000 /data/s2k/sports2000
Rate Hit Ratio 631 1481 Commits
2 3 Sessions 18 Miss 1.590
0.676 Latch Waits 232 234 Local
17 Hit 98.410 99.324 Tot/Mod Bufs
170 25 Remote 0 Log Reads 4465
10498 Evict Bufs 908 206 Batch
16 OS Reads 71 71 Lock Table
8192 27 Server 0 Rec Reads 1141
1062 LkHWMOldTrx 67 0003 Other 1
Log/Rec 3.9132 9.8851 Old/Curr BI 1
1 TRX 6 Area Full 1 99.12
After Image Disabled Blocked 0
Resource Waits Id Resource
Locks Waits Lock --- --------------------
---------- ---------- ------- 10 DB Buf S Lock
10497 0 100.00 6 Record Get
1061 0 100.00 7 DB
Buf Read 71 0 100.00
2 Record Lock 28 0
100.00 11 DB Buf X Lock 14
0 100.00 19 TXE Share Lock 14
0 100.00
20
Low Level APIs -- Fathom
21
Low Level APIs -- Fathom
22
Metrics Database
  • Multiple targets
  • Generic metrics
  • Individualized metric properties
  • Arbitrary grouping of metrics
  • Flexible reporting
  • Graphical output

23
Why? The Benefits of History
  • Spend your money wisely!
  • Be the first to know when something is wrong!
  • Better yet know before it happens and prevent
    it!
  • Dont be Doomed!

24
Why?
  • Spend your money wisely!
  • Be the first to know when something is wrong!
  • Better yet know before it happens and prevent
    it!
  • Dont be Doomed!

25
Effect of Changes
EMC
8k Blocks
Storage areas
conv89
26
Why?
  • Spend your money wisely!
  • Be the first to know when something is wrong!
  • Better yet know before it happens and prevent
    it!
  • Dont be Doomed!

27
Wheres the Problem?
04/29/05 Activity Summary
1012 (10 sec) Event Total
Per Sec Event Total Per
Sec Commits 354 35.4 DB Reads
1724 172.4 Undos 0
0.0 DB Writes 50 5.0 Record
Reads 15091 1509.1 BI Reads
0 0.0 Record Updates 64 6.4 BI
Writes 12 1.2 Record Creates
183 18.3 AI Writes 7
0.7 Record Deletes 71 7.1
Checkpoints 0 0.0 Record Locks
3776 377.6 Flushed at chkpt 0
0.0 Record Waits 0 0.0 Rec Lock
Waits 0 BI Buf Waits 0 AI Buf
Waits 0 Writes by APW 100 Writes by
BIW 17 Writes by AIW 71 DB Size
26 GB BI Size 249 MB AI Size 87
MB Empty blocks1268766 Free blocks 84945
RM chain 805939 Buffer Hits 96 Active
trans 0 121 Servers, 513 Users (204 Local,
309 Remote, 10 Batch), 4 Apws
28
TRX Rate
Normal
29
TRX Rate
Elevated
30
Jump in Background TRX Rate
31
Why?
  • Spend your money wisely!
  • Be the first to know when something is wrong!
  • Better yet know before it happens and prevent
    it!
  • Dont be Doomed!

32
Business Surge
5x init volume
Hiring lags
Again!
Rates Drop!
33
Statistical Rules of Thumb
  • A Metric is In Control if it is within 3
    standard deviations of the baseline.
  • Metrics that are between the 2nd 3rd standard
    deviation should be viewed as warnings.
  • 4 consecutive samples trending away from the mean
    are a warning.
  • 4 out of 6 on one side of the mean is a warning.

34
Summary
  • Obtain a baseline
  • Consistently gather comprehensive data
  • Business
  • Application
  • Database
  • Operating System
  • Review, analyze, and publish

35
Insanity is doing the same thing over and over
again and expecting different results. Albert
Einstein (1879 - 1955)
36
Questions?
  • ?
  • Resources
  • http//www.greenfieldtech.com/exchange05.shtml
Write a Comment
User Comments (0)
About PowerShow.com