Title: Arial 28pt. - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Title: Arial 28pt.

Description:

Limitations of this approach to database performance monitoring ... Analysis Using SQL Robert P. Trueblood, John N. Lovett, Jr. (data mining ... – PowerPoint PPT presentation

Number of Views:201
Avg rating:3.0/5.0
Slides: 25
Provided by: fellenm
Category:
Tags: 28pt | arial | blood | title | true

less

Transcript and Presenter's Notes

Title: Title: Arial 28pt.


1
ASE101 Building a Real-Time Database Performance
Repository Using the Sybase MDA Tables
Jeffrey Wong Consultant, Anton Ventures wongjtf_at_ao
l.com August 15-19, 2004
2
Building a Real-Time Database Performance
Repository Using the Sybase MDA Tables
  • Overview
  • Limitations of other database performance
    monitoring approaches
  • Limitations of this approach to database
    performance monitoring
  • Typical uses for the real-time (RT) MDA database
    system
  • Developmental timeline
  • Architecture of the RT MDA database system
  • Barrier synchronized time base and surrogate key
    equivalence
  • Fast insertion mechanism for capturing MDA table
    output
  • 2NF normalization and surrogate key substitution
    of foreign keys
  • Stored procedure generation from specification
    files
  • Restart of capture or archiving DML after
    deadlock victim termination
  • Visualization of archived MDA performance data
  • Visualization examples (3)
  • Futures
  • References
  • Questions and answers

3
Building a Real-Time Database Performance
Repository Using the Sybase MDA Tables
  • Limitations of other database performance
    monitoring approaches
  • Network Packet Sniffing Limited to
    request-response measurements. No information on
    Sybase internals efficiency
  • Sampling via Separate Monitoring Process
    Separate process introduces latency effects. Can
    be highly invasive for short sampling periods
    (typically, anything less than 10 seconds). IPC
    (inter process communication) overheads can be
    significant. Typically, no synchronized time base
    for collected samples
  • System Performance Report Typically, output is
    not in data-mining-friendly (i.e. tabular)
    format. Highly invasive for any sampling period
    less than two minutes
  • Ad-hoc Measurements May not be monitoring the
    correct indicators. Very heavily dependent on the
    analysts skill level

4
Building a Real-Time Database Performance
Repository Using the Sybase MDA Tables
  • Limitations of this approach to database
    performance monitoring
  • MDA Table Queries Hang at High CPU Utilization
    Levels This is a current limitation of the CIS
    proxy table mechanism upon which the MDA tables
    are based
  • Specialized Resource Needs Ideally, both tempdb
    and the RT data capture database need to be on
    RAM disk, with the RT data capture database also
    requiring battery backup of its RAM disk. Next
    best would be configuring a named cache for
    tempdb and RAM disk hosting for the tempdb
    transaction log
  • Lower Limit to Sampling Granularity Due to
    various technical considerations (including
    excessive resource utilization of CPU and disk),
    the current lower limit for the sampling
    periodicity is 5 seconds (c.f. MDA table minimum
    sampling periodicity of 1 second)
  • Counter Wraparound Steady-state averaging
    technique was employed to estimate a counter
    value where that counter value had overflowed and
    wrapped around. This may or may not always be the
    correct technique to use

5
Building a Real-Time Database Performance
Repository Using the Sybase MDA Tables
  • Typical uses for the real-time (RT) MDA database
    system
  • System Instability Root Cause Analysis The RT
    MDA database system was first deployed on a
    warm-standby replication reporting subsystem
    which had been experiencing intermittent
    stability issues. The two root causes of system
    instability were rapidly identified using this
    analysis system
  • Capacity Planning and System Performance
    Monitoring Actual resource utilization of a
    business intelligence and data mining vendor
    application was quantified by the RT MDA database
    system, prior to its introduction on a 7x24 OLTP
    production system. The new application was
    subsequently monitored by the RT MDA database
    system to certify that it exhibited appropriate
    resource utilization in production
  • System Performance Characterization System
    performance characteristics of 8 7x24 OLTP
    production systems are being collected by the RT
    MDA database system, prior to these 8 systems
    being consolidated on a 4 node cluster hardware
    system. The data collected by the RT MDA database
    system will be used to appropriately size and
    configure HA resource groups for that cluster

6
Building a Real-Time Database Performance
Repository Using the Sybase MDA Tables
  • Developmental Timeline
  • April, 2003 Release of ASE 12.5.0.3 ESD 1 (MDA
    Tables release). Commenced private RD into MDA
    Tables functionality
  • July, 2003 Commenced private development of RT
    MDA database system V1.0. This version of RT was
    single channel with a capture database and time
    base / surrogate key generator
  • September, 2003 Deployment of RT MDA V1.0 to a
    production site (first commercial use). Commenced
    private development of RT MDA V1.1 Multi-channel
    for both capture and archiving databases, bit-map
    barrier synchronization for both capture and
    archiving, monitoring profiles, and
    capture/archiving stored procedure generation
    from specification templates
  • January, 2004 Deployment of RT MDA V1.1 to a
    production site
  • March, 2004 Commenced joint development of RT
    MDA V1.2 (visualization and capacity
    planning/system performance characterization)
  • April, 2004 Deployment of RT MDA V1.2
    (visualization) to a production site
  • June, 2004 Deployment of RT MDA V1.2 (capacity
    planning/system performance characterization) to
    a production site

7
Building a Real-Time Database Performance
Repository Using the Sybase MDA Tables
  • Architecture of the RT MDA database system

8
Building a Real-Time Database Performance
Repository Using the Sybase MDA Tables
  • Architecture of the RT MDA database system
    (continued)
  • Transitions Description
  • Transition A Parallel capture of MDA table data
    synchronized on a standardized time base / sample
    number identification pair
  • Transitions B, C, D Capture stored procedures
    perform parallel inserts into the capture
    database. Tempdb is used to perform fast
    enumeration of capture data, such that for any
    capture data set, each row contains a
    monotonically increasing unique identifier. Each
    capture stored procedure schedules itself in a
    sleep-retry loop, if there is no data to process
  • Transition E Parallel archiving and purge of
    captured MDA table data synchronized on user
    configured archiving and purging time intervals
  • Transitions F, G, H, I Archive / purge stored
    procedures perform parallel inserts into the
    archive database when archiving, and parallel
    deletes into the capture and archive databases
    when purging. Tempdb is used to perform fast
    enumeration of filtered capture data, such that
    for any archive data set, each row contains a
    monotonically increasing unique identifier. Each
    archive / purge stored procedure schedules itself
    in a sleep-retry loop, if there is no data to
    process

9
Building a Real-Time Database Performance
Repository Using the Sybase MDA Tables
  • Architecture of the RT MDA database system
    (continued)
  • Transitions Description (continued)
  • Transitions J, K Time base trigger fires for
    each generated time base / sample number
    identification pair stored in the capture
    database, thereby ensuring that the time base
    history tables in the capture and archive
    databases are kept completely in sync
  • Transition L Every day at 1 AM, a batch job is
    executed that encapsulates the previous days
    archive data into a set of time-stamped,
    time-series visualization tables and views
  • Transitions M, N Data mining is performed on
    extracted time-series visualization data at the
    operations analysts convenience. Currently,
    unsophisticated visualization is carried out
    using Microsoft Excel (using its in-built
    charting module). It is intended in the future to
    extend the scope of the visualization to
    encompass other OLAP/BI tools (such as Impromptu)

10
Building a Real-Time Database Performance
Repository Using the Sybase MDA Tables
  • Barrier Synchronized Time Base and Surrogate Key
    Equivalence
  • Barrier synchronization can handle up to 64
    parallel channels
  • Most capture barrier synchronization is at one
    hierarchical level only, except for process
    monitoring which is at two hierarchical levels
    (this was done to enforce strict correlation
    between MDA process monitoring tables, albeit at
    the expense of introducing a one second time skew
    between sampling the process lookup table, and
    sampling the other process monitoring tables)
  • The time base generation is also barrier
    synchronized, and is currently set for 15 second
    sampling intervals
  • At the instant the current time base is
    synthesized, a monotonically increasing, unique
    surrogate key (identifier) is assigned to that
    time base. This surrogate key is the sample
    number for this particular parallel sampling
    activity
  • Also at the instant the current time base is
    synthesized, it, along with the sample number,
    initial time base indicator and sampling
    interval, are written to the capture time base
    history table (NB This table forms the
    time-series frame of reference for the parallel
    capture activities of the RT MDA database system)

11
Building a Real-Time Database Performance
Repository Using the Sybase MDA Tables
  • Fast Insertion Mechanism for Capturing MDA Table
    Output
  • The quantity of data output from MDA tables in a
    short period of time can be quite significant.
    For example, a trial deployment of RT MDA V1.0
    filled up 3 GB of disk in only one hour. There
    was only moderate activity on this target system,
    sampling intervals were 15 seconds, and only 4
    MDA tables were being monitored
  • Consequently, the following techniques were
    employed to ensure that capture database table
    insertion was as fast as possible
  • Most RT MDA tables are defined to use allpage
    locking, with SQL command retry used to recover
    from non-clustered index deadlocks
  • All DML to the capture or archive databases is
    done through highly-optimized (TransactSQL)
    stored procedures
  • Surrogate key updating of the control table is
    done once, at the end of the set insert composite
    transaction
  • Minimally logged insert commands (select into)
    are used to insert data into the tempdb staging
    tables
  • Set insert commands are used to insert the
    staging table rows into the respective capture
    database tables
  • Tempdb has been configured on each database
    server that is being monitored, as having both a
    bound named data cache and a transaction log
    hosted on RAM disk

12
Building a Real-Time Database Performance
Repository Using the Sybase MDA Tables
  • 2NF Normalization and Surrogate Key Substitution
    of Foreign Keys
  • As discussed earlier, the volume of information
    that needs to be stored when building a database
    repository from sampled MDA table data, can be
    significant
  • Hence, for V1.1 of the RT MDA database system, an
    archive database was introduced, along with a
    parallel time base synchronization mechanism
  • This archive database contains 2NF versions of
    the data tables of the capture database, along
    with associated domain tables
  • The domain tables contain the original column
    data as captured from the MDA tables, along with
    associated surrogate keys
  • Part of the parallel archiving process involves
    substitution of the original MDA column data with
    the associated surrogate key. These surrogate
    keys now act as foreign keys for the 2NF data
    tables of the archive database
  • The space savings accomplished by the above
    strategy are substantial. Currently, RT MDA V1.1
    can handle 7x24 capture and archiving with up to
    30 parallel channels on a 15 second cyclical
    sampling duration and a rotating 8 day archiving
    window with minimal invasiveness to the host OLTP
    system and with modest disk needs (3GB capture
    db, 6GB archive db)

13
Building a Real-Time Database Performance
Repository Using the Sybase MDA Tables
  • Stored Procedure Generation from Specification
    Files
  • Given the performance and non-invasive
    requirements for the RT MDA database system, it
    is mandatory that the capture, archive,
    normalization and purge operations be performed
    via stored procedure executions
  • However, the tedious and error-prone nature with
    manually encoding these stored procedures
    necessitated a more accurate and efficient
    process for generating these database objects
  • Thus initially, the manual encoding process was
    performed for a representative MDA table, and for
    capture, archive, normalization and purge
    activities
  • Then, the finished set of four stored procedures
    was parameterized and encoded as three perl
    template programs (NB archive and normalization
    activities were merged together)
  • Three configuration files were then constructed,
    with each non-blank line of the files containing
    the parameter information for a particular stored
    procedure

14
Building a Real-Time Database Performance
Repository Using the Sybase MDA Tables
  • Stored Procedure Generation from Specification
    Files (continued)
  • The above mentioned perl programs were next run
    with the respective configuration files, thereby
    producing the deployment source code for the RT
    MDA database system stored procedures
  • Installation and unit testing of the generated
    stored procedures was then carried out on a
    development system
  • Finally, the tested stored procedures were
    deployed on target production database servers
    and carefully monitored for possible malfunction

15
Building a Real-Time Database Performance
Repository Using the Sybase MDA Tables
  • Restart of Capture or Archiving DML After
    Deadlock Victim Termination
  • Given that most tables in the RT MDA capture
    database are of allpage locking (for performance
    reasons), the inclusion of deadlock retry logic
    is mandatory due to the distinct possibility of
    non-clustered index deadlocks occurring during
    normal operation
  • Initial implementation involved using a C or
    sybperl program per channel to re-submit the RT
    MDA stored procedure for execution automatically
    upon detection of the deadlock. However, this
    approach was abandoned due to logistic reasons
    (i.e. installing system software on multiple
    production systems requires significant
    commitment from outside groups for any large
    organization)
  • Current implementation involves a perl program
    per channel performing re-submittal of a shell
    script for execution automatically upon detection
    of the deadlock. The shell script in turn calls
    up the RT MDA stored procedure via isql
  • Future plans involve adapting Jean-Paul Martins
    asemon-logger Java program to provide the same
    functionality that was initially implemented to
    handle the non-clustered deadlock issue

16
Building a Real-Time Database Performance
Repository Using the Sybase MDA Tables
  • Visualization of Archived MDA Performance Data
  • Current methodology is simply applying Microsoft
    Excels charting module to RT MDA archive data
    extracted using custom visualization views
  • Major limitation of this approach is that there
    is no easy way to prevent the charting module
    from auto scaling the chart axes (especially the
    Y-axis), thus distorting the relevancy of the
    visualization
  • Advantages of this approach include easy
    auto-filtering of column data, easy chart type
    composition, and fast charting performance (when
    the chart data is taken from a flat file rather
    than dynamically through a database connection)
  • A recent discovery has been the existence of
    Jean-Paul Martins asemon Java GUI program for
    display of MDA table data in a tabular format.
    Consequently, the current focus for visualization
    has been temporarily shifted to adaptation of
    this program to access RT MDA capture and archive
    databases (rather than further development of the
    Microsoft Excel approach)
  • Three visualization examples have been included
    in this presentation to show the effectiveness of
    the Microsoft Excel approach to date

17
Building a Real-Time Database Performance
Repository Using the Sybase MDA Tables
  • Visualization Example 1 (disk activity)

18
Building a Real-Time Database Performance
Repository Using the Sybase MDA Tables
  • Visualization Example 2 (engine activity)

19
Building a Real-Time Database Performance
Repository Using the Sybase MDA Tables
  • Visualization Example 3 (buffer pool activity)

20
Building a Real-Time Database Performance
Repository Using the Sybase MDA Tables
  • Visualization Example 3 (buffer pool activity,
    continued)

21
Building a Real-Time Database Performance
Repository Using the Sybase MDA Tables
  • Futures (in no particular order)
  • Adaptation of the RT MDA database system to
    utilize other data mining/BI GUI tools (such as
    Impromptu)
  • Adaptation of Jean-Paul Martins asemon Java GUI
    program to visualize data contained in both the
    RT MDA capture and archive databases
  • Adaptation of Jean-Paul Martins asemon-logger
    Java program to utilize the barrier
    synchronization and capture/archive stored
    procedures of the RT MDA database system
  • Implementation of missing value simulation and
    time axis normalization in the RT MDA archive
    database
  • Design and implementation of event notification
    triggers in the RT MDA capture database. It is
    intended to integrate these notification triggers
    into the resource group failover mechanism of a
    well-known hardware vendors cluster computing
    system
  • Integration of Replication Server into the RT MDA
    database, so that the RT MDA archive,
    normalization and purge operations can be handled
    by DSI invocation of custom function strings
  • Integration of Replication Server 12.5
    performance counters into the RT MDA database
    system

22
Building a Real-Time Database Performance
Repository Using the Sybase MDA Tables
  • References
  • Whats new in ASE 12.5.0.3 Sybase Incorporated
    (the original reference for MDA Tables
    functionality)
  • http//www.sypron.nl (Rob Verschoors Web site
    more information of MDA Tables setup and usage)
  • http//www.sybase.com/developer/codexchange
    (source code for Jean-Paul Martins asemon and
    David Owens sybmon Java GUI monitoring programs
    are available at this Web site)
  • Tips, Tricks Recipes for Sybase ASE Rob
    Verschoor (where some components of the fast
    insertion mechanism for MDA table data capture
    were initially discussed)
  • Predictive Data Mining, A Practical Guide
    Sholom M. Weiss, Nitin Indurkhya (background
    information on data mining techniques)
  • Data Mining and Statistical Analysis Using SQL
    Robert P. Trueblood, John N. Lovett, Jr. (data
    mining statistical analysis queries using
    Transact-SQL)

23
Building a Real-Time Database Performance
Repository Using the Sybase MDA Tables
  • Questions and Answers
  • We have 10 minutes for QA. Please feel free to
    ask. Thank you
  • I will defer the difficult questions for when the
    10 minute period has expired (your schedule
    permitting)
  • Any questions that I cannot answer I will try to
    do so later via email, time permitting. Please be
    so kind as to leave me your email address in this
    case
  • My email address is wongjtf_at_aol.com. If you still
    have questions regarding this presentation,
    kindly send me a note and I will respond as soon
    as I can. Thanks

24
Building a Real-Time Database Performance
Repository Using the Sybase MDA Tables
Jeffrey Wong Consultant, Anton Ventures wongjtf_at_ao
l.com August 17, 2004
Write a Comment
User Comments (0)
About PowerShow.com