DM221 Build a Productive 24x7 Database Operation Infrastructure - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

DM221 Build a Productive 24x7 Database Operation Infrastructure

Description:

AOL Mission Statement ' ... Database consistency check. Performance Analysis. Performance Data Collection ... EMAIL. Email Notification. Benefits. Easy to ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 39
Provided by: sybas
Category:

less

Transcript and Presenter's Notes

Title: DM221 Build a Productive 24x7 Database Operation Infrastructure


1
DM221Build a Productive 24x7 Database Operation
Infrastructure
George Wang Melinda Meyers Principal
Consultant Senior DBA Sybase Inc. America
Online Inc. zwang_at_sybase.com mmeyersm_at_aol.com
2
Agenda
  • AOL Business
  • Challenge
  • Architecture
  • Future Direction
  • Q A

3
AOL Business
  • Worlds leader in interactive services, Web
    brands, Internet technologies, and e-commerce
    services
  • AOL Mission Statement
  • To build a global medium as central to peoples
    lives as telephone or televisionand even more
    valuable.
  • AOL Vision Statement
  • To build an interactive medium that improves the
    lives of people and benefits society as no other
    medium before it.

4
AOL Business
5
AOL Business
Note The number of ASE and RS on the graph
does not indicate production deployment
ASE
RS
6
Challenge
  • Explosive growth
  • Large-scale distributed deployment
  • 24x7 operation
  • Heterogeneous environment
  • Mixed versions of OS
  • Mixed versions of ASE and RS
  • Dynamic configuration
  • Staff

7
Architecture
  • Operation
  • System monitoring
  • Problem detection
  • Maintenance
  • Performance analysis
  • Repository
  • Automation
  • Notification

8
Operation Minimize Down Time
  • Standardization
  • Installation
  • Configuration
  • Procedures
  • High-availability
  • Fast response
  • History analysis
  • Proactive action

9
Operation
Chain of Escalation
10
System Monitoring
Ping
ASE
ASE
RS
BAK
BAK
11
System Monitoring
Data and Log Space
0
100
12
Problem Detection
  • LogChecker
  • Rule-based
  • Check ASE errorlog
  • Detect error messages
  • Filter out informational messages
  • Check RS errorlog
  • Capture message tags

13
Problem Detection
  • RS Heartbeat Latency
  • Program flow
  • Alarm if latency gt threshold
  • Detect health of RS
  • Latency analysis

ASE
ASE
RS
Detect the row at Time Alatency
Insert a row at Time A
14
Maintenance
  • Database Transaction Dump
  • Dump to file system
  • Copy system tables
  • Unix backup
  • Monitor capacity

15
Maintenance
  • Miscellaneous
  • Threshold of transaction log
  • 50, 75 and 90
  • Update statistics
  • Rotate errorlogs
  • Database consistency check

16
Performance Analysis
  • Performance Data Collection
  • ASE Monitor Historical Server
  • CPU utilization
  • Store procedure execution
  • IO activity
  • Cache activity
  • Object activity
  • Server status
  • Locking

17
Performance Analysis
  • Performance Data Analysis
  • Exception
  • Trend
  • Capacity
  • Load
  • Benchmark

18
Operation
19
Repository
  • Server inventory
  • Maintenance log
  • Problem history
  • Performance warehouse

20
Server Inventory
  • Server characteristics
  • name hostname SA CPU
  • version OS PM memory
  • type subsystem POC connection
  • Manual update via web pages
  • Automatic update via collection agents

21
Maintenance Log
  • Maintenance history
  • Installation Upgrade Bounce
  • OS maintenance Configuration Space allocation
  • Update query via web pages
  • Help diagnose problems

22
Problem History
  • Problem history
  • Symptom Diagnostics Solution
  • Case tracking Workaround
  • Update query via web pages
  • Benefits
  • Diagnose similar problems
  • Share knowledge and skills

23
Performance Warehouse
  • Automatic data collection
  • Automatic data summary
  • Analysis model
  • Dynamic On-demand analysis on the Web
  • Static Pre-defined and complex data model
  • Delivery via the Web

24
Operation Repository
SI Server Inventory ML Maintenance Log PH
Problem History PW Performance Warehouse
25
Job Scheduling
  • Unix Cron
  • Benefits
  • Simple
  • Drawbacks
  • Failure detection
  • Job stream dependency
  • Standalone vs. Distributed environment

26
Job Scheduling
  • Autosys
  • Scheduling and operations automation for
    distribution environment
  • Benefits
  • Centralized job scheduling management
  • Flexible job scheduling and dependency
  • Uninterrupted job processing
  • Failure detection
  • Fault tolerance

27
Autosys
Ethernet
Client
Server
Remote Agent
Remote Agent
Polls
Autosys Database
Event Processor
  • Remote Agent
  • start up
  • run job
  • return job status
  • exit
  • event found
  • starting conditions met
  • start up remote agent

28
Job _at_ Autosys
  • Job location

29
Job _at_ Autosys
  • Start condition
  • Min max run time
  • Automatic restart
  • Grouping
  • Dependency
  • Standard input output redirection

30
Email Notification
ASE
ASE
RS
Monitoring Host
Autosys
NOC
SA On-call
Designated SA
Group
31
Email Notification
  • Benefits
  • Easy to configure
  • Drawbacks
  • Large volume
  • Duplicate
  • Hard to prioritize
  • Broken escalation chain
  • Difficult to identify problem

32
Netcool/OMNIbus
ASE
ASE
RS
Monitoring Host
Autosys
NOC
SA On-call
Designated SA
Group
33
Netcool/Omnibus
  • Real-time event monitoring management
  • Consolidates
  • Integrates
  • Configurable
  • Meaningful

34
Infrastructure
Component
Web-enabled
Operation
Repository
Automation
Notification
35
Infrastructure
  • Productivity
  • Availability
  • Reliability
  • Scalability

36
Business Impact
  • Improved quality of online services
  • Overall system availability 99.6 in 1999
  • Strong subscriber growth (10M to 20M in 2 years)
  • Strong revenue growth (500M to 1.3B in 2 years)

37
Future Direction
  • Web-based infrastructure
  • Knowledge-centric event analysis
  • Performance-based early detection
  • Automated agent
  • Problem auto-correction
  • Enterprise management integration

38
Conclusion
  • Contact Information
  • George Wang ltzwang_at_sybase.comgt
  • Mendy Meyers ltmmeyersm_at_aol.comgt
  • Questions Answers

Thank You
Write a Comment
User Comments (0)
About PowerShow.com