Reduce MTTR - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

Reduce MTTR

Description:

with Combined Performance Management and Forensic Analysis. Brian Robertson ... Does not require entire data capture files to be sent. Deep forensic analysis ... – PowerPoint PPT presentation

Number of Views:1676
Avg rating:3.0/5.0
Slides: 38
Provided by: Robert1111
Category:
Tags: mttr | files | forensic | reduce

less

Transcript and Presenter's Notes

Title: Reduce MTTR


1
Reduce MTTR
  • with Combined Performance Management and Forensic
    Analysis

2
Introduction
  • Brian Robertson
  • Solutions Marketing Manager
  • Netscout Systems, Inc.

Jim Bauer Director of Network Infrastructure
Services
3
Agenda
  • Network Performance Management Challenges
  • Meeting the Challenge with Performance Management
    Solutions
  • Top-Down Troubleshooting Approach
  • Case Studies
  • Survey Results

4
Balancing Pro-active performance management With
Post-incident investigation and analysis
  • Challenge
  • Need proactive monitoring and analysis for
    everyday troubleshooting and capacity planning
  • Intermittent problems wreak havoc on the
    performance of business-critical and
    revenue-generating applications
  • Problem cause is often difficult to discover
  • Needs
  • Ability to automatically analyze traffic and
    simultaneously retain evidence to recreate a
    complex incident and discover its culprit without
    having to wait for the event to recur

5
Who Finds Performance Problems?
Source NetScout Systems Survey March 2007 N
232
6
Performance Problem Lifecycle
7
The Ultimate Payoff Reduced MTTRFaster time
to resolution/minimized user impact
Service Outage
User Calls Help Desk
End-Users Impacted
Final Verification
End Users Not Impacted!
time
Problem Origin
8
What is needed to Lower MTTR
  • A Performance Management System that
  • Combines key performance indicator (KPI)
    monitoring and analysis with continuous packet
    capture for post event data mining
  • Top down approach to troubleshooting
  • Application Fabric Monitoring
  • Has an architecture built for high performance
    recording and infrastructure monitoring
  • Uses a high capacity, highly available hardware
    platform

9
Unique Approach Three-Tier Data Architecture
for Rapid Top-Down Workflow
Top Down
Data Set
KPIs Retransmits, VoIP QoS Errors, App Resp
Time, etc
Flows CDM - Applications/Services Conversations,
Utilization, Volume, etc
Packets Header and Payload
10
Step 1 DetectionIs there an issue on the
network?
  • Trending, alarming and analytical data on key
    performance indicators will provide
  • Notification when link or application utilization
    increases
  • See when unwanted protocols are on the network
  • Measure VoIP links for Jitter, Packet Loss or
    High MOS scores
  • This information has to be available in real-time
    and historically
  • Collaboration with groups across the organization
    requires quick and easy distribution of reports

11
Understand how your business uses the network
Do you know all the applications running on your
network? Application visibility provides
business justification for IT decisionsAre
there good reasons for an upgrade? Are there
non-business uses of the network?
12
Baseline response time of key business
applications
Looking at several parameters such as MPLS,
VoIP, or QoS classes can provide critical data
in regards to the health of the network.
Response time provides insight into the end-user
experience and should be an integral part of any
performance management audit
13
Top Down Approach for Rapid TroubleshootingPower
Alarms - Micro-Burst Alarming
  • Based on traffic rates exceeding a 1 millisecond
    threshold
  • Interval can be configured as low as 5
    milliseconds up to 1 second
  • An alarm is received when the burst starts and
    one when it ends
  • Evidence is launched based on applications,
    hosts, conversations during the alarm interval

14
Step 2 DiagnosisWhat is the root cause of the
issue?
  • Look at the flows associated with the KPI
    effected by the issue
  • Applications
  • Conversations
  • Utilization
  • Volume
  • Historical view of trends for flows show
  • When and where an issue commonly occurs
  • Upward / downward trends for flows

15
Flow Based Troubleshooting Example
Music downloads from 3 different sites same user
16
Troubleshoot Intermittent or Subtle Problems with
Application Fabric Monitoring
  • Efficient on-board analysis
  • Sends only the data requested by the client
  • Does not require entire data capture files to be
    sent
  • Deep forensic analysis
  • Logically move from KPI to Flow to Packet
  • View data with millisecond granularity
  • Record packet level audit trails to view subtle
    and intermittent issues

17
Highly Efficient Architecture
18
Highly Efficient Architecture
On Board Analysis
Client Analysis
Efficient On-board Analysis
Inefficient Client-based Analysis
19
Superior Forensic Analysis
  • Expert Analysis
  • Expert Zoom and Data Mining interface with
    multiple workspaces
  • Broad support of over 1000 protocol decodes
  • TCP Session Follow
  • Bounce Diagrams
  • Intuitive, flexible GUI
  • Multi-user, Web-based console
  • Eases search process
  • Incident Reports for collaborating and
    communicating with others

20
Case in Point Solution Architecture Wins Business
  • Bank in North America has assets approaching 300
    Billion
  • 2 trading floors Chicago, New York, NJ
    disaster recovery
  • Pain Point Needs continuous capture and
    monitoring on trading floors
  • Need both flow based monitoring and trending for
    troubleshooting and capacity planning Plus
    recording for in-depth analysis because of the
    value of the financial trades
  • Current solution in Chicago not working network
    managers constrained by time delays in viewing
    packet trace files when pulling them over the
    network.

Able to perform top down troubleshooting with
detailed analysis of applications and
conversations. When necessary for in-depth
troubleshooting the actual packets are available
for event reconstruction and forensic data
mining without adding load to the network.
21
High Definition Visibility Deep Forensics
22
Quickly navigate to the needed level of
granularity consistently maintaining context as
you go deeper
What do you do if you need a packet decode, but
the offending traffic occurred two hours ago???
23
Step 3 VerificationHave we resolved the issue?
  • Does the KPI meet expectations now that changes
    have been implemented?
  • Re-evaluate response time of critical
    applications
  • Are key applications being delivered within
    previous response time levels? Have there been
    negative or positive impacts ? Have VoIP QoE
    Metrics changed?
  • Determine whether bandwidth utilization meets
    estimates
  • If the changes resolve the issue the baseline
    should be reset

24
Case in Point Troubleshooting employee remote
access problem
  • New England based insurance company
  • Pain Point Remote employees having trouble
    accessing network resources
  • LDAP servers source of many problems
  • Intermittent issues were elusive
  • Application Fabric Monitoring provided continuous
    capture and recording for in-depth
    troubleshooting forensics

Discovered two LDAP servers had their
authentication databases out of sync and were
spending their cycles trying to sync their
databases
25
Step 4 On-Going ManagementHow is your network
growing and changing over time?
  • Converged networks need unified performance
    management
  • Continuation of the tasks you performed in
    detection and post-deployment impact phases
  • Troubleshooting - requiring real-time information
  • Planning and traffic engineering - requiring
    longer-term historical information
  • Communication to key constituents
  • Easy to create, customizable reports

26
Visibility into how the network is usedComplete
Application monitoring and profiling with CDM
Virtualization
  • Application identification - Common matrix for
    multimedia voice, video and data
  • Well-known, complex, custom, URL-based apps
  • VoIP for RTP, SIP, MGCP, H.323, SCCP
  • Industry specific i.e. FIX protocol, IP
    Multicast and PACs
  • Application discovery for TCP and UDP unknown
  • No data reduction all applications
  • QoE / Response time analysis
  • Proactive Alarms for thresholds, response times,
    time over threshold and microbursts
  • Virtual interface analysis for VLANs, VRFs, QoS,
    or sites
  • Post-capture filters by variety of metrics, not
    just by IP address, ie CDM port

27
Added Benefits of a Unified Performance
Management System
  • Minimize Total Cost of Ownership (TCO) by
    choosing Performance Management Systems that
  • Support all applications, conversations and
    diverse network technologies that make up the
    network environment
  • Present
  • Future
  • Uses a common interface for all data sources
  • Deep Forensic data
  • Probe Flow data
  • NetFlow / sFlow

28
Common Data Model More Performance Metrics for
services or applications
VoIP packet loss
Link Usage over Time
Details from drill down on spike
29
Responsiveness Tracking with QoE
  • Quality of Experience Tracking (QoE)
  • Key Features
  • Adds support for Virtual Interfaces
  • Granularity down to 1-minute
  • Tracks TCP and HTTP Error counters
  • Adds support for both passive and active VoIP
    metrics
  • Adds support for IP-SLA transactions

TCP Errors
30
Responsiveness Tracking with QoE Visibility with
1 Minute Granularity
Response Time with 1 minute resolution
QoE Virtual Interface Support
31
Application Discovery Visibility into Unknown
Traffic
  • Identify Port-to-Port conversations for unknown
    applications
  • Can be logged Historically with 1-minute
    resolution

32
Survey Results
  • In 2007 NetScout partnered with Ashton, Metzler
    and Associates
  • Goal To see what impact performance management
    systems had in diagnosing critical issues
  • 138 Participants were asked how long it took to
    diagnose a critical network issue
  • Before implementing a Performance Management
    Solution
  • After implementing a Performance Management
    Solution

33
Improved MTTR with a Performance Management System
Before Implementing the nGenius Performance
Management System
After Implementing the nGenius Performance
Management System
69 Time Savings 6 Hours Time Savings
Source NetScout Systems Survey March 2007 N
138
34
Ability to Diagnose Issues in the First 3 Hours
Before Implementing the nGenius Performance
Management System
After Implementing the nGenius Performance
Management System
Percentage Increased from 26 to 77
Source NetScout Systems Survey March 2007 N
138
35
Summary Benefits of Proactive Management with
Post-incident analysis
  • Reduced MTTR with a Unified Solution
  • Must have vision into the network during the
    detection phase through to the on-going
    management phase of the performance problem
    lifecycle
  • Must be able to support real-time and historical
    reporting
  • Top down approach with context-sensitive data
    mining
  • Store packets and report on flows concurrently
  • Lower TCO with Architecture Advantages
  • Flexible to support todays and tomorrows
    applications and network technologies
  • Provide vision into the entire network not just
    certain pieces
  • Highly Available Hardware Architecture
  • Integrated automatic and ad-hoc reporting
    functionality for collaboration

36
About NetScout
  • The most experienced team in the industry
  • Founded in 1984
  • Growing, profitable
  • 102M revenues 2007
  • World-wide distribution and support
  • Winner of 2004 2006 Omega Northface Award for
    customer satisfaction

37
Case in Point Bates College
  • 3000 Users
  • Needed to see
  • Who was using large amounts of resources
  • Who was using the available bandwidth
  • Number of flows
  • Pain Point Needed to minimize impact of single
    users on resources
  • Diagnostics to see service and applications
  • Looking for heavily utilized links
  • Needed more than layer 2 RMON devices

Please Welcome Jim Bauer Director of Network
Infrastructure Services
Company Confidential
Write a Comment
User Comments (0)
About PowerShow.com