How the Mandate to Perform is Shaping Network Management Now - PowerPoint PPT Presentation


Title: How the Mandate to Perform is Shaping Network Management Now


1
How the Mandate to Perform is Shaping Network
Management Now
  • Jim Metzler
  • Vice President - Ashton, Metzler Associates
  • jim_at_ashtonmetzler.com

2
Agenda
  • Background
  • VoIP
  • Data Applications
  • Service Management
  • Bandwidth Optimization

3
Goals and Non-Goals
  • Goals
  • To identify the changing value proposition of
    network and systems management
  • To outline an approach to some of the key
    topics in network and systems management
  • Non-Goals
  • To be unduly prescriptive
  • To read every bullet on every slide

4
Key Premises
  • Over the past five years, there have been two
    waves of sensationalism relative to the value of
    IT. The first wave was the Y2K non-event. The
    second wave was the .com hysteria.
  • Now there is a reaction to that sensationalism
    i.e., Carrs IT Doesnt Matter.
  • To successfully demonstrate the value of IT, we
    must continually run IT as a utility while
    simultaneously finding ways for IT to add
    business value.

5
Business Managers are More Likely to See Value in
Applications than in the Infrastructure
Source Ashton, Metzler Associates
6
Importance of Showing the Linkage Between the
Infrastructure and Applications
Source Ashton, Metzler Associates
7
The Most Important Management Tasks
  • Performance
  • Application
  • Service Level
  • Configuration
  • Fault
  • Capacity

Source Ashton, Metzler Associates
8
How Value Plays Out on the Technology
Adoption Life Cycle
C H A S M
Late Majority
Early Majority
Innovators
Early Adopters
Laggards
SOURCE Geoffrey Moore
8
9
Agenda
  • Background
  • VoIP
  • Data Applications
  • Service Management
  • Bandwidth Optimization

10
Benefits of IP Telephony in Descending Order of
Importance
  • A Year Ago
  • Cheaper calls between company sites
  • VoIP systems cheaper to administer upgrade
  • Easier to deploy new integrated apps
  • Able to deploy voice functionality such as ACDs,
    or three-way calls
  • Cheaper international calls
  • Significant drop in the cost of M/A/C
  • Now
  • Easier to deploy new integrated apps
  • Able to deploy voice functionality such as ACDs,
    or three-way calls
  • Significant drop in the cost of M/A/C
  • Cheaper calls between company sites
  • VoIP systems cheaper to administer upgrade
  • Cheaper international calls

11
Current Inhibitors of VoIP Deployment
  • First Tier Lack of budget Systems for
    managing and troubleshooting VoIP quality
  • Second Tier Concerns about security Concerns
    about interoperability
  • Third Tier Benefits of VoIP are not compelling
    enough Lack of people to plan, design,
    implement and manage VoIP Concerns about E-911
    Concerns about whether or not QoS technologies
    are ready for
    production

12
Managing VoIP Performance
  • Network Requirements
  • One-way delay of around 200ms or less
  • Packet loss of 1 or less
  • Jitter of 25ms or less
  • Defining Jitter
  • Mean delay over some time period
  • Median delay over some time period
  • Degraded mean or median delay over some time
    period

13
Defining Jitter
  • Jitter is thought of as the variation in delay.
  • RFC 1889 defines jitter for the Real Time
    Transport Protocol (RTP).
  • D(i1, i) is the difference in the transit time
    for two consecutive packets sent between two
    devices.
  • ABSD(i1, i) is the absolute value of D(i1,
    i). This means that a positive variation in
    delay and a negative variation in delay are
    treated the same.
  • Jitter Jitter (ABSD(i1, i) Jitter)/16.

14
Agenda
  • Background
  • VoIP
  • Data Applications
  • Service Management
  • Bandwidth Optimization

15
Application Degradation
  • 62 of companies in general, and 85 of large
    companies in particular, have experienced
    incidents of significant application performance
    degradation.
  • The affects of application performance
    degradation include lost employee productivity,
    lost revenue, and a reduction in customer
    service.
  • One common technique that companies plan to use
    to ensure application performance is the same
    technique that has been used for years
    throw
    bandwidth at the problem.

Source Network World
16
Representative Enterprise Application SAP
  • The compute component of SAP is comprised of
    Application Server(s), Database Server(s), as
    well as a client running software that is
    sometimes known as SAPGUI.

Database Server
Application Servers
17
Understanding the Compute Component of SAP
Requires Understanding
  • The performance of all of the servers (Sun, HP)
    and operating systems (Windows, UNIX, Linux) that
    are involved.
  • The type of database being used i.e., Oracle,
    IBM DB2, Informix.
  • The particular SAP application(s) that is being
    used i.e., the Sales and Distribution
    application.
  • The particular sub-applications and transactions
    creating orders? Updating orders?
  • Other tasks that are also being run i.e.,
    maintenance?

18
Impact of Complex Compute Environments
  • In one test of SAP performance, the response
    time as seen by the user was one tenth of a
    second. However, when the test was re-run with
    two compute-bound processes on the client
    machine, the response time was roughly 4 seconds.
  • In another test of SAP performance, the fact
    that a maintenance task was periodically
    executing, caused the response time to jump up
    several seconds.

N. Gloy, J. B. Chen, Service Level Directed
Management of SAP R/3, Appliant

Digging into SAP R/3 for
Capacity Planning, Y. Somin, BMC
19
Representative Enterprise Application Performance
20
Agenda
  • Background
  • VoIP
  • Data Applications
  • Service Management
  • Bandwidth Optimization

21
Service Management The Link Between IT and
Business Managers
  • Concept taken from the SNA world
  • Could be explicit or implicit
  • Define and report on the services you offer
  • Develop a few key performance and unit cost
    metrics for that service

22
How Do IT Professionals Define a Service?
  • WAN Connectivity
  • DNS
  • Remote Access
  • Directory
  • PC Repair
  • Instant Messaging
  • Internet Access
  • Conferencing
  • Email
  • Voice

23
Tiered Model of Service Level Management
Business-based Service
Network Function
Network Management
NSP
HW Vendor
24
Approaches to Service Level Management
Source Ashton, Metzler Associates
25
Agenda
  • Background
  • VoIP
  • Data Applications
  • Service Management
  • Bandwidth Optimization

26
Sample WAN Upgrade Costs
  • An enterprise network has 100 branch offices that
    it connects with a hub-and-spoke frame relay
    network that has
  • T-1 access links at each branch office.
  • 128 Kbps PVCs 256 Kbps frame relay ports.
  • Due to performance issues, the company is
    considering upgrading the PVCs to be 256 Kbps.
  • This would cost around 220 per month per office
    for a total of 22,000 per month, or roughly
    800,000 over a three-year life cycle.
  • The company should first evaluate bandwidth
    optimization compression, caching, bandwidth
    management.

27
Compression
  • The basic function of compression is to reduce
    the size of the file that is to be sent over the
    WAN.
  • Most traditional compression algorithms (i.e.,
    WinZIP) use a dictionary compression algorithm
    such as the Lempel-Ziv algorithm.
  • A dictionary compression algorithm replaces a
    continuous stream of characters with codes.
  • The compression ratio of a compression algorithm
    is the ratio of the input bytes to the output
    bytes. A 21 ratio implies a 50 compression.

28
Compression Issues to Consider
  • At a hub site
  • With traditional approaches, the memory
    requirements can be an issue i.e., the memory
    requirements of a dictionary compression
    algorithm grow linearly with the number of sites.
  • The CPU requirements also tend to grow linearly
    with the number of sites.
  • Some classes of traffic do not lend themselves to
    traditional compression techniques i.e., if the
    traffic is already compressed, or it is encrypted
    - as encrypted traffic typically does not have
    repeated patterns.

29
Header Compression
  • The IP/UDP/RTP header associated with IP
    Telephony packets add 40 octets to each IP
    Telephony packet driving the need for header
    compression.
  • Header Compression RFCs
  • RFC 1144, Compressing TCP/IP Headers for
    Low-Speed Serial Links
  • RFC 2507, IP Header Compression
  • RFC 2508, Compressing IP/UDP/RTP Headers for
    Low-Speed Serial Links

30
Caching
  • Originally, cache referred to very fast memory
    inside of a mainframe computer. This type of
    cache is used to speed the memory access within a
    computer.
  • The motivation for deploying caching in the WAN
    is to accelerate the delivery of content, and
    optimize the use of WAN bandwidth by placing
    re-usable content close to the user.
  • When caching works well, it causes the WAN to
    perform as if it got a bandwidth upgrade e.g.,
    bandwidth enhancement.

31
Web Caching An Example
  • A user requests a Web page.
  • The network analyzes the request and decides to
    send the request to a local proxy cache.
  • If the local cache can fulfill the request, it
    does. If it cannot, it will send a request to
    the original Web server.
  • The Web server delivers the page to the local
    cache, which both stores the page and delivers it
    to the requesting user.
  • This works well if the content is static and the
    local proxy cache services multiple users with a
    need for the same content.

32
Congestion gt Jitter and/or Packet Loss
  • Example Traffic from multiple input ports
    converges on a single output port oversubscribing
    its bandwidth. The output port buffer fills,
    introducing buffer delay. Output buffers
    eventually overflows if congestion continues.

Data
Single Output Link
Switch or Router
Data
Voice
Video
33
Example VoIP in a Congested Network
  • Example Carrying delay-sensitive traffic, such
    as VoIP, on a congested network.
  • Assume that the switchs buffer capacity is 300
    1,518 B packets (a.k.a., 3.6 million bits) and
    that switch buffer fills during gap between voice
    packets.
  • The resulting delay is a function of the speed of
    the output link. For example
  • Gigabit Ethernet port3.6 msec
  • Fast Ethernet port.36 msec
  • Ethernet port360 msec

34
Positioning Bandwidth Management
  • Bandwidth management manages network resources
    (i.e., bandwidth, queues) in such a way as to at
    least attempt to ensure that key applications
    perform well.
  • Bandwidth management is required in situations in
    which
  • Bandwidth is notably expensive.
  • There are one or more business critical
    applications whose performance would be
    negatively impacted by surges in traffic.
  • Note that some companies implement performance
    management in the LAN by over-provisioning the
    LAN. Over-provisioning the WAN is usually not
    economically feasible.
  • Bandwidth Management techniques include queuing,
    MPLS and potentially 802.1X.

35
Thank You!!
36
Adaptive ManagementAligning People, Processes
and Systems
  • www.voyence.com

37
Adaptive Management Redefines the Role of IT
Technology focus Pillars of expertise Static
Management Reactive
Business process focus Cross enterprise
expertise Dynamic infrastructure Proactive
38
Technology Focus to Business Focus
  • The IT organization or service provider becomes
    an integral part of providing the enterprise
    with
  • Dynamic infrastructure in line with business
    goals
  • The agility to shift resources to meet business
    priorities
  • Services based on business realities
  • Two way communication for better business
    decisions

Changing business needs
Business Services
IT Services
Network Services
39
Static Management to Dynamic Decisions
Adaptive Management provides a better context for
decision making utilizing network management tools
Business Services Management
IT Services Management Uses business context to
adjust resource utilization in response to
changes in environmental conditions
Network Services Management Provides up to date
information to dynamically modify infrastructure
aligned with business goals

Network Configuration Management Provides a
secure and centralized mechanism to affect
network device configuration changes based on
business priorities
40
Pillars of Expertise to Cross Enterprise Expertise
IT Management must be prepared to
Analyze demand for resourcesUnderstand, manage,
and optimize the user experience while ensuring
continuous and secure operations.
  • s

Adaptive Enterprise
Customers, Suppliers, Employees
Business Services
IT Management
Adjust supply of and access to resourcesDynamical
ly allocate resources and optimize access to
automatically address changes in service demands
Networks, Applications, Services
Devices, Servers, Software, Storage
41
Reactive to Proactive
Adaptive Enterprise
Adaptive Management optimizes ITs value to the
enterprise
Network Services Management Provides
Business Services Management needs to
IT Services Management Provides
Automation of network deployment
Improve service levels
Automation of service provisioning
Dynamic network change management
Dynamic service provisioning
Respond to changes in business priorities
Integrated view of networks, devices and
configurations
Integrated view of applications and servers
Integrate and manage infrastructure
Design deployment of Network devices to
support services
Design deploy infrastructure
Design deployment of application services
42
Steps to Adaptive Management
Enable the network to provide flexible, on demand
services in a controlled environment.
Integrate into adaptive management
infrastructure
Map network infrastructure to network services
Capture and control network infrastructure
Network Configuration Management
Time
43
Capture and Control Network Infrastructure
  • All services rely on the network
  • In order to create an adaptive environment, IT
    management needs
  • Control over network availability and security
  • Standardization of configuration
  • The ability to quickly shift resources
  • Dynamic network infrastructure
  • Automation of routine changes
  • Historical data
  • Reporting

Capture and control network infrastructure
44
Control Over Change Leads to Adaptive Environment
  • Network Configuration Management provides
  • Dynamic Resource Allocation
  • Standardized configuration of the network tied to
    business needs
  • Dynamically allocate resources based on business
    priorities
  • Decreased Mean Time to Repair
  • Roll back a device to a working state in a single
    clicks
  • Enhanced Network Reliability and Security
  • Comprehensive audit trail and restore
    capabilities for all devices
  • Increased Network Availability
  • Implement routing and security policies
    consistently to all elements

45
Map Infrastructure to Services
  • Deploy services quickly and error-free
  • End Points and their service requirements for
    different applications i.e. EMAIL, ERP, CRM
  • Knowledge of the related network components
  • Match to physical network infrastructure to
    requirements
  • Provide Automated Provisioning
  • More progress today with IT infrastructure than
  • network

Map network infrastructure to network services
46
Integrate into Adaptive Management
  • End to end asset and problem management including
    network
  • Improved Root Cause Analysis
  • Correlate faults and performance degradation with
    configuration change events
  • Prevent future outages by adjusting
  • configuration policies
  • Mapping change events to problem events
  • Dynamic IT provisioning includes network
  • IT service requests drive network service
  • provisioning

Integrate into adaptive management
infrastructure
47
Network Configuration Management Necessary
Component of Adaptive Management
  • Voyence provides the first systemic approach to
    Network Configuration Management
  • Voyence provides software solutions to validate
    intended network changes and to verify that those
    changes result in an optimized network

Design Management
Network Resource Repository
Compliance Management
Change Management
48
Network Configuration Management
  • VoyenceControl! Advantage
  • Scalable architecture to manage 10s of thousands
    of heterogeneous devices across multiple networks
  • Automation of network change management
  • Graphical view of enterprise-wide network
    configurations
  • What?, how?, where?
  • Logical view of infrastructure
  • Validation of changes during design process
  • Audit and archive of network configuration changes

49
Network Configuration Management
  • VoyenceControl! Benefits
  • Improved Network Availability
  • Reduce injected errors through automation,
    standardization, and process
  • 20-40 improvement on MTTR
  • Tighter Network Security
  • Single point of control for all network
    configuration changes
  • Automation of periodic security related updates
  • Audit trail of all network changes
  • Reduced Network Administration Costs
  • 1080 reduction in network engineering costs
  • Recovery and re-deployment of stranded assets
  • Accurate capacity forecasting for budgeting and
    acquisition

50
Dynamic Network Configuration
Dynamic graphical or tabular workspace provides
the ability to verify and validate changes
and automate deployment.
51
Managing Your NetworkAs A Business Service
  • www.aprisma.com

52
Agenda
  • The Choice You Will Make
  • The Vision
  • What You Can Do Today Case Study
  • Practical Advice
  • Aprismas SPECTRUM Solution Suite

53
You Have A Hard Job
  • Its always the networks fault
  • Spiraling complexity and constant state of change
    as new applications, content, vendors and
    technologies are introduced
  • Complexity problem cannot be solved by continuous
    increase in people, time or money invested

54
The Choice You Will Make
Stay down in the network plumbing managing
your bits and bytes and jitter and latency, and
become irrelevant and outsourced. Rise up to
understand the connection between your network
components and business-relevant services, and
become vital even irreplaceable.
55
Network Management inCIOs Top Technology
Priorities
TOP TEN TECHNOLOGY PRIORITIES, GARTNER CIO SURVEY
2003
Ranking
Forecast 2006
Average weighted score (10 max)
2002
2001
2003
1
2
3
4
5
6
7
8
9
10
1
2
1
1
Security enhancement tools
Applications integration/middleware/ messaging
3
5
9
2
7
8
3
12
Enterprise portal deployment
?
5
3
17
4
?
Network infrastructure/management tools
6
1
13
5
Internal e-enabling infrastructure
Web design, development and content management
tools
2
-
6
14
Storage management (SAN, NAS) deployment
11
-
18
7
Customer Relationship Management (CRM)
9
6
6
8
-
-
9
2
Web services internal or external
Deploying XML based processes/ messaging
12
12
8
10
56
Why Network Management Remains Relevant
  • The network is the foundation of any electronic
    business service or transaction
  • On Demand, Real Time, Grid Computing and Web
    Services dont work without the network
  • Keeps internal / external providers and vendors
    honest by validating their performance against
    SLAs
  • An insurance policy against the cost of downtime
    or reduced performance

57
The Vision
58
The Vision
  • Define, Document Discover Services
  • Unified view across all silos of IT
    infrastructure from the end-user perspective
    andbusiness-relevance
  • Understand Relationships
  • Physical Logical Topology
  • Impact Analysis Prioritization
  • Root Cause Analysis
  • Infrastructure Services
  • Services Customers
  • Verify Validate Quality
  • Real Time, Historical, Predictive

59
People, Process and Politics then Products
  • Proactive, responsive and skilled people that
    communicate and present a unified face to the
    customer
  • Automated, integrated workflow and escalation
  • Isolated management silos (e.g., network,
    server) must be dismantled
  • Cross the political divide with Intelligent
    Finger-Pointing

60
What You Can Do Today
  • Pick one service or business process
  • Model it, understand what is normal, and notify
    upon exception
  • Document success, then move to the next
    service/process

61
What to Look for in a Service Management Solution
  • Service Dashboard
  • Distributed architecture with role-based views
    that integrate Fault and Performance data grouped
    by customer or business process
  • Predict/Prevent Problems
  • Intelligent thresholds that notify of impending
    SLA violation
  • Automated Response Time Monitoring
  • Know about Problems and their Root Cause
  • Sophisticated notification via email, phone, etc.
    with automated escalation and impact analysis
  • Root Cause Analysis and cross-silo event
    correlationleveraging both rules-based and
    model-based reasoning
  • Fix Problems
  • Recommend corrective actions and integrate
    configuration management
  • Verify and Validate SLAs
  • Reports on service inventory, capacity,
    availability and performance

62
Case Study WAN Savings
  • Verify SLAs with your service providers
  • Ensure you have not purchased too much or too
    little bandwidth
  • A retail customer is saving more than 100,000
    per month

63
Practical Advice
  • Seek out business goals
  • Define IT services related to the business
  • Put instrumentation in place
  • Link IT infrastructure status and performance
    with business-oriented IT services
  • Measure service levels in business terms, set
    objectives, predict results, publish reports,
    offer service-level guarantees

64
Manage What Matters
  • Who is Aprisma?
  • Aprismas SPECTRUM software manages the health
    and performance of networks and the business
    services that rely on them
  • What is SPECTRUM?
  • Reduces the number of tools required to
    accomplish management tasks
  • Automatically discovers and understands the
    relationships between IT infrastructure elements,
    services and customers
  • Proactively predicts problems
  • Intelligently isolates problems to the root cause
  • Understands what customers and services are
    impacted
  • Presents integrated, actionable information
  • Enables you to quickly fix the problem
  • Prevents problems from happening again in the
    future

65
1,000 Customers in 40 Countries
66
THANK YOU!
67
Network Services Management
  • www.openview.hp.com

68
Network Services Management Trends
  • Business trends
  • Organizations demand more capabilities from
    existing investments to maximize ROI
  • Investments focused at building foundation for
    future implementations
  • Cost minimization with acceptable service level
  • Network technology trends
  • Voice, data and video over IP
  • IP routing evolution
  • Built-in redundancy/resiliency
  • Management trends
  • Move beyond managing devices to understand
    complex relationships between network components
  • Need to understand network degradations risk as
    much as network faults
  • Need to understand impact of network outages on
    higher level business/IT services

69
Customer Challenges / Needs
  • Complexity
  • Chaotic, unpredictable traffic patterns
  • Volumes of management data
  • Emerging dynamic network technologies, e.g.,
    redundant routing protocols, IP Telephony
  • Cohesiveness
  • Getting a cohesive integrated view of the entire
    network and its relationships and interactions w.
    business processes
  • Cost pressures
  • Faster deployment
  • Fewer resources
  • Lack of special skills
  • Predict problems before they arise
  • Software that understands the network so I dont
    have to
  • Need to turn the data into usable information
  • Tie business services to network activity
  • Give network operators tools that make them more
    effective
  • Provide automation and out-of-the-box
    functionality
  • Meaningful, intelligent root cause analysis
  • Broad device and protocol support
  • Fast ROI

70
Network Services Protocols
Network ServicesInterdependencies
OSPF
Highly dynamic
Business Services
BGP
L2/L3
OSPF
VLAN
VPN
fairly static
71
Network Services Management Goals
  • Adapt to dynamic networks
  • Support IP routing redundancy
  • Adjust to changes in network rapidly
  • Dramatically reduce TCO
  • Fewer servers for net mgt. env.
  • Fewer operators for env.
  • Integrate fault performance
  • Performance data as part of RCA
  • Topology data to drive more useful performance
    reporting
  • Common config of managed objects
  • Fastest Mean-Time-To-Repair
  • Greatest range of data sources (events AND topo,
    perf, path, etc.)
  • Prioritization based on biz impact
  • Network services management service-centric
    network mgt.
  • Out-of-the-box mgmt. for specific network
    services through Net Mgmt. Smart Plug-ins
  • Impact mgt. of NW on higher level apps and effect
    of cfg changes on NW failures

72
Root-cause AnalysisIntelligent Diagnostics for
Networks
network
73
Cisco HSRP
What is it? Hot Standby Router Protocol is a
widely used Ciscoprotocol that provides
fail-over for routers supporting critical
business services. It is very complex and
essentially is a specialized virtualized network
service where each router in a group shares
common things such as an IP address. The
Problem Due to HSRPs many configurations, and
resiliency, problems are not easy to diagnose.
Furthermore, since it is designed to function
under some degree of failure, RCA is often not
enough the operator needs to now if the service
is still working and if its at risk of further
failure. Since operators usually triage problems,
an HSRP problem might be delayed.
74
Cisco HSRP
  • The Solution NNM Advanced Edition 7PLUS the NNM
    Advanced Routing SPI deliver the
    mostcomprehensive diagnosis available today for
    Cisco HSRPenvironments. Operators with be able
    to
  • understand which routers are participating in
    HSRP
  • which HSRP routers are in common groups
  • which routers are currently the primary and
    secondary routers
  • understand high-level RCA HSRP events
  • drill down into dynamic views to further
    understand the interrelationships between all
    network infrastructure supporting HSRP

75
Cisco HSRP
Intelligent Diagnostics for Networks
Active Problem Analysis for Cisco HSRP !
Root Cause HSRP Events
Intelligent messages to operator
Understand STATE of your services
Warning HSRP Router down Standby now active
  • Contextual launch of Detailed Viewsin the
    neighborhood of HSRP failure
  • Is it OK/CRITICAL?
  • Whats risk of future failure?

76
Network Performance Brown-Outs
What is it? Network Brown-Outs are conditions
thatsignificantly affect performance between
points in the network. The Problem A customer
is concerned about network performancein a
specific region of the core network. Both
client-server and Web-based applications use
this common path along the networkas well as the
remote backups performance varies widely
andsome applications are timing out for users at
peak times. Thecustomer needs to be able to
identify the problems related tocongestion not
just hard faults while they redesign their
networkfor better handling of high traffic loads.
77
Network Performance Brown-Outs
The Solution NNM Advanced Edition 7 PLUS
OpenView Performance Insight for Networks
delivers the most comprehensive network fault
and performance solution available. Customers
need to distinguish between hard network
faultsand temporary performance Brown-Outs.
They can limitunnecessary short-term remedy
efforts while they collect datafor the longer
term re design of their network. NNM AE 7
includes the previous stand-alone Problem
Diagnosisproduct which has been enhanced to
identify performanceproblems along a network
path. In addition, PD also is more
tightlyintegrated with NNM ET utilizing layer
2 topology for its path views. OVPI now includes
thresholding and is integrated with NNM. It
willprovide much of the data required for the
network redesign.
78
Network Performance Brown-Outs
The Solution
NNM receives threshold eventsfrom PDs probes
hence its not a hard fault!
PD discoverscongestion here!
Clients
Servers
PDProbe
PDProbe
OVPI collects on interfacesbetween these devices
79
Adjacent Device Failure Analysis
What is it? When a device, port or cable fails,
other network equipment in the neighborhood gets
affected, and emitsevents (symptoms). In
addition, some parts of the networkmay become
unreachable. The Problem When a failure happens,
a number of differentalarms and polling failures
will be detected. If these are notcorrelated,
the user will get a number of failures in an
alarmbrowser. Since these failures can happen
on different networkdevices, they will clutter
up the operators alarm browser anddistract from
focusing in on the actionable fault.
80
Adjacent Device Failure Analysis
  • The Solution NNM Advanced Edition 7 includes
    theActive Problem Analyzer which effectively
    correlates relatedfailures and presents the root
    cause as the primary event inthe operators
    browser.
  • uses ETs Layer 2 topology to understand the
    neighborhood of devices in the network,
    including network designs with redundancy
  • monitors IP addresses for reachability as well
    as device interfaces for availability
  • in the event of failure, does active polling
    of neighboring devices to help in diagnosis
  • correlates all symptoms from the neighborhood of
    the failure under the root cause

81
Pinpoint Failure Analysis An Example Edge
Switch Down
AdvancedEdition
Desired Root Cause Switch V down Secondary
Interface Z-23 down Impact C, D, G
unreachable
15
13
VLAN1
14
12
16
1
17
4
2
20
19
6
5
18
3
21
9
10
11
22
NNM Station
Fastest MTTR
8
7
Switch Z
  • Symptoms
  • Switch Z reports Z-23 down
  • Switch Z emits Linkdown
  • V ICMP/SNMP timeout
  • STP Root Convergence Count
  • STP Root changes
  • C, D, G ping/snmp timeouts
  • FALSE Symptoms
  • Intermittent ping/SNMP timeouts

23
24
26
25
27
28
VLAN2
Adapt to dynamic networks
Node C
Node G
Switch V
29
30
Dont generate Events for this Area!
Node D
82
Case Study
Commonwealth of Massachusetts NNM with Extended
Topology, HSRP, Frame Relay 50,000 end users 170
agencies 8 centralized enterprise
applications 1,500 network devices including
switches ATM Backbone We're connecting 400 Frame
Relay sites with 1200 PVCs. When a connection
failure occurs, network faults were overloading
the operators. 90 percent of the time the source
of the problem was within the Frame Relay service
provider's network, but we still needed to hash
through the volume of data to isolate the
problem. HP OpenView NNM with the HP OpenView NNM
SPI for Frame Relay enables us to quickly
identify the source of the problem, what is
impacted, and who is responsible. The
bottom-line is our department now provides our
customers with exceptional WAN Services which is
absolutely critical for our distributed
business. Richard Glasberg Commonwealth
of Massachusetts
83
OpenView Network Node Manager (NNM) 7 Starter
and Advanced Editions
New product packaging/pricing allows increased
flexibility for you to buy the product that fits
your needs
NNM Advanced Edition 7Designed for all sizes of
networks requiring advanced network management,
including switches/VLANs, sophisticated
root-cause analysis, and distributed environments
for large networks spanning multiple
sites/departments. This is the platform for
even greater advanced capabilities delivered
through the NNM Smart Plug-ins.
AdvancedEdition
NNM Starter Edition 7Entry-level product
designed for smaller networks (250-500 nodes)
needing basic network management (primarily Layer
3 routers/hubs) from a single management station.
StarterEdition
84
Top 3 Reasons to Purchase NNM 7
  • Reduced Mean Time to Repair
  • Advanced Intelligent diagnostics for networks
  • Event reduction through intelligent filters and
    correlators
  • Layer 2 discovery and root cause analysis
  • Enhanced Correlation Composer
  • Intelligent multi-threaded Poller and State
    Analysis
  • Dynamic views
  • Management of duplicate IP addresses from single
    station
  • Optimize your investment
  • Higher scalability to 30,000 nodes per station
  • Fewer servers required
  • NNM SPIs for specific technologies
  • New device and protocol support
  • Efficiency of resources
  • Up to 99 reduction in noise events focus on
    critical 1 (up to 14x improvement over NNM 6)
  • Intuitive new GUI for any operator level

Optimize your total cost of ownership and
operation
85
Five Recommendations for Optimized Management of
the IT Infrastructure
  • Jim Metzler
  • Vice President - Ashton, Metzler Associates
  • jim_at_ashtonmetzler.com

86
Recommendations
  • Metrics
  • Applications
  • Service Management
  • Bandwidth Optimization
  • Plan Holistically

87
Two Key Premises
  • It is difficult to improve the performance of a
    dynamic system if you do not have some metrics
    that describe the performance of that system.
  • It is difficult to get credit for improving the
    performance of a dynamic system if you do not
    have some metrics that describe the improved
    performance of that system.

88
Recommendation Metrics
  • Metrics are also necessary to set the
    expectations of your stakeholders. Stakeholders
    include
  • Business and Functional Managers
  • Applications Developers
  • Customers and Partners
  • Companies should set goals for the performance
    (i.e., response time, availability) of at least a
    few key applications. These applications goals
    can then be used to establish goals for the
    network.
  • In some cases, the network performance goals
    should also include packet loss and
    jitter.

89
Recommendation Metrics
  • Organizations should also establish some cost
    metrics. However, being measured by the
    bottom-line cost of IT will generally not lead to
    success.
  • Total Cost Unit Cost x the Number of Units
  • Possible unit cost metrics include
  • Cost per minute
  • Cost per megabyte
  • A few companies are moving to where they are
    identifying the IT costs associated with specific
    business functions such as
  • Booking a sales order
  • Responding to a customer service inquiry

90
Recommendations
  • Metrics
  • Applications
  • Service Management
  • Bandwidth Optimization
  • Plan Holistically

91
Recommendation Applications
  • Investigate application models that already exist
    that describe the behavior of any major
    enterprise applications that your company either
    has deployed or is looking to deploy. These
    models are computer simulations of how the
    generic application will perform.
  • Develop the ability, either in-house or
    outsourced, to do applications profiling or
    benchmarking. This refers to testing the
    application as it will be implemented in your
    company.
  • Integrate the infrastructure organizations into
    the application development lifecycle. The idea
    is to profile key applications at various stages
    in the development lifecycle.

92
Recommendation Applications
  • Implement some form of QoS to ensure that key
    enterprise applications (i.e., SAP, Oracle
    Financials) get the bandwidth they need to
    perform well.
  • Implement proactive monitoring and management.
    Continually monitor the infrastructure looking
    for deteriorating conditions with the tools and
    processes in place to respond to these conditions
    before they become more serious.
  • Evaluate applications management tools,
    particularly ones that can manage n-tier
    applications as well as applications that utilize
    Web services.

93
Recommendations
  • Metrics
  • Applications
  • Service Management
  • Bandwidth Optimization
  • Plan Holistically

94
Recommendation Getting Started with Service
Management
  • Stabilize infrastructure.
  • Assign senior IT resources that act as interfaces
    to the key business and functional organizations.
  • Define a few services and the customers for those
    services.
  • Develop business model Service Delivery,
    Management, Review and Modification.
  • Develop linked service level agreements
    internal and external.
  • Note Some companies are beginning to develop
    multi-tiered SLAs. For example, two WAN services
    - one that is based on the Internet and one that
    is not.

95
A Multi-layered Approach to Service Level
Management
96
External SLAs with Service Providers
  • When negotiating an SLA with service providers,
    it is important to focus on how they define the
    metrics. Even simple concepts such as
    availability have many different interpretations.
  • In general, the metrics in most SLAs are averages
    of averages. For example, the way most service
    providers calculate availability is averaged over
    all of the hours of the month, further averaged
    over all the sites in the network.
  • In most cases, the standard remedies for missing
    an SLA are at best weak. One option is to
    negotiate a progressive credit structure whereby
    the credits increase each month that the service
    provider fails to meet the SLA.

97
Recommendations
  • Metrics
  • Applications
  • Service Management
  • Bandwidth Optimization
  • Plan Holistically

98
Recommendation Bandwidth Optimization
  • Determine when techniques such as header
    compression make sense.
  • Evaluate the deployment of compression
    techniques, including techniques more
    sophisticated than the traditional dictionary
    compression algorithms.
  • International links are a candidate for
    compression. Another candidate is 56K links that
    are approaching exhaust.
  • Evaluate the deployment of caching for existing
    applications. Also, ensure that an analysis of
    caching is part of the systems development cycle.

99
Recommendations
  • Metrics
  • Applications
  • Service Management
  • Bandwidth Optimization
  • Plan Holistically

100
Recommendation Plan Holistically
  • To maximize performance and minimize cost, a
    companys IT infrastructure (i.e., LAN, WAN, Data
    Center) must be planned holistically.
  • For example, assume that a company is deploying
    SAP and that the goal is that SAP will be
    available 99.9 of the time.
  • Assume that the factors that drive an outage in
    one component of the infrastructure (i.e., a
    fiber cut on an access circuit) do not impact the
    availability of the other components.
  • To meet the availability goal for SAP, have the
    LAN, WAN and Data Center each designed to have
    99.97 availability.

101
Recommendation Plan Holistically
  • However, if one component of the infrastructure
    is only designed for 99.9 availability, the
    other two components of the infrastructure would
    have to be designed for 100 availability to meet
    the availability goal for SAP.
  • Even worse, if one component of the
    infrastructure is designed for something less
    than 99.9 availability, it is impossible to meet
    the availability goal for SAP no matter how much
    money the company spends increasing the
    availability of the other two components of the
    infrastructure.

102
Thank You!!
View by Category
About This Presentation
Title:

How the Mandate to Perform is Shaping Network Management Now

Description:

Source: Ashton, Metzler & Associates. 6. Importance of Showing the Linkage ... SOURCE: Geoffrey Moore. How Value Plays Out on the Technology. Adoption Life Cycle. 8 ... – PowerPoint PPT presentation

Number of Views:73
Avg rating:3.0/5.0
Slides: 103
Provided by: edgeNetw
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: How the Mandate to Perform is Shaping Network Management Now


1
How the Mandate to Perform is Shaping Network
Management Now
  • Jim Metzler
  • Vice President - Ashton, Metzler Associates
  • jim_at_ashtonmetzler.com

2
Agenda
  • Background
  • VoIP
  • Data Applications
  • Service Management
  • Bandwidth Optimization

3
Goals and Non-Goals
  • Goals
  • To identify the changing value proposition of
    network and systems management
  • To outline an approach to some of the key
    topics in network and systems management
  • Non-Goals
  • To be unduly prescriptive
  • To read every bullet on every slide

4
Key Premises
  • Over the past five years, there have been two
    waves of sensationalism relative to the value of
    IT. The first wave was the Y2K non-event. The
    second wave was the .com hysteria.
  • Now there is a reaction to that sensationalism
    i.e., Carrs IT Doesnt Matter.
  • To successfully demonstrate the value of IT, we
    must continually run IT as a utility while
    simultaneously finding ways for IT to add
    business value.

5
Business Managers are More Likely to See Value in
Applications than in the Infrastructure
Source Ashton, Metzler Associates
6
Importance of Showing the Linkage Between the
Infrastructure and Applications
Source Ashton, Metzler Associates
7
The Most Important Management Tasks
  • Performance
  • Application
  • Service Level
  • Configuration
  • Fault
  • Capacity

Source Ashton, Metzler Associates
8
How Value Plays Out on the Technology
Adoption Life Cycle
C H A S M
Late Majority
Early Majority
Innovators
Early Adopters
Laggards
SOURCE Geoffrey Moore
8
9
Agenda
  • Background
  • VoIP
  • Data Applications
  • Service Management
  • Bandwidth Optimization

10
Benefits of IP Telephony in Descending Order of
Importance
  • A Year Ago
  • Cheaper calls between company sites
  • VoIP systems cheaper to administer upgrade
  • Easier to deploy new integrated apps
  • Able to deploy voice functionality such as ACDs,
    or three-way calls
  • Cheaper international calls
  • Significant drop in the cost of M/A/C
  • Now
  • Easier to deploy new integrated apps
  • Able to deploy voice functionality such as ACDs,
    or three-way calls
  • Significant drop in the cost of M/A/C
  • Cheaper calls between company sites
  • VoIP systems cheaper to administer upgrade
  • Cheaper international calls

11
Current Inhibitors of VoIP Deployment
  • First Tier Lack of budget Systems for
    managing and troubleshooting VoIP quality
  • Second Tier Concerns about security Concerns
    about interoperability
  • Third Tier Benefits of VoIP are not compelling
    enough Lack of people to plan, design,
    implement and manage VoIP Concerns about E-911
    Concerns about whether or not QoS technologies
    are ready for
    production

12
Managing VoIP Performance
  • Network Requirements
  • One-way delay of around 200ms or less
  • Packet loss of 1 or less
  • Jitter of 25ms or less
  • Defining Jitter
  • Mean delay over some time period
  • Median delay over some time period
  • Degraded mean or median delay over some time
    period

13
Defining Jitter
  • Jitter is thought of as the variation in delay.
  • RFC 1889 defines jitter for the Real Time
    Transport Protocol (RTP).
  • D(i1, i) is the difference in the transit time
    for two consecutive packets sent between two
    devices.
  • ABSD(i1, i) is the absolute value of D(i1,
    i). This means that a positive variation in
    delay and a negative variation in delay are
    treated the same.
  • Jitter Jitter (ABSD(i1, i) Jitter)/16.

14
Agenda
  • Background
  • VoIP
  • Data Applications
  • Service Management
  • Bandwidth Optimization

15
Application Degradation
  • 62 of companies in general, and 85 of large
    companies in particular, have experienced
    incidents of significant application performance
    degradation.
  • The affects of application performance
    degradation include lost employee productivity,
    lost revenue, and a reduction in customer
    service.
  • One common technique that companies plan to use
    to ensure application performance is the same
    technique that has been used for years
    throw
    bandwidth at the problem.

Source Network World
16
Representative Enterprise Application SAP
  • The compute component of SAP is comprised of
    Application Server(s), Database Server(s), as
    well as a client running software that is
    sometimes known as SAPGUI.

Database Server
Application Servers
17
Understanding the Compute Component of SAP
Requires Understanding
  • The performance of all of the servers (Sun, HP)
    and operating systems (Windows, UNIX, Linux) that
    are involved.
  • The type of database being used i.e., Oracle,
    IBM DB2, Informix.
  • The particular SAP application(s) that is being
    used i.e., the Sales and Distribution
    application.
  • The particular sub-applications and transactions
    creating orders? Updating orders?
  • Other tasks that are also being run i.e.,
    maintenance?

18
Impact of Complex Compute Environments
  • In one test of SAP performance, the response
    time as seen by the user was one tenth of a
    second. However, when the test was re-run with
    two compute-bound processes on the client
    machine, the response time was roughly 4 seconds.
  • In another test of SAP performance, the fact
    that a maintenance task was periodically
    executing, caused the response time to jump up
    several seconds.

N. Gloy, J. B. Chen, Service Level Directed
Management of SAP R/3, Appliant

Digging into SAP R/3 for
Capacity Planning, Y. Somin, BMC
19
Representative Enterprise Application Performance
20
Agenda
  • Background
  • VoIP
  • Data Applications
  • Service Management
  • Bandwidth Optimization

21
Service Management The Link Between IT and
Business Managers
  • Concept taken from the SNA world
  • Could be explicit or implicit
  • Define and report on the services you offer
  • Develop a few key performance and unit cost
    metrics for that service

22
How Do IT Professionals Define a Service?
  • WAN Connectivity
  • DNS
  • Remote Access
  • Directory
  • PC Repair
  • Instant Messaging
  • Internet Access
  • Conferencing
  • Email
  • Voice

23
Tiered Model of Service Level Management
Business-based Service
Network Function
Network Management
NSP
HW Vendor
24
Approaches to Service Level Management
Source Ashton, Metzler Associates
25
Agenda
  • Background
  • VoIP
  • Data Applications
  • Service Management
  • Bandwidth Optimization

26
Sample WAN Upgrade Costs
  • An enterprise network has 100 branch offices that
    it connects with a hub-and-spoke frame relay
    network that has
  • T-1 access links at each branch office.
  • 128 Kbps PVCs 256 Kbps frame relay ports.
  • Due to performance issues, the company is
    considering upgrading the PVCs to be 256 Kbps.
  • This would cost around 220 per month per office
    for a total of 22,000 per month, or roughly
    800,000 over a three-year life cycle.
  • The company should first evaluate bandwidth
    optimization compression, caching, bandwidth
    management.

27
Compression
  • The basic function of compression is to reduce
    the size of the file that is to be sent over the
    WAN.
  • Most traditional compression algorithms (i.e.,
    WinZIP) use a dictionary compression algorithm
    such as the Lempel-Ziv algorithm.
  • A dictionary compression algorithm replaces a
    continuous stream of characters with codes.
  • The compression ratio of a compression algorithm
    is the ratio of the input bytes to the output
    bytes. A 21 ratio implies a 50 compression.

28
Compression Issues to Consider
  • At a hub site
  • With traditional approaches, the memory
    requirements can be an issue i.e., the memory
    requirements of a dictionary compression
    algorithm grow linearly with the number of sites.
  • The CPU requirements also tend to grow linearly
    with the number of sites.
  • Some classes of traffic do not lend themselves to
    traditional compression techniques i.e., if the
    traffic is already compressed, or it is encrypted
    - as encrypted traffic typically does not have
    repeated patterns.

29
Header Compression
  • The IP/UDP/RTP header associated with IP
    Telephony packets add 40 octets to each IP
    Telephony packet driving the need for header
    compression.
  • Header Compression RFCs
  • RFC 1144, Compressing TCP/IP Headers for
    Low-Speed Serial Links
  • RFC 2507, IP Header Compression
  • RFC 2508, Compressing IP/UDP/RTP Headers for
    Low-Speed Serial Links

30
Caching
  • Originally, cache referred to very fast memory
    inside of a mainframe computer. This type of
    cache is used to speed the memory access within a
    computer.
  • The motivation for deploying caching in the WAN
    is to accelerate the delivery of content, and
    optimize the use of WAN bandwidth by placing
    re-usable content close to the user.
  • When caching works well, it causes the WAN to
    perform as if it got a bandwidth upgrade e.g.,
    bandwidth enhancement.

31
Web Caching An Example
  • A user requests a Web page.
  • The network analyzes the request and decides to
    send the request to a local proxy cache.
  • If the local cache can fulfill the request, it
    does. If it cannot, it will send a request to
    the original Web server.
  • The Web server delivers the page to the local
    cache, which both stores the page and delivers it
    to the requesting user.
  • This works well if the content is static and the
    local proxy cache services multiple users with a
    need for the same content.

32
Congestion gt Jitter and/or Packet Loss
  • Example Traffic from multiple input ports
    converges on a single output port oversubscribing
    its bandwidth. The output port buffer fills,
    introducing buffer delay. Output buffers
    eventually overflows if congestion continues.

Data
Single Output Link
Switch or Router
Data
Voice
Video
33
Example VoIP in a Congested Network
  • Example Carrying delay-sensitive traffic, such
    as VoIP, on a congested network.
  • Assume that the switchs buffer capacity is 300
    1,518 B packets (a.k.a., 3.6 million bits) and
    that switch buffer fills during gap between voice
    packets.
  • The resulting delay is a function of the speed of
    the output link. For example
  • Gigabit Ethernet port3.6 msec
  • Fast Ethernet port.36 msec
  • Ethernet port360 msec

34
Positioning Bandwidth Management
  • Bandwidth management manages network resources
    (i.e., bandwidth, queues) in such a way as to at
    least attempt to ensure that key applications
    perform well.
  • Bandwidth management is required in situations in
    which
  • Bandwidth is notably expensive.
  • There are one or more business critical
    applications whose performance would be
    negatively impacted by surges in traffic.
  • Note that some companies implement performance
    management in the LAN by over-provisioning the
    LAN. Over-provisioning the WAN is usually not
    economically feasible.
  • Bandwidth Management techniques include queuing,
    MPLS and potentially 802.1X.

35
Thank You!!
36
Adaptive ManagementAligning People, Processes
and Systems
  • www.voyence.com

37
Adaptive Management Redefines the Role of IT
Technology focus Pillars of expertise Static
Management Reactive
Business process focus Cross enterprise
expertise Dynamic infrastructure Proactive
38
Technology Focus to Business Focus
  • The IT organization or service provider becomes
    an integral part of providing the enterprise
    with
  • Dynamic infrastructure in line with business
    goals
  • The agility to shift resources to meet business
    priorities
  • Services based on business realities
  • Two way communication for better business
    decisions

Changing business needs
Business Services
IT Services
Network Services
39
Static Management to Dynamic Decisions
Adaptive Management provides a better context for
decision making utilizing network management tools
Business Services Management
IT Services Management Uses business context to
adjust resource utilization in response to
changes in environmental conditions
Network Services Management Provides up to date
information to dynamically modify infrastructure
aligned with business goals

Network Configuration Management Provides a
secure and centralized mechanism to affect
network device configuration changes based on
business priorities
40
Pillars of Expertise to Cross Enterprise Expertise
IT Management must be prepared to
Analyze demand for resourcesUnderstand, manage,
and optimize the user experience while ensuring
continuous and secure operations.
  • s

Adaptive Enterprise
Customers, Suppliers, Employees
Business Services
IT Management
Adjust supply of and access to resourcesDynamical
ly allocate resources and optimize access to
automatically address changes in service demands
Networks, Applications, Services
Devices, Servers, Software, Storage
41
Reactive to Proactive
Adaptive Enterprise
Adaptive Management optimizes ITs value to the
enterprise
Network Services Management Provides
Business Services Management needs to
IT Services Management Provides
Automation of network deployment
Improve service levels
Automation of service provisioning
Dynamic network change management
Dynamic service provisioning
Respond to changes in business priorities
Integrated view of networks, devices and
configurations
Integrated view of applications and servers
Integrate and manage infrastructure
Design deployment of Network devices to
support services
Design deploy infrastructure
Design deployment of application services
42
Steps to Adaptive Management
Enable the network to provide flexible, on demand
services in a controlled environment.
Integrate into adaptive management
infrastructure
Map network infrastructure to network services
Capture and control network infrastructure
Network Configuration Management
Time
43
Capture and Control Network Infrastructure
  • All services rely on the network
  • In order to create an adaptive environment, IT
    management needs
  • Control over network availability and security
  • Standardization of configuration
  • The ability to quickly shift resources
  • Dynamic network infrastructure
  • Automation of routine changes
  • Historical data
  • Reporting

Capture and control network infrastructure
44
Control Over Change Leads to Adaptive Environment
  • Network Configuration Management provides
  • Dynamic Resource Allocation
  • Standardized configuration of the network tied to
    business needs
  • Dynamically allocate resources based on business
    priorities
  • Decreased Mean Time to Repair
  • Roll back a device to a working state in a single
    clicks
  • Enhanced Network Reliability and Security
  • Comprehensive audit trail and restore
    capabilities for all devices
  • Increased Network Availability
  • Implement routing and security policies
    consistently to all elements

45
Map Infrastructure to Services
  • Deploy services quickly and error-free
  • End Points and their service requirements for
    different applications i.e. EMAIL, ERP, CRM
  • Knowledge of the related network components
  • Match to physical network infrastructure to
    requirements
  • Provide Automated Provisioning
  • More progress today with IT infrastructure than
  • network

Map network infrastructure to network services
46
Integrate into Adaptive Management
  • End to end asset and problem management including
    network
  • Improved Root Cause Analysis
  • Correlate faults and performance degradation with
    configuration change events
  • Prevent future outages by adjusting
  • configuration policies
  • Mapping change events to problem events
  • Dynamic IT provisioning includes network
  • IT service requests drive network service
  • provisioning

Integrate into adaptive management
infrastructure
47
Network Configuration Management Necessary
Component of Adaptive Management
  • Voyence provides the first systemic approach to
    Network Configuration Management
  • Voyence provides software solutions to validate
    intended network changes and to verify that those
    changes result in an optimized network

Design Management
Network Resource Repository
Compliance Management
Change Management
48
Network Configuration Management
  • VoyenceControl! Advantage
  • Scalable architecture to manage 10s of thousands
    of heterogeneous devices across multiple networks
  • Automation of network change management
  • Graphical view of enterprise-wide network
    configurations
  • What?, how?, where?
  • Logical view of infrastructure
  • Validation of changes during design process
  • Audit and archive of network configuration changes

49
Network Configuration Management
  • VoyenceControl! Benefits
  • Improved Network Availability
  • Reduce injected errors through automation,
    standardization, and process
  • 20-40 improvement on MTTR
  • Tighter Network Security
  • Single point of control for all network
    configuration changes
  • Automation of periodic security related updates
  • Audit trail of all network changes
  • Reduced Network Administration Costs
  • 1080 reduction in network engineering costs
  • Recovery and re-deployment of stranded assets
  • Accurate capacity forecasting for budgeting and
    acquisition

50
Dynamic Network Configuration
Dynamic graphical or tabular workspace provides
the ability to verify and validate changes
and automate deployment.
51
Managing Your NetworkAs A Business Service
  • www.aprisma.com

52
Agenda
  • The Choice You Will Make
  • The Vision
  • What You Can Do Today Case Study
  • Practical Advice
  • Aprismas SPECTRUM Solution Suite

53
You Have A Hard Job
  • Its always the networks fault
  • Spiraling complexity and constant state of change
    as new applications, content, vendors and
    technologies are introduced
  • Complexity problem cannot be solved by continuous
    increase in people, time or money invested

54
The Choice You Will Make
Stay down in the network plumbing managing
your bits and bytes and jitter and latency, and
become irrelevant and outsourced. Rise up to
understand the connection between your network
components and business-relevant services, and
become vital even irreplaceable.
55
Network Management inCIOs Top Technology
Priorities
TOP TEN TECHNOLOGY PRIORITIES, GARTNER CIO SURVEY
2003
Ranking
Forecast 2006
Average weighted score (10 max)
2002
2001
2003
1
2
3
4
5
6
7
8
9
10
1
2
1
1
Security enhancement tools
Applications integration/middleware/ messaging
3
5
9
2
7
8
3
12
Enterprise portal deployment
?
5
3
17
4
?
Network infrastructure/management tools
6
1
13
5
Internal e-enabling infrastructure
Web design, development and content management
tools
2
-
6
14
Storage management (SAN, NAS) deployment
11
-
18
7
Customer Relationship Management (CRM)
9
6
6
8
-
-
9
2
Web services internal or external
Deploying XML based processes/ messaging
12
12
8
10
56
Why Network Management Remains Relevant
  • The network is the foundation of any electronic
    business service or transaction
  • On Demand, Real Time, Grid Computing and Web
    Services dont work without the network
  • Keeps internal / external providers and vendors
    honest by validating their performance against
    SLAs
  • An insurance policy against the cost of downtime
    or reduced performance

57
The Vision
58
The Vision
  • Define, Document Discover Services
  • Unified view across all silos of IT
    infrastructure from the end-user perspective
    andbusiness-relevance
  • Understand Relationships
  • Physical Logical Topology
  • Impact Analysis Prioritization
  • Root Cause Analysis
  • Infrastructure Services
  • Services Customers
  • Verify Validate Quality
  • Real Time, Historical, Predictive

59
People, Process and Politics then Products
  • Proactive, responsive and skilled people that
    communicate and present a unified face to the
    customer
  • Automated, integrated workflow and escalation
  • Isolated management silos (e.g., network,
    server) must be dismantled
  • Cross the political divide with Intelligent
    Finger-Pointing

60
What You Can Do Today
  • Pick one service or business process
  • Model it, understand what is normal, and notify
    upon exception
  • Document success, then move to the next
    service/process

61
What to Look for in a Service Management Solution
  • Service Dashboard
  • Distributed architecture with role-based views
    that integrate Fault and Performance data grouped
    by customer or business process
  • Predict/Prevent Problems
  • Intelligent thresholds that notify of impending
    SLA violation
  • Automated Response Time Monitoring
  • Know about Problems and their Root Cause
  • Sophisticated notification via email, phone, etc.
    with automated escalation and impact analysis
  • Root Cause Analysis and cross-silo event
    correlationleveraging both rules-based and
    model-based reasoning
  • Fix Problems
  • Recommend corrective actions and integrate
    configuration management
  • Verify and Validate SLAs
  • Reports on service inventory, capacity,
    availability and performance

62
Case Study WAN Savings
  • Verify SLAs with your service providers
  • Ensure you have not purchased too much or too
    little bandwidth
  • A retail customer is saving more than 100,000
    per month

63
Practical Advice
  • Seek out business goals
  • Define IT services related to the business
  • Put instrumentation in place
  • Link IT infrastructure status and performance
    with business-oriented IT services
  • Measure service levels in business terms, set
    objectives, predict results, publish reports,
    offer service-level guarantees

64
Manage What Matters
  • Who is Aprisma?
  • Aprismas SPECTRUM software manages the health
    and performance of networks and the business
    services that rely on them
  • What is SPECTRUM?
  • Reduces the number of tools required to
    accomplish management tasks
  • Automatically discovers and understands the
    relationships between IT infrastructure elements,
    services and customers
  • Proactively predicts problems
  • Intelligently isolates problems to the root cause
  • Understands what customers and services are
    impacted
  • Presents integrated, actionable information
  • Enables you to quickly fix the problem
  • Prevents problems from happening again in the
    future

65
1,000 Customers in 40 Countries
66
THANK YOU!
67
Network Services Management
  • www.openview.hp.com

68
Network Services Management Trends
  • Business trends
  • Organizations demand more capabilities from
    existing investments to maximize ROI
  • Investments focused at building foundation for
    future implementations
  • Cost minimization with acceptable service level
  • Network technology trends
  • Voice, data and video over IP
  • IP routing evolution
  • Built-in redundancy/resiliency
  • Management trends
  • Move beyond managing devices to understand
    complex relationships between network components
  • Need to understand network degradations risk as
    much as network faults
  • Need to understand impact of network outages on
    higher level business/IT services

69
Customer Challenges / Needs
  • Complexity
  • Chaotic, unpredictable traffic patterns
  • Volumes of management data
  • Emerging dynamic network technologies, e.g.,
    redundant routing protocols, IP Telephony
  • Cohesiveness
  • Getting a cohesive integrated view of the entire
    network and its relationships and interactions w.
    business processes
  • Cost pressures
  • Faster deployment
  • Fewer resources
  • Lack of special skills
  • Predict problems before they arise
  • Software that understands the network so I dont
    have to
  • Need to turn the data into usable information
  • Tie business services to network activity
  • Give network operators tools that make them more
    effective
  • Provide automation and out-of-the-box
    functionality
  • Meaningful, intelligent root cause analysis
  • Broad device and protocol support
  • Fast ROI

70
Network Services Protocols
Network ServicesInterdependencies
OSPF
Highly dynamic
Business Services
BGP
L2/L3
OSPF
VLAN
VPN
fairly static
71
Network Services Management Goals
  • Adapt to dynamic networks
  • Support IP routing redundancy
  • Adjust to changes in network rapidly
  • Dramatically reduce TCO
  • Fewer servers for net mgt. env.
  • Fewer operators for env.
  • Integrate fault performance
  • Performance data as part of RCA
  • Topology data to drive more useful performance
    reporting
  • Common config of managed objects
  • Fastest Mean-Time-To-Repair
  • Greatest range of data sources (events AND topo,
    perf, path, etc.)
  • Prioritization based on biz impact
  • Network services management service-centric
    network mgt.
  • Out-of-the-box mgmt. for specific network
    services through Net Mgmt. Smart Plug-ins
  • Impact mgt. of NW on higher level apps and effect
    of cfg changes on NW failures

72
Root-cause AnalysisIntelligent Diagnostics for
Networks
network
73
Cisco HSRP
What is it? Hot Standby Router Protocol is a
widely used Ciscoprotocol that provides
fail-over for routers supporting critical
business services. It is very complex and
essentially is a specialized virtualized network
service where each router in a group shares
common things such as an IP address. The
Problem Due to HSRPs many configurations, and
resiliency, problems are not easy to diagnose.
Furthermore, since it is designed to function
under some degree of failure, RCA is often not
enough the operator needs to now if the service
is still working and if its at risk of further
failure. Since operators usually triage problems,
an HSRP problem might be delayed.
74
Cisco HSRP
  • The Solution NNM Advanced Edition 7PLUS the NNM
    Advanced Routing SPI deliver the
    mostcomprehensive diagnosis available today for
    Cisco HSRPenvironments. Operators with be able
    to
  • understand which routers are participating in
    HSRP
  • which HSRP routers are in common groups
  • which routers are currently the primary and
    secondary routers
  • understand high-level RCA HSRP events
  • drill down into dynamic views to further
    understand the interrelationships between all
    network infrastructure supporting HSRP

75
Cisco HSRP
Intelligent Diagnostics for Networks
Active Problem Analysis for Cisco HSRP !
Root Cause HSRP Events
Intelligent messages to operator
Understand STATE of your services
Warning HSRP Router down Standby now active
  • Contextual launch of Detailed Viewsin the
    neighborhood of HSRP failure
  • Is it OK/CRITICAL?
  • Whats risk of future failure?

76
Network Performance Brown-Outs
What is it? Network Brown-Outs are conditions
thatsignificantly affect performance between
points in the network. The Problem A customer
is concerned about network performancein a
specific region of the core network. Both
client-server and Web-based applications use
this common path along the networkas well as the
remote backups performance varies widely
andsome applications are timing out for users at
peak times. Thecustomer needs to be able to
identify the problems related tocongestion not
just hard faults while they redesign their
networkfor better handling of high traffic loads.
77
Network Performance Brown-Outs
The Solution NNM Advanced Edition 7 PLUS
OpenView Performance Insight for Networks
delivers the most comprehensive network fault
and performance solution available. Customers
need to distinguish between hard network
faultsand temporary performance Brown-Outs.
They can limitunnecessary short-term remedy
efforts while they collect datafor the longer
term re design of their network. NNM AE 7
includes the previous stand-alone Problem
Diagnosisproduct which has been enhanced to
identify performanceproblems along a network
path. In addition, PD also is more
tightlyintegrated with NNM ET utilizing layer
2 topology for its path views. OVPI now includes
thresholding and is integrated with NNM. It
willprovide much of the data required for the
network redesign.
78
Network Performance Brown-Outs
The Solution
NNM receives threshold eventsfrom PDs probes
hence its not a hard fault!
PD discoverscongestion here!
Clients
Servers
PDProbe
PDProbe
OVPI collects on interfacesbetween these devices
79
Adjacent Device Failure Analysis
What is it? When a device, port or cable fails,
other network equipment in the neighborhood gets
affected, and emitsevents (symptoms). In
addition, some parts of the networkmay become
unreachable. The Problem When a failure happens,
a number of differentalarms and polling failures
will be detected. If these are notcorrelated,
the user will get a number of failures in an
alarmbrowser. Since these failures can happen
on different networkdevices, they will clutter
up the operators alarm browser anddistract from
focusing in on the actionable fault.
80
Adjacent Device Failure Analysis
  • The Solution NNM Advanced Edition 7 includes
    theActive Problem Analyzer which effectively
    correlates relatedfailures and presents the root
    cause as the primary event inthe operators
    browser.
  • uses ETs Layer 2 topology to understand the
    neighborhood of devices in the network,
    including network designs with redundancy
  • monitors IP addresses for reachability as well
    as device interfaces for availability
  • in the event of failure, does active polling
    of neighboring devices to help in diagnosis
  • correlates all symptoms from the neighborhood of
    the failure under the root cause

81
Pinpoint Failure Analysis An Example Edge
Switch Down
AdvancedEdition
Desired Root Cause Switch V down Secondary
Interface Z-23 down Impact C, D, G
unreachable
15
13
VLAN1
14
12
16
1
17
4
2
20
19
6
5
18
3
21
9
10
11
22
NNM Station
Fastest MTTR
8
7
Switch Z
  • Symptoms
  • Switch Z reports Z-23 down
  • Switch Z emits Linkdown
  • V ICMP/SNMP timeout
  • STP Root Convergence Count
  • STP Root changes
  • C, D, G ping/snmp timeouts
  • FALSE Symptoms
  • Intermittent ping/SNMP timeouts

23
24
26
25
27
28
VLAN2
Adapt to dynamic networks
Node C
Node G
Switch V
29
30
Dont generate Events for this Area!
Node D
82
Case Study
Commonwealth of Massachusetts NNM with Extended
Topology, HSRP, Frame Relay 50,000 end users 170
agencies 8 centralized enterprise
applications 1,500 network devices including
switches ATM Backbone We're connecting 400 Frame
Relay sites with 1200 PVCs. When a connection
failure occurs, network faults were overloading
the operators. 90 percent of the time the source
of the problem was within the Frame Relay service
provider's network, but we still needed to hash
through the volume of data to isolate the
problem. HP OpenView NNM with the HP OpenView NNM
SPI for Frame Relay enables us to quickly
identify the source of the problem, what is
impacted, and who is responsible. The
bottom-line is our department now provides our
customers with exceptional WAN Services which is
absolutely critical for our distributed
business. Richard Glasberg Commonwealth
of Massachusetts
83
OpenView Network Node Manager (NNM) 7 Starter
and Advanced Editions
New product packaging/pricing allows increased
flexibility for you to buy the product that fits
your needs
NNM Advanced Edition 7Designed for all sizes of
networks requiring advanced network management,
including switches/VLANs, sophisticated
root-cause analysis, and distributed environments
for large networks spanning multiple
sites/departments. This is the platform for
even greater advanced capabilities delivered
through the NNM Smart Plug-ins.
AdvancedEdition
NNM Starter Edition 7Entry-level product
designed for smaller networks (250-500 nodes)
needing basic network management (primarily Layer
3 routers/hubs) from a single management station.
StarterEdition
84
Top 3 Reasons to Purchase NNM 7
  • Reduced Mean Time to Repair
  • Advanced Intelligent diagnostics for networks
  • Event reduction through intelligent filters and
    correlators
  • Layer 2 discovery and root cause analysis
  • Enhanced Correlation Composer
  • Intelligent multi-threaded Poller and State
    Analysis
  • Dynamic views
  • Management of duplicate IP addresses from single
    station
  • Optimize your investment
  • Higher scalability to 30,000 nodes per station
  • Fewer servers required
  • NNM SPIs for specific technologies
  • New device and protocol support
  • Efficiency of resources
  • Up to 99 reduction in noise events focus on
    critical 1 (up to 14x improvement over NNM 6)
  • Intuitive new GUI for any operator level

Optimize your total cost of ownership and
operation
85
Five Recommendations for Optimized Management of
the IT Infrastructure
  • Jim Metzler
  • Vice President - Ashton, Metzler Associates
  • jim_at_ashtonmetzler.com

86
Recommendations
  • Metrics
  • Applications
  • Service Management
  • Bandwidth Optimization
  • Plan Holistically

87
Two Key Premises
  • It is difficult to improve the performance of a
    dynamic system if you do not have some metrics
    that describe the performance of that system.
  • It is difficult to get credit for improving the
    performance of a dynamic system if you do not
    have some metrics that describe the improved
    performance of that system.

88
Recommendation Metrics
  • Metrics are also necessary to set the
    expectations of your stakeholders. Stakeholders
    include
  • Business and Functional Managers
  • Applications Developers
  • Customers and Partners
  • Companies should set goals for the performance
    (i.e., response time, availability) of at least a
    few key applications. These applications goals
    can then be used to establish goals for the
    network.
  • In some cases, the network performance goals
    should also include packet loss and
    jitter.

89
Recommendation Metrics
  • Organizations should also establish some cost
    metrics. However, being measured by the
    bottom-line cost of IT will generally not lead to
    success.
  • Total Cost Unit Cost x the Number of Units
  • Possible unit cost metrics include
  • Cost per minute
  • Cost per megabyte
  • A few companies are moving to where they are
    identifying the IT costs associated with specific
    business functions such as
  • Booking a sales order
  • Responding to a customer service inquiry

90
Recommendations
  • Metrics
  • Applications
  • Service Management
  • Bandwidth Optimization
  • Plan Holistically

91
Recommendation Applications
  • Investigate application models that already exist
    that describe the behavior of any major
    enterprise applications that your company either
    has deployed or is looking to deploy. These
    models are computer simulations of how the
    generic application will perform.
  • Develop the ability, either in-house or
    outsourced, to do applications profiling or
    benchmarking. This refers to testing the
    application as it will be implemented in your
    company.
  • Integrate the infrastructure organizations into
    the application development lifecycle. The idea
    is to profile key applications at various stages
    in the development lifecycle.

92
Recommendation Applications
  • Implement some form of QoS to ensure that key
    enterprise applications (i.e., SAP, Oracle
    Financials) get the bandwidth they need to
    perform well.
  • Implement proactive monitoring and management.
    Continually monitor the infrastructure looking
    for deteriorating conditions with the tools and
    processes in place to respond to these conditions
    before they become more serious.
  • Evaluate applications management tools,
    particularly ones that can manage n-tier
    applications as well as applications that utilize
    Web services.

93
Recommendations
  • Metrics
  • Applications
  • Service Management
  • Bandwidth Optimization
  • Plan Holistically

94
Recommendation Getting Started with Service
Management
  • Stabilize infrastructure.
  • Assign senior IT resources that act as interfaces
    to the key business and functional organizations.
  • Define a few services and the customers for those
    services.
  • Develop business model Service Delivery,
    Management, Review and Modification.
  • Develop linked service level agreements
    internal and external.
  • Note Some companies are beginning to develop
    multi-tiered SLAs. For example, two WAN services
    - one that is based on the Internet and one that
    is not.

95
A Multi-layered Approach to Service Level
Management
96
External SLAs with Service Providers
  • When negotiating an SLA with service providers,
    it is important to focus on how they define the
    metrics. Even simple concepts such as
    availability have many different interpretations.
  • In general, the metrics in most SLAs are averages
    of averages. For example, the way most service
    providers calculate availability is averaged over
    all of the hours of the month, further averaged
    over all the sites in the network.
  • In most cases, the standard remedies for missing
    an SLA are at best weak. One option is to
    negotiate a progressive credit structure whereby
    the credits increase each month that the service
    provider fails to meet the SLA.

97
Recommendations
  • Metrics
  • Applications
  • Service Management
  • Bandwidth Optimization
  • Plan Holistically

98
Recommendation Bandwidth Optimization
  • Determine when techniques such as header
    compression make sense.
  • Evaluate the deployment of compression
    techniques, including techniques more
    sophisticated than the traditional dictionary
    compression algorithms.
  • International links are a candidate for
    compression. Another candidate is 56K links that
    are approaching exhaust.
  • Evaluate the deployment of caching for existing
    applications. Also, ensure that an analysis of
    caching is part of the systems development cycle.

99
Recommendations
  • Metrics
  • Applications
  • Service Management
  • Bandwidth Optimization
  • Plan Holistically

100
Recommendation Plan Holistically
  • To maximize performance and minimize cost, a
    companys IT infrastructure (i.e., LAN, WAN, Data
    Center) must be planned holistically.
  • For example, assume that a company is deploying
    SAP and that the goal is that SAP will be
    available 99.9 of the time.
  • Assume that the factors that drive an outage in
    one component of the infrastructure (i.e., a
    fiber cut on an access circuit) do not impact the
    availability of the other components.
  • To meet the availability goal for SAP, have the
    LAN, WAN and Data Center each designed to have
    99.97 availability.

101
Recommendation Plan Holistically
  • However, if one component of the infrastructure
    is only designed for 99.9 availability, the
    other two components of the infrastructure would
    have to be designed for 100 availability to meet
    the availability goal for SAP.
  • Even worse, if one component of the
    infrastructure is designed for something less
    than 99.9 availability, it is impossible to meet
    the availability goal for SAP no matter how much
    money the company spends increasing the
    availability of the other two components of the
    infrastructure.

102
Thank You!!
About PowerShow.com