Title: Optimum Performance, Maximum Insight: Behind the Scenes with Network Measurement Tools
1Optimum Performance, Maximum Insight Behind
the Scenes with Network Measurement Tools
ConferenceApril 17-18, 2007
Loki Jorgenson Chief Scientist
- Peter Van Epp
- Network Director
- Simon Fraser University
2Overview
- Network measurement, performance analysis and
troubleshooting are critical elements of
effective network management. - Recommended tools, methodologies, and practices
with a bit of hands-on
3Overview
- Quick-start hands-on
- Elements of Network Performance
- Realities Industry and Campus
- Contexts
- Methodologies
- Tools
- Demos
- QA
4Troubleshooting the LANNDT (Public Domain)
- Preview - NDT
- Source - http//e2epi.internet2.edu/ndt/
- Local server - http//ndtbby.ucs.sfu.ca7123
- http//192.75.244.1917123
- http//142.58.200.2537123
- Local instructions
- http//XXX.XXX
5Troubleshooting the LANAppCritical (Commercial)
- Preview - AppCritical
- Source - http//apparentNetworks.com
- Local server - http//XXX.XXXX.XXX
- Local instructions
- http//XXX.XXX
- Login guest , bcnet2007
- Downloads
- ? User Interface
- ? Download User Interface
- Install
- Start and login (see above)
6INTRO
7Network Performance
- Measurement
- How big? How long? How much?
- Quantification and characterization
- Troubleshooting
- Where is the problem? What is causing it?
- Diagnosis and remediation
- Optimization
- What is the limiter? What application affected?
- Design analysis and planning
8Functional vs. Dysfunctional
- Functional networks operate as specd
- Consistent with
- Only problem is congestion
- Bandwidth is the answer (or QoS)
- Dysfunctional networks operate otherwise
- Broken but ping works
- Does not meet application requirements
- Bandwidth and QoS will NOT help
9Causes of Degradation
- Five categories of degradation
- exceeds specification
- Insufficient capacity
- diverges from design
- Failed over to T1 auto-negotiate selects
half-duplex - presents dysfunction
- EM interference on cable
- includes devices and interfaces that are
mis-configured - Duplex mismatch
- manifests emergent features
- Extreme burstiness on high capacity links TCP
10STATS AND EXPERIENCE
11Trillions of Dollars
- Global Annual Spend on telecom 2 Trillion
- Network/Systems Mgmt 10 Billion
- 82 of network problems identified by end users
complaining about application performance
(Network World) - 38 of 20,000 helpdesk tests showed network
issues impacting application performance
(Apparent Networks) - 78 of network problems are beyond our control
(TELUS) - 50 of network alerts are false positives
(Netuitive) - 85 of networks are not ready for VOIP (Gartner
2004) - 60 of IT problems are due to human error
(Networking/CompTIA 2006)
12Real World Customer Feedback
- Based on survey of 20,000 customer tests, serious
network issue 38 of the time - 20 of networks have bad NIC card drivers
- 29 of devices have packet loss, caused by
- 50 high utilization
- 20 duplex conflicts
- 11 rate limiting behaviors
- 8 media errors
- 8 firewall issues
13Last Mile
- Last 100m
- LAN
- Workstations
- Office environment
- Servers
- WAN
- Leased lines
- Limited capacities
- Service providers / core networks
14METHODOLOGIES
15Real examples from the SFU network
- 2 links out - one to CAnet4 at 1G usually empty
- 100M commodity link heavily loaded
- (typically 6 times the volume of the C4 link)
- Physics grad student doing something data
intensive to a grid site in Taiwan - first indication ? total saturation of commodity
link - Argus pointed at the grid transfer as symptom
- routing problem as the cause
16Real examples (cont.)
- problem of asymmetric route
- 124552 tcp taiwan_ip.port -gt sfu_ip.port 809
0 1224826 0 -
-
packets in out bytes in out - reported the problem to Canarie NOC who quickly
got it fixed - user's throughput much increased, commodity link
less saturated! - Use of NDT might have increased stress !
17Network Life Cycle (NLC)
- Network life cycle
- Business case
- Requirements
- Request for Proposal
- Planning
- Staging
- Deployment
- Operation
- Review
18NLC Staging/Deployment
- Two hosts with a cross over cable
- insure the end points work.
- Move one segment closer to the end (testing each
time) - Not easy to do if sites are geographically/politic
ally distinct - Establish connectivity to the end points
- Tune for required throughput
- One of multiple possible points of failure lack
of visibility - Tools (even very disruptive tools) can help by
stressing the network - Localize and characterize
19NLC Staging/Deployment (cont.)
- Various bits of hardware (typically network
cards) and software (the IP stack) flaws - default configurations that are inappropriate for
very high throughput networks. -
- Careful what you buy
- (cheapest is not best and may be disastrous)
- optical is much better, but also much more
expensive than copper) - Tune the IP stack for high performance
- If possible try whatever you want to buy in a
similar to environment (RFP/Staging) - Staging won't guarantee anything
- something unexpected will always bite you.
20NLC Operation
- Easier if that the network was known to work at
implementation - Probably disrupting work so pressure is higher
- may not be able to use the disruptive tools
- may be occurring at a time when the staff
unavailable - Support user (e.g. NDT)
- researcher can point the web browser on their
machine at an ndt server - save (even if they don't understand) the results
for a network person to look at and comment on
later
21NLC Operation (cont.)
- automated monitoring / data collection
- can be very expensive to implement
- someone must eventually interpret it
- consider issues/costs when applying for funding
- passive continuous monitor on the network can
make your life (and success) much easier - multiple lightpath endpoints or dynamically
routed network can be challenging - issues may be (or appear to be) intermittant
- due to changes happen automatically can be
maddening.
22NLC Dependencies
23METHODOLOGIESMeasurement
24 Visibility
- Basic problem is lack of visibility at the
network level - Performance depends
- Application type
- End-user / task
- Benchmarks
- Healthy networks have design limits
- Broken networks are everything else
25Measurement Methodologies
- Device-centric (NMS)
- SNMP
- RTCP/XR / NetCONF
- E.g. HP OpenView
- Network behaviors
- Passive
- Flow-based - e.g. Cisco NetFlow
- Packet-based e.g. Network General Sniffer
- Active
- Flooding e.g. AdTech AX/4000
- Probing e.g. AppCritical
26E2E Measurement Challenges
- Layer 1
- Optical / light paths
- Wireless
- Layer 2
- MPLS
- Ethernet switch fabric
- Wireless
- Layer 3
- Layer 4
- TCP
- Layer 5
- Federation
27Existing Observatory Capabilities
- One way latency, jitter, loss
- IPv4 and IPv6 (owamp)
- Regular TCP/UDP throughput tests 1 Gbps
- IPv4 and IPv6 On-demand available (bwctl)
- SNMP
- Octets, packets, errors collected 1/min
- Flow data
- Addresses anonymized by 0-ing the low order 11
bits - Routing updates
- Both IGP and BGP - Measurement device
participates in both - Router configuration
- Visible Backbone Collect 1/hr from all routers
- Dynamic updates
- Syslog also alarm generation (nagios) polling
via router proxy
28Observatory Databases Dat? Types
- Data is collected locally and stored in
distributed databases - Databases
- Usage Data
- Netflow Data
- Routing Data
- Latency Data
- Throughput Data
- Router Data
- Syslog Data
29GARR User Interface
30METHODOLOGIESTroubleshooting
31Challenges to Troubleshooting
- Need resolution quickly
- Operational networks
- May not be able to instrument everywhere
- Often relies on expert engineers
- Does not work across 3rd party networks
- Authorization/access
- Converged networks
- Application-specific symptoms
- End-user driven
32HPC Networks
- Three potential problem sources
- user site to edge (x 2)
- core network
- Quickly eliminate as many of these as possible
- ? binary search
- Easiest during implementation phase
- Ideally - 2 boxes at the same site and move them
one link at a time - Often impractical deploy and pray (and
troubleshoot)
33HPC Networks (cont.)
- Major difference between dedicated lightpaths and
a shared network - Lightpath end to end test
- iperf/netperf on loopback
- this is likely too disruptive on shared network
- DANGEROUS
- Alternately, NDT to local server to isolate
- Recommended to have at least mid-path ping!
34HPC Networks (cont.)
- Shared see if other users have problems
- If no core problem, not common
- If core, outside agencies involved
- Start trouble shooting
- both end user segments in parallel
- Preventive measures
- support user runnable diagnostics
- ping and owamp - low impact monitoring
35E2EPI Problem StatementThe Network is Broken
- How the can user self-diagnosis first mile
problems without being a network expert? - How can the user do partial path decomposition
across multiple administrative domains?
36Strategy
- Most problems are local
- Test ahead of time!
- Is there connectivity reasonable latency? (ping
-gt OWAMP) - Is routing reasonable (traceroute)
- Is host reasonable (NDT Web100)
- Is path reasonable (iperf -gt BWCTL)
37What Are The Problems?
- TCP lack of buffer space
- Forces protocol into stop-and-wait
- Number one TCP-related performance problem.
- 70ms 1Gbps 70106 bits, or 8.4MB
- 70ms 100Mbps 855KB
- Many stacks default to 64KB, or 7.4Mbps
38What Are The Problems?
- Video/Audio lack of buffer space
- Makes broadcast streams very sensitive to
previous problems - Application behaviors
- Stop-and-wait behavior Cant stream
- Lack of robustness to network anomalies
39The Usual Suspects
- Host configuration errors (TCP buffers)
- Duplex mismatch (Ethernet)
- Wiring/Fiber problem
- Bad equipment
- Bad routing
- Congestion
- Real traffic
- Unnecessary traffic (broadcasts, multicast,
denial of service attacks)
40Typical Sources ofPerformance Degradation
- Half/Full-Duplex Conflicts
- Poorly Performing NICs
- MTU Conflicts
- Bandwidth Bottlenecks
- Rate-Limiting Queues
- Media Errors
- Overlong Half-duplex
- High Latency
41Self-Diagnosis
- Find a measurement server near me.
- Detect common tests in first mile.
- Dont need to be a network engineer.
- Instead of
- The network is broken.
- Hoped for result
- I dont know what Im talking about, but I think
I have a duplex mismatch problem.
42Partial Path Decomposition
- Identify end-to-end path.
- Discover measurement nodes near to and
representative of hops along the route. - Authenticate to multiple measurement domains
(locally-defined policies). - Initiate tests between remote hosts.
- See test data for already run tests. (Future)
43Partial Path Decomposition
- Instead of
- Can you give me an account on your machine?
- Can you set up and leave up and Iperf server?
- Can you get up at 2 AM to start up Iperf?
- Can you make up a policy on the fly for just
me? - Hoped for result
- Regular means of authentication
- Measurement peering agreements
- No chance of polluted test results
- Regular and consistent policy for access and
limits
44METHODOLOGIESApplication Performance
- Network Dependent Vendors
- Applications groups (e.g. VoIP)
- Field engineers
- Industry focused on QoE
45Simplified Three Layer Model
OSI
Description
Layer
7
Application
6
Presentation
5
Session
4
Transport
3
Network
2
Data Link
1
Physical
46New Layer Model
47App-to-Net Coupling
- Codec
- Dynamics
- Requirements
Application Model
48E-Model Mapping R ? MOS
E-model generated R-value (0-100) -
maps to well-known MOS score
E-ModelAnalysis
MOS (QoE)
49Coupling the Layers
User / Task / Process
Application Behaviors
Application Models
test/monitorforQoE
network requirements(QoS/SLA)
Network Behaviors
50METHODOLOGIESOptimization
51Network visibility
End-to-end network path
App-to-net coupling
End-to-end visibility
52Iterating to Performance
53Wizard Gap
Reprinted with permission (Matt Mathis,
PSC) http//www.psc.edu/mathis/
54Wizard Gap
- Working definition
- Ratio of effective network performance attained
by an average user to that attainable by a
network wizard.
55Fix the Network First
- Three Steps to Performance
- Clean the network
- Pre-deployment
- Monitoring
- Model traffic
- Application requirements for QoS/SLA
- Monitoring for application performance
- Deploy QoS
56Lessons Learned
- Guy Almes, chief engineer Abilene
- The general consensus is that it's easier to fix
a performance problem by host tuning and healthy
provisioning rather than reserving. But it's
understood that this may change over time. ...
For example, of the many performance problems
being reported by users, very few are problems
that would have been solved by QoS if we'd have
had it.
57Tools
- CAIDA Tools (Public)
- http//www.caida.org/tools/
- Taxonomies
- Topology
- Workload
- Performance
- Routing
- Multicast
58Recommended (Public) Tools
- MRTG (SNMP-based router stats)
- iPerf / NetPerf (active stress testing)
- Ethereal/WireShark (passive sniffing)
- NDT (TCP/UDP e2e active probing)
- Argus (Flow-based traffic monitoring)
- perfSonar (test/monitor infrastructure)
- Including OWAMP, BWCTL(iPerf), etc.
59Tools OWAMP/BWCTL
- OWAMP one way active measurement protocol
- Ping by any other name would smell as sweet
- depends on stratum 1 time server at both ends
- allows finding one way latency problems
- BWCTL Bandwidth control
- management front end to iperf
- prevent disruption of the network with iperf
60Tools BWCTL
- Typical constraints to running iperf
- Need software on all test systems
- Need permissions on all systems involved (usually
full shell accounts) - Need to coordinate testing with others
- Need to run software on both sides with specified
test parameters - ( BWCTL was designed to help with these)
61Tools ARGUS
- http//www.qosient.com/argus
- open source IP auditing tool
- entirely passive
- operates from network taps
- network accounting down to the port level
62Traffic Summary from Argus
- From Wed Aug 25 55900 2004 To Thu Aug 26
55900 2004 - 18,972,261,362 Total 10,057,240,289
Out 8,915,021,073 In - aaa.bb.cc.ddd 6,064,683,683 Tot
5,009,199,711 Out 1,055,483,972 In - ww.www.ww.www 1,490,107,096
1,396,534,031 93,573,065 - ww.www.ww.www11003 1,490,107,096
1,396,534,031 93,573,065 - xx.xx.xx.xxx 574,727,508
548,101,513 26,625,995 - xx.xx.xx.xxx6885 574,727,508
548,101,513 26,625,995 - yy.yyy.yyy.yyy 545,320,698
519,392,671 25,928,027 - yy.yyy.yyy.yyy6884 545,320,698
519,392,671 25,928,027 - zzz.zzz.zz.zzz 428,146,146
414,054,598 14,091,548 - zzz.zzz.zz.zzz6890 428,146,146
414,054,598 14,091,548
63Tools ARGUS
- using ARGUS to identify retransmission type
problems. - compare total packet size to application data
size - full (complete packet including IP headers)
- 125906 d tcp tcp sfu_ip.port -gt taiwan_ip.port
9217 18455 497718 27940870 - app (application data bytes delivered to the
user) - 125906 d tcp tcp sfu_ip.port -gt taiwan_ip.port
9217 18455 0 26944300 - data transfer one way
- acks back have no user data
64Tools ARGUS
- compare to misconfigured IP stack
- full
- 152738 tcp outside_ip.port -gt sfu_ip.port
967 964 65885 119588 - app
- 152738 tcp outside_ip.port -gt sfu_ip.port
967 964 2051 55952 - retransmit rate is constantly above 50
- poor throughput
- this should (and did) set off alarm bells
65Tools NDT
- (Many thanks to Lixin Liu)
- Test 1 50 signal on 802.11G
- WEB100 Enabled Statistics
- Checking for Middleboxes . . . . . . . . . . . .
. . . . . . Done checking for firewalls . . . .
. . . . . . . . . . . . . . . Done running 10s
outbound test (client-to-server C2S) . . . . .
12.00Mb/s running 10s inbound test
(server-to-client S2C) . . . . . . 13.90Mb/s - ------ Client System Details ------
- OS data Name 3D Windows XP, Architecture 3D
x86, Version 3D 5.1 Java data Vendor 3D Sun
Microsystems Inc., Version 3D 1.5.0_11
66Tools NDT
- ------ Web100 Detailed Analysis ------
- 45 Mbps T3/DS3 link found.
- Link set to Full Duplex mode
- No network congestion discovered.
- Good network cable(s) found
- Normal duplex operation found.
- Web100 reports the Round trip time 3D 13.09
msec the Packet size 3D 1460 Bytes and - There were 63 packets retransmitted, 447
duplicate acks received, and 0 SAC K blocks
received The connection was idle 0 seconds (0)
of the time C2S throughput test Packet queuing
detected 0.10 S2C throughput test Packet
queuing detected 22.81 This connection is
receiver limited 3.88 of the time. - This connection is network limited 95.87 of the
time. - Web100 reports TCP negotiated the optional
Performance Settings to - RFC 2018 Selective Acknowledgment OFF
- RFC 896 Nagle Algorithm ON
- RFC 3168 Explicit Congestion Notification OFF
RFC 1323 Time Stamping OFF RFC 1323 Window
Scaling ON
67Tools NDT
- Server 'sniffer.ucs.sfu.ca' is not behind a
firewall. Connection to the ep hemeral port was
successful Client is not behind a firewall.
Connection to the ephemeral port was succ
essful Packet size is preserved End-to-End
Server IP addresses are preserved End-to-End
Client IP addresses are preserved End-to-End - ... (lots of web100 stats removed!)
- aspd 0.00000
- CWND-Limited 4449.30
- The theoretical network limit is 23.74 Mbps The
NDT server has a 8192.0 KByte buffer which limits
the throughput to 977 - 6.96 Mbps
- Your PC/Workstation has a 63.0 KByte buffer which
limits the throughput to - 38.19 Mbps
- The network based flow control limits the
throughput to 38.29 Mbps - Client Data reports link is 'T3', Client Acks
report link is 'T3' - Server Data reports link is 'OC-48', Server Acks
report link is 'OC-12'
68Tools NetPerf
- netperf on the same link.
- available throughput less than max
- liu_at_CLM
- netperf -l 60 -H sniffer.ucs.sfu.ca -- -s
1048576 -S 1048576 -m 1048576 TCP STREAM TEST
from CLM (0.0.0.0) port 0 AF_INET to
sniffer.ucs.sfu.ca (14 2.58.200.252) port 0
AF_INET - Recv Send Send
- Socket Socket Message Elapsed
- Size Size Size Time Throughput
- bytes bytes bytes secs. 106bits/sec
- 2097152 1048576 1048576 60.10 9.91
- (second run)
- 2097152 1048576 1048576 61.52 5.32
69Tools NDT
- Test 3 80 on 802.11A
- WEB100 Enabled Statistics
- Checking for Middleboxes . . . . . . . . . . . .
. . . . . . Done checking for firewalls . . . .
. . . . . . . . . . . . . . . Done running 10s
outbound test (client-to-server C2S) . . . . .
20.35Mb/s running 10s inbound test
(server-to-client S2C) . . . . . . 20.61Mb/s - ...
- The theoretical network limit is 26.7 Mbps The
NDT server has a 8192.0 KByte buffer which limits
the throughput to 993 4.80 Mbps Your
PC/Workstation has a 63.0 KByte buffer which
limits the throughput to 38.80 Mbps The network
based flow control limits the throughput to 38.90
Mbps - Client Data reports link is 'T3', Client Acks
report link is 'T3' - Server Data reports link is 'OC-48', Server Acks
report link is 'OC-12' -
70Tools NetPerf
- iu_at_CLM
- netperf -l 60 -H sniffer.ucs.sfu.ca -- -s
1048576 -S 1048576 -m 1048576 TCP STREAM TEST
from CLM (0.0.0.0) port 0 AF_INET to
sniffer.ucs.sfu.ca (14 2.58. - 200.252) port 0 AF_INET
- Recv Send Send
- Socket Socket Message Elapsed
- Size Size Size Time Throughput
- bytes bytes bytes secs. 106bits/sec
- 2097152 1048576 1048576 60.25 21.86
- No one else using wireless on A (i.e. the case on
a lightpath) - NetPerf gets full throughput unlike the G case
71Tools perfSONAR
- Performance Middleware
- perfSONAR is an international consortium in which
Internet2 is a founder and leading participant - perfSONAR is a set of protocol standards for
interoperability between measurement and
monitoring systems - perfSONAR is a set of open source web services
that can be mixed-and-matched and extended to
create a performance monitoring framework - Design Goals
- Standards-based
- Modular
- Decentralized
- Locally controlled
- Open Source
- Extensible
72perfSONAR Integrates
- Network measurement tools
- Network measurement archives
- Discovery
- Authentication and authorization
- Data manipulation
- Resource protection
- Topology
73Performance Measurement Project Phases
- Phase 1 Tool Beacons (Today)
- BWCTL (Complete), http//e2epi.internet2.edu/bwctl
- OWAMP (Complete), http//e2epi.internet2.edu/owamp
- NDT (Complete), http//e2epi.internet2.edu/ndt
- Phase 2 Measurement Domain Support
- General Measurement Infrastructure (Prototype in
Progress) - Abilene Measurement Infrastructure Deployment
(Complete), http//abilene.internet2.edu/observato
ry - Phase 3 Federation Support (Future)
- AA (Prototype optional AES key, policy file,
limits file) - Discovery (Measurement Nodes, Databases)
(Prototype nearest NDT server, web page) - Test Request/Response Schema Support (Prototype
GGF NMWG Schema)
74Implementation
- Applications
- bwctld daemon
- bwctl client
- Built upon protocol abstraction library
- Supports one-off applications
- Allows authentication/policy hooks to be
incorporated
75LIVE DEMOS
76 77(No Transcript)
78(No Transcript)
79(No Transcript)
80Outline (REMOVE)
- Set the stage how bad is it (stats)
- Some stats from industry and SFU
- What kinds of problems are typical
- Overview of contexts
- LAN and campus
- Core networks including MPLS and optical
- Wireless
- Methodologies sniffing, flows, synthetic
traffic, active probing - Recommended tools with examples and demo
81Breakdown of Presentation (REMOVE)
- Intro and overview (both)
- Quick demos (both)
- Stats and experience
- Industry stats (Loki)
- Campus experience (Peter)
- Problem types
- Seven Deadly Sins (Loki)
- SFU/BCnet/CANARIE idiosyncrasies (Peter)
- Context overview (Peter)
- Methodologies overview (Loki)
- Tools lists and recommended tools
- Demos
82Application Ecology
- Paraphrasing ITU categories
- Real-time
- Jitter sensitive
- Voice, video, collaborative
- Synchronous/transactional
- Response time (RTT) sensitive
- Database, remote control
- Data
- Bandwidth sensitive
- Transfer, backup/recover
- Best-effort
- Not sensitive