Lessons Learned Monitoring the WAN - PowerPoint PPT Presentation

1 / 47
About This Presentation
Title:

Lessons Learned Monitoring the WAN

Description:

120 countries (99% world's connected population) 35 monitor sites in 14 countries ... Documentation (tutorials, help, FAQs), publicity (brochures, papers, maps, ... – PowerPoint PPT presentation

Number of Views:69
Avg rating:3.0/5.0
Slides: 48
Provided by: jul9193
Category:

less

Transcript and Presenter's Notes

Title: Lessons Learned Monitoring the WAN


1
Lessons Learned Monitoring the WAN
  • Les Cottrell, SLAC
  • ESnet RD Advisory Workshop April 23, 2007
  • Arlington, Virginia

Partially funded by DOE and by Internet2
2
Uses of Measurements
  • Automated problem identification trouble
    shooting
  • Alerts for network administrators, e.g.
  • Baselines, bandwidth changes in time-series,
    iperf, SNMP
  • Alerts for systems people
  • OS/Host metrics
  • Forecasts for Grid Middleware, e.g. replica
    manager, data placement
  • Engineering, planning, SLA (set verify),
    expectations
  • Also (not addressed here)
  • Security spot anomalies, intrusion detection
  • Accounting

3
WAN History
  • PingER (1994), IEPM-BW (2001), Netflow
  • E2E, active, regular, end user view,
  • all hosts owned by individual sites,
  • core mainly centrally designed developed
    (homogenous), contributions from FNAL, GATech,
    NIIT (close collaboration)
  • Why are you monitoring
  • network trouble management, planning,
    auditing/setting SLAs, Grid forecasting are very
    different though may use same measurements

4
PingER (1994)
  • PingER project originally (1995) for measuring
    network performance for US, Europe Japanese HEP
    community - now mainly RE sites
  • Extended this century to measure Digital Divide
  • Collaboration with ICTP Science Dissemination
    Unit http//sdu.ictp.it
  • ICFA/SCIC http//icfa-scic.web.cern.ch/ICFA-SCIC/
  • gt120 countries (99 worlds connected population)
  • gt35 monitor sites in 14 countries
  • Uses ubiquitous ping facility
  • Monitor 44 sites in S. Asia
  • Maybe most extensive active E2E monitoring in
    world

5
PingER Design Details
  • PingER Design (1994 no web services or RRD,
    security not a big thing, etc.)
  • Simple, no remote software (ping everywhere), no
    probe development, monitor host install 0.5 day
    effort for sys-admin
  • Data centrally gathered, archived, analyzed, so
    hard jobs (archiving, analysis, viz) do NOT
    require distribution, only one copy
  • Database flat ASCII files, rawdata, analyzed
    data, file/pair/day. Compressed saves factor 6
    (100GB)
  • Data available via web (lot of use, some uses
    unexpected, often analysis by Excel)

6
PingER Lessons
  • Measurement code rewritten twice, once to add
    extra data, once to document (perldoc) /
    parameterize / simplify installation
  • Gathering code (uses LYNX or FTP) pull from
    archive, no major mods in 10 years
  • Most of development for download, analyze data,
    viz, manage
  • New ways to use data jitter, out of order,
    duplicates, derive throughput, MOS all required
    study of data then implement and integrate
  • Dirty data (pathologies not related to network)
    require filtering or filling before analysis
  • Had to develop easy make/install download,
    instructions, FAQ, still new installs require
    communication
  • pre-reqs, getting name registration, getting cron
    jobs running, getting web server running,
    unblock, clarify documentation (often non-native
    English speakers)
  • Documentation (tutorials, help, FAQs), publicity
    (brochures, papers, maps, presentations/travel),
    get funding/proposals
  • Monitor availability of (developed tools to
    simplify/automate)
  • monitor sites (hosts stop working security
    blocks, hosts replaced, site forgets), nudge
    contacts
  • critical remote sites (beacons), choose new one
    (automatically updates monitor sites)
  • Validate/update meta data (name, address,
    institute, lat/long, contact ) in database (need
    easy update)

7
IEPM-BW (2001)
  • 40 target hosts in 13 countries
  • Bottlenecks vary from 0.5Mbits/s to 1Gbits/s
  • Traverse 50 AS, 15 major Internet providers
  • 5 targets at PoPs, rest at end sites
  • Added Sunnyvale for UltraLight
  • Covers all USATLAS tier 0, 1, 2 sites
  • Recently Added FZK, QAU
  • Main author (Connie Logg) retired

8
IEPM-BW Design Details
  • IEPM-BW (2001)
  • More focused (than PingER), fewer sites (e.g.
    BaBar collaborators), more intense, more probe
    tools (iperf, thrulay, pathload, traceroute,
    owamp, bbftp ), more flexibility
  • Complete code set (measurement, archive, analyze,
    viz) at each monitoring site. Data distributed.
  • Needs dedicated host
  • Remote sites need code installed
  • Originally executed remote via ssh, still needed
    code installed
  • Security, accounts (require training), recovery
    problems
  • Major changes with time
  • Use servers rather than ssh for remote hosts
  • Use mysql for configuration data bases rather
    than require perl scripts
  • Provide management tools for configuration data
    etc.
  • Add/replace probes

9
IEPM-BW Lessons (1)
  • Problems recommendations
  • Need right versions of mysql, gnuplot, perl (and
    modules) installed on hosts
  • All possible failure modes for probe tools need
    to be understood and accomodated
  • Timeout everything, clean up hung processes
  • Keep logfiles for day or so for debugging
  • Review how processes run with Netflow (mainly
    manual)
  • Scheduling
  • dont run file transfer, iperf, thrulay, pathload
    at same time on same path
  • Limit duration and frequency of intensive probes
    so do not impact network
  • Host loses disk, upgrades OS, loses DNS,
    applications upgraded (e.g. gnuplot), IEPM
    database zapped etc.
  • Need backup
  • Have a local host as target for sanity check
    (e.g. monitoring host based issues)
  • Monitor monitoring host load (e.g. Ganglia,
    Nagios)

10
IEPM-BW Lessons (2)
  • Different paths need different probes
    (performance and interest related)
  • Experiences with probes (lot of work to
    understand, analyze compare)
  • Owamp vs ping owamp needs server and accurate
    time ping only round trip available everywhere,
    may be blocked
  • Traceroute need to analyze significance of
    results
  • Packet pair separation
  • Abwe noisy, inaccurate especially on Gbps paths
  • Pathchirp better, pathload best (most intense
    approaches iperf), problems at 10Gbps, look at
    pathneck
  • TCP
  • thrulay more information, more manageable than
    iperf,
  • need to keep TCP buffers optimized/updated
  • File transfer
  • Disk to disk close to iperf/thrulay
  • disk measures file/disk system not network, but
    end user important
  • Adding new hosts still not easy

11
Other Lessons (1)
  • Traceroute no good for layers 2 1
  • Packet pair surpasses time granularity at 10Gbps
  • Net admin cannot review thousands of graphs each
    day
  • need event detection, alert notification, and
    diagnosis assistance
  • Comparing QoS vs best effort requires adding path
    reservation
  • Keeping TCP buffer parameters optimized difficult
  • Network configurations not static
  • Forecasting hard if path is congested, need to
    account for diurnal etc. variations

12
Examples of real data
Caltech thrulay
  • Misconfigured windows
  • New path
  • Very noisy

800
Mbps
0
Mar06
Nov05
UToronto miperf
  • Seasonal effects
  • Daily weekly

250
Mbps
0
Jan06
Nov05
Pathchirp
UTDallas
  • Some are seasonal
  • Others are not
  • Events may affect
  • multiple-metrics

120
thrulay
Mbps
0
iperf
Mar-10-06
Mar-20-06
  • Events can be caused by host or site congestion
  • Few route changes result in bandwidth changes
    (20)
  • Many significant events are not associated with
    route changes (50)

13
Netflow et. al.
  • Switch identifies flow by sce/dst ports, protocol
  • Cuts record for each flow
  • src, dst, ports, protocol, TOS, start, end time
  • Collect records and analyze
  • No intrusive traffic, real traffic,
    collaborators, applications
  • No accounts/pwds/certs/keys
  • No reservations etc
  • Characterize traffic top talkers, applications,
    flow lengths etc.
  • May be able to use for forecasting for some sites
    and event detection (also security wants it)
  • LHC-OPN requires edge routers to provide Netflow
    data
  • Internet 2 backbone
  • http//netflow.internet2.edu/weekly/
  • SLAC
  • www.slac.stanford.edu/comp/net/slac-netflow/html/S
    LAC-netflow.html

14
Netflow limitations
  • Can be a lot of data to collect each day, need
    lots cpu
  • Hundreds of MBytes to GBytes
  • Use of dynamic ports makes harder to detect app.
  • GridFTP, bbcp, bbftp can use fixed ports (but may
    not)
  • P2P often uses dynamic ports
  • Discriminate type of flow based on headers (not
    relying on ports)
  • Types bulk data, interactive
  • Discriminators inter-arrival time, length of
    flow, packet length, volume of flow
  • Use machine learning/neural nets to cluster flows
  • E.g. http//www.pam2004.org/papers/166.pdf
  • Aggregation of parallel flows (needs care, but
    not difficult)

15
NG PerfSONAR
  • Our future focus (for us 3rd Generation), NSF
    proposal
  • Open source, open community
  • Both end users (LHC, GATech, SLAC, Delaware) and
    network providers (ESnet, I2, GEANT, Eu NRENs,
    Brazil, ), achieve critical mass?
  • Many developers from multiple fields
  • Requires from the get go shared code,
    documentation, collaboration
  • Hopefully not as dependent on funding as a single
    team, so persistent?
  • Transparent gathering and storage of measurement,
    both from NRENs and end users
  • Sharing of information across autonomous domains
  • Uses standard formats
  • More comprehensive view
  • AAA to provide protection for of sensitive data
  • Reduces debugging time
  • Access to multiple components of the path
  • No need to play telephone tag
  • Currently mainly middleware, needs
  • Data mining and viz
  • Topology also at layers 1 2
  • Forecasting
  • Event detection and event diagnosis

16
Challenges
  • Probe tools fail at gt 1Gbps
  • Dedicated circuits, QoS, Layers 2 1
  • Impact of new (e.g. TOS NICs)
  • Auto event detection, alerts, diagnosis
  • Integrate passive active
  • Tie in end systems, file/disk systems, Apps
  • Sustainability (funding disappears)
  • Factorise components (measure, archive, analyze)
    AND tasks
  • Provide standard, published interfaces
  • Engage community (multiple developers, users,
    providers has its own challenges)

17
Questions, More information
  • Comparisons of Active Infrastructures
  • www.slac.stanford.edu/grp/scs/net/proposals/infra-
    mon.html
  • Some active public measurement infrastructures
  • www-iepm.slac.stanford.edu/
  • www-iepm.slac.stanford.edu/pinger/
  • www.slac.stanford.edu/grp/scs/net/talk06/IEPM-BW2
    0Deployment.ppt
  • e2epi.internet2.edu/owamp/
  • amp.nlanr.net/ (no longer available)
  • Monitoring tools
  • www.slac.stanford.edu/xorg/nmtf/nmtf-tools.html
  • www.caida.org/tools/
  • Google for iperf, thrulay, bwctl, pathload,
    pathchirp
  • Event detection
  • www.slac.stanford.edu/grp/scs/net/papers/noms/noms
    14224-122705-d.doc

18
More Slides
19
Active E2E Monitoring
20
E.g. Using Active IEPM-BW measurements
  • Focus on high performance for a few hosts needing
    to send data to a small number of collaborator
    sites, e.g. HEP tiered model
  • Makes regular measurements with probe tools
  • ping (RTT, connectivity), owamp (1 way delay)
    traceroute (routes)
  • pathchirp, pathload (available bandwidth)
  • iperf (one multi-stream), thrulay, (achievable
    throughput)
  • supports bbftp, bbcp (file transfer applications,
    not network)
  • Looking at GridFTP but complex requiring renewing
    certificates
  • Choice of probes depends on importance of path,
    e.g.
  • For major paths (tier 0, 1 some 2) use full
    suite
  • For tier 3 use just ping and traceroute
  • Running at major HEP sites CERN, SLAC, FNAL,
    BNL, Caltech, Taiwan, SNV to about 40 remote
    sites
  • http//www.slac.stanford.edu/comp/net/iepm-bw.slac
    .stanford.edu/slac_wan_bw_tests.html

21
IEPM-BW Measurement Topology
  • 40 target hosts in 13 countries
  • Bottlenecks vary from 0.5Mbits/s to 1Gbits/s
  • Traverse 50 AS, 15 major Internet providers
  • 5 targets at PoPs, rest at end sites

Taiwan
  • Added Sunnyvale for UltraLight
  • Adding FZK Karlsruhe

TWaren
22
Top page
23
Probes Ping/traceroute
  • Ping still useful
  • Is path connected/node reachable?
  • RTT, jitter, loss
  • Great for low performance links (e.g. Digital
    Divide), e.g. AMP (NLANR)/PingER (SLAC)
  • Nothing to install, but blocking
  • OWAMP/I2 similar but One Way
  • But needs server installed at other end and good
    timers
  • Now built into IEPM-BW
  • Traceroute
  • Needs good visualization (traceanal/SLAC)
  • No use for dedicated ? layer 1 or 2
  • However still want to know topology of paths

24
Probes Packet Pair Dispersion
  • Used by pathload, pathchirp, ABwE available bw
  • Send packets with known separation
  • See how separation changes due to bottleneck
  • Can be low network intrusive, e.g. ABwE only 20
    packets/direction, also fast lt 1 sec
  • From PAM paper, pathchirp more accurate than
    ABwE, but
  • Ten times as long (10s vs 1s)
  • More network traffic (factor of 10)
  • Pathload factor of 10 again more
  • http//www.pam2005.org/PDF/34310310.pdf
  • IEPM-BW now supports ABwE, Pathchirp, Pathload

25
BUT
  • Packet pair dispersion relies on accurate timing
    of inter packet separation
  • At gt 1Gbps this is getting beyond resolution of
    Unix clocks
  • AND 10GE NICs are offloading function
  • Coalescing interrupts, Large Send Receive
    Offload, TOE
  • Need to work with TOE vendors
  • Turn off offload (Neterion supports multiple
    channels, can eliminate offload to get more
    accurate timing in host)
  • Do timing in NICs
  • No standards for interfaces
  • Possibly use packet trains, e.g. pathneck

26
Achievable Throughput
  • Use TCP or UDP to send as much data as can memory
    to memory from source to destination
  • Tools iperf (bwctl/I2), netperf, thrulay (from
    Stas Shalunov/I2), udpmon
  • Pseudo file copy Bbcp also has memory to memory
    mode to avoid disk/file problems

27
BUT
  • At 10Gbits/s on transatlantic path Slow start
    takes over 6 seconds
  • To get 90 of measurement in congestion avoidance
    need to measure for 1 minute (5.25 GBytes at
    7Gbits/s (todays typical performance)
  • Needs scheduling to scale, even then
  • Its not disk-to-disk or application-to
    application
  • So use bbcp, bbftp, or GridFTP

28
AND
  • For testbeds such as UltraLight, UltraScienceNet
    etc. have to reserve the path
  • So the measurement infrastructure needs to add
    capability to reserve the path (so need API to
    reservation application)
  • OSCARS from ESnet developing a web services
    interface (http//www.es.net/oscars/)
  • For lightweight have a persistent capability
  • For more intrusive, must reserve just before make
    measurement

29
Visualization Forecasting in Real World
30
Examples of real data
Caltech thrulay
  • Misconfigured windows
  • New path
  • Very noisy

800
Mbps
0
Mar06
Nov05
UToronto miperf
  • Seasonal effects
  • Daily weekly

250
Mbps
0
Jan06
Nov05
Pathchirp
UTDallas
  • Some are seasonal
  • Others are not
  • Events may affect
  • multiple-metrics

120
thrulay
Mbps
0
iperf
Mar-10-06
Mar-20-06
  • Events can be caused by host or site congestion
  • Few route changes result in bandwidth changes
    (20)
  • Many significant events are not associated with
    route changes (50)

31
Scattter plots histograms
Scatter plots quickly identify correlations
between metrics
Thrulay
Pathchirp
Iperf
Thrulay (Mbps)
RTT (ms)
Pathchirp iperf (Mbps)
Throughput (Mbits/s)
Pathchirp
Thrulay
Histograms quickly identify variability or
multimodality
32
Changes in network topology (BGP) can result in
dramatic changes in performance
Hour
Samples of traceroute trees generated from the
table
Los-Nettos (100Mbps)
Remote host
Snapshot of traceroute summary table
Notes 1. Caltech misrouted via Los-Nettos
100Mbps commercial net 1400-1700 2. ESnet/GEANT
working on routes from 200 to 1400 3. A
previous occurrence went un-noticed for 2
months 4. Next step is to auto detect and notify
Drop in performance (From original path
SLAC-CENIC-Caltech to SLAC-Esnet-LosNettos
(100Mbps) -Caltech )
Back to original path
Dynamic BW capacity (DBC)
Changes detected by IEPM-Iperf and AbWE
Mbits/s
Available BW (DBC-XT)
Cross-traffic (XT)
Esnet-LosNettos segment in the path (100 Mbits/s)
ABwE measurement one/minute for 24 hours Thurs
Oct 9 900am to Fri Oct 10 901am
33
On the other hand
  • Route changes may affect the RTT (in yellow)
  • Yet have no noticeable effect on on available
    bandwidth or throughput

Available Bandwidth
Achievable Throughput
Route changes
34
However
  • Elegant graphics are great to understand problems
    BUT
  • Can be thousands of graphs to look at (many site
    pairs, many devices, many metrics)
  • Need automated problem recognition AND diagnosis
  • So developing tools to reliably detect
    significant, persistent changes in performance
  • Initially using simple plateau algorithm to
    detect step changes

35
Seasonal Effects on events
  • Change in bandwidth (drops) between 1900 2200
    Pacific Time (700-1000am PK time)
  • Causes more anomalous events around this time

36
Forecasting
  • Over-provisioned paths should have pretty flat
    time series
  • Short/local term smoothing
  • Long term linear trends
  • Seasonal smoothing
  • But seasonal trends (diurnal, weekly need to be
    accounted for) on about 10 of our paths
  • Use Holt-Winters triple exponential weighted
    moving averages

37
Experimental Alerting
  • Have false positives down to reasonable level
    (few per week), so sending alerts to developers
  • Saved in database
  • Links to traceroutes, event analysis, time-series

38
Passive
  • Active monitoring
  • Pro regularly spaced data on known paths, can
    make on-demand
  • Con adds data to network, can interfere with
    real data and measurements
  • What about Passive?

39
Netflow et. al.
  • Switch identifies flow by sce/dst ports, protocol
  • Cuts record for each flow
  • src, dst, ports, protocol, TOS, start, end time
  • Collect records and analyze
  • Can be a lot of data to collect each day, needs
    lot cpu
  • Hundreds of MBytes to GBytes
  • No intrusive traffic, real traffic,
    collaborators, applications
  • No accounts/pwds/certs/keys
  • No reservations etc
  • Characterize traffic top talkers, applications,
    flow lengths etc.
  • LHC-OPN requires edge routers to provide Netflow
    data
  • Internet 2 backbone
  • http//netflow.internet2.edu/weekly/
  • SLAC
  • www.slac.stanford.edu/comp/net/slac-netflow/html/S
    LAC-netflow.html

40
Typical days flows
  • Very much work in progress
  • Look at SLAC border
  • Typical day
  • 28K flows/day
  • 75 sites with gt 100KB bulk-data flows
  • Few hundred flows gt GByte
  • Collect records for several weeks
  • Filter 40 major collaborator sites, big (gt
    100KBytes) flows, bulk transport apps/ports
    (bbcp, bbftp, iperf, thrulay, scp, ftp )
  • Divide by remote site, aggregate parallel streams
  • Look at throughput distribution

41
Netflow et. al.
  • Peaks at known capacities and RTTs
  • RTTs might suggest windows not optimized, peaks
    at default OS window size(BWWindow/RTT)

42
How many sites have enough flows?
  • In May 05 found 15 sites at SLAC border with gt
    1440 (1/30 mins) flows
  • Maybe Enough for time series forecasting for
    seasonal effects
  • Three sites (Caltech, BNL, CERN) were actively
    monitored
  • Rest were free
  • Only 10 sites have big seasonal effects in
    active measurement
  • Remainder need fewer flows
  • So promising

43
Mining data for sites
  • Real application use (bbftp) for 4 months
  • Gives rough idea of throughput (and confidence)
    for 14 sites seen from SLAC

44
Multi months
Bbcp throughput from SLAC to Padova
  • Bbcp SLAC to Padova
  • Fairly stable with time, large variance
  • Many non network related factors

45
Netflow limitations
  • Use of dynamic ports makes harder to detect app.
  • GridFTP, bbcp, bbftp can use fixed ports (but may
    not)
  • P2P often uses dynamic ports
  • Discriminate type of flow based on headers (not
    relying on ports)
  • Types bulk data, interactive
  • Discriminators inter-arrival time, length of
    flow, packet length, volume of flow
  • Use machine learning/neural nets to cluster flows
  • E.g. http//www.pam2004.org/papers/166.pdf
  • Aggregation of parallel flows (needs care, but
    not difficult)
  • Can use for giving performance forecast
  • Unclear if can use for detecting steps in
    performance

46
Conclusions
  • Some tools fail at higher speeds
  • Throughputs often depend on non-network factors
  • Host interface speeds (DSL, 10Mbps Enet,
    wireless), loads, resource congestion
  • Configurations (window sizes, hosts, number of
    parallel streams)
  • Applications (disk/file vs mem-to-mem)
  • Looking at distributions by site, often
    multi-modal
  • Predictions may have large standard deviations
  • Need automated assist to diagnose events

47
In Progress
  • Working on Netflow viz (currently at BNL SLAC)
    then work with other LHC sites to deploy
  • Add support for pathneck
  • Look at other forecasters e.g. ARMA/ARIMA, maybe
    Kalman filters, neural nets
  • Working on diagnosis of events
  • Multi-metrics, multi-paths
  • Signed collaborative agreement with Internet2 to
    collaborate with PerfSONAR
  • Provide web services access to IEPM data
  • Provide analysis forecasting and event detection
    to PerfSONAR data
  • Use PerfSONAR (e.g. router) data for diagnosis
  • Provide viz of PerfSONAR route information
  • Apply to LHCnet
  • Look at layer 1 2 information
Write a Comment
User Comments (0)
About PowerShow.com