Structured approach in trouble shooting - PowerPoint PPT Presentation

1 / 65
About This Presentation
Title:

Structured approach in trouble shooting

Description:

So takes 30ms extra (15ms one way) to send additional 1000 bytes, or 8000 bits ... no account for other traffic, treats all links on path, there and back, as one. ... – PowerPoint PPT presentation

Number of Views:45
Avg rating:3.0/5.0
Slides: 66
Provided by: mikew211
Category:

less

Transcript and Presenter's Notes

Title: Structured approach in trouble shooting


1
Structured approach in trouble shooting
  • Collect and analyze symptoms
  • Localize the problem
  • Isolate the trouble
  • Locate and correct the problem
  • Verify the fix

2
Network Baseline
  • network monitoring
  • by monitoring the day-to-day operating of the
    network to establish what is normalfor a
    network.
  • learn average traffic of the network
  • learn the peak traffic time, over the day, over
    the week, and over the month
  • learn the most and least application being used
  • identify the networks users that are most prone
    to difficulties
  • logs should be kept, so that administrator can
    compare the encounter problems with those
    baseline information showing what normal network
    operation should be

3
Document network problems
  • The more information an administrator have, the
    easier it should be to solve the problem
  • information can be collected by
  • Network Management tools
  • network analyzers
  • problems collected from clients
  • identification
  • preliminary information (who report, time,
    related to previous problem, location etc)
  • network information collected by network
    technician
  • list of action taken
  • summary (hardware, software, configuration, user
    problems)

4
Analyzing locate and fix
  • localize and locate problems
  • list all possible causes
  • generate a problem scenario based on knowledge
    and previous information
  • determine the most likely cause, by isolation and
    elimination
  • use diagnostic utilities built into the devices
    (e.g. NIC, routers, PC) to help to solve locate
    the problem
  • check physical and logical indicators (LEDs) for
    the status of devices
  • correct the problems
  • use replacement method to eliminate the possible
    causes
  • start from basic user, cable, patch cords,
    malfunction machines
  • verify the problems had disappeared

5
Focus Basics and Standard Tools
  • Solving network problems depends a lot on your
    understanding
  • Simple tools can tell you what you need to know
  • Example ping is incredibly useful!

6
Troubleshooting
  • Avoid it by
  • redundancy
  • documentation
  • training
  • Try quick fixes first
  • simple problems often have big effects
  • is the power on?
  • is the network cable plugged into the right
    socket? Is LED flashing?
  • has anything changed recently?
  • Change only one thing at a time
  • test thoroughly after the change
  • Be familiar with the system
  • maintain documentation
  • Be familiar with your tools
  • before trouble strikes

7
Troubleshooting Learn as you go
  • Study and be familiar with the normal behaviour
    of your network
  • Monitoring tools can tell you when things are
    wrong
  • if you know what things look like when they are
    right
  • Using tools such as Ethereal can help you
    understand
  • your network, and
  • TCP/IP better

8
Documentation
  • Maintain an inventory of equipment and software
  • a list mapping MAC addresses to machines can be
    very helpful
  • Maintain a change log for each major system,
    recording
  • each significant change
  • each problem with the system
  • each entry dated, with name of person who made
    the entry
  • Two categories of documentation
  • Configuration information
  • describes the system
  • use system tools to obtain a snapshot, e.g.,
    sysreport in Red Hat Linux
  • Procedural information
  • How to do things
  • use tools that automatically document what you
    are doing, e.g., script

9
Connectivity Testing Cabling
  • Label cables clearly at each end
  • Cable testers
  • ensure wired correctly, check
  • attenuation
  • length is it too long?
  • 100BaseT less than 100m
  • Is the activity light on the interface blinking?

10
Software tools ping
  • Most useful check of connectivity
  • Universal
  • If ping hostname, includes a rough check of DNS
  • Sends an ICMP (Internet Control Message Protocol)
    ECHO_REQUEST
  • Waits for an ICMP ECHO_REPLY
  • Most pings can display round trip time
  • Most pings can allow setting size of packet
  • Can use to make a crude measurement of throughput

11
ping Roughly Estimating Throughput
  • Example
  • ping with packet size 100 bytes, round-trip
    time 30ms
  • ping with packet size 1100 bytes, round-trip
    time 60ms
  • So takes 30ms extra (15ms one way) to send
    additional 1000 bytes, or 8000 bits
  • Throughput is roughly 8000 bits per 15ms, or
    about 540,000 bits per second
  • A very crude measurement no account for other
    traffic, treats all links on path, there and
    back, as one.

12
ping Roughly Estimating Throughput
  • This can be expressed as a simple formula

13
Multi Hop Paths
14
Throughput Measuring with ping 1
  • Measure throughput between two remote hosts may
    use tools like ping
  • ping two locations with two packet sizes (4 pings
    altogether, minimum)
  • Example

15
Throughput Measuring with ping 2
  • Time difference / 2 (round trip time (RTT) -gt one
    way)
  • Divide by size difference in bits 8000
  • Multiply by 1000 (ms -gt seconds)
  • Convert bps to Mbps

16
Throughput Measuring with ping 3
17
Throughput Measuring with ping 4
18
Throughput Measuring with ping 5
  • Completing calculation for throughput between
    205.153.61.1 and 205.153.61.2

19
How to Use ping?
  • Ensure local host networking is enabled first
    ping localhost, local IP address
  • ping a known host on local network
  • ping local and remote interfaces on router
  • ping by IP as well as by hostname if hostname
    ping fails
  • confirm DNS with nslookup (or dig) see later
  • Ping from more than one host

20
What ping Result is Good, Bad?
  • A steady stream of consistent replies indicates
    probably okay
  • Usually first reply takes longer due to ARP
    lookups at each router
  • After that, ARP results are cached
  • ICMP error messages can help understand results
  • Destination Network Unreachable indicates the
    host doing ping cannot reach the network
  • Destination Host Unreachable may come from
    routers further away

21
Ping Responses
  • On a Cisco router you will get the responses as
    to the right
  • Actual response is
  • routergtping www.yahoo.com
  • Translating "www.yahoo.com"...domain server
    (209.1.221.10) OK
  • Type escape sequence to abort. Sending 5,
    100-byte ICMP Echos to 216.115.102.81, timeout is
    2 seconds
  • !!!!!
  • Success rate is 100 percent (5/5), round-trip
    min/avg/max 4/16/24ms

22
Troubleshooting with ping (1)
  • Standard ping used to check the availability of
    a host
  • Ping ltip addressgt
  • Extended ping used to track packet loss or
    latency (sending out 1 ping per second until
    the process is halted by CTRL-C)
  • Unix / Linux
  • Ping s ltip addressgt
  • Windows
  • Ping t ltip addressgt

23
Troubleshooting with ping 2
  • Cisco router sends a fixed no. of packets as
    fast as it can and waits for response
  • routergtpingProtocol ipTarget IP address
    www.inetdaemon.comTranslating 209.1.221.10Repeat
    count 5 100Datagram size 100Timeout in
    seconds 2Extended commands nSweep range
    of sizes nType escape sequence to
    abort.Sending 100, 100-byte ICMP Echos to
    207.150.192.12, timeout is 2 seconds!!!!!!!!!!!!
    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!Success
    rate is 100 percent (100/100), round-trip
    min/avg/max 12/19/280 ms

24
Possible Causes of unable to ping
  • If you are trying to ping a name, try pinging the
    IP address of the destination machine. If ping
    fails when you try the name of the site, but
    works when you try the IP address, it's NOT a
    network problem, it's DNS.
  • If you are trying to ping a site and both the
    name and the IP address fail, they may be
    blocking ping with an access list. Try a
    traceroute instead.
  • If there are multiple hops between you and the
    destination, then try pinging each host in the
    path to the destination until you find the host
    that fails to respond to ping. Use a traceroute
    (successfully) to get a list of the hosts between
    you and the destination for this purpose.

25
What ping is not
  • Standard ping should not be used to prove the
    following
  • Routing problems
  • Latency
  • Packet loss
  • There are eight reasons why you cannot trust ICMP
    ping

26
1. Ping is end to end
  • Ping reveals nothing regarding the intermediate
    devices
  • Cannot be used for Routing

27
2. Platform differences
  • PCs, Unix machines and routers all handle ICMP
    and ping packets differently
  • Introduces false latency that does not occur with
    ordinary TCP or UDP data which is treated
    identically by all of the above platforms

28
3. Ping does not identify the host causing the
problem
  • If you ping www.yahoo.com and you think you see a
    problem at Yahoo, you have no way to know what
    the cause is without running additional tests
    (tracerourt, dig or nslookup)
  • A device in the middle of the path between you
    and Yahoo might be failing or over-utilised,
    making it appear that Yahoo is dropping packets
    when they are not and the reverse could be true
    too

29
4. Queuing and QoS
  • Routers can implement queuing strategies, forcing
    them to handle ICMP differently from TCP and UDP
  • Devices providing Q0S functions may also handle
    ICMP in a way that differs from the standards and
    specifications in order to optimise availability
    for TCP and UDP traffic
  • A QoS device might be programmed to drop 80 of
    all ICMP regardless of how much TCP or UDP
    traffic there is currently

30
5. Rate Limits
  • A host may have an artificial rate limit, or
    access-list imposed to reduce the effect of a
    possible future denial of service attack
  • This will artificially drop only the ICMP packets
    and leave the TCP and UDP packets untouched, i.e.
    100 of the TCP and UDP packets will get through
    even though there is loss seen with ICMP

31
6. Baseline Dependency
  • Most network administrators fail to do a 24-hour
    baseline performance evaluation before they buy
    bandwidth
  • When the Internet pipe hits maximum utilisation
    during peak hours the administrators panic and
    starts screaming at their service providers after
    running a ping or two to various remote sites and
    before checking their own networks

32
7. Local Network Issues
  • Momentary glitches in performance are normal
    occurrences on every network this is yet
    another reason for performing extensive
    baselining
  • In networks running OSPF, the entire network
    experiences latency every time the update timer
    ticks down to zero and the network is flooded
    with OSPF updates good baselining and network
    planning will help to avoid this
  • Ping can do nothing to identify the OSPF latency
    problem above, it will get totally random,
    unpredictable and therefore useless results

33
8. Bad Network Design
  • A bottleneck may be engineered into the network
    making everything on the far side of that
    connection appear to be slow
  • There is no physical failure and all equipment is
    functioning normally there simply isnt enough
    capacity
  • A bad ping result here is useless in this case as
    there is no faulty equipment

34
arping uses ARP requests
  • Limited to local network
  • Can work with MAC or IP addresses
  • use to probe for ARP entries in router (very
    useful!)
  • packet filtering
  • can block ICMP pings, but
  • won't block ARP requests

35
Path Discovery traceroute
  • Sends UDP packets
  • (Microsoft tracert sends ICMP packets)
  • increments Time to Live (TTL) in IP packet header
  • Sends three packets at each TTL
  • records round trip time for each
  • increases TTL until enough to reach destination

36
traceroute How it Works
  • As IP packets pass through each router, TTL in IP
    header is decremented
  • Packet is discarded when TTL decrements to 0
  • ROUTER sends ICMP TIME_EXCEEDED message back to
    traceroute host
  • When UPD packet reaches destination, gets ICMP
    PORT_UNREACHABLE, since uses an unused high UDP
    port

37
traceroute Limitations
  • Each router has a number of IP addresses
  • but traceroute only shows the one it used
  • get different addresses when run traceroute from
    other end
  • sometimes route is asymmetric
  • router may be configured to not send ICMP
    TIME_EXCEEDED messages
  • get stars instead of round-trip time in
    traceroute output

38
traceroute Example
  • Explain the functions that are performed by
    packets 1 to 26.

39
traceroute Example (2)
  • Packet 1 sends a DNS query to DNS server
    (140.112.254.4) to query IP address of
    www.csie.ntu.edu.tw
  •  Packet 2 sends back the IP address of
    www.csie.ntu.edu.tw which is 140.112.30.28
  • The host sends a UDP to 140.112.30.28 with
    time-to-live (TTL) set to 1. TTL decrements by 1
    when the packet passes a router. Here the TTL
    turns to 0, causing the first router
    (192.168.5.1) to send back an ICMP message
    Time-to-live exceeded.
  •  Packets 5 and 6 try to resolve the name of the
    first router but unsuccessful.
  •  Packets 7 to 10 repeat what packet 3 and 4
    did two more times so that different response
    time can be collected to calculate the average.

40
traceroute Example (3)
  • Packets 11 to 18 send UDP packet to
    140.112.30.28 with TTL set to 2. This time the
    packet managed to reach the second router
    (140.112.4.126) before it dies and causing the
    second router to send back an ICMP message
    Time-to-live exceeded. The same UDP packet is
    repeated two times to calculate the average
    response time.
  •  Packets 19 to 26 send UDP packet to
    140.112.30.28 with TTL set to 3. This time the
    packet managed to reach the third router
    (140.112.30.28) before it dies and since this
    third routers IP address matches the destination
    address of the UDP, the router return another
    ICMP message Destination unreachable because
    the destination port is deliberately selected to
    one that is normally not used (gt 3000). Name
    resolution is performed by packets 21 and 22
    successfully. The same packet is repeated two
    more time to calculate the average response time.

41
Performance Measurements delay
  • Three sources of delay
  • transmission delay time to put signal onto
    cable or media
  • depends on transmission rate and size of frame
  • propagation delay time for signal to travel
    across the media
  • determined by type of media and distance
  • queuing delay time spent waiting for
    retransmission in a router

42
Performance Measurements 2
  • bandwidth the transmission rate through the
    link
  • relates to transmission time
  • throughput amount of data that can be sent over
    link in given time
  • relates to all causes of delay
  • is not the same as bandwidth
  • Other measurements needed
  • i.e., for quality of service for multimedia

43
Using netstat tua to See Network Connections
  • netstat tua shows all network connections,
    including those listening
  • netstat tu shows only connections that are
    established
  • netstat i is like ifconfig, shows info and stats
    about each interface
  • netstat nr shows the routing table, like route
    n
  • Windows provides netstat also.

44
Traffic Measurements netstat -i
  • The netstat program can show statistics about
    network interfaces
  • Linux netstat shows lost packets in three
    categories
  • errors,
  • drops (queue full shouldnt happen!)
  • overruns (last data overwritten by new data
    before old data was read shouldnt happen!)
  • drops and overruns indicate faulty flow control
    bad!
  • These values are cumulative (since interface was
    up)
  • Could put a load on interface to see current
    condition, with ping l, to send large number of
    packets to destination
  • See the difference in values

45
Measuring Traffic netstat -i
  • Here we run netstat i on ictlab
  • netstat -i
  • Kernel Interface table
  • Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR
    TX-OK TX-ERR TX-DRP TX-OVR Flg
  • eth0 1500 0 407027830 0 0
    0 1603191764 0 0 3
    BMRU
  • lo 16436 0 2858402 0 0
    0 2858402 0 0 0
    LRU
  • Notice that of the 1.6 billion bytes transmitted,
    there were 3 overuns.
  • Next, blast the path you want to test with
    packets using ping l or the spray program, and
    measure again.

46
Issues with netstat -i
  • (CollisIerrsOerrs)/(IpktsOpkts) gt 2 network
    hardware problem
  • (Collis/Opkts) gt 10 Interface overloaded.
    Redistribute traffic to other interface or
    servers
  • (Ierrs/Ipkts) gt 25 host drops packets, network
    / servers overloaded
  • If gt 120 collisions/s network is overloaded
  • If sum of input and output packets is gt 600 for a
    10Mbps interface or 6000 for a 100Mbps interface,
    network segment is too busy

47
What is Packet Capture?
  • Real time collection of data as it travels over
    networks
  • Tools called
  • packet sniffers
  • packet analysers
  • protocol analysers, and sometimes even
  • traffic monitors
  • Sniffer, tcpdump, EtherealSee Ethereal Lab at
  • http//ictlab.tyict.vtc.edu.hk/tsangkt/snm/Tutori
    als/ethereal/

48
When Packet Capture?
  • Most powerful technique
  • When need to see what client and server are
    actually saying to each other
  • When need to analyse type of traffic on network
  • Requires understanding of network protocols to
    use effectively

49
Example
  • The following gives the contents of an
    Ethernet
  • frame captured by a protocol analyzer
  • Sequence Captured Bit stream
  • 0000 23 87 45 9A 43 88 34 CD 7E FF 34 62 08 00
    45 FF
  • 0010 12 34 23 76 40 00 64 06 CD AB 85 23 43 59
    85 23
  • 0020 43 5A 23 87 52 63 25 41 40 43 00 00 00 00
    FF 75
  • 0030 20 00 35 75 00 00 82 04 05 91 70 90
  • Given that Ethernet II frame format is
  • 8 6 6 2
    variable 4
  • Preamble Dest. Source Type
    Data FCS
  • Address Address
  • What are the source and destination MAC
    addresses?
  • What is the type of Ethernet data?

50
Example (contd)
  • If the frame contains an IP datagram with the
    above format, determine
  • The Ethernet type value and version no. of the IP
    protocol.
  • The source and destination IP addresses in
    dot-decimal notation.
  • What is the protocol type?

51
Troubleshooting Protocols
  • DNS
  • email
  • using telnet

52
DNS troubleshooting
  • Suspect DNS when get long timeouts before see any
    response
  • ping name, IP address, see if only IP address
    works
  • tools on Linux, Unix
  • nslookup,dig, host
  • tools on Windows
  • nslookup

53
nslookup an interactive program
Here a user asks nslookup to provide address of
sysadmin.no-ip.com nslookup displays the name
and address of the server used to resolve the
query, it then displays the answer to the query.
  • nslookup
  • gt sysadmin.no-ip.com
  • Server dns04.netvigator.com
  • Address 218.102.32.20853
  • Non-authoritative answer
  • Name sysadmin.no-ip.com
  • Address 202.69.77.139

54
nslookup reverse lookups
  • Maps IP address to hostname (PTR)
  • gt 202.69.77.139
  • Server dns04.netvigator.com
  • Address 218.102.32.20853
  • Non-authoritative answer
  • 139.77.69.202.in-addr.arpa name
    077-139.onebb.com.
  • Authoritative answers can be found from
  • 77.69.202.in-addr.arpa nameserver
    ns1.onebb.com.
  • 77.69.202.in-addr.arpa nameserver
    ns2.onebb.com.
  • ns1.onebb.com internet address 202.180.160.1
  • ns2.onebb.com internet address 202.180.161.1
  • gt

55
DNS Record Types
  • Type Name Function
  • Zone Records
  • SOA Start of Authority This name server
    is authoritative for this domain
  • NS Name Server Identifies the
    name server for this domain
  • Basic Records
  • A Address Name-to-address
    mappings
  • PTR Pointer Address-to-name mappings MX Mail
    Exchanger Makes mail routing decision
  • Optional Records
  • CNAME Canonical Name Nicknames for a host
  • HINFO Host Info Identifies hardware and OS

56
The SOA Record
  • Indicates that this name server is the best
    source of information for the data within this
    domain.
  • There is only one SOA record for each zone the
    zone continues until another SOA record is
    encountered.
  • Other secondary name servers within a domain are
    non-authoritative
  • Non-authoritative name server requests zone
    transfer periodically (refresh time) from the
    primary authoritative name server in a domain
    whenever the serial number is incremented.

57
SOA Record Example
  • _at_ IN SOA rusty.austin.edu admin.austin.edu
    ( 1 Serial
  • 10800 Refresh after 3 hours 3600
    Retry after 1 hour 604800 Expire after 1
    week 86400 ) Minimum TTL of 1 day
  • The symbol _at_ in the name field is a shorthand for
    the name of the current zone. In this example, it
    is the same as austin.edu
  • rusty.austin.edu is the zone's primary name
    server
  • admin.austin.edu is the email address (replace
    first dot with _at_) of the technical contact in
    charge of the data.

58
Email testing with telnet
  • Email protocols SMTP, POP3 are text
  • telnet a good tool to test them
  • syntax
  • telnet server portnumber
  • SMTP port 25
  • POP3 port 110

59
Test the VTC mail server
  • telnet smtp.vtc.edu.hk 25
  • Trying 192.168.79.191...
  • Connected to smtp.vtc.edu.hk (192.168.79.191).
  • Escape character is ''.
  • 220 pandora.vtc.edu.hk ESMTP Mirapoint 3.2.2-GA
    Tue, 25 Feb 2003 111530 0800 (HKT)
  • helo nickpc.tyict.vtc.edu.hk
  • 250 pandora.vtc.edu.hk Hello 172.19.32.30,
    pleased to meet you
  • mail fromltnicku_at_vtc.edu.hkgt
  • 250 ltnicku_at_vtc.edu.hkgt... Sender ok
  • rcpt toltnicku_at_vtc.edu.hkgt
  • 250 ltnicku_at_vtc.edu.hkgt... Recipient ok
  • data
  • 354 Enter mail, end with "." on a line by itself
  • My message body.
  • .
  • 250 AFF21826 Message accepted for delivery
  • quit
  • 221 pandora.vtc.edu.hk closing connection
  • Connection closed by foreign host.

60
SMTP commands for sending mail
  • helo identify your computer
  • mail from specify sender
  • rcpt to specify receiver
  • data indicates start of message body
  • quit terminate session
  • Use names, not IP addresses, to specify
    destination

61
Testing the VTC pop3 server 1
  • telnet pop.vtc.edu.hk 110
  • Trying 192.168.79.12...
  • Connected to pop.vtc.edu.hk (192.168.79.12).
  • Escape character is ''.
  • OK carme.vtc.edu.hk POP3 service (iPlanet
    Messaging Server 5.2 Patch 1 (built Aug 19 2002))
  • user nicku
  • OK Name is a valid mailbox
  • pass password
  • OK Maildrop ready
  • stat
  • OK 1 673

62
Testing the pop3 server 2
  • retr 1
  • OK 673 octets
  • Return-path ltnicku_at_vtc.edu.hkgt
  • Received from pandora.vtc.edu.hk
    (pandora.vtc.edu.hk 192.168.79.191)
  • by carme.vtc.edu.hk (iPlanet Messaging Server
    5.2 Patch 1 (built Aug 19 2002))
  • with ESMTP id lt0HAU00I35H3HGL_at_carme.vtc.edu.hkgt
    for nicku_at_ims-ms-daemon
  • (ORCPT nicku_at_vtc.edu.hk) Tue, 25 Feb 2003
    111629 0800 (CST)
  • Received from nickpc.tyict.vtc.edu.hk
    (172.19.32.30)
  • by pandora.vtc.edu.hk (Mirapoint
    Messaging Server MOS 3.2.2-GA)
  • with SMTP id AFF21826 Tue, 25 Feb 2003
    111601 0800 (HKT)
  • Date Tue, 25 Feb 2003 111530 0800 (HKT)
  • From Nick Urbanik ltnicku_at_vtc.edu.hkgt
  • Message-id lt200302250316.AFF21826_at_pandora.vtc.edu
    .hkgt
  • My message body.
  • .
  • dele 1
  • OK message deleted
  • quit

63
pop3 commands retrieving mail
  • See RFC 1939 for easy-to-read details
  • First, must authenticate
  • user username
  • pass password
  • stat shows number of messages and total size in
    bytes
  • list list all the message numbers and size in
    bytes of each message
  • retr messagenum retrieve the message with
    number messagenum
  • dele messagenum delete the message with message
    number messagenum
  • quit

64
telnet Testing Other Applications
  • Many network protocols are text. telnet can be
    helpful in checking
  • IMAP servers
  • telnet hostname 143
  • Web servers
  • telnet hostname 80
  • Ftp servers
  • telnet hostname 21
  • Even ssh (can check version, if responding)
  • telnet hostname 22

65
Conclusion
  • Check the simple things first
  • Document what you do
  • Become familiar with common tools
  • Use the tools to become familiar with your
    network before troubles strike
  • Know what is normal
Write a Comment
User Comments (0)
About PowerShow.com