Title: TCP/IP%20and%20Other%20Transports%20for%20High%20Bandwidth%20Applications%20Back%20to%20Basics
1TCP/IP and Other Transports for High Bandwidth
Applications Back to Basics
Richard Hughes-Jones The University of
Manchester www.hep.man.ac.uk/rich/ then
Talks then look for Brasov
2Structure of the Talks
- The aim is to give you a picture of how
researchers are using high performance networks
to support their work. - Back to Basics
- Simple Introduction to Networking
- TCP/IP on High Bandwidth Long Distance Networks
- But TCP/IP works !
- The effect of packet loss
- Advanced TCP Stacks
- Fairness
- Real Applications on Real Networks
- Disk-2-disk applications on real networks
- Memory-2-memory tests
- Transatlantic disk-2-disk at Gigabit speeds
- Remote Computing Farms
- The effect of distance
- Radio Astronomy e-VLBI
- Thanks for allowing me to use their slides to
- Sylvain Ravot CERN, Les Cottrell SLAC, Brian
Tierney LBL, Robin Tasker DL
3- Simple Introduction to Networking
4What is a Protocol Stack ?
- ISO OSI (Open Systems Interconnection) Seven
Layer Model defines a framework allowing
development of real network protocols - A layer
- performs unique and specific tasks
- only has knowledge of those layers immediately
above and below - uses services of layer below, and provides
services to layer above - the services defined by a layer are
implementation independent its a definition of
how things work - conceptually communicates with its peer in the
remote system
5The Layering Principle
- Encapsulation
- Each protocol layer N adds a Header to the data
unit from layer N1 - Header contains control information
6What do the Layers do?
- Transport Layer acts as a go-between for the
user and network - Provides end-to-end data movement control
- Gives the level of reliability/integrity need by
the application - Can ensure a reliable service (which network
layer cannot), e.g. assigns sequence numbers to
identify lost packets - Network Layer deals with logical addressing
the transmission of packets, mechanism for
routing. - Data Link Layer provides the synchronization and
error checking for the data transmitted over a
single physical link (may ensure correct
delivery of frames) - ??Going down fits packets from the network layer
above into frames. - ??Going up Groups bits from the physical layer
into frames. - Physical Layer concerned with the transmission
of individual bits.
7How do the IP Protocols fit together?
TFTP RFC 783
File Transfer Protocol (FTP) RFC 559
ssh
ping
SNMP RFC 1157
TELNET RFC 854
Application
DNS
traceroute
Simple Mail Transfer Protocol (SMTP) RFC 821
DNS
( Presentation
NFS RFC 1024, 1057 and 1094
Session)
HTTP
POP3/IMAP
Internet Control Message Protocol (ICMP) RFC 792
Routing OSPF, BGP
Transmission Control Protocol (TCP) RFC 793
User Datagram Protocol (UDP) RFC 768
Transport
Address Resolution Protocols ARP RFC 826 RARP
RFC 903
Internet Protocol IP RFC 791
Network
Network Interface Cards
Data Link
Ethernet Token Ring ISDN FDDI SMDS ATM
SDH/SONET xDSL
Transmission Mode
Physical
TP Copper Fibre Optic Satellite Microwave
DWDM CWDM etc
8Some of the IP Protocols
- Transmission Control Protocol. TCP provides
application programs access to the network using
a reliable, connection-oriented transport layer
service. - User Datagram Protocol. UDP provides unreliable,
connection-less delivery service using the IP
protocol to transport messages between machines.
It adds the ability to distinguish among multiple
destinations on a single host computer. - Internet Protocol. IP receives datagrams from the
upper-layer software and transmits it to the
destination host based upon a best effort,
connection-less delivery service. - Internet Control Message Protocol. ICMP allows
internet routers to transmit error messages and
test messages. - Internet Group Message Protocol. IGMP is used
with multicast to send UDP datagrams to multiple
hosts. - Address Resolution Protocol. ARP translates
between the 32 bit IP address and a 48 bit LAN
address. - Reverse Address Resolution Protocol. RARP
translates between the 48 bit LAN address and the
32 bit IP address.
9The Physical Layer 1 Ethernet
10The Link Layer 2 Ethernet Frame
Frame header
IP Datagram
FCS
12 bytes
Inter Frame Gap
Preamble, which is comprised of 56 bits of
alternating 0s and 1s. The preamble provides all
the nodes on the network a signal against which
to synchronize.
Start Frame delimiter, which marks the start of a
frame. The start frame delimiter is 8 bits long
with the pattern10101011
Media Access Control (MAC) Address Every Ethernet
network card has, built into its hardware, a
unique six-octet (48-bit) hexadecimal number that
differentiates it from all other Ethernet cards
in the universe. The DA and SA define the path
across the link
Length/Type field two octets long. If the value
lt 1500 (0x05dc hex) indicates the length of
data If the value gt 1500 indicates network-layer
protocol Ethernet Types
Data, the reason the frame exists. MTU Maximum
Transport Unit
Frame Check Sequence to protect the frame contents
11The Link Layer Ethernet VLANs
VLANS are logical networks built over the same
physical cable plant. Distinguishes Ethernet
frames between their logical networks using
VLAN header
VLAN is defined by the use of value 0x8100 in
the Type field location. The next two
octets are composed of the following three
fields User Priority field
This field is 3 bits in length and is used to
define the priority of the Ethernet frame.
This is utilized to define and deliver a class
of service Canonical
format indicator This is 1
bit in length. Just dont ask!!!
VLAN Identifier field
This field is 12 bits in length
and contains the VLAN identifier (VID)
of this frame.
The original Length/Type field will then
follow the inserted VLAN tag.
12The Network Layer 3 IP
- IP Layer properties
- Provides best effort delivery
- It is unreliable
- Packet may be lost
- Duplicated
- Out of order
- Connection less
- Provides logical addresses
- Provides routing
- Demultiplex data on protocol number
13The Internet datagram
Frame header
Transport
FCS
IP header
20 Bytes
14IP Datagram Format (cont.)
- Type of Service TOS now being used for QoS
- Total length length of datagram in bytes,
includes header and data - Time to live TTL specifies how long datagram
is allowed to remain in internet - Routers decrement by 1
- When TTL 0 router discards datagram
- Prevents infinite loops
- Protocol specifies the format of the data area
- Protocol numbers administered by central
authority to guarantee agreement, e.g. ICMP1,
TCP6, UDP17 - Source destination IP address (32 bits each)
contain IP address of sender and intended
recipient - Options (variable length) Mainly used to record
a route, or timestamps, or specify routing
15Internet Class-based addresses
- An Address looks like 192.168.22.123
- Class A large number of hosts, few networks
- 0nnnnnnn hhhhhhhh hhhhhhhh hhhhhhhh
- 7 network bits (0 and 127 reserved, so 126
networks), 24 host bits (gt 16M hosts/net) - Initial byte 1-127 (decimal)
- Class B medium number of hosts and networks
- 10nnnnnn nnnnnnnn hhhhhhhh hhhhhhhh
- 16,384 class B networks, 65,534 hosts/network
- Initial byte 128-191 (decimal)
- Class C large number of small networks
- 110nnnnn nnnnnnnn nnnnnnnn hhhhhhhh
- 2,097,152 networks, 254 hosts/network
- Initial byte 192-223 (decimal)
- Class D Multicast (See RFC 1112)
- 1110nnnn nnnnnnnn nnnnnnnn hhhhhhhh
- Initial byte 224-239 (decimal)
- Class E Reserved
- Initial byte 248-255 (decimal)
16The Transport Layer 4 UDP
- UDP Provides
- Connection less service over IP
- No setup teardown
- One packet at a time
- Minimal overhead high performance
- Provides best effort delivery
- It is unreliable
- Packet may be lost
- Duplicated
- Out of order
- Application is responsible for
- Data reliability
- Flow control
- Error handling
17UDP Datagram format
Frame header
Application data
FCS
IP header
UDP header
8 Bytes
- Source/destination port port numbers identify
sending receiving processes - Port number IP address allow any application on
Internet to be uniquely identified - Ports can be static or dynamic
- Static (lt 1024) assigned centrally, known as well
known ports - Dynamic
- Message length in bytes includes the UDP header
and data (min 8 max 65,535)
18The Transport Layer 4 TCP
- TCP RFC 768 RFC 1122 Provides
- Connection orientated service over IP
- During setup the two ends agree on details
- Explicit teardown
- Multiple connections allowed
- Reliable end-to-end Byte Stream delivery over
unreliable network - It takes care of
- Lost packets
- Duplicated packets
- Out of order packets
- TCP provides
- Data buffering
- Flow control
- Error detection handling
- Limits network congestion
19The TCP Segment Format
20 Bytes
20TCP Segment Format cont.
- Source/Dest port TCP port numbers to ID
applications at both ends of connection - Sequence number First byte in segment from
senders byte stream - Acknowledgement identifies the number of the
byte the sender of this segment expects to
receive next - Code used to determine segment purpose, e.g.
SYN, ACK, FIN, URG - Window Advertises how much data this station is
willing to accept. Can depend on buffer space
remaining. - Options used for window scaling, SACK,
timestamps, maximum segment size etc.
21The RTP Header Format
Frame header
Application data
FCS
IP header
UDP header
RTP header
2231
0
24
8
16
4
19
Hlen
Vers
Type of serv.
Total length
Identification
Flags
Fragment offset
TTL
Protocol
Header Checksum
Source IP address
Destination IP address
IP Options (if any)
Padding
23More Information
- Lectures, tutorials etc. on TCP/IP
- www.nv.cc.va.us/home/joney/tcp_ip.htm
- www.cs.pdx.edu/jrb/tcpip.lectures.html
- www.raleigh.ibm.com/cgi-bin/bookmgr/BOOKS/EZ306200
/CCONTENTS - www.cisco.com/univercd/cc/td/doc/product/iaabu/cen
tri4/user/scf4ap1.htm - www.cis.ohio-state.edu/htbin/rfc/rfc1180.html
- www.jbmelectronics.com/tcp.htm
- Encylopaedia
- http//www.freesoft.org/CIE/index.htm
- TCP/IP Resources
- www.private.org.il/tcpip_rl.html
- Understanding IP addresses
- http//www.3com.com/solutions/en_US/ncs/501302.htm
l - Configuring TCP (RFC 1122)
- ftp//nic.merit.edu/internet/documents/rfc/rfc1122
.txt - Assigned protocols, ports etc (RFC 1010)
- http//www.es.net/pub/rfcs/rfc1010.txt
/etc/protocols
24 25 26More Information Some URLs
- UKLight web site http//www.uklight.ac.uk
- MB-NG project web site http//www.mb-ng.net/
- DataTAG project web site http//www.datatag.org/
- UDPmon / TCPmon kit writeup http//www.hep.man
.ac.uk/rich/net - Motherboard and NIC Tests
- http//www.hep.man.ac.uk/rich/net/nic/GigEth_te
sts_Boston.ppt http//datatag.web.cern.ch/datata
g/pfldnet2003/ - Performance of 1 and 10 Gigabit Ethernet Cards
with Server Quality Motherboards FGCS Special
issue 2004 - http// www.hep.man.ac.uk/rich/
- TCP tuning information may be found
athttp//www.ncne.nlanr.net/documentation/faq/pe
rformance.html http//www.psc.edu/networking/p
erf_tune.html - TCP stack comparisonsEvaluation of Advanced
TCP Stacks on Fast Long-Distance Production
Networks Journal of Grid Computing 2004 - PFLDnet http//www.ens-lyon.fr/LIP/RESO/pfldnet200
5/ - Dante PERT http//www.geant2.net/server/show/nav.0
0d00h002
27tcpdump / tcptrace
- tcpdump dump all TCP header information for a
specified source/destination - ftp//ftp.ee.lbl.gov/
- tcptrace format tcpdump output for analysis
using xplot - http//www.tcptrace.org/
- NLANR TCP Testrig Nice wrapper for tcpdump and
tcptrace tools - http//www.ncne.nlanr.net/TCP/testrig/
- Sample use
- tcpdump -s 100 -w /tmp/tcpdump.out host
hostname - tcptrace -Sl /tmp/tcpdump.out
- xplot /tmp/a2b_tsg.xpl