CSE 524: Lecture 12 - PowerPoint PPT Presentation

1 / 49
About This Presentation
Title:

CSE 524: Lecture 12

Description:

TL: TCP Overview RFCs: 793, 1122, 1323, 2018, ... Must avoid overlap with earlier incarnation ... Server thinks new incarnation is the same as old connection. 10 ... – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 50
Provided by: wuch
Category:

less

Transcript and Presenter's Notes

Title: CSE 524: Lecture 12


1
CSE 524 Lecture 12
  • Transport Layer (Part 3)

2
Transport Layer
  • Last class
  • CIDR exam question
  • Specific transport layers
  • UDP
  • This class
  • TCP

3
TL TCP and Transport Layer Functions
  • Demux to upper layer
  • Quality of service
  • Security
  • Delivery semantics
  • Flow control
  • Congestion control
  • Reliable data transfer

4
TL TCP Overview RFCs 793, 1122, 1323, 2018,
2581
  • point-to-point
  • one sender, one receiver
  • reliable, in-order byte steam
  • no message boundaries
  • pipelined
  • TCP congestion and flow control set window size
  • send receive buffers
  • full duplex data
  • bi-directional data flow in same connection
  • MSS maximum segment size
  • connection-oriented
  • handshaking (exchange of control msgs) inits
    sender, receiver state before data exchange
  • protocol implemented at ends (fate-sharing)
  • flow and congestion controlled
  • sender will not overwhelm receiver or network

5
TL TCP header
URG urgent data (generally not used)
counting by bytes of data (not segments!)
ACK ACK valid
PSH push data now (generally not used)
bytes rcvr willing to accept
RST, SYN, FIN connection estab (setup,
teardown commands)
Internet checksum (as in UDP)
6
TL TCP connections
  • TCP sender, receiver establish connection
    before exchanging data segments
  • initialize TCP variables
  • Initial sequence s
  • Buffers, flow control info (e.g. RcvWindow)
  • Window scaling
  • client connection initiator
  • server contacted by client
  • Java API
  • Socket clientSocket new Socket("hostname","po
    rt) Socket connectionSocket
    welcomeSocket.accept()

7
TL TCP connections
  • Three way handshake
  • Step 1 client end system sends TCP SYN control
    segment to server
  • specifies initial seq
  • should be random to prevent spoofing (
    http//www.rfc-editor.org/rfc/rfc1948.txt )
  • Step 2 server end system receives SYN, replies
    with SYNACK control segment
  • ACKs received SYN
  • allocates buffers
  • specifies server-gt receiver initial seq.
  • Step 3 client receives SYNACK control segment,
    replies with ACK and potentially data
  • ACKs received SYNACK
  • goes to established state

8
TL TCP Connection Establishment
  • A and B must agree on initial sequence number
    selection
  • 3-way handshake

A
B
SYN Seq A
SYNACK-A Seq B
ACK-B
9
TL TCP Sequence Number Selection
  • Why not simply chose 0?
  • Must avoid overlap with earlier incarnation
  • Client machine seq 0, initiates connection to
    server with seq 0.
  • Client sends one byte and machine crashes
  • Client reboots and initiates connection again
  • Server thinks new incarnation is the same as old
    connection

10
TL TCP Sequence Number Selection
  • Why is selecting a random ISN Important?
  • Suppose machine X selects ISN based on
    predictable sequence
  • Fred has .rhosts to allow login to X from Y
  • Evil Ed attacks
  • Disables host Y denial of service attack
  • Make a bunch of connections to host X
  • Determine ISN pattern a guess next ISN
  • Fake pkt1 ltsrc Ygtltdst Xgt, guessed ISN
  • Fake pkt2 desired command
  • Attack popularized by K. Mitnick

11
TL TCP ISN selection and spoofing attacks
.rhosts Y
X
Ed
Y
12
TL TCP connection setup
CLOSED
active OPEN
create TCB Snd SYN
passive OPEN
CLOSE
create TCB
delete TCB
CLOSE
LISTEN
delete TCB
APP SEND
rcv SYN
SYN SENT
SYN RCVD
snd SYN
snd SYN ACK
rcv SYN
snd ACK
Rcv SYN, ACK
rcv ACK of SYN
Snd ACK
CLOSE
ESTAB
Send FIN
13
TL TCP connections
  • Data transfer for established connections using
    sequence numbers and sliding windows with
    cumulative ACKs
  • Seq. s
  • byte stream number of first byte in segments
    data
  • ACKs
  • seq of next byte expected from other side
  • cumulative ACK
  • duplicate acks sent when out-of-order packet
    received
  • See web trace
  • Java API
  • connectionSocket.receive()
  • clientSocket.send()

Host B
Host A
User types C
Seq42, ACK79, data C
host ACKs receipt of C, echoes back C
Seq79, ACK43, data C
host ACKs receipt of echoed C
Seq43, ACK80
simple telnet scenario
14
TL TCP connections
  • Closing a connection
  • Client-initiated close (reverse process for
    server-initiated close)
  • Java API clientSocket.close()
  • Step 1 client end system sends TCP FIN control
    segment to server
  • Step 2 server receives FIN, replies with ACK.
    Closes connection, sends FIN.

15
TL TCP connections
  • Step 3 client receives FIN, replies with ACK.
  • Enters timed wait - will respond with ACK to
    received FINs
  • Step 4 server, receives ACK. Connection closed.
  • Note with small modification, can handle
    simultaneous FINs.

client
server
closing
FIN
ACK
closing
FIN
ACK
timed wait
closed
closed
16
TL TCP Connection Tear-down
Sender
Receiver
FIN
FIN-ACK
Data write
Data ack
FIN
FIN-ACK
17
TL TCP Connection Tear-down
CLOSE
ESTAB
send FIN
CLOSE
rcv FIN
send FIN
send ACK
CLOSE WAIT
FIN WAIT-1
rcv FIN
rcv ACK
CLOSE
snd ACK
snd FIN
rcv FINACK
FIN WAIT-2
CLOSING
LAST-ACK
snd ACK
rcv ACK of FIN
rcv ACK of FIN
TIME WAIT
CLOSED
rcv FIN
Timeout2msl
snd ACK
delete TCB
18
TL Time Wait Issues
  • Cannot close connection immediately after
    receiving FIN
  • What if a new connection restarts and uses same
    sequence number?
  • Web servers not clients close connection first
  • Established ? Fin-Waits ? Time-Wait ? Closed
  • Why would this be a problem?
  • Time-Wait state lasts for 2 MSL
  • MSL is should be 120 seconds (is often 60s)
  • Servers often have order of magnitude more
    connections in Time-Wait

19
TL TCP connections
TCP server lifecycle
TCP client lifecycle
20
TL TCP Demux to upper layer
gathering data from multiple app processes,
enveloping data with header (later used for
demultiplexing)
32 bits
source port
dest port
other header fields
  • multiplexing/demultiplexing
  • based on sender, receiver port numbers, IP
    addresses
  • source, dest port s in each segment
  • recall well-known port numbers for specific
    applications
  • Servers wait on well known ports (/etc/services)

application data (message)
TCP/UDP segment format
21
TL TCP Demux to upper layer
Web client host C
server B
host A
port use simple telnet app
Web server B
Web client host A
port use Web server
22
TL TCP Flow control
  • TCP is a sliding window protocol
  • For window size n, can send up to n bytes without
    receiving an acknowledgement
  • When the data is acknowledged then the window
    slides forward
  • Each packet advertises a window size
  • Indicates number of bytes the receiver has space
    for
  • Original TCP always sent entire window
  • Congestion control now limits this

23
TL TCP Flow control
  • receiver explicitly informs sender of
    (dynamically changing) amount of free buffer
    space
  • RcvWindow field in TCP segment
  • sender keeps the amount of transmitted, unACKed
    data less than most recently received RcvWindow

sender wont overrun receivers buffers
by transmitting too much, too fast
RcvBuffer size or TCP Receive Buffer RcvWindow
amount of spare room in Buffer
receiver buffering
24
TL TCP Flow control
  • What happens if window is 0?
  • Receiver updates window when application reads
    data
  • What if this update is lost?
  • Deadlock
  • TCP Persist timer
  • Sender periodically sends window probe packets
  • Receiver responds with ACK and up-to-date window
    advertisement

25
TL TCP flow control enhancements
  • Problem (Clark, 1982)
  • If receiver advertises small increases in the
    receive window then the sender may waste time
    sending lots of small packets
  • What happens if window is small?
  • Small packet problem known as Silly window
    syndrome
  • Receiver advertises one byte window
  • Sender sends one byte packet (1 byte data, 40
    byte header 4000 overhead)

26
TL TCP flow control enhancements
  • Solutions to silly window syndrome
  • Clark (1982)
  • receiver avoidance
  • prevent receiver from advertising small windows
  • increase advertised receiver window by min(MSS,
    RecvBuffer/2)
  • Nagles algorithm (1984)
  • sender avoidance
  • prevent sender from unnecessarily sending small
    packets
  • http//www.rfc-editor.org/rfc/rfc896.txt
  • Inhibit the sending of new TCP segments when new
    outgoing data arrives from the user if any
    previously transmitted data on the connection
    remains unacknowledged
  • Allow only one outstanding small (not full sized)
    segment that has not yet been acknowledged
  • Works for idle connections (no deadlock)
  • Works for telnet (send one-byte packets
    immediately)
  • Works for bulk data transfer (delay sending)

27
TL TCP reliable data transfer
  • Segment integrity
  • Acknowledgement generation
  • Retransmission

28
TL TCP RDT segment integrity
  • Checksum included in header
  • Is it sufficient to just checksum the packet
    contents?
  • No, need to ensure correct source/destination
  • Pseudoheader portion of IP hdr that are
    critical
  • Checksum covers Pseudoheader, transport hdr, and
    packet body
  • Layer violation, redundant with parts of IP
    checksum

29
TL TCP RDT acks and timeouts
  • TCPs reliable data transfer approach
  • Cumulative acknowledgements
  • Receiver sends back the byte number it expects to
    receive next
  • Out of order packets generate duplicate
    acknowledgements
  • Receive 1, Ack 2
  • Receive 4, Ack 2
  • Receive 3, Ack 2
  • Receive 2, Ack 5
  • Retransmissions
  • Sender sends segment and sets a timer
  • Waits for an acknowledgement indicating segment
    was received
  • Send 1
  • Wait for Ack 2
  • No Ack 2 and timer expires
  • Send 1 again

30
TL TCP RDT acks and timeouts
event data received from application above
simplified sender, assuming
  • one way data transfer
  • no flow, congestion control

create, send segment
wait for event
event timer timeout for segment with seq y
wait for event
retransmit segment
event ACK received, with ACK y
ACK processing
31
TL TCP RDT acks and timeouts
00 sendbase initial_sequence number 01
nextseqnum initial_sequence number 02 03
loop (forever) 04 switch(event) 05
event data received from application above 06
create TCP segment with sequence
number nextseqnum 07 start timer for
segment nextseqnum 08 pass segment
to IP 09 nextseqnum nextseqnum
length(data) 10 event timer timeout for
segment with sequence number y 11
retransmit segment with sequence number y 12
compute new timeout interval for segment
y 13 restart timer for sequence
number y 14 event ACK received, with ACK
field value of y 15 if (y gt
sendbase) / cumulative ACK of all data up to y
/ 16 cancel all timers for
segments with sequence numbers lt y 17
sendbase y 18 19
else / a duplicate ACK for already
ACKed segment / 20 increment
number of duplicate ACKs received for y 21
if (number of duplicate ACKS received
for y 3) 22 / TCP
fast retransmit / 23 resend
segment with sequence number y 24
restart timer for segment y 25
26 / end of loop forever /
Simplified TCP sender
32
TL TCP delayed acknowledgements
  • Problem
  • In request/response programs, you send separate
    ACK and Data packets for each transaction
  • Delay ACK in order to send ACK back along with
    data
  • Solution
  • Dont ACK data immediately
  • Wait 200ms (must be less than 500ms why?)
  • Must ACK every other packet
  • Must not delay duplicate ACKs
  • Without delayed ACK 40 byte ack data packet
  • With delayed ACK data packet includes ACK
  • See web trace example
  • Extensions for asymmetric links
  • See later part of lecture

33
TL TCP ACK generation RFC 1122, RFC 2581
TCP Receiver action delayed ACK. Wait up to
500ms for next segment. If no next segment, send
ACK immediately send single cumulative ACK
send duplicate ACK, indicating seq. of next
expected byte immediate ACK if segment
starts at lower end of gap
Event in-order segment arrival, no
gaps, everything else already ACKed in-order
segment arrival, no gaps, one delayed ACK
pending out-of-order segment arrival higher-than-
expect seq. gap detected arrival of segment
that partially or completely fills gap
34
TL TCP retransmission
  • Wait at least one RTT before retransmitting
    packet
  • Importance of accurate RTT estimators
  • Estimator too low ? unneeded retransmissions
  • Estimator too high ? poor throughput, slow
    reaction to segment loss
  • RTT estimator must adapt to change in RTT
  • But not too fast, or too slow!
  • Backing off the retransmission timeout
  • Exponential backoff
  • Double retransmission timer interval after every
    loss until successful retransmission

35
TL TCP retransmission scenarios
Host A
Host B
Seq92, 8 bytes data
Seq100, 20 bytes data
Seq92 timeout
ACK100
ACK120
Seq100 timeout
Seq92, 8 bytes data
ACK120
premature timeout, cumulative ACKs
36
TL Initial Round-trip Estimator
  • Round trip times exponentially averaged
  • Recommended value for x 0.1-0.2
  • 0.125 for most TCPs
  • Influence of given sample decreases exponentially
    fast
  • Retransmit timer set to b RTT, where b 2
  • Every time timer expires, RTO exponentially
    backed-off
  • Like Ethernet
  • Not good at preventing spurious timeouts

EstimatedRTT (1-x)EstimatedRTT xSampleRTT
37
TL Jacobsons Retransmission Timeout
  • Key observation
  • At high loads round trip variance is high
  • Need larger safety margin with larger variations
    in RTT
  • Solution
  • Base RTO value on RTT and standard deviation
    (RRTT)

38
TL Jacobsons Retransmission Timeout
EstimatedRTT (1-x)EstimatedRTT xSampleRTT
  • Setting the timeout
  • EstimtedRTT plus safety margin
  • large variation in EstimatedRTT -gt larger safety
    margin

Timeout EstimatedRTT 4Deviation
Deviation (1-x)Deviation
xSampleRTT-EstimatedRTT
39
TL Retransmission Ambiguity
A
B
Original transmission
X
RTO
Sample RTT
retransmission
ACK
40
TL Karns algorithm
  • Accounts for retransmission ambiguity
  • If a segment has been retransmitted
  • Dont count RTT sample on ACKs for this segment
  • Keep backed off time-out for next packet
  • Reuse RTT estimate only after one successful
    transmission

41
TL Timer Granularity
  • Many TCP implementations set RTO in multiples of
    200,500,1000ms
  • Why?
  • Avoid spurious timeouts RTTs can vary quickly
    due to cross traffic
  • Make timers interrupts efficient

42
TL TCP Congestion Control
  • Motivated by ARPANET congestion collapse
  • Flow control, but no congestion control
  • Sender sends as much as the receiver resources
    will allow
  • Go-back-N on loss, burst out advertised window
  • Congestion control
  • Extending control to network resources
  • Underlying design principle packet conservation
  • At equilibrium, inject packet into network only
    when one is removed
  • Basis for stability of physical systems (fluid
    model)
  • Why was this not working before?
  • No equilibrium
  • Solved by self-clocking and congestion window
  • Spurious retransmissions
  • Solved by accurate RTO estimation (see earlier
    discussion)
  • Resource limitations prevent equilibrium
  • Solved by congestion window and congestion
    avoidance algorithms

43
TL TCP congestion control basics
  • Keep a congestion window, cwnd
  • Book calls this Congwin
  • Denotes how much network is able to absorb
  • Senders maximum window
  • Min (receivers advertised window, cwnd)
  • Senders actual window
  • Max window - unacknowledged segments

44
TL TCP Congestion Control
  • end-end control (no network assistance)
  • transmission rate limited by congestion window
    size, cwnd over segments

cwnd
  • w segments, each with MSS bytes sent in one RTT

45
TL TCP congestion control
  • two phases
  • slow start
  • congestion avoidance
  • important variables
  • cwnd
  • ssthresh defines threshold between two slow
    start phase, congestion control phase (Book calls
    this threshold)
  • useful reference
  • http//www.aciri.org/floyd/papers/sacks.ps.Z
  • probing for usable bandwidth
  • ideally transmit as fast as possible (cwnd as
    large as possible) without loss
  • increase cwnd until loss (congestion)
  • loss decrease cwnd, then begin probing
    (increasing) again

46
TL TCP slow start
  • Start the self-clocking behavior of TCP
  • Use acks to clock sending new data
  • Do not send entire advertised window in one shot

Pr
Pb
Sender
Receiver
Ab
As
Ar
47
TL TCP slow start
Host A
Host B
initialize cwnd 1 for (each segment ACKed)
cwnd until (loss event OR cwnd gt
ssthresh)
one segment
RTT
two segments
four segments
  • exponential increase (per RTT) in window size
  • Window actually increases to W in RTT log2(W)
  • Can overshoot window and cause packet loss

48
TL TCP slow start example
49
TL TCP slow start sequence plot
. . .
Sequence No
Time
Write a Comment
User Comments (0)
About PowerShow.com