TCP Server Fault Tolerance Using Connection Migration to a Backup Server - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

TCP Server Fault Tolerance Using Connection Migration to a Backup Server

Description:

Westminster, CO 80234. 1Department of Computer Science. University of Colorado, Campus Box 0430 ... Runs on the backup. Processes same client TCP stream ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 32
Provided by: manish86
Category:

less

Transcript and Presenter's Notes

Title: TCP Server Fault Tolerance Using Connection Migration to a Backup Server


1
TCP Server Fault Tolerance Using Connection
Migration to a Backup Server
  • Manish Marwah1,2 Shivakant Mishra1
    Christof Fetzer3

3ATT Labs-Research 180 Park Avenue Florham
Park, NJ 07932
2Avaya Labs 1300 W 120th Avenue Westminster, CO
80234
1Department of Computer Science University of
Colorado, Campus Box 0430 Boulder, CO 80309
IEEE International Conference on Dependable
Systems and Networks 2003, San Francisco, June
22-25, 2003
2
Outline
  • Introduction
  • ST-TCP Architecture
  • ST-TCP Protocol Details
  • ST-TCP System Architecture
  • Experimental Results
  • Conclusions
  • Future Work

3
Introduction
  • TCP is hugely popular
  • Used in numerous applications
  • Provides a rich set of features
  • However, TCP does not provide server
    fault-tolerance

4
TCP Server Fault Tolerance
  • Consider a TCP based client-server application
  • Server failure gt TCP connection failure gt
    Application session failure
  • A backup server provides fast service restoration
  • For application session restoration, in addition,
    an application level recovery protocol is
    necessary
  • Custom SW on clients is required

5
ST-TCP (Server fault Tolerant TCP)
  • We propose ST-TCP
  • A light-weight active primary-backup server
    fault tolerance mechanism, implemented at the TCP
    layer, for fast, transparent failover of a TCP
    connection to a backup server

6
ST-TCP Design Principles
  • No changes required in the client
  • No changes required in the server application
  • Fast and transparent failover
  • Behavior exactly the same as standard TCP
  • Minimal overhead during normal operation
  • Simple to implement minimal kernel changes
  • Assumes primary and backup on same LAN and
    deterministic application

7
ST-TCP Architecture
  • Architecture Overview
  • Ethernet Tapping Architecture

8
Architecture Overview
  • Active Primary-Backup System
  • Application Replica
  • Ethernet Tapping
  • Backup receives same byte stream as primary
  • Primary backup heartbeat

9
Architecture Overview (cont.)
  • Application Replica
  • Runs on the backup
  • Processes same client TCP stream
  • Produces same TCP stream (as primary) for the
    client
  • Suppresses output TCP stream during normal
    operation

10
Architecture Overview (cont.)
  • Failure Detection and Recovery
  • Primary and backup exchange heartbeats (HB) for
    failure detection
  • Backup takes over the client-primary TCP
    connection if it detects that the primary has
    failed
  • To prevent false positives, backup turns off the
    power to the primary before taking over

11
Ethernet Tapping Architecture
  • Option 1 Promiscuous Mode

Hub/Switch
  • Works for Hubs
  • For switches
  • Replicate all primary port traffic on the backup
    port, or
  • Make sure that switch does not learn MAC address
    of primary

12
Ethernet Tapping Architecture (cont.)
  • Option 2 Multicast Ethernet Addr
  • Create virtual NICs with service IP addr (SVI) on
    primary and backup
  • Associate a multicast Ethernet addr (SME) to both
    these virtual interfaces
  • Configure a static ARP entry in the router
    mapping SVI to SME

13
ST-TCP Protocol Details
  • Initialization
  • Failure Free Period
  • Primary-Backup Synchronization
  • Failure Detection and Recovery

14
Initialization
  • Replica application started on backup
  • Uses same port number as primary
  • Uses same sequence numbers

Client
ST-TCP Server
SYN
SYN/ACK
ACK
Backup syncs sequence numbers with primary
15
Failure Free Period
  • Backup drops all TCP segments destined for the
    client
  • Client Acks destined for the primary serve as
    Acks for backup as well
  • Primary-Backup exchange heartbeat (HB) messages
    on a UDP connection
  • This connection is also used by the backup for
    sending sequence number of the latest client
    bytes received

16
Primary-Backup Synchronization
  • Problem - Backup may miss bytes
  • Primary discards client bytes only after the
    backup server has received them
  • Uses additional receive buffer space
  • Backup periodically informs Primary of latest
    client bytes
  • Monitors primary-client segments to determine if
    it missed any bytes
  • Asks primary for missing bytes

17
Primary TCP Receive Buffer Management
Standard TCP Receive Buffer
ST-TCP Receive Buffer
  • Additional buffer and standard buffer independent
  • Only kernel change in the primary

18
Failure Detection and Recovery
  • Failure detection of the primary by the backup
    involves monitoring the primary-backup heartbeat
    (HB)
  • Failure detection of the backup by the primary
    involves monitoring
  • the primary-backup heartbeat (HB)
  • status of the primary additional receive buffer

19
System Architecture
  • No Single point of failure

20
Experimental Results
  • Setup
  • Applications
  • Performance Results

21
Experimental Setup
800 MHz AMD Athlon, 512KB cache, 256 MB RAM,
10/100 Mbps NIC Linux 2.2.18
Primary
10/100 Mbps Hub
Backup
800 MHz AMD Athlon, 512KB cache, 256 MB RAM,
10/100 Mbps NIC Linux 2.2.18
900 MHz Pentium III, 10/100 Mbps NIC Linux
2.4.9 (can be any OS)
Client
22
Applications
  • Performance of ST-TCP is measured with simulated
    applications representing various communication
    characteristics
  • Three such applications are considered
  • Echo
  • Client sends 150 bytes Server echos back
    (rlogin/telnet)
  • Interactive
  • Client sends 150 bytes Server responds with 10
    kbytes (http)
  • Bulk Transfer
  • Client sends 150 bytes Server responds with
    large data transfer (1, 5, 20, 100 MB) (ftp)
  • Experiments are run with varying heartbeat
    intervals (5 s to 50 ms)

23
Performance Results
  • Two quantities are measured
  • Comparison of ST-TCP with standard TCP during
    failure free period

No significant overhead of using ST-TCP!
  • Failover Times
  • Failure detection time
  • How much the backup TCP has backed off

24
Failover Times Echo
25
Failover Times Interactive
26
Failover Times Bulk Transfer
27
Conclusions
  • ST-TCP extends TCP to tolerate server failures
  • Hot backup server, tapping architecture minimizes
    overhead
  • No changes to the client, server application
  • Negligible impact during failure free periods
  • Insignificant performance overhead, minimal
    impact of backup on primary, no bandwidth impact
  • No deviation from standard TCP
  • Fast failover (few hundred ms), completely
    transparent to the client
  • Logger for double failure scenarios
  • Light-weight

28
Future Work
  • Performance Enhancements
  • Address application nondeterminism issues
  • Extend it to other transport layer protocols e.g.
    SCTP
  • Run real applications on ST-TCP
  • Address issues related to using one backup for
    multiple primary servers
  • Primary and Backup on different LANs

29
Backup Slides
30
Bandwidth Used by Primary Backup Heartbeat
Assuming each HB message (including all headers)
is 128 bytes
31
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com