A Survey on High Availability Mechanisms for IP Services 11 October 2005 - PowerPoint PPT Presentation

About This Presentation
Title:

A Survey on High Availability Mechanisms for IP Services 11 October 2005

Description:

A Survey on High Availability Mechanisms for IP Services. 11 October 2005. N. ... No semantic to delimit a UDP connection. Maintains multiple purpose timers ... – PowerPoint PPT presentation

Number of Views:37
Avg rating:3.0/5.0
Slides: 39
Provided by: marjou
Category:

less

Transcript and Presenter's Notes

Title: A Survey on High Availability Mechanisms for IP Services 11 October 2005


1
A Survey on High Availability Mechanisms for IP
Services11 October 2005
  • N. AYARI, FT RD., D. Barbaron, FT RD
  • L. Lefevre, INRIA P. Primet, INRIA

2005 High Availability and Performance Computing
Workshop (HAPCW'2005) Santa FE, USA
2
IntroductionDifferent types of clusters
  • MPP and SMP clusters,
  • Scalability via CPU and Memory interconnects
  • Using special purpose hardware and/or software,
  • High availability through
  • Job scheduling and migration,
  • Fault detection and check pointing.
  • Clusters of independent working nodes
  • Pretty alternative based on commodity hardware
    and/or general purpose operating systems
  • Scalability achieved by efficient distribution
    of the incoming requests on the available nodes
  • High availability ?
  • Service non interruption and service integrity

3
IntroductionScalability issues in clusters of
commodity hw/sw nodes
  • The request distribution should
  • Increase performance by
  • Improving the system responsiveness
  • Concurrent supported connections per unit of
    time,
  • Keeping reasonable response times
  • When does the bottleneck is observed?
  • Support upper layer session integrity
  • Integrity depends on the switching granularity
  • On a per datagram, connection or session
    distribution basis.

4
Switch designs
  • Can be
  • Stateless or Statefull
  • Applies to
  • Layer 4 switching
  • Uses 2-4 packet information (TCP/IP Model)
  • Layer 5 switching
  • Uses 2-5 packet information (TCP/IP Model)

5
Stateless vs Statefull switch designsStateless
switch design
  • Stateless switch design
  • Achieves a better latency by
  • Processing each datagram independently from its
    predecessors
  • Does not maintain any state information
  • Implements service integrity
  • On a per connection basis in Layer 4 Switching
  • Uses hashing to compute the same cluster node
    for all datagrams originated from the same client
    identified by ltIP _at_, Port Number, Protocolgt.
  • On a per session basis in layer 5 Switching
  • Depends on the IP data application
  • - Cookie based persistency for web traffic
  • - Cookie Switching
  • - Cookie based hashing
  • What about other data applications?

6
Stateless vs Statefull switch designsStateless
switch design limitations
  • Upper layer session integrity
  • A request belonging to one session goes to the
    wrong server
  • Hash Collisions needs robust hash functions
  • Fault node handling
  • When the hash function depends on the number of
    active nodes
  • Replaying all sessions when one or more nodes
    crash
  • Fair load distribution
  • The stateless nature uses static load balancing
  • Source Hashing,
  • While request have varying service time and
    service resources
  • SIP long sessions, FTP bandwidth consuming
    transfers, etc.

7
Stateless vs Statefull switch designsStatefull
switch designs
  • It aims to improve both
  • Upper layer session integrity
  • Maintaining connection/session STATES
  • Source and destination IP _at_, port numbers,
    transport protocol
  • - No semantic to delimit a UDP connection
  • Maintains multiple purpose timers
  • Avoid maintaining inactive sessions/connections
  • - DDoS counter measure
  • Computes statistics on the client's session
    duration average
  • Needs to speed up the lookup for each datagram
  • Use index hashing
  • Load distribution Fairness
  • Using service state aware load distribution
    policies

8
Stateless vs Statefull switch designsStatefull
design limitations
  • Cost effectiveness
  • Server state distribution overhead
  • Efficiency depends on the granularity of the
    switching operation
  • Layer 4 or Layer 5 ?
  • Does layer 4 scale for all IP services?
  • Load distribution fairness?
  • Decision taken on the first datagram in a
    session/connection
  • Need new mechanisms

9
Fair Scheduling
  • How to measure load?
  • Using a robust, simple, quickly adapted summary
    metric
  • CPU, Memory and Disk I/O utilization,
  • Number of active application processes and
    connections,
  • The availability of network protocol buffers,
  • Number of active users.
  • Policies?
  • Static
  • Randomization, (Weighted) Round Robin,
    Source/Destination Hashing.
  • Dynamic (Server/Client state aware)
  • (Weighted) Least connections, Short Expected
    delay, Minimum misses,

10
Fair Scheduling
  • Policies?
  • Dynamic (Server/Client state aware) (cont.)
  • Cache affinity,
  • The file is partitioned among the nodes
  • SIETA (Size Interval Task assignment with equal
    load),
  • The node is determined based on the 'size' of
    the request
  • CAP (Client Aware Policy)
  • Consecutive connections from the same client
    assigned to the same node
  • Admission Control Policies
  • Locality-Based Least-Connection, Locality-Based
    Least-Connection with Replication.

11
Fair Scheduling
  • Policies?
  • Network traffic based balancing
  • Focus on predicting the volume of incoming
    traffic from a source based upon past history
  • Priority based balancing
  • Assigns higher priority to some data traffic
  • Topology based Redirection
  • Redirect traffic to the cluster nearest the
    client in terms of
  • Hop count (static),
  • Network latency (dynamic).
  • Application specific Redirection
  • Layer 5 load balancing specialize back end
    servers for special contents or services
  • Etc.

12
Layer 4 SwitchingHow?
  • Works at the TCP/IP level
  • Content blind switching

Layer 4 switches
13
Layer 4 Switching A kernel implementation
  • The IP Virtual Server implementation
  • Supports NAT, DR, and Tunnelling
  • As add-on modules in the networking layer of the
    kernel
  • Based on the Linux packet filtering and routing
    capabilities
  • The Linux Virtual Server
  • A cluster of independently working nodes,
  • Using the IPVS load balancer.
  • Some recommendations WZ

14
Layer 4 switchingPerformance Single CPU Linux
2.2 LVS-NAT vs. LVS-DR scaling
  • Performance Rou2001.

15
Layer 4 switchingSome Layer 4 switching products
Two Way One Way One Way One Way
Packet double rewriting Packet single Rewriting Packet Tunneling Packet Forwarding
- Cisco's Local Director (commercial) - Magic Router (Berkley) - LSNAT - F5 Network's BIG-IP 5100 - LVS - Foundry Network's Server Iron - Cyber IQ's Hyper Flow - Coyote Point's Equalizer - TCP Router - LVS - IBM Network Dispatcher (Component of IBM Websphere NetEdge server) - OneIP (BellLabs) - LSMAC - Intel NetStructure - Traffic Director - Nortel Network's Alteon 780 series - Foundry Network's Server Iron - Radware WSD Pro - LVS - VA Balance (VA Linux Systems Japan)
16
Layer 4 SwitchingThe Net filter Capabilities and
Return Code
Return Code Meaning
NF_DROP Discard the packet
NF_ACCEPT Keep the packet
NF_STOLEN Forget about the packet
NF_QUEUE Queue packet for user space
NF_REPEAT Call this hook function again
17
Layer 4 SwitchingThe IPVS Architecture
18
Layer 4 SwitchingPersistency handling
19
Layer 4 switchingIssues
  • The persistence template for layer 4 switching
    may not scale
  • Example VoIP data exchange using SIP
  • Different transport connections for different
    transaction within the same SIP session.
  • Session corruption implies datagram losses
  • More Latency (TCP AIMD)

20
Layer 5 SwitchingThe solution?
  • The switch is also the single view of the cluster
  • The request distribution is done on the basis of
  • the load estimation on the cluster's nodes
  • the connection identifiers of the request
  • ltsource and destination IP _at_, source and
    destination port nb, protocolgt
  • the session identifiers of the request and the
    content type
  • Layer 5 header informations
  • Additional delay
  • Need to complete the connection to parse the data

21
Layer 5 SwitchingThe solution?
Layer 5 switches
22
Layer 5 SwitchingTCP Gateway, the problems
  • Cost effective
  • Multiple copies and context switching
  • The proxy becomes rapidly the bottleneck because
    it is a two way architecture.

23
Layer 5 SwitchingTCP Splicing, the Packet
Mapping operations.
  • Modifications also affect
  • IP pseudo Header
  • Socket options

24
Layer 5 SwitchingTCP Splicing message timeline,
the Delayed Binding.


25
Layer 5 SwitchingTCP Splicing, the issues
  • Delayed binding
  • Double processing overhead
  • Two way switch mechanism
  • Buffer size for large scale forwarders
  • The transition between the control mode and the
    forwarder mode
  • Delay the activation of the spliced connection
    until the buffers got drained.
  • Forwarding data concurrently with draining the
    buffers.
  • End-to-end Flow Control
  • From Small/Big AdvWin to Big/Small AdvWin

26
Layer 5 SwitchingTCP Splice improvements
  • Pre forking TCP splice
  • Reduce the three way handshake cost
  • Pre-allocate Server Scheme
  • Guess Real Server on receipt of the TCP Sync
  • Etc.

27
Layer 5 SwitchingTCP Handoff
  • One way mechanism
  • Migrate the TCP connection from the Front end to
    the back end servers using the Handoff protocol
    Msg/Ack
  • MagicNberHdPrIdentifier, ConnMagicNxtSeqNber,
    AckMsg informs of the hdoff result
  • The connection is done without going through the
    Three Way handshake procedure.

28
Layer 5 SwitchingTCP Handoff message timeline
29
TCP Handoff vs TCP Splice
  • Based on LVS TCPSP and TCPHA 2.4 kernel
    implementations
  • Throughput (13 KB file)
  • Overhead due to L7 processing front-end -gt
    bottleneck -gt low scalability

Apache throughput (conn/sec)
Back End nodes in cluster
30
Layer 5 SwitchingThe limitations
  • Highly available connections?
  • Connection failover
  • One way vs two way architectures
  • Improvements on TCP Handoff
  • Actual implementations do not cover all data
    traffic

31
Layer 5 SwitchingSome layer 5 switching products
Two Way Architecture Two Way Architecture One Way Architecture One Way Architecture
TCP Gateway TCP Splicing TCP Handoff TCP Connection Hop
- IBM Network Dispatcher CBR - CAP (Client Aware Proxy) - Vovida's Load balancer Proxy - Foundry Network Server Iron - Radware WSD Pro - Hydra WS Hydra2500 - Alteon Applications switching series from Nortel - Sharp Corporation Super Proxy - Resonate's Central Dispatcher (with redirection capabilities) - Cisco's CSS 11500 (Content Service switch) - OpenFusion Load balancing service for Corba based applications and services from PrismTech Kemp technologies LoadMaster series (2460, 2860, etc.) Sun Fire B10n Content Load Balancing Blade switch (Tunneling based) Procera MLXP Layer 5 switch OctaGate Smart Web switch - Extreme Network layer 5 CA switch device - ScalaServer - TCPHA - Resonate's Central Dispatch
32
High Availability
  • How to detect that a member has failed?
  • Pings, timeouts,
  • Heartbeat message exchange
  • Status, cluster transition and retransmission
    messages
  • TCPHA include state message exchange
  • The accuracy of the failure detection
  • Timeouts with multiple retries detect failure
    accuracy with high probability
  • How to recover from failover
  • a load balancer failover
  • State synchronization
  • Subsystem failover
  • IP Takeover through channel bonding
  • Application Failover
  • The Linux watchdog timer interface, etc.

33
High Availability
  • More on connection failover
  • Through connection migration and reliable sockets
  • Different from TCP Handoff
  • Include
  • Migratory TCP
  • Fault tolerant TCP
  • Connection passing

34
High AvailabilityThe accuracy in distributed
architectures
  • DNS scalability through site redundancy
  • DNS SRV RR used in service location
  • Localizing available SIP proxies
  • The effectiveness of DNS based scalability and
    failover are corrupted by the DNS cache updates
    frequency.

35
High AvailabilityThe accuracy in distributed
architectures
  • RSerPool

36
High availabilityOther tips for distributed
architectures
  • Multicast
  • Needs explicit support of all routers within the
    client server path
  • IP Anycast route redundancy
  • Different servers running the same service can
    all have the same anycast _at_ on one of their
    interfaces
  • If server fails, the router will update its route
    to the nearest available node
  • Depends on router's update frequency

37
Conclusion and Future directions
  • Further work will address
  • Kernel implementation of layer 5 switching to
    handle session oriented data transfers.
  • Improvements on the forwarder kernel component
  • Fair load distribution in session oriented data
    transfers.
  • IPv6 compliance?
  • Security concerns in connection failover

38
  • THANKS
Write a Comment
User Comments (0)
About PowerShow.com