Status of Joint Research Project - PowerPoint PPT Presentation

1 / 44
About This Presentation
Title:

Status of Joint Research Project

Description:

Function Address/Address Partition bind/unbind events. NOKIA RESEARCH CENTER / BOSTON ... Node/Cluster/Zone availability events. Same mechanism as for function events ... – PowerPoint PPT presentation

Number of Views:27
Avg rating:3.0/5.0
Slides: 45
Provided by: nrjatmasay
Learn more at: http://www.ietf.org
Category:

less

Transcript and Presenter's Notes

Title: Status of Joint Research Project


1
TIPC as TML
draft-maloy-tipc-01.txt
Jon Maloy, Ericsson Steven Blake,
Modularnet Maarten Koning, WindRiver Jamal Hadi
Salim,Znyx Hormuzd Khosravi,Intel
IETF-61, Washington DC, Nov 2004
2
TIPC
  • A transport protocol for cluster environments
  • Connectionless and Connection Oriented Reliable
    or Unreliable.
  • Reliable or Unreliable Multicast
  • Usage not limited to ForCES context
  • A framework for detecting, supervising and
    maintaining cluster topology
  • Available as portable open source code package
    under BSD licence
  • 12000 lines of C code, 112 kbyte Linux kernel
    module
  • Runs on 4 OSes so far, and more to come
  • Proven concept, used and deployed in several
    Ericsson products

3
ForCES Protocol Framework
ForCES Protocol Messages
4
TIPC as L2 TML
ForCES Protocol Messages
5
Interface Adaptation
Interface Adaptation
Interface Adaptation
ForCES Protocol Messages
6
Fulfilling Requirements(1)
  • Reliability
  • Reliable transport in all modes
  • Can be made unreliable per socket/direction
  • Security
  • Only secure within closed networks.
  • No explicit authentication/encryption support
    yet, but planned
  • Not IP-based, no router will forward TIPC
    messages!!
  • Congestion Control
  • At three levels Connection/Transport, Signalling
    Link and Carrier level
  • Will give feedback to PL layer if connection is
    broken or message rejected
  • Multicast/Broadcast
  • Supported

7
Fulfilling Requirements(2)
  • Timeliness
  • Immediate delivery (No Nagle algorithm)
  • Inter-node delivery time in the order of 100
    microseconds
  • HA Considerations
  • L2 link failure detection and failover handled
    transparently for user
  • Connection abortion with error code if no
    redundant carrier available
  • Peer node failure detection after 0.5-1.5 seconds
  • Encapsulation
  • 24 byte extra header
  • 40 extra for connectionless
  • Priorities
  • Supports 4 message importance priorities,
    determining congestion levels and abort/rejection
    levels
  • Is 8 levels really needed ?

8
Connection Directly on TIPC
CE
CE Object
FB X
FB Y
TIPC
FE
FE Object
LFB 1
LFB 2
9
Connections via FE/CE Object
CE
CE Object
FB X
FB Y
TIPC
FE
FE Object
LFB 1
LFB 2
10
Connection Usage
CE
CE Object
FB X
FB Y
Traffic Data Connection Low Priority Reliable
CE-gtFE Unreliable FE-gtCE
Control Connection High Priority Reliable in
both directions
TIPC
FE
FE Object
LFB 1
LFB 2
11
Functional Addressing Unicast
  • Function Address
  • Persistent, reusable 64 bit port identifier
    assigned by user
  • Consists of type number and instance number
  • Function Address Sequence
  • Sequence of function addresses with same type

Server Process, Partition B
Client Process
bind(type foo, lower100,
upper199)
sendto(type foo, instance 33)
Server Process, Partition A
bind(type foo, lower0,
upper99)
12
Address Mapping -Unicast
CE
RSVP77
CE Object
FB X
tml_bind(RSVP,77)
TML API
bind(RSVP,77,77)
TIPC API
TIPC
FE
TIPC API
bind(meter,44,44)
Meter44
FE Object
TML API
LFB 1
tml_bind(meter,44)
13
Connection Setup
CE 8
RSVP77
CE Object
FB X
TIPC API
TIPC
FE 17
connect(RSVP,77,node8)
Meter44
FE Object
LFB 1
tml_connect(RSVP,77, CEID8)
If instance numbers are coordinated over whole
cluster there is no need for LFBs to know CEID
14
Functional Addressing Multicast
  • Based on Function Address Sequences
  • Any partition overlapping with the range used in
    the destination address will receive a copy of
    the message
  • Client defines multicast group per call

Server Process, Partition B
Client Process
bind(type foo, lower100,
upper199)
sendto(type foo, lower 33,
upper 133)
foo,33,133
Server Process, Partition A
foo,33,133
bind(type foo, lower0,
upper99)
15
Address Mapping -Multicast
CE
RSVP77
CE Object
tml_mcast(meter_mc, groupX)
FB X
sendto(meter_mc,X,X)
TIPC
FE
Meter13
bind(meter_mc,X,X)
Meter44
bind(meter_mc,X,X)
FE Object
tml_join(meter_mc,X)
tml_join(meter_mc,X)
16
Questions???
17
Why TIPC in ForCES ?
  • Congestion control at three levels
  • Connection level, signalling link level and media
    level
  • Based on 4 importance priorities
  • Simple to configure
  • Each node needs to know its own identity, that is
    all
  • Automatic neighbour detection using
    multicast/broadcast
  • Lightweigth, Reactive Connections
  • Immediate connection abortion at node/process
    failure or overload
  • Toplogy Subscription Service
  • Functional and physical topology

18
Functional View
Socket API Adapter
Port API Adapter
Other API Adapters
User Adapter API
Address Subscription
Address Resolution
Address Table Distribution
Connection Supervision Route/Link Selection
Reliable Multicast
Neighbour Detection Link Establish/Supervision/Fai
lover
Node Internal
Fragmentation/De-fragmentation
Packet Bundling Congestion Control
Sequence/Retransmission Control
Bearer Adapter API
Infiniband
Mirrored Memory
Ethernet
SCTP
UDP
19
Network Topology
Zone lt1gt
Zone lt2gt
Cluster lt2.1gt
Slave Node lt2.1.3333gt
Node lt1.2.3gt
20
Functional Addressing Unicast
  • Function Address
  • Persistent, reusable 64 bit port identifier
    assigned by user
  • Consists of type number and instance number
  • Function Address Sequence
  • Sequence of function addresses with same type

Server Process, Partition B
Client Process
bind(type foo, lower100,
upper199)
sendto(type foo, instance 33)
Server Process, Partition A
bind(type foo, lower0,
upper99)
21
Functional Addressing Multicast
  • Based on Function Address Sequences
  • Any partition overlapping with the range used in
    the destination address will receive a copy of
    the message
  • Client defines multicast group per call

Server Process, Partition B
Client Process
bind(type foo, lower100,
upper199)
sendto(type foo, lower 33,
upper 133)
foo,33,133
Server Process, Partition A
foo,33,133
bind(type foo, lower0,
upper99)
22
Location Transparency
  • Location of server not known by client
  • Lookup of physical destination performed
    on-the-fly
  • Efficient, no secondary messaging involved

Node lt1.1.1gt
Server Process, Partition B
Client Process
bind(type foo, lower100,
upper199)
sendto(type foo, lower 33,
upper 133)
Server Process, Partition A
foo,33,133
bind(type foo, lower0,
upper99)
23
Location Transparency
  • Location of server not known by client
  • Lookup of physical destination performed
    on-the-fly
  • Efficient, no secondary messaging involved

Node lt1.1.2gt
Node lt1.1.1gt
Server Process, Partition B
Client Process
bind(type foo, lower100,
upper199)
sendto(type foo, lower 33,
upper 133)
Server Process, Partition A
foo,33,133
bind(type foo, lower0,
upper99)
24
Location Transparency
  • Location of server not known by client
  • Lookup of physical destination performed
    on-the-fly
  • Efficient, no secondary messaging involved

Node lt1.1.1gt
Server Process, Partition B
Client Process
bind(type foo, lower100,
upper199)
sendto(type foo, lower 33,
upper 133)
Server Process, Partition A
foo,33,133
bind(type foo, lower0,
upper99)
25
Address Binding
  • Many sockets may bind to same partition
  • Closest-First or Round-Robin algorithm chosen by
    client

Server Process, Partition A
Client Process
bind(type foo, lower0,
upper99)
sendto(type foo, lower 33,
upper 133)
Server Process, Partition A
foo,33,133
bind(type foo, lower0,
upper99)
26
Address Binding
  • Many sockets may bind to same partition
  • Closest-First or Round-Robin algorithm chosen by
    client
  • Same socket may bind to many partitions

Server Process, Partition B
Client Process
bind(type foo, lower100,
upper199)
sendto(type foo, lower 33,
upper 133)
Server Process, Partition AB
foo,33,133
bind(type foo, lower0,
upper99) bind(typefoo, lower100,
upper199)
27
Address Binding
  • Many sockets may bind to same partition
  • Closest-First or Round-Robin algorithm chosen by
    client
  • Same socket may bind to many partitions
  • Same socket may bind to different functions

Server Process, Partition B
Client Process
bind(type foo, lower100,
upper199)
sendto(type foo, lower 33,
upper 133)
Server Process, Partition A
foo,33,133
bind(type foo, lower0,
upper99) bind(typebar, lower0,
upper999)
28
Functional Topology Subscription
  • Function Address/Address Partition bind/unbind
    events

Server Process, Partition B
Client Process
bind(type foo, lower100,
upper199)
subscribe(type foo, lower
0, upper 500)
Server Process, Partition A
bind(type foo, lower0,
upper99)
29
Network Topology Subscription
  • Node/Cluster/Zone availability events
  • Same mechanism as for function events

Node lt1.1.3gt
Node lt1.1.1gt
Client Process
node,0x1001003
subscribe(type node, lower
0x1001000, upper 0x1001009)
Node lt1.1.2gt
node,0x1001002
30
ForCES Applied on TIPC
Network Equipment
Control Element
OSPF, RIP
COPS, CLI, SNMP
Other Applications
ForCES Protocol/TIPC
Forwarding Element
31
ForCES applied on TIPC
Network Equipment
Control Element
Control Element
Control Element
OSPF, RIP
COPS, CLI, SNMP
Other Applications
ForCES Protocol/TIPC
Forwarding Element
Forwarding Element
32
CONNECTIONS
  • Establishment based on functional addressing
  • Selectable lookup algorithm, partitioning,
    redundancy etc
  • No protocol messages exchanged during
    setup/shutdown
  • Only payload carrying messages
  • Traditional TCP-style connection setup/shutdown
    as alternative
  • End-to-end flow control
  • SOCK_SEQPACKET
  • SOCK_STREAM
  • SOCK_RDM for connectionless and multicast
  • SOCK_DGRAM can easily be added if needed
  • Same with Unreliable SOCK_SEQPACKET

33
CONNECTIONS
  • No protocol messages exchanged during
    setup/shutdown
  • Only payload carrying messages

Server Process, Partition B
foo,117
sendto(type foo, instance 117)
34
CONNECTIONS
  • No protocol messages exchanged during
    setup/shutdown
  • Only payload carrying messages

Server Process, Partition B
connect(client) send()
35
CONNECTIONS
  • No protocol messages exchanged during
    setup/shutdown
  • Only payload carrying messages

Server Process, Partition B

connect(server)
36
CONNECTIONS
  • Immediate abortion event in case of peer
    process crash

Server Process, Partition B
37
CONNECTIONS
  • Immediate abortion event in case of peer node
    crash

Node lt1.1.5gt
Node lt1.1.3gt
Server Process, Partition B
abort
38
CONNECTIONS
  • Immediate abortion event in case of
    communication failure

Node lt1.1.5gt
Node lt1.1.3gt
Server Process, Partition B
abort
39
CONNECTIONS
  • Immediate abortion event in case of node
    overload

Node lt1.1.5gt
Node lt1.1.3gt
Server Process, Partition B
40
Network Redundancy
  • Retransmission protocol and congestion control at
    signalling link level
  • Normally two links per node pair, for full load
    sharing and redundancy

Node lt1.1.5gt
Node lt1.1.3gt
Server Process, Partition B
41
Network Redundancy
  • Retransmission protocol and congestion control at
    signalling link level
  • Normally two links per node pair, for full load
    sharing and redundancy
  • Smooth failover in case of single link failure,
    with no consequences for user level connections

Node lt1.1.5gt
Node lt1.1.3gt
Server Process, Partition B
42
Remaining Work
  • Implementation
  • Reliable Multicast not fully implemented yet
    (exp. end of Q1)
  • Re-stabilization after most recent changes
  • Re-implementation of multi-cluster neighbour
    detection and link setup
  • Protocol
  • Fully manual inter cluster link setup
  • Guaranteeing Name Table consistency between
    clusters
  • Slave node Name Table reduction
  • ?????

43
http//tipc.sourceforge.net
44
QUESTIONS ??
Write a Comment
User Comments (0)
About PowerShow.com