FIRB Project High performance enabling platforms for computational grid oriented scalable virtual or - PowerPoint PPT Presentation

1 / 47
About This Presentation
Title:

FIRB Project High performance enabling platforms for computational grid oriented scalable virtual or

Description:

F. Callegati) [Lab PI, BO, TN, UTD] ... A. Fumagalli) [Lab PI, TN, UTD] ... delegated to WAN, MAN, and LAN resilience schemes. delegated to HW, SW, ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 48
Provided by: grid88
Category:

less

Transcript and Presenter's Notes

Title: FIRB Project High performance enabling platforms for computational grid oriented scalable virtual or


1
FIRB ProjectHigh performance enabling platforms
for computational grid oriented scalable virtual
organization (GRID.IT)(Resp. Prof. Marco
Vanneschi, UniPi/CNR)PROJECT WORKSHOPWP1
GRID ORIENTED OPTICAL SWITCHING PARADIGMSPiero
CastoldiLuca ValcarenghiRoma, 1-3 Marzo 2005
2
Outline of this talk
  • Objectives of WP1
  • Network scenarios
  • Positioning of the activity
  • State of the activities of WP1
  • Sample architectures and results
  • Focus on grid network services for NA-PE (by Luca
    Valcarenghi)
  • Conclusions and future work

3
WP1 activities
WP 1 Grid oriented optical switching
paradgms (Resp. P. Castoldi)
We are here!
  • Activity 1 Connections, topologies and network
    service models
  • (Resp. R. Battiti) Lab PI, TN
  • Activity 2 Grid computing on state-of-the-art
    optical networks
  • (Resp. P. Castoldi) Lab PI, TN, UTD
  • Activity 3 Migration scenarios to intelligent
    flexible optical networks
  • (Resp. F. Callegati) Lab PI, BO, TN, UTD
  • Activity 4 Control plane and network emulation
    for optical packet switching networks (Resp. A.
    Fumagalli) Lab PI, TN, UTD
  • Activity 5 Enabling technologies for optical
    switching networks
  • (Resp. G. Cancellieri) Lab PI, BO, AN

1 year
2 year
1 year
2 year
1 year
2 year
2 year
3 year
1 year
2 year
3 year
4
WP1 Objectives (1)
  • Overall WP1 objective introduce network-awarness
    into a grid programming environment (by means of
    new grid network services) for operation over any
    (optical) WAN transport infrastructure
  • Supporting global grid computing in WAN requires
    to guarantee QoS in terms of new grid application
    requirements
  • Fault tolerance
  • Low Latency
  • Dynamic Provisioning/ Dynamic Reconfigurability
    (logical topology)
  • Bit rate/ Protocol Independency
  • Definition of a Network Aware Programming
    Enviroment (NA-PE) should be able to dynamically
    adapt the used network resources to meet
    seamlessly grid application requirements

5
WP1 Objectives (2)
  • Communication from connection-less to
    connection-oriented, packet based
  • From end-to-end IP-based best effort transport to
    GMPLS-based controlled optical transport through
    Diffserv and IP/MPLS transport
  • Extended resource database, i.e. computational
    network database
  • Collaboration between application middleware and
    network middleware is introduced
  • Introduction of new grid network services

6
Positioning of WP1 (1)
  • To best of our knowledge
  • National projects (PRIN, FIRB,CofinLab)
  • No activities really on-going on the topic,
    telecommunications (TLC) world developed some
    ideas, computer science (CS) world could use them
  • CNIT developed ad-hoc solutions within LABNET and
    VICOM project
  • Need for cooperation between the two worlds
  • European projects (IP, NoE, STREPs)
  • CSCE cooperation pushed within IST IP NOBEL
    yielded a positive feedback, but missing killer
    application (grid!)
  • Concepts pushed within IST NoE e-photon/ONE
    raised interest especially from the architectural
    point of view
  • Interest of TLC operators (Wind in Italy)

7
Positioning of WP1 (2)
  • Outside Europe
  • Different needs in different countries many
    countries push for grid computing research
    efforts
  • E.g. Japan, Korea, US have over-provisioned
    networks (no bottlenecks in access, or metro)
  • Great interests everywhere for the so-called
    service platforms
  • Service platforms are becoming of interest where
    network bottlenecks may affect transactions
    performance
  • Service platforms allow automatic pipe
    provisioning customized to services (including
    grid)

8
Network scenario 1 (NS1)
  • Backbone network based on IP/MPLS network with
    centralized/distributed service plane to support
    grid general purpose and grid network services
    for NA-PE

9
Network scenario 2 (NS2)
  • Backbone network based WDM optical transport with
    centralized/distributed service plane to support
    grid general purpose and grid network services
    for a NA-PE

10
Network scenario 3 (NS3)
  • Backbone network based on optical packet
    switching
  • No real distributed service plane can be built
    due to processing limitation of nodes
  • Services must be driven on edge router, native
    services can be exploited

11
State of the activity (1)
  • Architectures (see PRC3)
  • Centralized and distributed support for NA-PE
    operating over IP/MPLS packet networks (NS1)
  • Centralized and distributed support NA-PE
    operating over optical circuit networks (NS 2)
  • On-going for optical packet switched networks
    (NS3)
  • Technological issues
  • Networking issues
  • The two issues are not disjoint

12
State of the activity (2)
  • Demonstration (our metropolitan testbed, see
    PRC3)
  • Hardware
  • clusters (workers), IP/MPLS routers are available
    and in operation (NS 1)
  • WDM optical network elements available but not in
    operation (NS 2)
  • optical packet routers under prototyping phase
    (NS3)
  • Software
  • IP/MPLS router dynamic configuration through XML
    scripts (NS1)
  • WDM optical network elements static configuration
    only available through management interface so
    far (NS2)
  • Optical router are not reconfigurable in
    software, hardware configuration (NS3)

13
Performance evaluation
  • Functional validation (qualitative, see PRC3)
  • Through use cases
  • Numerical validation (quantitative, see PRC3)
  • Measurements on the testbed
  • Includes
  • Network performance
  • Service plane algorithms efficiency
  • Parsing/formatting of XML messages
  • .. and computation time

14
Sample architectures and results
  • New Grid Network Services for NA-PE (Luca will
    expand)
  • New Service Oriented ASTN Architecture (SO-ASTN)
    provided with Service Plane functionalities
  • VPN topology discovery use case
  • Fault tolerance service (Luca will expand)
  • Performance evaluation of all-optical switching
    technologies

15
New Grid Services
Job Scheduling Service
Replica Optimization Service
Authentication/ Authorization Service
Resource Discovery Service
General Purpose Grid Services
Network Information Service (NIS)
Network Monitoring Service (NMS)
Network Cost Estimation Service (NCES)
Connectivity Service (CS)
Grid Network Services
NIMS
GRID User to Network Interface
  • NIMS provides network status information
    (network topology, available bandwidth)
  • NCES allows Grid services to have the
    possibility to use monitoring information for
    dynamic adaptation to Grid status
  • CS consists of the Reachability service and of
    the Connectivity Establishment Service

16
SO-ASTN Architecture
SO-ASTN Service Oriented Automatically
Switched Transport Network G-UNI GRID User to
Network Interface UNI User to Network
Interface NMI-A Network Management Interface
ASTN NMI_T Network Management Interface
Transport CCI Connection Controller Interface
Grid Access Network
GRID Applications
G-UNI
Service Plane
X
UNI
Management Plane
Control Plane
SO-ASTN
NMI-A
ASTN
NMI-T
CCI
Transport Plane
17
VPN topology discovery (1)
PC B
PC A
PC C
GRID Service
GRID Network layer
GRID Service
GRID Service
VPN
GUNI
GUNI
GUNI
FE
FE
Distributed Network Service Plane
FE
DSE
DSE
DSE
UNI
UNI
UNI
GE
Network Plane
Edge Layer
GE
GE
IR
DWDM Ring
Inner Layer
IR
IR
18
VPN topology discovery (2)
lt?xml version"1.0" encoding"UTF-8"?gt ltTopology
xmlnsxsi"http//www.w3.org/2001/XMLSchema-instan
ce" xsinoNamespaceSchemaLocation"Topo
logyResponse.xsd"gt ltNode ID"100" Name"A"gt
ltInterface ID"1" Address"217.9.70.11"
Type"2"gt ltPort ID"1"gt
ltPerformance Delay"300" Jitter"50" BER"7"/gt
ltBandwidth Available"1" Utilized"0"/gt
ltDestination NodeId"300" InterfaceId"1"
PortId"1"/gt lt/Portgt lt/Interfacegt
ltInterface ID"2" Address"217.9.70.12"
Type"2"gt ltPort ID"1"gt
ltPerformance Delay"400" Jitter"50" BER"8"/gt
ltBandwidth Available"1" Utilized"0"/gt
ltDestination NodeId"200" InterfaceId"2"
PortId"1"/gt lt/Portgt lt/Interfacegt
lt/Nodegt lt/Topologygt
19
Fault tolerance service Path Restoration and
Service Replication
NCES
NIMS
CS
ASSIST
Integrated path restoration and service
replication fault tolerance problem solution
Service A
Service A
recovery
recovery
Service A
Service A
primary
Integrated path restoraiton and service
replication
primary
secondary
Service A
Path Protection/Restoration
primary
Service Replication
20
NS3 Adaptive routing in OPS (1)
  • The forwarding algorithm determines
  • the output fiber and the output wavelength
  • if wavelengths busy
  • packet delayed in FDL buffer or
  • packet dropped, because the required delay is not
    available
  • Wavelength and delay selection (WDS) are
    correlated
  • minimize the gaps
  • maximize the wavelength utilization

21
NS3 - Adaptive routing in OPS (2)
  • The routing algorithm provides
  • a default path used as a first chance
  • a few alternative paths used in case the default
    is congested
  • Traffic flows are routed according to different
    path selection strategies
  • SL (Single Link) only the default path is used
    (static routing)
  • SA (Single Alternative) a single alternative
    path is used
  • MA (Multiple Alternative) more than one
    alternative path is used
  • Packets may be transmitted on different
    wavelengths according to given strategies
  • PS (Partial Sharing) if the default path is
    congested, the best wavelength is chosen on one
    of the alternative path
  • CS (Complete Sharing) the best wavelength is
    chosen over the default and the alternative paths

22
Results on European network topology
1e-01
SL
PS-SA
CS-SA
PS-MA
CS-MA
1e-02
Packet Loss Probability
1e-03
1e-04

1e-05
1
2
3
4
5
Number of FDLs
23
Conclusions future work
  • Identified roadmap for supporting NA-PE on
    different network infrastructures
  • Proposed new network services
  • Given sample use cases of network services
  • Dissemination 1 tutorial at Hot Interconnect 12
    Conf. (USA)
  • Publications in year 2004 2 Journal, 15
    conference papers
  • Complete architectural and implementation work to
    support NA-PE
  • Strengthen interactions with WP1, WP2, WP8 and
    possibly other WPs

24
ACK to all thepeople who havecontributed!
25
FIRB ProjectHigh performance enabling platforms
for computational grid oriented scalable virtual
organization (GRID.IT)(Resp. Prof. Marco
Vanneschi, UniPi/CNR)PROJECT WORKSHOPNEW
NETWORK SERVICES FOR GRID COMPUTINGLuca
ValcarenghiRoma, 1-3 Marzo 2005
26
Network Aware Programming Environment
User Interface (UI)
User
Run (max_exec_time, reliable, etc.)
Application Requests
Application
f(max_exec_time, reliable, etc.)
Network Resource DB
Programming Environment General Purpose Grid
Services Grid Network Services
Infos
Infos
Computational Resource DB
Update
Update
Notification
Allocation request
Middleware ? Grid Abstract Machine
Elaboration
Notification
Resource Allocation
Basic HWSW platform
27
Grid Network Service Interaction with Network
Management and Control Plane
  • General Purpose Grid Services
  • Resource Discovery Service
  • Job Scheduling Service
  • Replica Optimization Service
  • Authentication and authorization service
  • Grid Network Services
  • Network Information and Monitoring Services
    (NIMS)
  • Connectivity Service (CS)
  • Network Cost Estimation Service (NCES)

28
Topology Discovery Service as Part of the NIMS
Why?
MOTIVATIONS
  • Network status infrastructure information needed
    in order to improve GRID performance ?GRID
    Network Information and Monitoring Service
  • Existing monitoring tools, as Network Weather
    Service (NWS), measure bandwidth and latency of
    end-to-end paths through invasive TCP/IP based
    probing
  • Topology Discovery Service enhanced information
    include different available routes, reserved
    resources, physical and logical topology

GOALS
  • Deriving the GRID network physical and logical
    topology (assuming to know all the involved
    nodes).
  • Detecting a real-time GRID busy / available
    resources snapshot.
  • Using network information in order to allow GRID
    network services to reserve capacity resources,
    utilize alternative paths, avoid or solve
    congestion or failure events.

29
Centralized Approach
Network Topology
Network Topology

30
Strategy
  • Assumption router connection in the whole MPLS
    network are point-to-point link based.
  • Each node is queried with the following set of
    requests
  • 1. Physical Interface Type (FE, GE, ATM) and
    Speed.
  • 2. Logical interface IP address (local
    subnet).
  • 3. Reserved and Available MPLS-RSVP bandwidth.
  • 4. Number of passing-through LSP.
  • Topology server merges all information sets and
    builds
  • 1. Node adjacencies.
  • 2. Links type and speed.
  • 3. Current reserved traffic and available
    bandwidth.

31
XML-based Implementation
Centralized topology Service
  • Why XML?
  • Flexible and light.
  • Easy update.
  • Extensible solution.

Service Database
GRID server
XSLT engine
Communication module
XSLT files database
GRID server GRID application request/reply
manager. XSLT engine XML transformation
module. Communication module Junoscript-based
simple query manager.
Junoscript XML server
32
XML Messages Format
lttopologygt ltnode address217.9.70.112
typemplsgt ltphysical_interface
namege-0/0/0 typege gt
ltlogical_interface namege-0/0/0.0
local172.16.0.1/29
subnet172.16.0.0/29 destination_node217.9.70.
111 destination_interfacege-0
/0/2.0 total_bandwidth1000
reserved_bandwidth10 active_LSPs1gt
lt/logical_interfacegt
lt/physical_interfacegt lt/nodegt lt/topologygt
  • Junoscript queries
  • Get-interface-information.
  • Show rsvp interface detail

1a.xml
A
1b.xml
TOPOLOGY IS COMPLETE! No routing protocol
dependance!
1b.xml
1b.xml
B
1c.xml
1c.xml
C
33
Final considerations
Advantages
Drawbacks
  • Topology is complete even if hosts are not
    connected to all routers.No routing protocols
    dependence.
  • Detailed information more details about network
    depending on router specific requests (e.g., VPN
    logical topology)
  • Highly customizable can be interface to set-up
    operations (LSP setup, guaranteed bandwidth
    allocation request).
  • Flexible and light structure (XML facilities)
  • Network administrative problemNo suitable for
    extended inter-domain scenarios.
  • High Network Element knowledge (routers and nodes
    address) is not always available.
  • Central server central point of failure

Future extensions and works
  • Distributed approach integration in large GRID
    inter-domains.
  • Performance evaluations.

34
Approaches for Grid Computing Fault Tolerance
Failover Schemes
TCP/IP Stack
Layered Grid Architecture
Application end-user applications
Application
Application specific fault tolerant schemes
based on middleware fault detection
Middleware
Tasks and Data Replicas Condor-G checkpointing,
migration, DAGMan GT2/GT3 GridFTP Reliable File
Transfer (RFT) Replica Location Service
(RLS) Fault Tolerant TCP (FT-TCP)
Collective collective resource control
Resource resource management
Transport
Connectivity Inter-process communication,
protection
delegated to WAN, MAN, and LAN resilience schemes
Internet/Network
Fabric basic hardware and software
Link
delegated to HW, SW, and farm failover schemes
35
Basic Question
  • Is it efficient to replicate/migrate a process
    when a network failure infrastructure occurs ?
  • Replication/migration implies (at least)
  • Saving the status of the process
  • Setting up the connection to transfer the process
    to the replica location
  • Setting up a new connection (with guaranteed
    bandwidth) between the original client and the
    process replica location
  • Restarting the process

36
Integrating Service Migration and GMPLS Path
Restoration
  • Current scenario
  • Application and Middleware fault tolerant
    schemes
  • checkpoint, migration, and replication
  • address both hardware and software failures
  • Fabric (LAN, MAN, and WAN) resilient schemes
  • Ethernet Rapid Spanning Tree Protocol (RSTP) ,
    GMPLS/MPLS path restoration, IP dynamic rerouting
  • address network infrastructure failures
  • Grid computing resilience guaranteed by
    application/middleware and fabric resilient
    schemes independently
  • Objectives
  • Integrate application/middleware and fabric fault
    tolerant schemes to more efficiently overcome
    specific network infrastructure failures while
    guaranteeing the connectivity QoS requirements
    (e.g., minimum guaranteed bandwidth)
  • Try to move the most of the burden of recovering
    grid network infrastructure failures to the
    fabric making lighter the connectivity, resource,
    collective, and application layer services

37
Integrated Resilience
  • Integrating network layer connection rerouting
    with task/data replication/migration
  • Integrated scheme model by centralized MILP
    problem formulation
  • Objective maximizing the number of connections
    restored after failure

1
1
A
A
A
G
A
0
2
G
G
0
2
0
2
B
4
B
B
4
4
H
D
5
3
H
D
D
H
3
5
5
3
38
Simulation Scenario
  • Replica location utilization pattern
  • Evaluation scenarios
  • Limited link capacity ci,j
  • Limited number of replicas
  • per location, Rl, total number of (s,d) pairs for
    which location l can be utilized for service
    migration
  • per failed connection between (s,d) pair, Rs,d,
    total number of locations allowed for the
    migration of services hosted in d and
    communicating with services hosted in s
  • Limited distance (hop) of allowed replica
    locations H
  • Minimum required replication flow ?
  • Physical network
  • 100 randomly generated connection matrices
  • Bidirectional client-server connection generation
  • Bidirectional connection rerouting
  • Expected network blocking probability
  • average ratio between number of unrecovered
    connections and failed connections
  • Expected path restoration utilization
  • Average number of times the original server
    location node is utilized as replica location
    normalized to the number of replica locations
    utilized

39
Network Topologies
Prism
Pan-European
40
Integrated Resilience Performance
  • Integrated restoration outperforms OSPF dynamic
    rerouting resilience
  • Integrated restoration and migration only
    resilience show the same performance but by
    utilizing path restoration decreases the need for
    service synchronization and restart

41
Replication Patterns
Replica utilization pattern for recovery (0,1)
connectivity (100 simulations)
4
12
24
7
Pb0.19
Pb0.36
5
6
27
6
8
6
11
25
??0.0 H from server?
??1.0 H from server?
6
32
Pb0.36
Pb0.25
22
7
??0.0 Hop from server1
??1.0 Hop from server1
42
Ongoing and Future Work
  • Topology discovery service
  • Distributed implementation based on traceroute
    and ping
  • Fault tolerance service
  • Experimental evaluation of integrated migration
    and GMPLS path restoration fault tolerance

43
Back-up
44
Service and Network Provider relationship
Service Provider
GRID Application
Centralized Service Layer
Grid Network Service Plane
CSE
GUNI
Service signaling
Distributed Service Layer
X
DSE
DSE
DSE
DSE
UNI
UNI
UNI
UNI
Control Plane
NMI-A
Management Plane
CPE
CPE
CPE
CPE
CCI
CCI
CCI
CCI
NMI-T
Transport Plane
Network Provider
45
Distribute Service Element Module
GRID Producer
NIMS
CS
NCES
XML Service Request
GUNI
XML-SLA Data Base
DSE
XML Validator
XSLT Service Mapping Database
DSE Signaling Engine
XSLT-based mapping
DSE
Service Requester
Service Signaling
UNI
SOAP is the protocol utilized for transporting
XML files
Network Edge Router
46
Grid Network Services and Network Services
Interaction
47
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com