Scalable, Fault-tolerant Management of Grid Services: Application to Messaging Middleware

About This Presentation

Title:

Scalable, Fault-tolerant Management of Grid Services: Application to Messaging Middleware

Description:

Title: PowerPoint Presentation Last modified by: Harshawardhan Gadgil Created Date: 1/1/1601 12:00:00 AM Document presentation format: On-screen Show – PowerPoint PPT presentation

Number of Views:53

Avg rating:3.0/5.0

Slides: 38

Provided by: gridsUcsI

Learn more at: http://grids.ucs.indiana.edu

Category:

more less

Transcript and Presenter's Notes

Title: Scalable, Fault-tolerant Management of Grid Services: Application to Messaging Middleware

1
Scalable, Fault-tolerant Management of Grid
Services Application to Messaging Middleware

Harshawardhan Gadgil
Advisor Prof. Geoffrey Fox

Ph.D. Defense Exam April 5, 2007
2
Talk Outline

Use Cases and Motivation
Architecture
Handling Consistency and Security Issues
Performance Evaluation
Application Managing Grid Messaging Middleware
Conclusion
Thesis Contribution and Future Work

3
GridLarge Number of Distributed Resources

Applications distributed and composed of a large
number and type (hardware, software) of resources
Components widely dispersed and disparate in
nature and access

4
Sensor Grid
Galip Aydin, Ph.D. Thesis, Jan 2007
5
ExampleAudio Video Conferencing

GlobalMMCS project, which uses NaradaBrokering as
a event delivery substrate
Consider a scenario where there is a teacher and
10,000 students. One way is to form a TREE shaped
hierarchy of brokers
One broker can support up to 400 simultaneous
video clients and 1500 simultaneous audio clients
with acceptable quality. So one would need
(10000 / 400 25 broker nodes).

Scalable Service Oriented Architecture for
Audio/Video Conferencing, Ahmet Uyar, Ph.D.
Thesis, May 2005
6
DefinitionUse of term Resource

Consider a Digital Entity on the network
Specific case where this entity can be controlled
by modest external state
Can be captured via a few messages (typically 1)
The digital entity in turn can bootstrap and
manage components that may require more state.
Thus, Digital entity Component Manageable
Resource
Thus could be hardware or software (services)
Henceforth, we refer to Service being managed as
a Manageable Resource

7
DefinitionWhat is Management ?

Resource Management Maintaining Systems
ability to provide its specified services with a
prescribed QoS
Management Operations include
Configuration and Lifecycle operations (CREATE,
DELETE)
Handle RUNTIME events
Monitor status and performance
Maintain system state (according to user defined
criteria)
This thesis addresses
Configuring, Deploying and Maintaining Valid
Runtime Configuration
Crucial to successful working of applications
Static (configure and bootstrap) and Dynamic
(monitoring / event handling)

From WS Distributed Management
http//devresource.hp.com/drc/slide_presentations/
wsdm/index.jsp
8
Existing Systems

Distributed Monitoring frameworks
NWS, Ganglia, MonALISA
Primarily serve to gather metrics (which is one
aspect of resource management, as we defined)
Management Frameworks
SNMP primarily for hardware (hubs, routers)
CMIP Improved security logging over SNMP
JMX Managing and monitoring for Java
applications
WBEM System management to unify management of
distributed computing environments
Management systems not-interoperable Move to
Web Services based management of resources
XML based interactions that facilitate
implementation in different languages, running on
different platforms and over multiple transports
Competing Specifications (WS Management and WS
Distributed Management)

9
Motivation Issues in Management

Resources must meet
General QoS and Life-cycle features
(User defined) Application specific criteria
Improper management such as wrong configuration
major cause of service downtime
Large number of widely dispersed Resources
Decreasing hardware cost gt Easier to replicate
for fault-tolerance (Espl. Software replication)
Presence of firewalls may restrict direct access
to resources
Resource specific management systems have evolved
independently (different platform / language /
protocol)
Requires use of proprietary technologies
Central management System
Scalability and single point of failure

10
Desired Features of the Management Framework

Fault Tolerance
Failures are Normal, Resources may fail, but so
also components of the management framework.
Framework MUST recover from failure
Scalability
With Growing Complexity of application, number of
resources (application components) increase
E.g. LHC Grid consists of a large number of CPUs,
disks and mass storage servers (on the order of
30K)
In future, much larger systems will be built
MUST cope with large number of resources in terms
of
Additional components Required

11
Desired Features of the Management Framework

Performance
Initialization Cost, Recovery from failure,
Responding to run-time events
Interoperability
Services exist on different platforms, Written in
different languages, managed using system
specific protocols and hence not INTEROPERABLE
Framework must implement interoperable protocols
such as based on Web-Service standards
Generality
Management framework must be a generic framework.
Should be applicable to any type of resource
(hardware / software)
Usability
Autonomous operation (as much as possible)

12
Architecture

We assume resource specific external state to be
maintained by a Registry (assumed scalable,
fault-tolerant by known techniques)
We leverage well-known strategies for providing
Fault-tolerance (E.g. Replication, periodic
check-pointing, request-retry)
Fault-detection (E.g. Service heartbeats)
Scalability (E.g. hierarchical organization)

13
Management Architecture built in terms of

Hierarchical Bootstrap System
Resources in different domains can be managed
with separate policies for each domain
Periodically spawns a System Health Check that
ensures components are up and running
Registry for metadata (distributed database)
Robust by standard database techniques and our
system itself for Service Interfaces
Stores resource specific information
(User-defined configuration / policies, external
state required to properly manage a resource)
Generates a unique ID per instance of registered
component
Our present implementation is a simple registry
service

14
Management Architecture built in terms of

Messaging Nodes form a scalable messaging
substrate
Provides transport protocol independent messaging
between components
Can provide Secure delivery of messages
In our case, we use NaradaBrokering Broker as a
messaging node
Managers Active stateless agents that manage
resources.
Since they dont maintain state, hence robust
Actual management functions are performed by a
Resource specific manager component
Resources what you are managing
Wrapped by a Service Adapter which provides a Web
Service interface.
Service Adapter connects to messaging node to
leverage transport independent publish subscribe
communication with other components

15
ArchitectureScalability Hierarchical
distribution
ROOT
Spawns if not present and ensure up and running
Passive Bootstrap Nodes Only ensure that all
child bootstrap nodes are always up and running
US
EUROPE

Active Bootstrap Nodes
/ROOT/EUROPE/CARDIFF
Always the leaf nodes in the hierarchy
Responsible for maintaining a working set of
management components in the domain

CGL
CARDIFF
FSU
16
ArchitectureConceptual Idea (Internals)
WS Management
Periodically Spawn
Manager processes periodically checks available
resources to manage. Also Read/Write resource
specific external state from/to registry
Connect to Messaging Node for sending and
receiving messages
User writes system configuration to registry
Publish Subscribe based communication via
Messaging Node
17
Architecture User Component

Resource Characteristics are determined by the
user.
Events generated by the Resources are handled by
the manager
Event processing is determined by via WS-Policy
constructs
For e.g., Automatically instantiate a failed
resource instance
ltpolPolicy xmlnspolhttp//schemas.xmlsoap.org/w
s/2004/09/policy
xmlnspol1"http//www.hpsearch.org/schemas/2006/
07/policy"gt
ltpolAllgt
ltpol1AUTOInstantiate
forkProcessLocator"udp//156.56.104.15265535"/
gt
lt/polAllgt
lt/polPolicygt
Managers can set up services
Writing information to registry can be used to
start up a set of services

18
Issues in the distributed systemConsistency

Examples of inconsistent behavior
Two or more managers managing the same resource
Old messages / requests reaching after new
requests
Multiple copies of resources existing at the same
time / Orphan Resources leading to inconsistent
system state
Use a Registry generated monotonically increasing
Unique Instance ID (IID) to distinguish between
new and old instances
Requests from manager A are considered obsolete
IF IID(A) lt IID(B)
Service Adapter stores the last known MessageID
(IIDseqNo) allowing it to differentiate between
duplicates AND obsolete messages
Service adapter periodically renews with registry
IF IID(ResourceInstance_1) lt IID(ResourceInstance_
2)
THEN ResourceInstance_1 is OBSOLETE
SO ResourceInstance_1 silently shuts down

19
Issues in the distributed systemSecurity

NaradaBrokerings Topic Creation and Discovery
and Security Scheme addresses
Message level security
Provenance, Lifetime, Unique Topics
Secure Discovery of endpoints
Prevent unauthorized access to services
Prevent malicious users from modifying message
Thus message interactions are secure when passing
through insecure intermediaries

NB-Topic Creation and Discovery - Grid2005 /
IJHPCN http//grids.ucs.indiana.edu/ptliupages/pub
lications/NB-TopicDiscovery-IJHPCN.pdf
NB-Security (Grid2006) http//grids.ucs.indiana.ed
u/ptliupages/publications/NB-SecurityGrid06.pdf
20
Implemented

Management framework
Management of NaradaBrokering Brokers
Released with NaradaBrokering in Feb 2007
WS Specifications
WS Management (could use WS-DM) -June 2005
parts (WS Transfer Sep 2004, WS Enumeration
Sep 2004) and WS PolicySep 2004, SOAP v 1.2
(needed for WS-Management)
WS Eventing (Leveraged from the WS Eventing
support in NaradaBrokering)
Used XmlBeans 2.0.0 for manipulating XML in
custom container

21
Performance EvaluationMeasurement Model Test
Setup

Multithreaded manager process - Spawns a Resource
specific management thread (A single manager can
manage multiple different types of resources)
Limit on maximum resources that can be managed
Limited by maximum threads per JVM possible
(memory constraints)
Theoretically 1800 resources (refer Thesis
writeup)
Practical Limit on maximum requests that can be
handled
Performance Model in Thesis

22
Performance EvaluationResults
23
Performance EvaluationResults

Scenario illustrating a case with multiple
concurrent events
Response time increases with increasing number of
concurrent requests
Response time is RESOURCE-DEPENDENT and the shown
times are illustrative
Increases rapidly as no. of requests gt 210
MAY involve dependency on external services such
as Registry access which will increase overall
response time but can allow more than (210)
concurrent requests to be processed

24
Performance EvaluationComparing Increasing
Managers on same machine w.r.t. different machines
25
Performance EvaluationResearch Question How
much infrastructure is required to manage N
resources ?

N Number of resources to manage
M Max. no. of resources that connect to a
single messaging node
D Maximum concurrent requests that can be
processed by a single manager process before
saturating
For analysis, we set this as the number of
resources assigned per manager
R min. no. of registry service components
required to provide desired level of
fault-tolerance
Assume every leaf domain has 1 messaging node.
Hence we have N/M leaf domains
Further, No. of managers required per leaf domain
is M/D
Other passive bootstrap nodes are not counted
here since ltlt N
Total Components in lowest level
(R registry 1 Bootstrap Service 1
Messaging Node M/D Managers)
(N/M such leaf domains)
(2 R M/D) (N/M)

26
Performance EvaluationResearch Question How
much infrastructure is required to manage N
resources ?

Thus percentage of additional infrastructure is
(2 R M/D)N/M 100 / N
(2 R)/M 1/D 100
A Few Cases
If, D 200, M 800 and R 4, then Additional
Infrastructure
(24)/800 1/200 100 1.2
Shared Registry then there is one registry
interface per domain, R 1, then Additional
Infrastructure
(21)/800 1/200 100 0.87
If NO messaging node is used (assume D 200),
then Additional Infrastructure
(R registry 1 bootstrap node N/D
managers)/N 100
(1R)/N 1/D 100
100/D (for N gtgt R)
0.5

No. of Resources (N), No. of Resource assigned to
manager (D), Registry Service Instances (R), Max.
Entities connected to Messaging Node (M)
27
Performance EvaluationResearch Question How
much infrastructure is required to manage N
resources ?
28
Prototype Managing Grid Messaging Middleware

We illustrate the architecture by managing the
distributed messaging middleware NaradaBrokering
This example motivated by the presence of large
number of dynamic peers (brokers) that need
configuration and deployment in specific
topologies
Runtime metrics provide dynamic hints on
improving routing which leads to redeployment of
messaging system (possibly) using a different
configuration and topology or use (dynamically)
optimized protocols (UDP v TCP v Parallel TCP)
and go through firewalls
Broker Service Adapter
NB illustrates an electronic entity that didnt
start off with an administrative service
interface
So add wrapper over the basic NB BrokerNode
object that provides WS Management front-end
Allows CREATION, CONFIGURATION and MODIFICATION
of broker configuration and broker topologies

29
Performance EvaluationXML Processing Overhead

XML Processing overhead is measured as the total
marshalling and un-marshalling time required
including validation against schema
In case of Broker Management interactions,
typical processing time (includes validation
against schema) 5 ms
Broker Management operations invoked only during
initialization and failure from recovery
Reading Broker State using a GET operation
involves 5ms overhead and is invoked periodically
(E.g. every 1 minute, depending on policy)
Further, for most operation dealing with changing
broker state, actual operation processing time gtgt
5ms and hence the XML overhead of 5 ms is
acceptable.

30
Prototype Observed Operation Costs (Individual
Resources Brokers)
Operation Time (msec) (average values) Time (msec) (average values)
Operation Un-Initialized (First time) Initialized (Later modifications)
Set Configuration 778 5 33 3
Create Broker 610 6 57 2
Create Link 160 2 27 2
Delete Link 104 2 20 1
Delete Broker 142 1 129 2
31
RecoveryEstimated Recovery Cost per broker
Topology Number of Resource specific Configuration Entries Recovery Time T(Read State From Registry) T(Bring Resource up to speed) T(Read State) TSetConfig Create Broker CreateLink(s) Recovery Time T(Read State From Registry) T(Bring Resource up to speed) T(Read State) TSetConfig Create Broker CreateLink(s)
Ring N nodes, N links (1 outgoing link per Node) 2 Resource Objects Per node 10 (778 610 160) 1558 msec 10 (778 610 160) 1558 msec
Cluster N nodes, Links per broker vary from 0 3 1 4 Resource Objects per node Min 5 778 610 1393 msec Max 20 778 610 160 2 27 1622 msec
Assuming 5ms Read time from registry per
resource object
32
Prototype Observed Recovery Cost per Broker
Operation Average (msec)
Spawn Process 2362 18
Read State 8 1
Restore (1 Broker 1 Link) 1421 9
Restore (1 Broker 3 Link) 1616 82
Time for Create Broker depends on the number
type of transports opened by the broker E.g. SSL
transport requires negotiation of keys and would
require more time than simply establishing a TCP
connection If brokers connect to other brokers,
the destination broker MUST be ready to accept
connections, else topology recovery takes more
time.
33
Contributions

Designed and implemented a Resource Management
Framework
Scalable to manage large number of resources
Tolerant to failures in framework itself
Can handle failures in managed resources via user
defined policies
We have shown that Management framework can be
built on top of a publish subscribe framework to
provide transport independent messaging between
framework components
Implemented Web Service Management to manage
resources
Detailed evaluation of the system components to
show that the proposed architecture has
acceptable costs
Implemented Prototype to illustrate management of
a distributed messaging middleware system
NaradaBrokering

34
Future Work

Apply the framework to broader domains
Investigate application of architecture to
resources with significant runtime state that
needs to be maintained
Higher frequency and size of messages
XML processing overhead becomes significant
Investigate strategies to distribute framework
components (load balance) considering factors
such as locality of resources and runtime metrics

35
Publications

On the presented work
Scalable, Fault-tolerant Management in a Service
Oriented Architecture
Harshawardhan Gadgil, Geoffrey Fox, Shrideep
Pallickara, Marlon PierceAccepted as poster in
HPDC 2007
Managing Grid Messaging Middleware
Harshawardhan Gadgil, Geoffrey Fox, Shrideep
Pallickara, Marlon PierceIn Proceedings of
Challenges of Large Applications in Distributed
Environments (CLADE 2006), pp. 83 - 91, June 19,
2006, Paris, France
A Retrospective on the Development of Web Service
Specifications
Shrideep Pallickara, Geoffrey Fox, Mehmet Aktas,
Harshawardhan Gadgil, Beytullah Yildiz, Sangyoon
Oh, Sima Patel, Marlon Pierce and Damodar Yemme
Chapter in Book Securing Web Services Practical
Usage of Standards and Specifications
On NaradaBrokering
On the Secure Creation, Organization and
Discovery of Topics in Distributed
Publish/Subscribe systems
Shrideep Pallickara, Geoffrey Fox, Harshawardhan
Gadgil(To Appear) International Journal of High
Performance Computing and Networking (IJHPCN),
2006. Special Issue of extended versions of the 6
best papers at the ACM/IEEE Grid 2005 Workshop
On the Discovery of Brokers in Distributed
Messaging Infrastructure
Shrideep Pallickara, Harshawardhan Gadgil,
Geoffrey FoxIn Proceedings of the IEEE Cluster
2005 Conference. Boston, MA

36
Publications (contd)

On NaradaBrokering (contd)
On the Discovery of Topics in Distributed
Publish/Subscribe systems
Shrideep Pallickara, Geoffrey Fox, Harshawardhan
GadgilIn Proceedings of the 6th IEEE/ACM
International Workshop on Grid Computing Grid
2005 Conference, pp. 25-32, Seattle, WA (Selected
as one of six Best Papers)
A Framework for Secure End-to-End Delivery of
Messages in Publish/Subscribe Systems
Shrideep Pallickara, Marlon Pierce,
Harshawardhan Gadgil, Geoffrey Fox, Yan Yan, Yi
Huang, (To Appear) In Proceedings of The 7th
IEEE/ACM International Conference on Grid
Computing (Grid 2006), Barcelona, September
28th-29th, 2006
HPSearch, GIS and Misc
A Scripting based Architecture for Management of
Streams and Services in Real-time Grid
Applications
Harshawardhan Gadgil, Geoffrey Fox, Shrideep
Pallickara, Marlon Pierce, Robert Granat, In
Proceedings of the IEEE/ACM Cluster Computing and
Grid 2005 Conference, CCGrid 2005, Vol. 2, pp.
710-717, Cardiff, UK
Building Messaging Substrates for Web and Grid
ApplicationsGeoffrey Fox, Shrideep Pallickara,
Marlon Pierce, Harshawardhan GadgilIn special
Issue on Scientific Applications of Grid
Computing in Philosophical Transactions of the
Royal Society, London, Volume 363, Number 1833,
pp 1757-1773, August 2005

37
Publications (contd)

HPSearch, GIS and Misc (contd)
Building and Applying Geographical Information
System Grids
Galip Aydin, Ahmet Sayar, Harshawardhan Gadgil,
Mehmet S. Aktas, Geoffrey C. Fox, Sunghoon Ko,
Hasan Bulut,and Marlon E. Pierce, 12 January
2006, Special Issue on Geographical information
Systems and Grids based on GGF15 workshop,
Concurrency and Computation Practice and
Experience
SERVOGrid Complexity Computational Environments
(CCE) Integrated Performance Analysis
Galip Aydin, Mehmet S. Aktas, Geoffrey C. Fox,
Harshawardhan Gadgil, Marlon Pierce, Ahmet Sayar,
In Proceedings of the 6th IEEE/ACM International
Workshop 13-14 Nov. 2005. Page(s) 256 - 261
(Grid 2005)