Title: The DataGrid project and its Monitoring Services Workpackage
1 The DataGrid project and its Monitoring
Services Workpackage
- Peter Kacsuk
- Laboratory of Parallel and Distributed Systems
- MTA SZTAKI Research Institute
- kacsuk_at_sztaki.hu
- www.lpds.sztaki.hu
2Contents
- Overview of the DataGrid project
- Goals and tasks of WP3
- Current work
3Project Overview
- EU Contract Signed (29th Dec00) Project
started (1 Jan01) ! - 9.8 MEuro over 3 years, 21 partners (6 principal)
-gt BIG - Large unfunded effort -gt total project size ?
30MEuro - Collaboration with Earth Bio Sciences
- Industrial Participation (3 partners)
- Project Management Board (strategic resource)
- Project Technical Board (WP managers
applications - Architecture Task Force - small (incl. Foster
Kesselmann) - CERN Project Office (PM Architect
Secretariat) - Requirements, evaluations, prototypes, started
Summer 2000 ! - Presented at DANTE/Geant, Terena, ECFA, HEP-CCC,
IST2000, GridForum-5, ...
4Workpackages
5Workpackage Relationships
Applications
HEP Apps (WP8)
EO Apps (WP9)
Bio Apps (WP10)
Data Grid Services
Workload Management (WP1)
Data Management (WP2)
Monitoring Services (WP3)
Core Middleware
Globus Middleware
Physical Fabric
Fabric Manage- ment (WP4)
Networking (WP7)
Mass Storage Management (WP5)
6WP 3 GRID Monitoring Services
- Goals
- to specify, develop, integrate and test tools and
infrastructure to enable end-user and
administrator access to status and error
information in a Grid environment - to permit both job performance optimisation as
well as allowing for problem tracing, crucial to
facilitating high performance Grid computing. - Issues
- Unified Information Architecture
- Schema design for monitoring info
- Directory services (GRIS/GIIS, LDAP, RDBMS, )
- Performance analysis, visualisation,
7Tasks of WP3
- Requirements Design
- Current Technology
- Infrastructure
- Analysis Presentation
- Demonstration
8Tasks of WP3
- Requirements Design
- A full requirements analysis will be performed to
evaluate the needs of all classes of end-users - Interfaces to other sub-systems will be defined
and needs for instrumentation of components will
be identified. - An architectural specification of the components
(and their relationships) necessary to meet the
WP objectives will be established. - Boundary conditions and interfaces with other
Grid components will be specified and where
appropriate APIs will be defined. - Standards for message formats will be set up.
- Current Technology
- Infrastructure
- Analysis Presentation
- Demonstration
9Tasks of WP3
- Requirements Design
- Current Technology
- Evaluation of existing distributed computing
monitoring technologies to understand potential
uses and limitations. - Tests will be made in the first demonstration
environments of the project to gain experience
with tools currently available. - Issues studied will include functionality,
scalability, robustness, resource usage, etc. - Infrastructure
- Analysis Presentation
- Demonstration
10Tasks of WP3
- Requirements Design
- Current Technology
- Infrastructure
- Software libraries supporting instrumentation
APIs will be developed and gateway or interface
mechanisms established to computing fabrics,
networks and mass storage. - Where appropriate, local monitoring tools will be
developed to provide the contact point for status
information and to be a routing channel for
errors. - Directory services will be exploited to enable
location and/or access to information. - Methods for short and long term storage of
monitoring information will be developed to
enable both archiving and near real-time analysis
functions. - Analysis Presentation
- Demonstration
11Tasks of WP3
- Requirements Design
- Current Technology
- Infrastructure
- Analysis Presentation
- Development of software for analysis of
monitoring data and tools for presentation of
results. - Monitoring and evaluating job parallelism.
- Techniques for analysing the multivariate data
must be developed. - Effective means of visual presentation will be
established. - Demonstration
12Tasks of WP3
- Requirements Design
- Current Technology
- Infrastructure
- Analysis Presentation
- Demonstration
- Integration and test of tools and infrastructure
developed in this workpackage. - Instrumentation and monitoring services will be
tested in demonstration grid environments
developed elsewhere in the project. - Evaluations will be performed in terms of
scalability, etc.
13Current Work
- A relational approach to the Information and
Monitoring System following the principles of the
Grid Monitoring Architecture of the Global Grid
Forum - prototype of the relational GMA, implemented in
Java - A research in the area of information services to
define the tasks of a grid information service
and determine the disadvantages of the use of
existing information service solutions. - Developing further the cluster monitoring /
performance tools GRM/PROVE so that they will be
usable in the grid environment
14WP3 Architecture
15Relational GMA
- Information stored in relational database tables
- currently SQL servlet via JDBC to MySQL
- Query language SQL
- communication between Consumer and Producer XML
over http(s) - Meta-directory OpenLDAP, extended with
Persistent Search (subscription) - Java servlets Producer, Consumer, Registry
(meta-directory) and Schema (table description) - First prototype
- Integration into DataGrid from September
16Relational GMA
Time, Information Services and the Grid,
B.Coghlan, A.Djaoui, S.Fisher, J.Magowan,
A.Martin, BNCOD2001
17GRM monitor for grid
- GRM has been a semi-on-line, application-level
monitor for P-GRADE applications in cluster
environments - Goal is to use GRM in a grid environment
- Hardest problem start-up of a monitor in the
grid - GRMs structure is revisited and is being
modified for the requirements of the grid - monitor start-up changed significantly
- new instrumentation API independent from P-GRADE
- trace collection scheme is revisited (considering
wide-area networks, scalability and latency
issues)
18GRM Architecture in the Grid
Local Host
Main MonitorMM
Site 1
Site MonitorSM
Site MonitorSM
Site 2
Host 1
Host 2
Host 1
Local MonitorLM
Local MonitorLM
Local MonitorLM
shm
shm
shm
Application Process
Application Process
Appl. Process
Appl. Process
19 ?
Thank you