Title: Report on the INFN-GRID Globus evaluation
1Report on the INFN-GRID Globus evaluation
- Massimo Sgaravatto
- INFN Padova
- for the INFN Globus group
- globus_at_infn.it
- http//www.infn.it/globus
2Why Globus ?
- Some basic services (security, information
services, resource management, ) must be
deployed in order to implement and use a Grid for
real applications - Globus identified as possible Grid framework
providing these services - Need to assess Globus packages (effectiveness,
completeness, robustness, ease of use, ) - ? WP Installation and Evaluation of the Globus
Toolkit of the INFN-GRID Project - Goal evaluation of the Globus toolkit
- Which services can be useful ?
- What is necessary to integrate/modify ?
- What is missing ?
3Globus activities within INFN
- Activities driven by the following work plan
- Evaluation of Globus security services
- Evaluation of Grid Information Service
- Evaluation of Globus services for resource
management - Evaluation of Globus tools for data management
- Evaluation of Globus HBM for fault monitoring
- Evaluation of Globus GEM for execution
environment management - Globus deployment and installation tools
- Not only a simple evaluation
- Some existing shortcomings addressed
- Specific configurations and customizations
implemented - INFN-GRID Globus evaluation activities performed
between June 2000 and January 2001 - Official Globus 1.1.3 (1.1.4 for MPICH-G2)
release tested
4Globus security services
- The Globus GSI security model seems to satisfy
the INFN community current requirements on
security - One time login mechanism
- Use of X509 certificates
- Possibility for extending relations of trust to
multiple CAs without having to interfere with
their X.500 naming scheme - Some shortcomings
- Need for limited (by scope or purpose) proxies
- Memory leaks in the GAA library
- Cryptic diagnostics
- Interface between GSI and AFS
- Hopefully addressed with gsiklog
- No tools for group management
- Hopefully addressed with CAS
5INFN customizations on security
- INFN-CA
- CRL distribution
- Centralized management of the grid-mapfile
- Goal Ease the sharing of the same access
policies (represented by the grid-mapfiles) for
groups of hosts with common purposes - Proposed system
- Central repository (LDAP server) to store user
certificates (subjects) and to define groups of
users - Certificates published by CA manager
- Group manager responsible for editing group
memberships (using a LDAP client) - Resource owners (Globus administrators)
periodically (i.e. cron job) connect to this
repository, download the subject of the
certificates that meet a specified criterion
(e.g. all users of group X), and produce
grid-mapfile entries
6(No Transcript)
7Globus Information Services
- INFN implemented a hierarchical structure of GIS
based on geographical entities - Site GIISs
- Local GRISs registered at the site GIIS
- Root GIIS where local GIISs are registered
8INFN GIS Topology
Dcinfn,dcit, ogrid
Top Level INFN GIIS
Dcmi, Dcinfn, dcit,ogrid
Dcpd,Dcinfn, dcit,ogrid
GIIS
GIIS
GRIS
Padova
Milano
9(No Transcript)
10root GIIS
A global view
1st level query focus on a set of resources
Scheduling/ Resource discovery
High Availability ldbm backend (?) GIIS
replication (?)
GIIS
2nd and 3rd level query Get more updated info
GIIS
..
11Globus Information Services
- Problems
- Performance
- Querying the root GIIS server, on the worst case
the whole namespace must be searched - The overall response time is limited by the
slowest response of a descendant - Poor GRIS performance (shell backend)
- Example (querying a site GIIS)
- 1 sec. When cache is on
- 5-10 sec. When cache expired and GIIS and GRIS
not busy - gt 1 min. when cache expired and GRIS busy
12Globus Information Services
- Problems
- Pull model
- Mixed push/pull model more suitable
- Security and access controls
- Any GRIS can register itself to a GIIS
- No access control when searching the GIS
- Fault tolerance
- No automatic failover mechanisms
13Globus Information Services
- Other INFN customisations
- INFN-GIS browser
- Tools (MRTG based) to monitor LDAP servers
- Entries returned
- Connections
- On-going MDS-2.1 alpha evaluation
14INFN-GIS browser
http//bond.cnaf.infn.it/ cgi-bin/mdsbrowse1.pl
15Resource Management
- Evaluation of Globus GRAM
- Focus on possible use of GRAM as uniform
interface to different underlying local resource
management systems - Tests with Condor, LSF and PBS as LRMS
- INFN WAN Condor pool as Globus resource
- The model is fine, but lack of robustness
(needed for real production environments) - Memory leaks in the Globus job manager
- Fixes provided by our group were fed back to
Globus - Scalability (one job manager for each job)
- Reliability (the job manager is not persistent)
- Hopefully addressed with the new jobmanager (by
Condor team) - Globus GRAM integrated in the first workload
management system prototype of the DataGrid
project
16INFN WAN Condor pool
- Single pool
- To optimize CPU usage of all INFN hosts
- gt 200 machines
- Mainly Linux and Digital Unix machines
- Spread in the different INFN sites
- Sub-pools
- To define policies/priorities on resource usage
- Multiple checkpoint servers
- To guarantee the performance and the efficiency
of the system - To reduce network traffic for checkpointing
activity - General purpose computing facility for all INFN
users - Different kinds of applications
- Allocation time for Condor jobs January 00
December 00 gt 45 years - http//www.infn.it/condor
17Resource Management
- GRAM Reporter (Information providers) in
particular for farms - Many useless attributes (at least for our needs),
attributes not calculated (always defined as
0), some attributes not properly calculated by
Globus shell scripts - Some important information describing the farms
and the submitted jobs (necessary for example for
a resource broker) missing - ? We are addressing this problem in the context
of the DataGrid Project - Submission of Condor jobs to Globus resources
- Condor-G
- Useful as a reliable job submission service
- Persistent queue of jobs
- Logging information
- Exploitation of the new persistent Globus
jobmanager (hopefully in the next release) - Reliable (two phase commit) submission protocol
(hopefully in the next release) - Exploited in the first workload management system
prototype of the DataGrid project as job
submission service - GlideIn
- Evaluation of MPICH-G2 vs. MPICH
- Some shortcomings found (lack of support for
shared memory, worse latency performance for
small messages wrt. MPICH)
18Data management
- Tests with GASS
- Tests with GridFTP alpha release 2
- Capability of resuming an interrupted file
transfer successfully tested - Support for the GSI authentication mechanisms
successfully tested - Throughput tests
- Increasing number of parallel streams and fixed
file size - Increasing file size and fixed number of streams
- Increasing TCP buffer size
- Increasing block size
19Other services
- Fault Monitoring (HBM)
- Evaluation of HBM for fault detection (for
system and user processes) - but the HBM package is not seeing active
development - Execution Environment Management (GEM)
- Evaluation of GEM as service for code migration
- but Globus now provides only limited
capabilities (executable staging)
20Globus installation tools
- Various problems installing and deploying Globus
using the standard install procedures - Installation and configuration partially manual
(error prone) - Very long compilation time
- No hooks for local customizations
- ...
- ? INFN-GRID Globus installation toolkit
- To shorten the installation time of the Globus
toolkit - Support for specific customisations
- Quick distribution of patches
- Support for distribution of new tools and
packages
21INFN-GRID Installation toolkit
- Characteristics
- Distribution of binary files
- Distribution of the packages needed to
install/use Globus - Distribution of various Globus flavoured
compilations (kerberos, MPICH, AFS) - Support for the most used platforms in the HENP
community (Linux RH, Solaris) - Binary file relocation supported
- Latest patches included (e.g. fixes for Globus
jobmanager memory leaks) - Support for local customisations (hook to support
different CAs, support for different GIS
configurations, support for different LRMS,) - Support for distribution of new tools and
packages (certretrieve, GDMP, ) - Upgrade and uninstall procedures
- Documentation
- Proven to be successful
- Used to setup a INFN GRID Testbed and also
outside (CERN, FNAL, ) - Used as installation tool for DataGrid Testbed 0
22Conclusions
- The Globus toolkit can provide basic services
useful to create and deploy usable Grids, but
various shortcomings and issues must be addressed - Other info
- Report on the INFN-GRID Globus Evaluation
- http//www.infn.it/globus/Docs/infn-globus-evaluat
ion.pdf - Response from Globus team to Report on the
INFN-GRID Globus Evaluation - http//www.isi.edu/annc/infn/responsetoinfn.pdf
- http//www.infn.it/globus