Title: Overview of the GLUE Project Grid Laboratory Unified Environment Author: Piotr Nowakowski, M'Sc' Cyf
1Overview of the GLUE Project(Grid Laboratory
Unified Environment)Author Piotr Nowakowski,
M.Sc.Cyfronet, Kraków
2Presentation Summary
- Goals of GLUE
- Key GLUE contributors
- GLUE schema
- GLUE activities
- Unresolved issues
3Goals of GLUE
- Promote coordination between European and US
Grid projects - Define, construct, test and deliver
interoperable middleware to all Grid Projects - Experiment with intercontinental Grid deployment
and operational issues - Establish procedures and policies regarding
interoperability - Once the GLUE collaboration establishes the
necessary, minimum requirements for
interoperability of middleware, any future
software designed by the projects covered by the
umbrella of the HICB and JTB must maintain the
achieved interoperability.
4GLUE Organizationally
- Management by iVDGL and DataTAG
- Guidance and oversight by the High Energy
Physics Intergrid Coordination Board (HICB) and
Joint Technical Board (JTB) - Participating organizations (19 entities in
all) - Grid Projects (EDG, GriPhyN, CrossGrid etc.)
- LHC experiments (Atlas, CMS etc.)
5HENP Collaboration
- The HENP (High-Energy Nuclear Physics) Grid RD
projects (initially DataGrid, GriPhyN, and PPDG,
as well as the national European Grid projects in
UK, Italy, Netherlands and France) have agreed to
coordinate their efforts to design, develop and
deploy a consistent open source standards-based
global Grid infrastructure. - To that effect, their common efforts are
organized in three major areas - A HENP InterGrid Coordination Board (HICB) for
high-level coordination - A Joint Technical Board (JTB)
- Common Projects, and Task Forces to address
needs in specific technical areas
6The DataTAG Project
Aim Creation of an intercontinental Grid testbed
using DataGrid (EDG) and GriPhyN components.
Work packages WP1 Establishment of an
intercontinental testbed infrastructure WP2 High
performance networking WP3 Bulk data transfer
validations and performance monitoring WP4 Intero
perability between Grid domains WP5 Information
dissemination/exploitation WP6 Project management
7DataTAG WP4
- Aims
- To produce an assessment of interoperability
solutions, - To provide a test environment for LHC
applications to extend existing use cases to test
interoperability of Grid components, - To provide input to a common Grid LHC
architecture, - To plan EU-US integrated Grid deployment.
WP4 Tasks T4.1 Develop an intergrid resource
discovery schema, T4.2 Develop intergrid
Authentication, Authorization and Accounting
(AAA) mechanisms, T4.3 Plan and deploy an
intergrid VO in collaboration with iVDGL.
8DataTAG WP4Framework and Relationships
9The iVDGL Project(International Virtual Data
Grid Laboratory)
Aim To provide high-performance global computing
infrastructure for keynote experiments in physics
and astronomy (ATLAS, LIGO, SDSS etc.)
- iVDGL activities
- Establishing supercomputing sites throughout the
U.S. and Europe linking them with a
multi-gigabit transatlantic link - Establishing a Grid Operations Center (GOC) in
Indiana - Maintaining close cooperation with partnership
projects in the EU and the GriPhyN project.
10U.S. iVDGL Network
- Selected participants
- Fermilab
- Brookhaven National Laboratory
- Argonne National Laboratory
- Stanford LINAC Laboratory
- University of Florida
- University of Chicago
- California Institute of Technology
- Boston University
- University of Wisconsin
- Indiana University
- Johns Hopkins University
- Northwestern University
- University of Texas
- Pennsylvania State University
- Hampton University
- Salish Kootenai College
11iVDGL Organization Plan
- Project Steering Group advises iDVGL directors
on important project decisions and issues. - Project Coordination Group provides a forum
for short-term planning and tracking of the
project activities and schedules. The PCG
includes representatives of related Grid
projects, particularly EDT/EDG. - Facilities Team identification of testbed
sites, hardware procurement - Core Software Team definitions of software
suites and toolkits (Globus, VDT, operating
systems etc.) - Operations Team performance monitoring,
networking, coordination, security etc. - Applications Team planning the deployment of
applications and the related requirements - Outreach Team Website maintenance, planning
conferences, publishing research materials etc.
Note The GLUE effort is coordinated by the
Interoperability Team (aka GLUE Team)
12The GriPhyN Project
- Aims
- To provide the necessary IT solutions for
petabyte-scale data-intensive science by
advancing the Virtual Data concept, - To create Petascale Virtual Data Grids (PVDG) to
meet the computational needs of thousands of
scientists spread across the globe.
Timescale 5 years (2000-2005)
- GriPhyN applications
- The CMS and ATLAS LHC experiments at CERN
- LIGO (Laser Interferometer Gravitational Wave
Observatory) - SDSS (Sloan Digital Sky Survey)
13The Virtual Data Concept
Virtual data the definition and delivery to a
large community of a (potentially unlimited)
virtual space of data products derived from
experimental data. In virtual data space,
requests can be satisfied via direct access
and/or computation, with local and global
resource management, policy, and security
constraints determining the strategy used.
- GriPhyN IT targets
- Virtual Data technologies new methods of
cataloging, characterizing, validating, and
archiving software components to implement
virtual data manipulations - Policy-driven request planning and scheduling of
networked data and computational resources
mechanisms for representing and enforcing both
local and global policy constraints and new
policy-aware resource discovery techniques. - Management of transactions and task execution
across national-scale and worldwide virtual
organizations new mechanisms to meet user
requirements for performance, reliability, and
cost.
14Sample VDG Architecture
15Petascale Virtual Data Grids
Petascale both computationally intensive
(Petaflops) and data intensive (Petabytes). Virtua
l containing little ready-to-use information,
instead focusing on methods of deiving this
information from other data.
The Tier Concept
Developed for use by the most ambitious LHC
experiments ATLAS and CMS.
- Tier 0 CERN HQ
- Tier 1 National center
- Tier 2 Regional center
- Tier 3 HPC center
- Tier 4 Desktop PC cluster
16The DataGrid (EDG) Project
Aim To enable next-generation scientific
exploration which requires sharing intensive
computation and analysis of shared large-scale
databases, from hundreds of terabytes to
petabytes, across widely distributed scientific
communities.
DataGrid Work Packages WP1 Workload
Management WP2 Data Management WP3 Monitoring
Services WP4 Fabric Management WP5 Storage
Management WP6 Integration (testbeds)
WP7 Network WP8 Application Particle
Physics WP9 Application Biomedical
Imaging WP10 Application Satellite
surveys WP11 Dissemination WP12 Project
Management
17GLUE Working Model
- The following actions take place once an
interoperability issue is encountered - The DataTAG/iVDGL managers define a plan and
sub-tasks to address the relevant issue. This
plan includes integrated tests and demonstrations
which define overall success. - The DataTAG/iVDGL sub-task managers assemble all
the input required to address the issue on hand.
The HIJTB and other relevant experts would be
strongly involved. - The DataTAG/iVFGL sub-task managers organize
getting the work done using the identified
solutions. - At appropriate points the work need is presented
to the HICB, which discusses it on a technical
level. Iterations take place. - At appropriate points the evolving solutions are
presented to the HICB. - At an appropriate point the final solution is
presented to the HICB with a recommendation that
it be accepted by Grid projects.
18GLUE Working Model - example
- Issue DataGRID and iVDGL use different data
models for publishing resource information.
Therefore RBs cannot work across domains. - The HIJTB recognizes this and proposes it as an
early topic to address. The DataTAG/iVDGL
management is advised to discuss this early on. - DataTAG management has already identified this
as a sub-task. - DataTAG/iVDGL employees are assigned to the
problem. - Many possible solutions exist, from
consolidation to translation on various levels
(the information services level or even the RB
level). The managers discuss the problem with
clients in order to ascertain the optimal
solution. - The group involved organizes its own meetings
(regardless of the monthly HIJTB meetings). this
is taking place now - A common resource model is proposed. Once it has
been demonstrated to work within a limited test
environment, the HIJTB/HICB will discuss if and
when to deploy this generally, taking into
account the ensuing modifications which will be
needed to other components such as the resource
broker.
19GLUE Schemas
- GLUE schemas descriptions of objects and
attributes needed to describe Grid resources and
their mutual relations. - GLUE schemas include
- Computing Element (CE) schema in development
- Storage Element (SE) schema TBD
- Network Element (NE) Schema TBD
The development of schemas is coordinated by JTB
with collaboration from Globus, PPDG and EDG WP
managers.
20CE Schemaversion 4 24/05/2002
- Computing Element an entry point into a queuing
system. Each queue points to one or more
clusters. - Cluster a group of subclusters or individual
nodes. A cluster may be referenced by more than
one computing element. - Subcluster a homogenous group of individual
computing nodes (all nodes must be represented by
a predefined set of attributes). - Host a physical computing element. No host may
be part of more than one subcluster.
21GLUE Schema Representation
In existing MDS models, GLUE Schemas and their
hierarchies can be represented through DITs
(Directory Information Tree). Globus MDS v2.2
will be updated to handle the new schema. In
future OGSA-based implementations (Globus v3.0)
the structure can be converted to an XML document.
22GLUE Stage I
Aims Integration of US (iVDGL) and European
(EDG) testbeds developing a permanent set of
reference tests for new releases and services.
- Phase I
- Cross-organizational authentication
- Unified service discovery and information
infrastructure - Test of Phase I infrastructure
- Phase II
- Data movement infrastructure
- Test of Phase II infrastructure
- Phase III
- Community authorization services
- Test of the complete service
In progress
23Grid Middleware and Testbed
- The following middleware will be tested in Stage
I of GLUE - EDG Work Packages WP1 (Workload management), WP2
(Data management), WP3 (Information and
monitoring services), WP5 (Storage management) - GriPhyN middleware Globus 2.0, Condor v6.3.1,
VDT1.0,
The GLUE testbed will consist of
- Computational resources several CEs from
DataTAG and iVDGL respectively. - Storage access to mass storage systems at CERN
and US Tier 1 sites. - Network standard production networks should be
sufficient.
24GLUE Stage I Schedule
Feb 2002 Test interoperating certificates
between US and EU done May 2002 Review of
common resource discovery schema in
progress Jun 2002 Full testbed proposal
available for review. Review of common storage
schema First version of common use cases (EDG
WP8) Refinement of testbed proposals through
HICB feedback Jul 2002 Intercontinental resource
discovery infrastructure in test mode for
production deployment in September Sep
2002 Interoperating Community and VO
authorization available Implementation of
common use cases by the experiments Nov
2002 Demonstrations planned Dec 2002 Sites
integrated into Grid executing all goals of Stage
I
25Unresolved Issues
- Ownership of GLUE schemas
- Maintenance of GLUE schemas
- Ownership (and maintenance) of MDS information
providers
26Web Addresses
- GLUE Homepage at HICB http//www.hicb.org/glue/g
lue.html - GLUE-Schema site http//www.hicb.org/glue/glue-s
chema/schema.htm - HENP Collaboration page http//www.hicb.org
- The DataTAG Project http//www.datatag.org
- The iVDGL Project http//www.ivdgl.org
- The GriPhyN Project http//www.griphyn.org
- European DataGrid http//www.eu-datagrid.org