Title: EGEE Providing a Production Grid Infrastructure for Collaborative Science
1EGEEProviding a Production Grid Infrastructure
for Collaborative Science
- Erwin Laure
- EGEE Technical Director
ISSGC08 July 6-18, 2008 Balatonfüred, Hungary
2Defining the Grid
- A Grid is the combination of networked resources
and the corresponding middleware, which provides
services for the user.
3The EGEE Project
- Aim of EGEE
- to establish a seamless European Grid
infrastructure for the support of the European
Research Area (ERA) - EGEE
- 1 April 2004 31 March 2006
- 71 partners in 27 countries, federated in
regional Grids - EGEE-II
- 1 April 2006 30 April 2008
- Expanded consortium
- EGEE-III
- 1 May 2008 30 April 2010
- Transition to sustainable model
4Defining the Grid
- A Grid is the combination of networked resources
and the corresponding middleware, which provides
services for the user.
5 EGEE Infrastructure
Country participating in EGEE
6EGEE Infrastructures
- Production service
- Scaling up the infrastructure with resource
centres around the globe - Stable, well-supported infrastructure, running
only well-tested and reliable middleware - Pre-production service
- Run in parallel with the production service
(restricted nr of sites) - First deployment of new versions of the gLite
middleware - Test-bed for applications and other external
functionality - T-Infrastructure (TrainingEducation)
- Complete suite of Grid elements and application
(Testbed, CA, VO, monitoring, support, ) - Everyone can register and use GILDA for training
and testing
20 sites on 3 continents
7EGEE Operations Process
- Geographically distributed responsibility for
operations - There is no central operation
- Regional Operation Centers
- Responsible or resource centers in their region
- Tools are developed/hosted at different sites
- GOC DB (RAL), SAM (CERN), GStat (Taipei), CIC
Portal (Lyon) - Grid operator on duty
- 10 teams working in weekly rotation
- Crucial in improving site stability and
management - Operations coordination
- Weekly operations meetings
- Regular ROC managers meetings
- Series of EGEE Operations Workshops
- Procedures described in Operations Manual
- Introducing new sites
- Site downtime scheduling
- Highlights
- Distributed operation
- Evolving and maturing procedures
- Procedures being in introduced into and shared
with the related infrastructure projects
8Improved reliability through multi-level
monitoring
Doubled size and usage without impact on
operations
Central probes (SAM)
Local probes
9Defining the Grid
- A Grid is the combination of networked resources
and the corresponding middleware, which provides
services for the user.
10gLite Middleware Distribution
- Combines components from different providers
- Condor and Globus (via VDT)
- LCG
- EGEE
- Others
- Focus on providing a deployable MW distribution
for EGEE production service - Middleware services configuration tools
- Follows a service oriented approach
- Usage of webservices where useful and possible
performance-wise - Complemented by application-level servcies
11Production Grid Middleware
- Key factors in EGEE Grid Middleware Development
- Strict software process
- Use industry standard software engineering
methods - Software configuration management, version
control, defect tracking, automatic build system,
- Conservative approach in what software to use
- Avoid cutting-edge software
- Deployment on over 200 sites cannot assume a
homogenous environment middleware needs to work
with many underlying software flavors - Avoid evolving standards
- Evolving standards change quickly (and sometime
significantly cf. OGSI vs. WSRF) impossible to
keep pace on gt 200 sites
Long (and tedious) pathfrom prototypes to
production
12gLite Process
Directives
Development
External Software
Error Fixing
Software
Directives
Integration
Certification
Pre-Production
Deployment Packages
Testbed Deployment
Problem
Fail
Production Infrastructure
Pre-Production Deployment
Fail
Integration Tests
Pass
Functional Tests
Pass
Fail
Installation Guide, Release Notes, etc
Scalability Tests
Pass
Release
13Building Software for the Grid
Courtesy IBM
Platform Infrastructure
Unix
Windows
JVM
TCP/IP
MPI
.Net Runtime
VPN
SSH
Slide Courtesy David Abramson
14Building Software for the Grid
Upper Middleware Tools
Courtesy IBM,
Platform Infrastructure
Unix
Windows
JVM
TCP/IP
MPI
.Net Runtime
VPN
SSH
Slide Courtesy David Abramson
15Portals on EGEE
16Not only portals
- Portals are a good way to bring computing power
to end-users - In most cases domain specific
- Application programmers (and portal programmers)
need more powerful interfaces - Workflow engines
- Higher level programming abstractions (SAGA,
DRMAA, ) - Programming environments (gEclipse)
17Defining the Grid
- A Grid is the combination of networked resources
and the corresponding middleware, which provides
services for the user.
18 EGEE Applications
- gt270 VOs from several scientific domains
- Astronomy Astrophysics
- Civil Protection
- Computational Chemistry
- Comp. Fluid Dynamics
- Computer Science/Tools
- Condensed Matter Physics
- Earth Sciences
- Fusion
- High Energy Physics
- Life Sciences
- Further applications under evaluation
Applications have moved from testing to routine
and daily usage 80-95 efficiency
19Accelerating and colliding particles
Large Hadron Collider
- 27 km circumference tunnel
- Due to start up in 2008
- 40 Million Particle collisions per second
- Online filter reduces to a few 100 good events
per second recorded on disk and magnetic tape at
100-1,000 MegaBytes/sec - 15 PetaBytes per year for all four experiments
- Data analyzed by 100s of research groups world
wide
20The Data Acquisition
21Acquisition, First pass reconstruction, Storage
Distribution
22Data Distribution on the Grid
23Earth Science Applications in EGEE
Flood of a Danube river-Cascade of models
(meteorology,hydraulic ,hydrodynamic.) UISAV(SK)
ESA, UTV(IT), KNMI(NL), IPSL(FR)- Production and
validation of 7 years of Ozone profiles from GOME
Rapid Earthquake analysis (mechanism and
epicenter) 50- 100CPUs IPGP(FR)
Geocluster for Academy and industry CGG(FR)
Data mining Meteorology Space Weather (GCRAS,
RU)
Modelling seawater intrusion in costal aquifer
(SWIMED) CRS4(IT),INAT(TU),Univ.Neuchâtel(CH)
DKRZ(DE)- Data access studies, climate impacts on
agriculture
Specfem3D Seismic application. Benchmark for MPI
(2 to 2000 CPUs) (IPGP,FR)
Air Pollution model- BAS(BG)
Mars atmosphere CETP( FR)
24Improved Efficiency Through VO Monitoring
- SAM allows to plug in VO-specific test
- Only responsive sites taken into account for
scheduling - Experiment dashboards
- Better understand reason for failures
- Extensively used by the LHC community
- VLMED VO (biomed) using the dashboard for a year
now, others interested - Evolution similar to operations grid monitoring
- Feed VO monitoring results to the sites
- Common mechanism
25How mature are we?
Gartner Group
Grid on the Computing in HighEnergy Physics
conferences timeline
Beijing 2001
San Diego 2003
Victoria 2007
Mumbai 2006
Padova 2000
Interlaken 2004
Slide courtesy of Les Robertson, LCG Project
Leader
26The Future of Grids
- Increasing the number of infrastructure users by
increasing awareness - Dissemination and outreach
- Training and education
- Increasing the number of applications by
improving application support and middleware
functionality - Improved usability through high level grid
middleware extensions - Increasing the grid infrastructure
- Incubating related projects
- Ensuring interoperability between projects
- Protecting user investments
- Towards a sustainable grid infrastructure
27Grid Interoperability
- Incubator for new Grid efforts world-wide
- Infrastructure and application efforts
- Leading role in building world-wide Grids
through interoperation efforts - Bilateral EGEE/OSG, EGEE/NDGF, EGEE/NAREGI,
EGEE/Unicore/DEISA - Multilateral Grid Interoperability Now (GIN)
- Experiences and requirements fed back into
standardization process (OGF) - Many EGEE members are area directors, WG chairs,
WG members - Contacts with industry strengthened
- Industry Days, Industry Task Force, Business
Associates Programme
28EGEE working with related infrastructure projects
29Evolution
National
European e-Infrastructure
Global
30European Grid Initiative
- Need to prepare permanent, common Grid
infrastructure - Ensure the long-term sustainability of the
European e-Infrastructure independent of short
project funding cycles - Coordinate the integration and interaction
between National Grid Infrastructures (NGIs) - Operate the production Grid infrastructure on a
European level for a wide range of scientific
disciplines
31Summary
- Grids represent a powerful new tool for science
- ?Today we have a window of opportunity to move
grids from research prototypes to permanent
production systems (as networks did a few years
ago) - EGEE offers
- a mechanism for linking together people,
resources and data of many scientific community - a basic set of middleware for gridfying
applications with documentation, training and
support - regular forums for linking with grid experts,
other communities and industry
32Summary
- Success will lead to the adoption of grids as the
main computing infrastructure for science - If we succeed then the potential return to
international scientific communities will be
enormous and open the path for commercial and
industrial applications
33EGEE08 Conference