Title: Towards Seamless Grid Computing The EGEE Experience on Interoperable Grid Infrastructures
1Towards Seamless Grid ComputingThe EGEE
Experience on Interoperable Grid Infrastructures
- Erwin Laure
- EGEE-II Technical Director
- Erwin.Laure_at_cern.ch
2eScience
- Science is becoming increasingly digital, needs
to deal with increasing amounts of data and
computational needs - Simulations get ever more detailed
- Nanotechnology design of new materials from
the molecular scale - Modelling and predicting complex systems
(weather forecasting, river floods, earthquake) - Decoding the human genome
- Experimental Science uses ever moresophisticated
sensors to make precisemeasurements - Need high statistics
- Huge amounts of data
- Serves user communities around the world
3Scientific trends
- Scientific advances are more and more based on
simulations using virtual laboratories -
- Ulf Dahlsten, Former Director Emerging
Technologies and Infrastructures, EU, predicts
that in five years 80 percent of all scientific
papers in all areas will be made in virtual
laboratories. Fifty percent of social science
documents will go the same way in five to ten
years. - The size of data an organization owns, manages,
and depends on is dramatically increasing - Ownership cost of storage capacity goes down
- Data generated and consumed goes up
- Network capacity goes up
- Distributed computing technology matures and is
more widely adopted
4EGEE
- Flagship grid infrastructure project co-funded
by the European Commission - Now in 2nd phase with 91 partners in 32
countries
- Main Objectives
- Operate a large-scale, production quality grid
infrastructure for e-Science - Attract new resources and users from industry as
wellas sciences
5EGEE What do we deliver?
- Infrastructure operation
- Sites distributed across many countries
- Large quantity of CPUs and storage
- Continuous monitoring of grid services
automated site configuration/management - Support multiple Virtual Organisations from
diverse research disciplines - Middleware
- Production quality middleware distributed under
business friendly open source licence - Implements a service-oriented architecture that
virtualises resources - Adheres to recommendations on web service
inter-operability and evolving towards emerging
standards - User Support - Managed process from first contact
through to production usage - Training
- Expertise in grid-enabling applications
- Online helpdesk
- Networking events (User Forum, Conferences etc.)
6250 sites 48 countries 50,000 CPUs 13
PetaBytes gt5000 users gt200 VOs gt140,000 jobs/day
Archeology Astronomy Astrophysics Civil
Protection Comp. Chemistry Earth
Sciences Finance Fusion Geophysics High Energy
Physics Life Sciences Multimedia Material
Sciences
32
7Users and resources distribution
8 EGEE Grid Management Structure
- Operations Coordination Centre (OCC)
- management, oversight of all operational and
support activities - Regional Operations Centres (ROC)
- providing the core of the support infrastructure,
each supporting a number of resource centres
within its region - Grid Operator on Duty
- Resource centres
- providing resources (computing, storage, network,
etc.) - Grid User Support (GGUS)
- At FZK, coordination and management of user
support, single point of contact for users
9Example GridMap Monitoring Visualization
9
10Registered Collaborating Projects
25 projects have registered as of September 2007
web page
11EGEE working with related infrastructure projects
12LHC Use of Multiple Grid Infrastructures
13Why is Interoperability difficult?
- Grid infrastructures use different technologies
- And even if same technologies are used they are
usually heavily customized - Only a few widely adopted standards
- gridFTP, X.509 (but used differently!)
- Prototypes BES, JSDL,
- Production Grids are difficult to change
adopting standards takes time - Standards need to be stable before adoption
- Apart from technological differences, access
policies also differ - Dialog among major Grid infrastructure providers
started at last OGF22. - Strong interactions between infrastructures and
application community needed - HEP was driving interop efforts for LHC
- Other applications can build on these experiences
14Access to Compute Resources
Nordugrid
CREAM
ARC
EGEE
GRAM v2/v4
Unicore
OSG
DEISA
GRAM v4WS
NAREGI
Teragrid
Naregi
There are as many Computing Interfaces as Batch
Systems!
15How to Start
- Understanding the differences
- Compatibility matrix
- Domains that have to be linked for
interoperability - Security
- Information Services
- Job Management
- Data Management
- For interoperation you have to add
- Monitoring
- Accounting
- Operational links and joint policies
- Trouble ticket systems
- Operational security
16Interoperability Matrix
- Understand both middleware stacks
- Identify the common interfaces
- Create an interoperability matrix
17Different Strategy
- Long term solution
- Common interfaces
- Standards
- Medium term solutions
- Gateways
- Adaptors and Translators
- Short term solutions
- Parallel Infrastructures
- User driven
- Site driven
18Parallel Infrastructures
- User Driven
- The user joins both grids
- Uses different clients
- Depending on which interface
- More work for the User
- Required for each infrastructure
- Keyhole approach
- Restricts functionality
- Method initially used by ATLAS
- Split workload between grids
19Parallel Infrastructures
- Site Driven
- The site joins both grids
- Deploys both interfaces
- User only sees their grid interface
- More work for the site
- Can only be supported by large sites
- Reduced resources
- Use By FZK
- Participating in EGEE, Nordugrid and D-Grid
20Gateway
- A gateway is a bridge between grid
infrastructures - Single point of failure
- Gateway breaks, grid disappears
- Scalability bottleneck
- All the load through one service
- Useful as a proof concept and to demonstrate the
need - NAREGI approach using glite-CE
Gateway
21Adaptors and Translators
- Adaptors allow connection
- Translators understand/modify information
- They are built into the middleware
- The middleware can then work with both interfaces
- Useful feature even when using standards!
- Requires modification to the grid middleware
- Existing service interfaces can still be used
- Using in the GIN information System most portals
Plugin
API
Plugin
22Worldwide Grids
APAC DEISA EGEE Naregi NDGF NGSOSG Pragma Teragri
d
23How mature are we?
Gartner Group
Grid on the Computing in HighEnergy Physics
conferences timeline
Beijing 2001
San Diego 2003
Victoria 2007
Mumbai 2006
Padova 2000
Interlaken 2004
Slide courtesy of Les Robertson, LCG Project
Leader
24From e-Infrastructures to Knowledge
Infrastructures
- Network infrastructure connects computing and
data resources and allows their seamless usage
via Grid infrastructures - Federated resources and new technologies enable
new application fields - Distributed digital libraries
- Distributed data mining
- Digital preservation of cultural heritage
- Data curation
- ? Knowledge Infrastructure
25ICT for Science e-Infrastructures
Connecting the finest minds Sharing and
federating the best scientific resources Building
global virtual communities
Sharing and federating scientific data
Sharing computers, instruments and applications
Linking at the speed of the light
Mario Campolargo Acting Director Emerging
Technologies and Infrastructures, EU European
Information Space Infrastructures, Services and
Applications Workshop, Rome, 29-30 October 2007
26Evolution
National
European e-Infrastructure
Global
27European Grid Initiative
- Need to prepare permanent, common Grid
infrastructure - Ensure the long-term sustainability of the
European e-Infrastructure independent of short
project funding cycles - Coordinate the integration and interaction
between National Grid Infrastructures (NGIs) - Operate the production Grid infrastructure on a
European level for a wide range of scientific
disciplines
28Summary
- EGEE provides a dependable production quality
Grid infrastructure to a wide variety of
scientific disciplines. - Collaborations on technical and political topics
are key to implement a truly world-wide
infrastructure - Need to cover full spectrum from individual
sites, small scale Grids to world-wide
infrastructures - Grids are increasingly becoming an essential part
of the scientific computing infrastructure
sustainability needs to be ensured