The EGEE project: building an international production grid infrastructure - PowerPoint PPT Presentation


PPT – The EGEE project: building an international production grid infrastructure PowerPoint presentation | free to view - id: 56dd71-MWIwN


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation

The EGEE project: building an international production grid infrastructure


Title: User communities and applications Author: David Fergusson Last modified by: Joanne Barnett Created Date: 2/25/2005 9:54:36 PM Document presentation format – PowerPoint PPT presentation

Number of Views:126
Avg rating:3.0/5.0
Slides: 44
Provided by: DavidFe74


Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: The EGEE project: building an international production grid infrastructure

The EGEE project building an international
production grid infrastructure
EGEE is a project co-funded by the European
Commission under contract INFSO-RI-508833
  • EGEE - what is it and why is it needed?
  • Middleware current and future
  • Operations providing a stable service
  • Networking enabling collaboration
  • Summary
  • The material for this talk has been contributed
    by many colleagues in the EGEE LCG projects.
  • It is heavily based on Bob Jones talk at UK AHM

Technological push?
  • Is Grid technology merely a funding
  • Is it an example of scientists wanting to do
    something because they (just about) could?
  • Is it in fact a technology driven activity,
    without any real purpose?
  • Consider this diagram

Grids vs. Distributed Computing
  • Existing distributed applications
  • tend to be specialised systems
  • intended for a single purpose or user group
  • Grids go further and take into account
  • Different kinds of resources
  • Not always the same hardware, data and
  • Different kinds of interactions
  • User groups or applications want to interact with
    Grids in different ways
  • Dynamic nature
  • Resources and users added/removed/changed

What is Grid Computing?
  • A Virtual Organisation is
  • People from different institutions working to
    solve a common goal
  • Sharing distributed processing and data resources
  • Grid infrastructure enables virtual organisations

Grid computing is coordinated resource sharing
and problem solving in dynamic,
multi-institutional virtual organizations
The terms of the problem
  • Technological progress produces more
    sophisticated digital sensors (particle physics
    detectors, satellites, radio-telescopes,
  • Much of science is therefore becoming
    increasingly data-intensive
  • Huge amounts of data need to be analyzed by large
    and geographically distributed scientific
  • Consequently, single computers, clusters or
    supercomputers are not powerful enough for the
    necessary calculations and the data processing

Result access to large facilities is difficult
and expensive for the scientific community,
particularly in less favoured countries gt
increase of the electronic divide
The Grid a possible solution
  • The World Wide Web provides seamless access to
    information stored in different geographical
  • The Grid provides seamless access to computing
    power and data storage capacity distributed over
    the globe
  • Relies on advanced software, called middleware
  • authenticates, authorizes and accounts (AAA)
  • understands and locates the data which the
    scientist needs
  • distributes the computing processing to wherever
    in the world there is available and useful
  • sends the results back

The name Grid was chosen by analogy with the
electric power grid
  • Must share data between thousands of scientists
    with multiple interests
  • Must connect major computer centres, not just PCs
    (not P2P computing)
  • Must ensure that all data is accessible anywhere,
  • Must grow rapidly, yet remain reliable for more
    than a decade
  • Must cope with different computer centres access
  • Must ensure data security

  • Effective and seamless collaboration of dispersed
    communities, scientific first and then industrial
  • Ability to run large-scale applications
    aggregating thousands of computers, for very wide
    range of applications
  • Transparent access to distributed resources from
    your desktop
  • The term e-Science has been coined to express
    these benefits
  • In the vision of the Knowledge Grid, the Grid
    can act as unifying agent between applications
    and non homogeneous data

What are the characteristics of a Grid system?
  • Numerous Resources

Ownership by Mutually Distrustful Organizations
Different Security Requirements Policies
Potentially Faulty Resources
Resources are Heterogeneous
What are the characteristics of a Grid system?
  • Numerous Resources

Ownership by Mutually Distrustful Organizations
Connected by Heterogeneous, Multi-Level Networks
Different Security Requirements Policies
Different Resource Management Policies
Potentially Faulty Resources
Geographically Separated
Resources are Heterogeneous
EGEE Overview
  • Goal
  • Create a world-wide production-quality Gid
    infrastructure for e-Science
  • on top of present and future EU Research
    Networking infrastructure
  • Build on
  • EU and EU member states major investments in
    Grid Technology
  • International connections (US and AP)
  • Several pioneering prototype results
  • Large Grid development teams in EU require major
    EU funding effort
  • Approach
  • Leverage current and planned national and
    regional Grid initiatives and infrastructures
  • Work closely with relevant industrial Grid
    developers, NRENs and US-AP projects
  • http//

Grid infrastructure
Geant-NREN networks
The (Science) Grid Vision
The Grid networked data processing centres and
middleware software as the glue of resources.
In 2 years EGEE will
  • Establish production quality sustained Grid
  • 3000 users from at least 5 disciplines
  • over 8,000 CPU's, 50 sites
  • over 5 Petabytes (1015) storage
  • Demonstrate a viable general process to bring
    other scientific communities on board
  • Propose a second phase in mid 2005 to take over
    EGEE in early 2006

  • EGEE builds on the work of LCG to establish a
    grid operations service
  • LCG (LHC Computing Grid) - Building and operating
    the LHC Grid
  • A collaboration between
  • The physicists and computing specialists from the
    LHC experiment
  • The projects in Europe and the US that have been
    developing Grid middleware
  • The regional and national computing centres that
    provide resources for LHC
  • The research networks

EGEE Activities
32 Million Euros EU funding over 2 years started
1st April 2004
  • 48 service activities (Grid Operations, Support
    and Management, Network Resource Provision)
  • 24 middleware re-engineering (Quality
    Assurance, Security, Network Services
  • 28 networking (Management, Dissemination and
    Outreach, User Training and Education,
    Application Identification and Support, Policy
    and International Cooperation)

Emphasis in EGEE is on operating a
production grid and supporting the end-users
  • EGEE - what is it and why is it needed?
  • Middleware current and future
  • Operations providing a stable service
  • Networking enabling collaboration
  • Summary

gLite Approach
  • Exploit experience and components from existing
  • AliEn, VDT, EDG, LCG, and others
  • Design team works out architecture and design
  • Feedback and guidance from EGEE PTF
    applications Operations, LCG GAG ARDA
  • Components are initially deployed on a prototype
  • Small scale (CERN Univ. Wisconsin)
  • Get user feedback on service semantics and
  • After internal integration and testing,
    components are delivered to grid operations group
    and deployed on the pre-production service

  • gLite - the new EGEE middleware
  • Service oriented - components that are
  • Loosely coupled (by messages)
  • Accessible across network modular and
    self-contained clean modes of failure
  • So can change implementation without changing
  • Can be developed in anticipation of new uses
  • and are based on standards. Opens EGEE to
  • New middleware (plethora of tools now available)
  • Heterogeneous resources (storage, computation)
  • Interact with other Grids (international,
    regional and national)

Future EGEE Middleware - gLite
  • Intended to replace LCG-2
  • Starts with existing components from AliEN, EDG,
    VDT etc.
  • Aims to address LCG-2 shortcoming and advanced
    needs from applications
  • Prototyping short development cycles for fast
    user feedback
  • Initial web-services based prototypes being
    tested with representatives from the application

Architecture Guiding Principles
  • Lightweight (existing) services
  • Easily and quickly deployable
  • Use existing services where possible as basis for
  • Interoperability
  • Allow for multiple implementations
  • Resilience and Fault Tolerance
  • Co-existence with deployed infrastructure
  • Reduce requirements on site components
  • Co-existence (and convergence) with LCG-2 and
    Grid3 are essential for the EGEE Grid service
  • Service oriented approach
  • Follow WSRF standardization
  • No mature WSRF implementations exist to date so
    start with plain WS (WS-I)
  • Provide framework to others so higher-level
    services can be developed quickly
  • Architecture

  • EGEE - what is it and why is it needed?
  • Middleware current and future
  • Operations providing a stable service
  • Needs more than middleware
  • Organisational, operational infrastructure
  • Networking enabling collaboration
  • Summary

User-view of EGEE a multi-VO Grid
User Interface
User Interface
Grid services
EGEE adding a VO
  • EGEE has a formal procedure for adding selected
    new user communities (Virtual Organisations)
  • Negotiation with one of the Regional Operations
  • Seek balance between the resources contributed by
    a VO and those that they consume.
  • Resource allocation will be made at the VO level.
  • Many resources need to be available to multiple
    VOs shared use of resources is fundamental to a

SA1 - Operations
  • Scale of the production service
  • April 2004 2000 CPUs over 30 sites (LCG-1 ?
  • December 2004 8000 CPUs over 80 sites
    (Migrated to Scientific Linux)
  • This is far beyond the project milestones!
  • Continuous improvements to LCG-2 middleware
  • Set-up of CIC/ROCs
  • Roles/responsibilities defined in execution plans
  • documented and implemented
  • On-going
  • Complete set-up of pre-production service
  • Deployment planning for gLite (EGEE1 M/W version)
  • Deploy accounting infrastructure

Running the Production Service
  • Grid deployment has entered a new phase
  • Basic middleware is working
  • responsible now for a small fraction of the
  • Outstanding performance/functionality issues
  • RLS, RB / little modularity lack of consistent
  • some solutions are being developed but many
    cannot be addressed in current software/architectu
    re - set priorities for new middleware (gLite)
  • Many operational issues
  • mis-configuration, out of date mware, single
    points of failure, failover, mgmt interfaces
  • resources unsuitable for applications needs (e.g.
    insufficient disk space)
  • slow response by sites to problems (holiday
    periods, security concerns)
  • new middleware will not help for many of these
    issues - grid partners must think Service

The grid still does not appear as a single
coherent facility applications must adapt to the
current service to gain maximum profit but
result has been very effective for LHCb - 3000
concurrent jobs
EGEE Operations (I) OMC and CIC
  • Operation Management Centre
  • located at CERN, coordinates operations and
  • coordinates with other grid projects
  • Core Infrastructure Centres
  • behave as single organisations
  • operate core services (VO specific and general
    Grid services)
  • develop new management tools
  • provide support to the Regional Operations

EGEE Operations ROC
  • Regional Operations Centre responsibilities and
  • Testing (certification) of new middleware on a
    variety of platforms before deployment
  • Deployment of middleware releases coordination
    distribution inside the region
  • integration of Local VO
  • Development of procedures and capabilities to
    operate the resources
  • First-line user support
  • Bring new resources into the infrastructure and
    support their operation
  • Coordination of integration of national grid
    infrastructures Provide resources for
    pre-production service

Production grid service
Launched Sept03 with 12 sites, now more than 100
sites and continues to grow
Production grid service
Launched Sept03 with 12 sites, now more than 100
sites and continues to grow
Production grid service
Launched Sept03 with 12 sites, now more than 100
sites and continues to grow
Grid projects
  • Many Grid development efforts all over the
  • UK e-Science Grid
  • Netherlands VLAM, PolderGrid
  • Germany UNICORE, Grid proposal
  • France Grid funding approved
  • Italy INFN Grid
  • Eire Grid proposals
  • Switzerland - Network/Grid proposal
  • Hungary DemoGrid, Grid proposal
  • Norway, Sweden - NorduGrid
  • NASA Information Power Grid
  • DOE Science Grid
  • NSF National Virtual Observatory
  • NSF GriPhyN
  • DOE Particle Physics Data Grid
  • NSF TeraGrid
  • DOE ASCI Grid
  • DOE Earth Systems Grid
  • DARPA CoABS Grid
  • NEESGrid
  • EuroGrid (Unicore)
  • DataTag (CERN,)
  • DataGrid (CERN, ...)
  • Astrophysical Virtual Observatory
  • GRIP (Globus/Unicore)
  • GRIA (Industrial applications)
  • GridLab (Cactus Toolkit)
  • CrossGrid (Infrastructure Components)
  • EGSO (Solar Physics)

Authentication, Authorisation
  • Authentication
  • User obtains certificate from CA
  • Connects to UI by ssh
  • Downloads certificate
  • Invokes Proxy server
  • Single logon to UI - then Secure Socket Layer
    with proxy identifies user to other nodes

VO mgr
VO service
  • Authorisation - currently
  • User joins Virtual Organisation
  • VO negotiates access to Grid nodes and resources
    (CE, SE)
  • Authorisation tested by CE, SE
  • gridmapfile maps user to local account

VO database
SSL (proxy)
Gridmapfiles On CE, SE nodes
JRA3 - EGEE Authentication Scheme- EUGridPMA
  • Policy Management Authority Club of trusted
    Certification Authority managers
  • Green CA Accredited
  • Yellow being discussed
  • Other Accredited CAs
  • DoEGrids (US)
  • GridCanada
  • ASCCG (Taiwan)
  • CERN
  • Russia (HEP)
  • FNAL Service CA (US)
  • Israel
  • Pakistan
  • Greece Hellasgrid CA (AUTH)

  • EGEE - what is it and why is it needed?
  • Middleware current and future
  • Operations providing a stable service
  • Networking enabling collaboration
  • Current application communities
  • Summary

Bringing new applications to the grid
  • Outreach events inform people about the grid /
  • Application experts discuss specific
    characteristics with the users
  • Migrate application to EGEE infrastructure with
    the help of EGEE experts
  • Initial deployment for testing purposes
  • Production usage - user community contributes
    computing resources for heavy production
    demands - Canadian dinner party

NA3 User training and induction
  • NA3 has been involved in more than 130 training
    events across the world
  • (including the GGF and other grid schools)
  • 2000 people trained
  • induction application developer advanced
    activity retreats
  • Material archive online with 1000 presentations
  • Strong links made with GILDA testbed and use of
    GENIUS portal
  • Regularly used as part of tutorials
  • Essential element of the virtuous cycle for new
  • Training is one of the first things new
    communities need
  • Process for handling feedback defined
  • Helping to improve material and organisation
  • Roadmap for future event planned
  • Open to new suggestions
  • Produced status report and update training plan
    taking into account lessons learned
  • On-going
  • Plan for next EGEE M/W (gLite) training

EGEE User Support infrastructure
  • General approach 3 main support centers to
    guarantee coverage 24/7 and 365 day support and
    provide a single point of contact to customers
    and to local Grid operations.

To ensure 24x7 support, it was decided to have 3
GGUS teams in different time zones. GGUS started
off at Forschungszentrum Karlsruhe in Germany in
2003 and has had a partner group at Academia
Sinica in Taiwan since April 2004. A third
partner in North America will complete the 24
hours cycle.
EGEE User Support infrastructure
  • The ROCs and VOs and the other project wide
    groups such as the Core Infrastructure Center
    (CIC), middleware groups (JRA), network groups
    (NA), service groups (SA) will be connected via a
    central integration platform provided by GGUS.
  • This central helpdesk keeps track of all service
    requests and assigns them to the appropriate
    support groups. In this way, formal communication
    between all support groups is possible. To enable
    this, each group has to build only one interface
    between its internal support structure and the
    central GGUS application.

  • More about applications and communities in the
    next talk

  • EGEE - what is it and why is it needed?
  • Middleware current and future
  • Operations providing a stable service
  • Networking enabling collaboration
  • Current application communities
  • Enabling new and effective use of EGEE
  • Summary

Who else can benefit from EGEE?
  • EGEE Generic Applications Advisory Panel
  • For new applications
  • EU projects MammoGrid, Diligent, SEE-GRID
  • Expression of interest Planck/Gaia
    (astroparticle), SimDat (drug discovery)

Intellectual Property
  • The existing EGEE grid middleware (LCG-2) is
    distributed under an Open Source License
    developed by EU DataGrid
  • Derived from modified BSD - no restriction on
    usage (academic or commercial) beyond
  • Same approach for new middleware (gLite)
  • Application software maintains its own licensing
  • Sites must obtain appropriate licenses before

  • EGEE is the first attempt to build a worldwide
    Grid infrastructure for data intensive
    applications from many scientific domains
  • A large-scale production grid service is already
    deployed and being used for HEP and BioMed
    applications with new applications being ported
  • Resources user groups will rapidly expand
    during the project
  • A process is in place for migrating new
    applications to the EGEE infrastructure
  • A training programme has started with events
    already held
  • Prototype next generation middleware is being
    tested (gLite)
  • Plans for a follow-on project are being discussed

When will the grid disappear?
  • Two possibilities
  • 1. Grids will not fulfill their promise and fade
    into being a niche distributed computing domain
  • 2 Grids will become ubiquitous and easily usable
    transparent to the user and so disappear
  • Following the trajectory of other networked