The EGEE project: building an international production grid infrastructure - PowerPoint PPT Presentation

Loading...

PPT – The EGEE project: building an international production grid infrastructure PowerPoint presentation | free to view - id: 56dd71-MWIwN



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

The EGEE project: building an international production grid infrastructure

Description:

Title: User communities and applications Author: David Fergusson Last modified by: Joanne Barnett Created Date: 2/25/2005 9:54:36 PM Document presentation format – PowerPoint PPT presentation

Number of Views:126
Avg rating:3.0/5.0
Slides: 44
Provided by: DavidFe74
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: The EGEE project: building an international production grid infrastructure


1
The EGEE project building an international
production grid infrastructure
EGEE is a project co-funded by the European
Commission under contract INFSO-RI-508833
2
  • EGEE - what is it and why is it needed?
  • Middleware current and future
  • Operations providing a stable service
  • Networking enabling collaboration
  • Summary
  • The material for this talk has been contributed
    by many colleagues in the EGEE LCG projects.
  • It is heavily based on Bob Jones talk at UK AHM
    2004.

3
Technological push?
  • Is Grid technology merely a funding
    opportunity?
  • Is it an example of scientists wanting to do
    something because they (just about) could?
  • Is it in fact a technology driven activity,
    without any real purpose?
  • Consider this diagram

4
Grids vs. Distributed Computing
  • Existing distributed applications
  • tend to be specialised systems
  • intended for a single purpose or user group
  • Grids go further and take into account
  • Different kinds of resources
  • Not always the same hardware, data and
    applications
  • Different kinds of interactions
  • User groups or applications want to interact with
    Grids in different ways
  • Dynamic nature
  • Resources and users added/removed/changed
    frequently

5
What is Grid Computing?
  • A Virtual Organisation is
  • People from different institutions working to
    solve a common goal
  • Sharing distributed processing and data resources
  • Grid infrastructure enables virtual organisations

Grid computing is coordinated resource sharing
and problem solving in dynamic,
multi-institutional virtual organizations
(I.Foster)
6
The terms of the problem
  • Technological progress produces more
    sophisticated digital sensors (particle physics
    detectors, satellites, radio-telescopes,
    synchrotrons)
  • Much of science is therefore becoming
    increasingly data-intensive
  • Huge amounts of data need to be analyzed by large
    and geographically distributed scientific
    communities
  • Consequently, single computers, clusters or
    supercomputers are not powerful enough for the
    necessary calculations and the data processing

Result access to large facilities is difficult
and expensive for the scientific community,
particularly in less favoured countries gt
increase of the electronic divide
7
The Grid a possible solution
  • The World Wide Web provides seamless access to
    information stored in different geographical
    locations
  • The Grid provides seamless access to computing
    power and data storage capacity distributed over
    the globe
  • Relies on advanced software, called middleware
  • authenticates, authorizes and accounts (AAA)
  • understands and locates the data which the
    scientist needs
  • distributes the computing processing to wherever
    in the world there is available and useful
    capacity
  • sends the results back

The name Grid was chosen by analogy with the
electric power grid
8
Challenges
  • Must share data between thousands of scientists
    with multiple interests
  • Must connect major computer centres, not just PCs
    (not P2P computing)
  • Must ensure that all data is accessible anywhere,
    anytime
  • Must grow rapidly, yet remain reliable for more
    than a decade
  • Must cope with different computer centres access
    policies
  • Must ensure data security

9
Benefits
  • Effective and seamless collaboration of dispersed
    communities, scientific first and then industrial
  • Ability to run large-scale applications
    aggregating thousands of computers, for very wide
    range of applications
  • Transparent access to distributed resources from
    your desktop
  • The term e-Science has been coined to express
    these benefits
  • In the vision of the Knowledge Grid, the Grid
    can act as unifying agent between applications
    and non homogeneous data

10
What are the characteristics of a Grid system?
  • Numerous Resources

Ownership by Mutually Distrustful Organizations
Individuals
Different Security Requirements Policies
Required
Potentially Faulty Resources
Resources are Heterogeneous
11
What are the characteristics of a Grid system?
Standards
  • Numerous Resources

Ownership by Mutually Distrustful Organizations
Individuals
Connected by Heterogeneous, Multi-Level Networks
Different Security Requirements Policies
Required
Different Resource Management Policies
Potentially Faulty Resources
Geographically Separated
Resources are Heterogeneous
12
EGEE Overview
  • Goal
  • Create a world-wide production-quality Gid
    infrastructure for e-Science
  • on top of present and future EU Research
    Networking infrastructure
  • Build on
  • EU and EU member states major investments in
    Grid Technology
  • International connections (US and AP)
  • Several pioneering prototype results
  • Large Grid development teams in EU require major
    EU funding effort
  • Approach
  • Leverage current and planned national and
    regional Grid initiatives and infrastructures
  • Work closely with relevant industrial Grid
    developers, NRENs and US-AP projects
  • http//www.eu-egee.org

Applications
Grid infrastructure
Geant-NREN networks
13
The (Science) Grid Vision
The Grid networked data processing centres and
middleware software as the glue of resources.
14
In 2 years EGEE will
  • Establish production quality sustained Grid
    services
  • 3000 users from at least 5 disciplines
  • over 8,000 CPU's, 50 sites
  • over 5 Petabytes (1015) storage
  • Demonstrate a viable general process to bring
    other scientific communities on board
  • Propose a second phase in mid 2005 to take over
    EGEE in early 2006

15
EGEE and LCG
  • EGEE builds on the work of LCG to establish a
    grid operations service
  • LCG (LHC Computing Grid) - Building and operating
    the LHC Grid
  • A collaboration between
  • The physicists and computing specialists from the
    LHC experiment
  • The projects in Europe and the US that have been
    developing Grid middleware
  • The regional and national computing centres that
    provide resources for LHC
  • The research networks

16
EGEE Activities
32 Million Euros EU funding over 2 years started
1st April 2004
  • 48 service activities (Grid Operations, Support
    and Management, Network Resource Provision)
  • 24 middleware re-engineering (Quality
    Assurance, Security, Network Services
    Development)
  • 28 networking (Management, Dissemination and
    Outreach, User Training and Education,
    Application Identification and Support, Policy
    and International Cooperation)

Emphasis in EGEE is on operating a
production grid and supporting the end-users
17
  • EGEE - what is it and why is it needed?
  • Middleware current and future
  • Operations providing a stable service
  • Networking enabling collaboration
  • Summary

18
gLite Approach
  • Exploit experience and components from existing
    projects
  • AliEn, VDT, EDG, LCG, and others
  • Design team works out architecture and design
  • Feedback and guidance from EGEE PTF
    applications Operations, LCG GAG ARDA
  • Components are initially deployed on a prototype
    infrastructure
  • Small scale (CERN Univ. Wisconsin)
  • Get user feedback on service semantics and
    interfaces
  • After internal integration and testing,
    components are delivered to grid operations group
    and deployed on the pre-production service

19
gLite
  • gLite - the new EGEE middleware
  • Service oriented - components that are
  • Loosely coupled (by messages)
  • Accessible across network modular and
    self-contained clean modes of failure
  • So can change implementation without changing
    interfaces
  • Can be developed in anticipation of new uses
  • and are based on standards. Opens EGEE to
  • New middleware (plethora of tools now available)
  • Heterogeneous resources (storage, computation)
  • Interact with other Grids (international,
    regional and national)

20
Future EGEE Middleware - gLite
  • Intended to replace LCG-2
  • Starts with existing components from AliEN, EDG,
    VDT etc.
  • Aims to address LCG-2 shortcoming and advanced
    needs from applications
  • Prototyping short development cycles for fast
    user feedback
  • Initial web-services based prototypes being
    tested with representatives from the application
    groups

21
Architecture Guiding Principles
  • Lightweight (existing) services
  • Easily and quickly deployable
  • Use existing services where possible as basis for
    re-engineering
  • Interoperability
  • Allow for multiple implementations
  • Resilience and Fault Tolerance
  • Co-existence with deployed infrastructure
  • Reduce requirements on site components
  • Co-existence (and convergence) with LCG-2 and
    Grid3 are essential for the EGEE Grid service
  • Service oriented approach
  • Follow WSRF standardization
  • No mature WSRF implementations exist to date so
    start with plain WS (WS-I)
  • Provide framework to others so higher-level
    services can be developed quickly
  • Architecture
    https//edms.cern.ch/document/476451

22
  • EGEE - what is it and why is it needed?
  • Middleware current and future
  • Operations providing a stable service
  • Needs more than middleware
  • Organisational, operational infrastructure
  • Networking enabling collaboration
  • Summary

23
User-view of EGEE a multi-VO Grid
User Interface
User Interface
Grid services
24
EGEE adding a VO
  • EGEE has a formal procedure for adding selected
    new user communities (Virtual Organisations)
  • Negotiation with one of the Regional Operations
    Centres
  • Seek balance between the resources contributed by
    a VO and those that they consume.
  • Resource allocation will be made at the VO level.
  • Many resources need to be available to multiple
    VOs shared use of resources is fundamental to a
    Grid

25
SA1 - Operations
  • Scale of the production service
  • April 2004 2000 CPUs over 30 sites (LCG-1 ?
    LCG-2)
  • December 2004 8000 CPUs over 80 sites
    (Migrated to Scientific Linux)
  • This is far beyond the project milestones!
  • Continuous improvements to LCG-2 middleware
  • Set-up of CIC/ROCs
  • Roles/responsibilities defined in execution plans
  • documented and implemented
  • On-going
  • Complete set-up of pre-production service
  • Deployment planning for gLite (EGEE1 M/W version)
  • Deploy accounting infrastructure

26
Running the Production Service
  • Grid deployment has entered a new phase
  • Basic middleware is working
  • responsible now for a small fraction of the
    problems
  • Outstanding performance/functionality issues
  • RLS, RB / little modularity lack of consistent
    interfaces
  • some solutions are being developed but many
    cannot be addressed in current software/architectu
    re - set priorities for new middleware (gLite)
  • Many operational issues
  • mis-configuration, out of date mware, single
    points of failure, failover, mgmt interfaces
  • resources unsuitable for applications needs (e.g.
    insufficient disk space)
  • slow response by sites to problems (holiday
    periods, security concerns)
  • new middleware will not help for many of these
    issues - grid partners must think Service

The grid still does not appear as a single
coherent facility applications must adapt to the
current service to gain maximum profit but
result has been very effective for LHCb - 3000
concurrent jobs
27
EGEE Operations (I) OMC and CIC
  • Operation Management Centre
  • located at CERN, coordinates operations and
    management
  • coordinates with other grid projects
  • Core Infrastructure Centres
  • behave as single organisations
  • operate core services (VO specific and general
    Grid services)
  • develop new management tools
  • provide support to the Regional Operations
    Centres

28
EGEE Operations ROC
  • Regional Operations Centre responsibilities and
    roles
  • Testing (certification) of new middleware on a
    variety of platforms before deployment
  • Deployment of middleware releases coordination
    distribution inside the region
  • integration of Local VO
  • Development of procedures and capabilities to
    operate the resources
  • First-line user support
  • Bring new resources into the infrastructure and
    support their operation
  • Coordination of integration of national grid
    infrastructures Provide resources for
    pre-production service

29
Production grid service
Launched Sept03 with 12 sites, now more than 100
sites and continues to grow
30
Production grid service
Launched Sept03 with 12 sites, now more than 100
sites and continues to grow
31
Production grid service
Launched Sept03 with 12 sites, now more than 100
sites and continues to grow
32
Grid projects
  • Many Grid development efforts all over the
    world
  • UK e-Science Grid
  • Netherlands VLAM, PolderGrid
  • Germany UNICORE, Grid proposal
  • France Grid funding approved
  • Italy INFN Grid
  • Eire Grid proposals
  • Switzerland - Network/Grid proposal
  • Hungary DemoGrid, Grid proposal
  • Norway, Sweden - NorduGrid
  • NASA Information Power Grid
  • DOE Science Grid
  • NSF National Virtual Observatory
  • NSF GriPhyN
  • DOE Particle Physics Data Grid
  • NSF TeraGrid
  • DOE ASCI Grid
  • DOE Earth Systems Grid
  • DARPA CoABS Grid
  • NEESGrid
  • DOH BIRN
  • NSF iVDGL
  • EuroGrid (Unicore)
  • DataTag (CERN,)
  • DataGrid (CERN, ...)
  • Astrophysical Virtual Observatory
  • GRIP (Globus/Unicore)
  • GRIA (Industrial applications)
  • GridLab (Cactus Toolkit)
  • CrossGrid (Infrastructure Components)
  • EGSO (Solar Physics)

33
Authentication, Authorisation
  • Authentication
  • User obtains certificate from CA
  • Connects to UI by ssh
  • Downloads certificate
  • Invokes Proxy server
  • Single logon to UI - then Secure Socket Layer
    with proxy identifies user to other nodes

CA
Personal
VO mgr
VO service
  • Authorisation - currently
  • User joins Virtual Organisation
  • VO negotiates access to Grid nodes and resources
    (CE, SE)
  • Authorisation tested by CE, SE
  • gridmapfile maps user to local account

VO database
SSL (proxy)
Gridmapfiles On CE, SE nodes
34
JRA3 - EGEE Authentication Scheme- EUGridPMA
  • Policy Management Authority Club of trusted
    Certification Authority managers
    www.eugridpma.org
  • Green CA Accredited
  • Yellow being discussed
  • Other Accredited CAs
  • DoEGrids (US)
  • GridCanada
  • ASCCG (Taiwan)
  • CERN
  • Russia (HEP)
  • FNAL Service CA (US)
  • Israel
  • Pakistan
  • Greece Hellasgrid CA (AUTH)

35
  • EGEE - what is it and why is it needed?
  • Middleware current and future
  • Operations providing a stable service
  • Networking enabling collaboration
  • Current application communities
  • Summary

36
Bringing new applications to the grid
  • Outreach events inform people about the grid /
    EGEE
  • Application experts discuss specific
    characteristics with the users
  • Migrate application to EGEE infrastructure with
    the help of EGEE experts
  • Initial deployment for testing purposes
  • Production usage - user community contributes
    computing resources for heavy production
    demands - Canadian dinner party

37
NA3 User training and induction
  • NA3 has been involved in more than 130 training
    events across the world
  • (including the GGF and other grid schools)
  • 2000 people trained
  • induction application developer advanced
    activity retreats
  • Material archive online with 1000 presentations
  • Strong links made with GILDA testbed and use of
    GENIUS portal
  • Regularly used as part of tutorials
  • Essential element of the virtuous cycle for new
    communities
  • Training is one of the first things new
    communities need
  • Process for handling feedback defined
  • Helping to improve material and organisation
  • Roadmap for future event planned
  • Open to new suggestions
  • Produced status report and update training plan
    taking into account lessons learned
  • On-going
  • Plan for next EGEE M/W (gLite) training

38
EGEE User Support infrastructure
  • General approach 3 main support centers to
    guarantee coverage 24/7 and 365 day support and
    provide a single point of contact to customers
    and to local Grid operations.

To ensure 24x7 support, it was decided to have 3
GGUS teams in different time zones. GGUS started
off at Forschungszentrum Karlsruhe in Germany in
2003 and has had a partner group at Academia
Sinica in Taiwan since April 2004. A third
partner in North America will complete the 24
hours cycle.
39
EGEE User Support infrastructure
  • The ROCs and VOs and the other project wide
    groups such as the Core Infrastructure Center
    (CIC), middleware groups (JRA), network groups
    (NA), service groups (SA) will be connected via a
    central integration platform provided by GGUS.
  • This central helpdesk keeps track of all service
    requests and assigns them to the appropriate
    support groups. In this way, formal communication
    between all support groups is possible. To enable
    this, each group has to build only one interface
    between its internal support structure and the
    central GGUS application.

40
  • More about applications and communities in the
    next talk

41
  • EGEE - what is it and why is it needed?
  • Middleware current and future
  • Operations providing a stable service
  • Networking enabling collaboration
  • Current application communities
  • Enabling new and effective use of EGEE
  • Summary

42
Who else can benefit from EGEE?
  • EGEE Generic Applications Advisory Panel
  • For new applications
  • EU projects MammoGrid, Diligent, SEE-GRID
  • Expression of interest Planck/Gaia
    (astroparticle), SimDat (drug discovery)

43
Intellectual Property
  • The existing EGEE grid middleware (LCG-2) is
    distributed under an Open Source License
    developed by EU DataGrid
  • Derived from modified BSD - no restriction on
    usage (academic or commercial) beyond
    acknowledgement
  • Same approach for new middleware (gLite)
  • Application software maintains its own licensing
    scheme
  • Sites must obtain appropriate licenses before
    installation

44
Summary
  • EGEE is the first attempt to build a worldwide
    Grid infrastructure for data intensive
    applications from many scientific domains
  • A large-scale production grid service is already
    deployed and being used for HEP and BioMed
    applications with new applications being ported
  • Resources user groups will rapidly expand
    during the project
  • A process is in place for migrating new
    applications to the EGEE infrastructure
  • A training programme has started with events
    already held
  • Prototype next generation middleware is being
    tested (gLite)
  • Plans for a follow-on project are being discussed

45
When will the grid disappear?
  • Two possibilities
  • 1. Grids will not fulfill their promise and fade
    into being a niche distributed computing domain
  • 2 Grids will become ubiquitous and easily usable
    transparent to the user and so disappear
  • Following the trajectory of other networked
    services
About PowerShow.com