Experiences with using the EGEE grid infrastructure and lessons for the future - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Experiences with using the EGEE grid infrastructure and lessons for the future

Description:

... feature full platform for distributed hosting, management and retrieval of data and information ... Integration with commercial software packages ... – PowerPoint PPT presentation

Number of Views:291
Avg rating:3.0/5.0
Slides: 24
Provided by: dell312
Category:

less

Transcript and Presenter's Notes

Title: Experiences with using the EGEE grid infrastructure and lessons for the future


1
Experiences with using the EGEE grid
infrastructure and lessons for the future
Bob Jones (CERN) EGEE project Director
  • Bob Jones
  • EGEE Project Director

2
Contents
  • EGEE in one slide
  • What EGEE does today (one more slide)
  • Our understanding of what CLARIN wants
  • Based on Peter Wittenburgs presentation at
    EGEE09 last week and the CLARIN short-guides
    (very useful!)
  • Centres, Trust Domain, Metadata, Virtual
    Collections, etc.
  • Mapped on to what exists today
  • I dont pretend that we have a turn-key solution
    for CLARIN but rather these are examples of what
    is possible
  • How CLARIN could interface with EGI
  • Suggested Next Steps
  • Lots of material contributed by EGEE WLCG
    colleagues

3
EGEE-III
Flagship Grid infrastructure project co-funded
by the European Commission
  • Main Objectives
  • Expand/optimise existing EGEE infrastructure,
    include more resources and user communities
  • Prepare migration from a project-based model to a
    sustainable federated infrastructure based on
    National Grid Initiatives

Duration 2 years Consortium 140 organisations
across 33 countries EC co-funding 32Million
4
EGEE What do we deliver?
  • Infrastructure operation - Sites distributed
    across many countries
  • Large quantity of CPUs and storage
  • Continuous monitoring of grid services
    automated site configuration/management
  • Support multiple Virtual Organisations from
    diverse research disciplines
  • Middleware - Production quality software
    distributed under business friendly open source
    licence
  • Implements a service-oriented architecture that
    virtualisesresources
  • Adheres to recommendations on web service
    inter-operability and evolving towards emerging
    standards
  • User Support - Managed process from first contact
    through to production usage
  • Training
  • Expertise in grid-enabling applications
  • Online helpdesk
  • Dedicated support for specific disciplines
  • Networking events (User Forum, Conferences etc.)
    for cross-discipline interaction

5
CLARIN Centres
  • Centres classification
  • Recognized R
  • Matadata C
  • Service B
  • Infrastructure A (roughly equivalent of EGEE
    Regional Operations Centres)
  • External E
  • Need to monitor quality of services provided by
    centres
  • Need more details on the service definitions for
    each type
  • Probably need a Service Level Agreement for each
    type
  • Example from EGEE EGEE Service level agreement
    between Regional Operations Centres and Sites
  • EGEE/EGI has an extendable monitoring
    infrastructure
  • Based on NAGIOS widely used and extendable open
    source monitoring toolkit
  • See Service Availability Monitoring in EGEE and
    Beyond video demo _at_ EGEE09 on YouTube

6
  • WLCG depends on two major science grid
    infrastructures .
  • EGEE - Enabling Grids for E-Science
  • OSG - US Open Science Grid

Interoperability interoperation is vital
significant effort in building the procedures to
support it
7
Tier 0 Tier 1 Tier 2
An example WLCG
  • Tier-0 (CERN)
  • Data recording
  • Initial data reconstruction
  • Data distribution
  • Tier-1 (11 centres)
  • Permanent storage
  • Re-processing
  • Analysis
  • Tier-2 (130 centres)
  • Simulation
  • End-user analysis

The WLCG MoU http//lcg.web.cern.ch/lcg/mou.htm
8
Monitoring Centres
http//gstat-dev/gstat/summary/grid/WLCG/
9
Trust Domain
  • The choices made by CLARIN appear to be very
    sensible
  • Not exactly the same as EGEE/EGI but
    interoperation is possible
  • Pilot project between BiG Grid, SURFnet and MPI
    already built an integrated online SLCS
    Certificate Authority service with an example use
    case of the IMDI browser (a linguistic corpus
    access browser)
  • Talk to AAI community
  • GEANT and IGTF/EUGridPMA have a lot of useful
    experience
  • Europe should avoid separate sets of CAs
  • joint-security-policy-group_at_cern.ch

https//www.eugridpma.org/members/worldmap/
10
Centre availability/reliability reporting
See VO Specific Service Monitor using Service
Level Status video demo _at_ EGEE09 on YouTube
11
Component Metadata
  • AMGA the ARDA Metadata Grid Application
  • Metadata Catalogue of EGEEs gLite Middleware
  • Millions of files, 6000 users, 200 computing
    centres
  • Mainly (real-only) file metadata
  • Main concerns scalability, performance,
    fault-tolerance, support for hierarchical
    collections, security
  • replicate metadata between different AMGA
    instances allowing the federation of metadata
  • different authentication methods via
    (Grid-Proxy-) Certificates as well as very
    flexible accesses control mechanisms for
    individual data items based on ACLs
  • Does not yet support Persistent Identifiers
  • AMGA uses grid file LFNs (Logical File Name) as
    does rest of gLite
  • Would require some development
  • http//amga.web.cern.ch/amga/
  • support-amga_at_cern.ch
  • AMGA 2.0 presentation at EGEE09

same campus as KAIST(possible ISOcat mirror for
CLARIN)
12
Workflows
  • Many workflow managers supported
  • WMS (part of gLite)
  • GridWay (part of RESPECT)
  • Kepler, Taverna etc.
  • Example - WISDOM

13
Virtual Collections
  • VOMS Virtual Organization Membership Service
  • VOMS is a system for managing authorization data
    within multi-institutional collaborations. VOMS
    provides a database of user roles and
    capabilities and a set of tools for accessing and
    manipulating the database and using the database
    contents to generate Grid credentials for users
    when needed

http//www.gcube-system.org/
gCube offers a feature full platform for
distributed hosting, management and retrieval of
data and information See EGEE09 demo on YouTube
A Virtual Research Environment for Species
Distribution Map Generation and Management
14
Goal Long-term sustainability of grid
infrastructures in Europe Approach Establish a
federated model bringing together National Grid
Infrastructures (NGIs) to build the European Grid
Infrastructure (EGI) EGI Organisation
Coordination and operation of a common
multi-national, multi-disciplinary Grid
infrastructure To enable and support
international Grid-based collaboration To provide
support and added value to NGIs To liaise with
corresponding infrastructures outside Europe
15
CLARIN and EGI
  • The creation of National Grid Infrastructures and
    their overall coordination can provide an ICT
    context for the research infrastructures
  • An operational framework for centres involved in
    CLARIN
  • In the EGI context, Specialised Support Centres
    (SSCs) are the means of interaction with user
    communities
  • The EGI SSCs are established and governed by the
    user communities
  • Humanities SSC foreseen in ROSCOE project
    proposal

16
ESFRI _at_ EGEE09
Cherenkov Telescope Array
17
How to be future proof
  • Consider ALL (production grids, supercomputers,
    commercial cloud systems, volunteer grids,
    network etc.) as a combined e-Infrastructure
    ecosystem
  • Aim for interoperability and combine the
    resources into a consistent whole
  • Work closely with EGEE/EGI, DEISA/PRACE and GEANT
    they are ready to help! - they have links
    around the world
  • Keep the applications agile
  • Dont make the code so specialised that it can
    only use one specific installation things will
    change!
  • Make it easy for the users
  • Consider a community gateway/portal
  • Simplify authorisation/authentication
  • Easy access to common codes (handle license
    issues)
  • Relevant tutorials documentation

18
Grids, clouds, supercomputers, etc.
  • Grids
  • Collaborative environment
  • Distributed resources (political/sociological)
  • Commodity hardware (also supercomputers)
  • (HEP) data management
  • Complex interfaces (bug not feature)
  • Supercomputers
  • Expensive
  • Low latency interconnects
  • Applications peer reviewed
  • Parallel/coupled applications
  • Traditional interfaces (login)
  • Also SC grids (DEISA, Teragrid)

Many different problems Amenable to different
solutions No right answer
  • Clouds
  • Proprietary (implementation)
  • Economies of scale in management
  • Commodity hardware
  • Virtualisation for service provision and
    encapsulating application environment
  • Details of physical resources hidden
  • Simple interfaces (too simple?)
  • Volunteer computing
  • Simple mechanism to access millions CPUs
  • Difficult if (much) data involved
  • Control of environment ? check
  • Community building people involved in Science
  • Potential for huge amounts of real work

19
European E-Infrastructure Forum
  • Forum for the discussion of principles and
    practices to create synergies for distributed
    Infrastructures
  • Goal seamless interoperation of leading
    e-Infrastructures serving the European Research
    Area
  • Focus needs of the user communities that require
    services which can only be achieved by
    collaborating Infrastructures
  • Initial membership
  • EGEE EGI
  • DEISA PRACE
  • Terena GEANT
  • Offers a way of interacting as a whole with user
    communities of a multi-national nature that are
    interested in making use of the Infrastructures

20
Proposed next steps (1)
  • Identify clear contact points between the ESFRI
    projects and e-Infrastructures
  • E-Infrastructure projects have been talking to
    individuals (users or partners)
  • Can we make contacts more official and identify
    contact points in specific areas
  • Security
  • Data management
  • Network
  • Etc.
  • These will be useful for establishing links
    between different ESFRI projects, between ESFRI
    projects and e-Infrastructures etc.

21
Proposed next steps (2)
  • Use these contacts to build matrix for technical
    requirements organisational aspects

requirement CLARIN DARIAH/CESSDA EISCAT3D EPOS LIFEWATCH ELIXIR XFEL CTA FAIR SKA
Single sign-on
Persistent storage
Global
workflows
Virt Org
stds
DRAFT
22
Proposed next steps (3)
  • Once the matrix has been built it can be used to
    focus
  • Collaboration between ESFRI projects
  • Collaboration between ESFRI projects and
    e-Infrastructures
  • Provide input to roadmaps for e-Infrastructures
    of the future
  • Provide input to national funding agencies and
    European Commission on their future funding
    programmes

23
Why this work is important
  • If the recent trends continue, in 2025, the
    United States and Europe will have lost their
    scientific and technological supremacy for the
    benefit of Asia (China and India will have caught
    up with or even overtaken the Triad)

http//ec.europa.eu/research/social-sciences/pdf
/the-world-in-2025-report_en.pdf
24
Summary
  • The key added value of grid infrastructures is a
    framework for collaboration
  • Global secure access to computing resources,
    data, software and results
  • CPU power for computing-intensive tasks
  • Data management capabilities
  • Metadata and annotation
  • Security
  • Replication
  • High-speed data transfers
  • Facilitate creation of distributed data
    repositories, data mining, indexing and search
  • Software services
  • Availability of open source software
  • Integration with commercial software packages
  • Scalable and dynamic architecture which can be
    extended with additional services as required
  • All organisations can participate AND contribute
  • The EGI operational model and SSCs are a
    candidate mechanism for CLARIN to interact with
    EGI
Write a Comment
User Comments (0)
About PowerShow.com