EGEE and gLite - PowerPoint PPT Presentation

1 / 52
About This Presentation
Title:

EGEE and gLite

Description:

roberto.barbera_at_ct.infn.it. Enter GRID pass phrase for this identity: ... Personal Certificate/L=SEOUL/CN= UniqueUser /Email=roberto.barbera_at_ct.infn.it ... – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 53
Provided by: supercom
Category:
Tags: egee | barbera | glite

less

Transcript and Presenter's Notes

Title: EGEE and gLite


1
EGEE and gLite
  • Kyung-Lang Park
  • (2005. 9. 6)

2
Definitions
  • LCG project LHC Computing Grid Project
  • LHC Large Hadron Collider at CERN
  • EGEE project Enabling Grids for E-Science in
    Europe
  • LCG-2 EGEE middleware based on GT2
  • gLite EGEE middleware based on WS
  • GILDA Training infrastructure

3
Contents
  • EGEE Overview
  • gLite Overview
  • GILDA Practicals
  • Conclusion

4
EGEE Overview
5
Motivation
  • Science is becoming increasingly digital and
    needs to deal with increasing amounts of data
  • Particle Physics
  • Large amount of data produced
  • Large worldwide organized collaborations
  • Large Hadron Collider (LHC) at CERN
  • 40 million collisions per second
  • 100,000 of todays fastest PC processors
  • The solution the Grid
  • HEP LHC Computing Grid project (LCG)
  • Close integration of LCG and EGEE projects

6
The largest e-Infrastructure EGEE
  • Objectives
  • consistent, robust and secure service grid
    infrastructure
  • improving and maintaining the middleware
  • attracting new resources and users from industry
    as well as science
  • Structure
  • 71 leading institutions in 27 countries,
    federated in regional Grids
  • leveraging national and regional grid activities
    worldwide
  • funded by the EU with 32 M Euros for first 2
    years starting 1st April 2004

7
Service Usage
  • VOs and users on the production service
  • Active VOs
  • HEP 4 LHC, D0, CDF, Zeus, Babar
  • Biomed
  • ESR (Earth Sciences)
  • Computational chemistry
  • Magic (Astronomy)
  • EGEODE (Geo-Physics)
  • Registered users in these VO 600
  • Many local VOs, supported by their ROCs
  • Scale of work performed
  • LHC Data challenges 2004
  • gt1 M SI2K years of CPU time (1000 CPU years)
  • 400 TB of data generated, moved and stored
  • 1 VO achieved 4000 simultaneous jobs (4 times
    CERN grid capacity)

Number of jobs processed per month (April
2004-April 2005)
8
EGEE infrastructure usage
  • Average job duration January 2005 June 2005
    for the main VOs

9
EGEE pilot applications
  • High-Energy Physics (HEP)
  • Provides computing infrastructure (LCG)
  • Challenging
  • thousands of processors world-wide
  • generating petabytes of data
  • chaotic use of grid with individual user
    analysis (thousands of users interactively
    operating within experiment VOs)
  • Biomedical Applications
  • Similar computing and data storage requirements
  • Major additional challenge
  • security privacy

10
EGEE pilot applications
  • Bioinformatics
  • BioMed
  • GPS_at_
  • xmipp_Mlrefine
  • Drug Discovery
  • Medical Imaging
  • Generic applications
  • Earth sciences applications

11
gLite Overview
12
Grid middleware
  • The Grid relies on advanced software, called
    middleware, which interfaces between resources
    and the applications
  • The GRID middleware
  • Finds convenient places for the application to
    be run
  • Optimises use of resources
  • Organises efficient access to data
  • Deals with authentication to the different sites
    that are used
  • Runs the job monitors progress
  • Recovers from problems
  • Transfers the result back to the scientist

13
EGEE Middleware gLite
  • First release of gLite end of March 2005
  • Focus on providing users early access to
    prototype
  • Release 1.3 in Aug 05
  • Intended to replace present middleware with
    production quality services
  • Aims to address present shortcomings and advanced
    needs from applications
  • Developed from existing components
  • Interoperability Co-existence with deployed
    infrastructure
  • Robust Performance Fault Tolerance
  • Open source license
  • Prototyping short development cycles for fast
    user feedback
  • Initial web-services based prototypes being tested

gLite-2
gLite-1
LCG-2
LCG-1
Globus 2 based
Web services based
14
gLite and computation
  • Jobs are
  • (as in LCG) run from batch queues, termed
    computing elements CEs
  • Described in Job Description Language
  • gLite also supports
  • Interactive jobs
  • Jobs run in batch mode listener receives
    messages from CE
  • Parallelism using MPI
  • MPI jobs can run on CEs that support MPInot
    across administrative domains (not MPICH-G)
  • Workflow (DAGs, from Condor)
  • Checkpointing
  • Partitioned jobs (soon) e.g. Monte-Carlo

15
gLite and data
  • Simple data
  • Files
  • Requires
  • Replica files
  • Move data to computation
  • Virtual filesystems
  • Metadata for files
  • File transfer
  • These services are amongst those provided in gLite
  • Structured data
  • RDBMS, XML databases
  • Require extendable middleware tools to support
  • computation near to data
  • easy access, controlled by AA
  • integration and federation
  • Hence OGSA-DAIDAI Data Access and Integration
  • OGSA-DAI is NOT currently being ported to gLite

16
EGEE middlewares face to face
  • LCG
  • Security
  • GSI
  • Job Management
  • Condor Globus
  • CE, WN
  • Logging Bookkeeping
  • Data Management
  • LCG services
  • Information Monitoring
  • BDII (evolution of MDS)
  • Grid Access
  • CLI API
  • Operating system
  • gLite
  • Security
  • GSI and VOMS
  • Job Management
  • Condor Globus blahp
  • CE, WN
  • Logging Bookkeeping
  • Job Provenance
  • Package management
  • Data Management
  • LFC
  • gLite-I/O FiReMan
  • Information Monitoring
  • BDII
  • R-GMA Service Discovery
  • Grid Access
  • CLI API Web Services
  • Easier installation / configuration
  • Currently Scientific LINUX, will be available on
    others, incl. Windows

WS non-WS
non-WS
17
gLite components overview
Near Future
Access Services
Grid AccessService
API
CLI
now
Security Services
Information Monitoring Services
Authorization
Auditing
Information Monitoring
Job Monitoring
Service Monitoring
Authentication
Dynamic Connectivity
Service Discovery
Data Services
Job Management Services
MetadataCatalog
File ReplicaCatalog
JobProvenance
PackageManager

Accounting
StorageElement
DataMovement
ComputingElement
WorkloadManagement
Site Proxy
18
Overview of gLite JMS
  • Job Management Services
  • main services related to job management/execution
    are
  • computing element
  • job management (job submission, job control,
    etc.), but it must also provideprovision of
    information about its characteristics and status
  • workload management
  • core component discussed in details
  • Accounting
  • special case as it will eventually take into
    account
  • computing, storage and network resources
  • job provenance
  • keep track of the definition of submitted jobs,
    execution conditions and environment, and
    important points of the job life cycle for a long
    period
  • debugging, post-mortem analysis, comparison of
    job execution
  • package manager
  • automates the process of installing, upgrading,
    configuring, and removing software packages from
    a shared area on a grid site.
  • extension of a traditional package management
    system to a Grid

19
Architecture Overview
Resource Broker Node (Workload Manager, WM)
Job status
Storage Element
20
WMSs Architecture
4
5
1
2
3
21
Jobs State Machine
22
gLite deployment scenario
23
  • GILDA Practicals

24
The GILDA t-Infrastructure
  • Why t-infrastructure?
  • e-Infrastructure for production
  • t-Infrastrcuture for training
  • Need guaranteed response for tutorials limit the
    vulnerability of production systems
  • use training grid
  • have training CA
  • able to change middleware to prepare participants
    for future releases on production system
  • Also
  • need safe resources for installation training
  • easy entry point for new communities

25
The GILDA project(https//gilda.ct.infn.it)
26
The GILDA Test-bed(https//gilda.ct.infn.it/testb
ed.html)
15 sites in 3 continents !
27
The GILDA Services(https//gilda.ct.infn.it/testb
ed.html)
Ready for gLite !
28
WMS layout in GILDA
RB LCG
GILDA site
GILDA site
GILDA site
29
GRID Security the players
Grid
30
Digital certificates
  • The goal of authorization and autentication of
    users and resources is done through digital
    certificates, in X.509 format
  • Certification Authority (CA)
  • Issue Digital Certificates for users and machines
  • Check the identity and the personal data of the
    requestor
  • Registration Authorities (RAs) do the actual
    validation
  • CAs periodically publish a list of compromised
    certificates
  • Certificate Revocation Lists (CRL) contain all
    the revoked certificates yet to expire
  • CA certificates are self-signed
  • For each player, a CA guarantees its autenticity
    with a certificate

31
Certificate Use
  • Digital certificates are split in public/private
    keys
  • Public key is spread along the net, while the
    private stays encripted on the disk
  • Default location for public/private keys is
    HOME/.globus (attention to file permissions)
  • ls -l HOME/.globus
  • -rw-r--r-- 1 local local 1143 Jun 30
    1601 usercert.pem
  • -r-------- 1 local local 963 Jun 30
    1601 userkey.pem

32
Verify your certificate
  • To get information on your certificate, run
  • gt openssl x509 -in .globus/usercert.pem noout
    -text
  • Certificate
  • Data
  • Version 3 (0x2)
  • Serial Number 1783 (0x6f7)
  • Signature Algorithm md5WithRSAEncryption
  • Issuer CIT, OGILDA, CNGILDA
    Certification Authority
  • Validity
  • Not Before Jun 30 071413 2005 GMT
  • Not After Jul 30 071413 2005 GMT
  • Subject CIT, OGILDA, OUPersonal
    Certificate, LSEOUL, CNSEOUL20/Emailroberto.bar
    bera_at_ct.infn.it
  • ......

33
X.509 proxy certificates
  • GSI extension to X.509 Identity Certificates
  • signed by the normal end entity cert (or by
    another proxy)
  • Support some important features
  • Delegation and Mutual authentication
  • Has a limited lifetime (minimized risk of
    compromised credentials)
  • It is created by the grid-proxy-init command
  • gt grid-proxy-init
  • Your identity /CIT/OGILDA/OUPersonal
    Certificate/LSEOUL/CNSEOUL20/Emailroberto.barbe
    ra_at_ct.infn.it
  • Enter GRID pass phrase for this identity
  • Creating proxy ...................................
    .............................. Done
  • Your proxy is valid until Mon Jul 18 071428
    2005

Grid Pass Phrase SEOUL
34
Inspecting your proxy
  • By grid-proxy-info you can inspect info about
    your proxy
  • gtgrid-proxy-info -all
  • subject /CIT/OGILDA/OUPersonal
    Certificate/LSEOUL/CNSEOUL20/Emailroberto.barbe
    ra_at_ct.infn.it/CNproxy
  • issuer /CIT/OGILDA/OUPersonal
    Certificate/LSEOUL/CNSEOUL20/Emailroberto.barbe
    ra_at_ct.infn.it
  • identity /CIT/OGILDA/OUPersonal
    Certificate/LSEOUL/CNSEOUL20/Emailroberto.barbe
    ra_at_ct.infn.it
  • type full legacy globus proxy
  • strength 512 bits
  • path /tmp/x509up_u500
  • timeleft 115724

35
Long term proxy
  • Proxy has limited lifetime (default is 12 h)
  • Bad idea to have longer proxy
  • However, a grid task might need to use a proxy
    for a much longer time
  • Grid jobs in HEP Data Challenges last up to 2
    days
  • myproxy server
  • Allows to create and store a long term proxy
    certificate
  • -s lthost_namegt specifies the hostname of MyProxy
    server
  • -l ltusergt define user that will own remote
    credentials
  • myproxy-init -s lthost_namegt -l ltusergt
  • myproxy-info -s lthost_namegt -l ltusergt
  • Get information about stored long living proxy
  • myproxy-get-delegation -s lthost_namegt -l ltusergt
  • Get a new proxy from MyProxy server
  • myproxy-destroy -l ltusergt -s lthost_namegt
  • Destroy the credential into the server
  • Check out the myproxy-xxx --help option
  • A dedicated service on the RB can renew
    automatically the proxy
  • contacts the myproxy server

36
Store credentials on MyProxy Server
  • gt grid-proxy-destroy remove local credentials
  • gt myproxy-init -s grid001.ct.infn.it l
    ltUniqueUsernamegt
  • Your identity /CIT/OGILDA/OUPersonal
    Certificate/L
  • SEOUL/CNltUniqueUsernamegt/Email
  • roberto.barbera_at_ct.infn.it
  • Enter GRID pass phrase for this identity
  • Creating proxy .......................Done
  • Proxy Verify OK
  • Your proxy is valid until Sun Jul 24 185344
    2005
  • Enter MyProxy pass phrase
  • Verifying password - Enter MyProxy pass phrase
  • A proxy valid for 168 hours (7.0 days) for user
    ltUniqueUsernamegt now exists on grid001.ct.infn.it.
  • Now your credentials are stored on MyProxy
    server, and are available
  • for delegation or renewal by RB.
  • ATTENTION! ltUniqueUsernamegt MUST BE your PERSONAL
  • username

37
Get delegation
  • gt myproxy-get-delegation -s grid001.ct.infn.it -l
    ltUniqueUsergt
  • Enter MyProxy pass phrase
  • A proxy has been received for user ltUniqueUsergt
    in /tmp/x509up_u500
  • gt grid-proxy-info -all
  • subject /CIT/OGILDA/OUPersonal
    Certificate/LSEOUL/CN ltUniqueUsergt/Emailroberto
    .barbera_at_ct.infn.it
  • /CNproxy/CNproxy/CNproxy
  • issuer /CIT/OGILDA/OUPersonal
    Certificate/LSEOUL/CN ltUniqueUsergt/Emailroberto
    .barbera_at_ct.infn.it
  • /CNproxy/CNproxy
  • identity /CIT/OGILDA/OUPersonal
    Certificate/LSEOUL/CN ltUniqueUsergt/Emailroberto
    .barbera_at_ct.infn.it
  • type full legacy globus proxy
  • strength 512 bits
  • path /tmp/x509up_u500
  • timeleft 115658

38
Workload Managements System
  • The user interacts with Grid via a Workload
    Management System (WMS)
  • The Goal of WMS is the distributed scheduling
    and resource management in a Grid environment.
  • What does it allow Grid users to do?
  • To submit their jobs
  • To execute them on the best resources
  • The WMS tries to optimize the usage of resources
  • To get information about their status
  • To retrieve their output

39
JDL
  • Information to be specified when a job has to be
    submitted
  • Job characteristics
  • Job requirements and preferences on the computing
    resources
  • Also including software dependencies
  • Job data requirements
  • Information specified using a Job Description
    Language (JDL)
  • Based upon Condors CLASSified ADvertisement
    language (ClassAd)
  • Fully extensible language
  • A ClassAd
  • Constructed with the classad construction
    operator
  • It is a sequence of attributes separated by
    semi-colon ().
  • So, the JDL allows definition of a set of
    attribute, the WMS takes into account when making
    its scheduling decision

40
Job Preparation
  • An attribute is a pair (key, value), where value
    can be a Boolean, an Integer, a list of strings,
    ....
  • ltattributegt ltvaluegt
  • In case of literal string for values
  • if a string itself contains double quotes, they
    must be escaped with a backslash
  • Arguments " \"Hello\" 10"
  • the character ' cannot be specified in the JDL
  • special characters such as , , gt, lt are only
    allowed
  • if specified inside a quoted string
  • if preceded by triple \
  • Arguments "-f file1\\\file2"
  • Comments must be preceded by a sharp character
    () or have to follow the C syntax
  • The JDL is sensitive to blank characters and tabs
  • they should not follow the semicolon () at the
    end of a line

41
Job Description Language
  • The supported attributes are grouped in two
    categories
  • Job Attributes
  • Define the job itself
  • Resources
  • Taken into account by the RB for carrying out the
    matchmaking algorithm (to choose the best
    resource where to submit the job)
  • Computing Resource
  • Used to build expressions of Requirements and/or
    Rank attributes by the user
  • Have to be prefixed with other.
  • Data and Storage resources (see talk Job Services
    With Data Requirements)
  • Input data to process, SE where to store output
    data, protocols spoken by application when
    accessing SEs

42
JDL Relevant Attributes
JobType Normal (simple, sequential job),
Interactive, MPICH, Checkpointable Or
combination of them Executable (mandatory) The
command name Arguments (optional) Job command
line arguments StdInput, StdOutput, StdError
(optional) Standard input/output/error of the
job InputSandbox (optional) List of files on the
UI local disk needed by the job for running The
listed files will automatically staged to the
remote resource OutputSandbox (optional) List of
files, generated by the job, which have to be
retrieved VirtualOrganisation (optional) A
different way to specify the VO of the user
43
Job Submission
  • glite-job-submit performs the job submission to
    the WMS

Usage glite-job-submit options ltjdl filegt
Principal Options --vo ltvo namegt perform
submission with a different VO than the UI
default one --output, -o ltoutput filegt save
jobId on a file, instead of STDIN --resource, -r
ltresource valuegt, specify the resource for
execution (needs the GLUE UniqueId of the queue,
obtainable with list-match) --debug show function
calls and parameters
44
Job life cycle check
  • glite-job-status ltjob idgt
  • check job execution status
  • glite-job-output ltjob idgt
  • If job status is done, allows output
    retrieve
  • glite-job-cancel ltjob idgt
  • perform job deletion
  • All of these commands accepts (with the option i
    ltfilegt) input from a file.
  • glite-job-status -i myjobId

45
JDL -- Example
  • Type "Job"
  • JobType "Normal"
  • Executable "/bin/bash"
  • StdOutput std.out"
  • StdError std.err"
  • InputSandbox yourscript.sh"
  • OutputSandbox std.err",std.out"
  • Arguments "yourscript.sh"

46
Job Requirements
  • Requirements
  • Job requirements on the resources
  • Specified using GLUE attributes of resources
    published in the Information Service
  • Its value is a boolean expression
  • Only one requirements can be specified
  • if there are more than one, only the last one is
    taken into account
  • If you need several Requirements, combine them
    through logical operators (, , !, .....).
  • If not specified, default value defined in UI
    configuration file is considered
  • Default other.GlueCEStateStatus "Production"
    (the resource has to be able to accept jobs and
    dispatch them on WNs)

47
JDL Requirements
  • Insert a requirement to parse only the short
    queues.
  • Requirements (other.GlueCEPolicyMaxWallClockTime
    gt 720)
  • Insert a requirement to parse only the long
    queues.
  • Requirements (other.GlueCEPolicyMaxWallClockTime
    gt 1440)
  • Insert a requirement to parse only the infinite
    queues.
  • Requirements (other.GlueCEPolicyMaxWallClockTime
    gt 2880)
  • Insert a requirement to stear the execution on a
    particular CE Queue.
  • Requirements other.GlueCEUniqueID
    "grid010.ct.infn.it2119/jobmanager-lcgpbs-long"

48
Job Submission
  • glite-job-list-match allows to check the
    suitable resources for execution
  • No job submission is performed, just listmatch
    is performed
  • Usage glite-job-list-match options ltjdl filegt
  • Principal Options
  • --vo ltvo namegt perform list-match with a
    different VO than the UI default one
  • --rank show resources in order of ranking
  • --output, -o ltoutput filegt redirect output on a
    file, instead of STDIN
  • --debug show function calls and parameters

49
JDL -- Requirements
  • Type "Job"
  • JobType "Normal"
  • Executable "/bin/sh"
  • StdOutput "povray_cubo.out"
  • StdError "povray_cubo.err"
  • InputSandbox "start_povray_cubo.sh","cubo.pov"
  • OutputSandbox "povray_cubo.out","povray_cubo.er
    r","cubo.png"
  • RetryCount 7
  • Arguments "start_povray_cubo.sh"
  • Requirements Member("POVRAY-3.5",other.GlueHostA
    pplicationSoftwareRunTimeEnvironment)

50
Start_povray_cubo.sh
  • !/bin/bash
  • mv cubo.pov OBJECT.POV rename input file
  • /usr/bin/povray /usr/share/povray-3.5/ini/res800.i
    ni run povray
  • mv OBJECT.png cubo.png rename output file

51
From Phase I to II
  • From 1st EGEE EU Review in February 2005
  • The reviewers found the overall performance of
    the project very good.
  • remarkable achievement to set up this
    consortium, to realize appropriate structures to
    provide the necessary leadership, and to cope
    with changing requirements.
  • EGEE I
  • Large scale deployment of EGEE infrastructure to
    deliver production level Grid services with
    selected number of applications
  • EGEE II
  • Natural continuation of the projects first phase
  • Emphasis on providing an infrastructure for
    e-Science
  • ? increased support for applications
  • ? increased multidisciplinary Grid
    infrastructure
  • ? more involvement from Industry
  • Extending the Grid infrastructure world-wide
  • ? increased international collaboration
  • (Asia-Pacific is already a partner!)

52
Conclusion
  • EGEE is an open project to construct
    e-infrastructure
  • gLite is production-level grid middleware of EGEE
  • Well-defined architecture and reliable software
  • Towards service-oriented architecture
  • Migrate from LCG to gLite incrementally
  • gLite Condor GTK 2.0 (?)
  • Focus on data grid
  • Doesnt support multiple site MPI jobs
Write a Comment
User Comments (0)
About PowerShow.com