Intro to arc middleware - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Intro to arc middleware

Description:

Introduction to ARC Middleware. ISSGC'09, Sophia Antipolis, Nice, France ... type : Proxy draft (pre-RFC) compliant impersonation proxy. strength : 512 bits ... – PowerPoint PPT presentation

Number of Views:386
Avg rating:3.0/5.0
Slides: 36
Provided by: icea
Category:

less

Transcript and Presenter's Notes

Title: Intro to arc middleware


1
Introduction to ARC Middleware ISSGC09, Sophia
Antipolis, Nice, France
  • Ivan Degtyarenko and Michael Gindonis
  • CSC IT Center for Science, Espoo, Finland
  • July 11th, 2009

2
Todays session
What is it about?
After a quick introduction, you will familiarize
yourselves with ARC middleware with practical
examples. By this point you have already covered
grid middleware basics, X509, certificates,
proxies, virtual organizations, etc. so lets
dive in!
3
ARC Tutorial timetable for this morning
4
Short introductionA Hello Grid job with ARC
grid-proxy-init ngsub -f hello.xrsl ngstat
-a ngget hello
generate proxy submit monitor fetch the results
  • hello.sh
  • !/bin/sh
  • echo Hello Grid!

hello.xrsl (executablehello.sh) (jobnamehello
) (stdouthello.out) (stderrhello.err) (gmloggri
dlog) (cputime10) (memory200) (disk1)
5
Steps to start running on Grid
  • get an account for a system with a Grid User
    Interface installed (or install it on your own
    PC)
  • request a certificate from a Certificate
    Authority (CA)
  • install the certificate into /.globus/
  • join a VO
  • log in to the Grid (create a proxy)
  • write a job description in a file
  • check available resources (optional)
  • submit the job
  • monitor the progress of the job
  • fetch the results

once
every session
6
Privacy
Note! When working on the Grid, you must accept
that some information about your jobs and your
Grid identity may be made public, for example via
monitoring tools i.e. your name /
affiliation IP address of your client
computer job names and duration
runtime environment other
information Fortunately, for today you are
relatively anonymous /CIT/OGILDA/OUPersonal
Certificate/LSophia Antipolis/CNISSGCXX
7
Security Policies
  • policies vary in different grids and VOs
  • you will need to accept these terms to use these
    resource
  • Since you are in the Gilda VO you have already
    accepted its policy
  • You will need to accept the M-grid Acceptable Use
    Policy since some resources used in this tutorial
    are part of M-grid

8
The NorduGrid collaboration
  • a community around the open source ARC Grid
    middleware
  • national Grids (e.g. M-grid, SweGrid, NorGrid),
    users also outside the Nordic countries
  • real users, real applications
  • implemented a production Grid system working non
    stop since May 2002
  • open for anyone to participate

9
ARC Middleware
  • ARC middleware (Advanced Resource Connector)
  • open source out-of-the-box Grid solution software
    which enables production quality computational
    and data Grids
  • Easily Installable/Buildable for a variety of
    distributions
  • non-intrusive server installation
  • Supports a many common LRMS (Batch Systems)
  • Grid Engine, PBS/torque, Platform LSF
  • builds upon standard Open Source solutions such
    as OpenLDAP, OpenSSL, SASL and Globus Toolkit
  • adds services not provided by Globus such as
    scheduling
  • extends or completely replaces some Globus
    components

10
ARC Middleware (cont.)
  • provides a reliable implementation of the
    fundamental Grid services, such as information
    services, resource discovery and monitoring, job
    submission and management, brokering and data
    management and resource management
  • integrates computing resources and storage
    elements via a secure Grid layer
  • provides a light-weight standalone client, the
    User Interface, which allows to submit, manage
    and monitor jobs on the Grid, move data around
    and query recourse info
  • UI built-in broker allows to select the best
    resource for a job
  • Grid job requirements are expressed in extended
    Resource Specification Language (xRSL)

11
ARC Middleware Architecture
12
The not so short introduction Installing the
ARC client
  • required to submit jobs to NorduGrid
  • download from http//ftp.nordugrid.org/download/
  • binaries for various Linux distributions, source
    code also available
  • the easiest way to install the client is to use
    the standalone version
  • uncompress in a directory (no root privileges
    required) tar zxvf nordugrid-standalone-ltlatest
    gt.i386.tgz
  • run the environment setup script cd
    nordugrid-standalone-ltlatestgt . ./setup.sh
  • RPM packages are recommended for multi-user
    installations

13
Requesting and Installing the grid Certificate
  • create a certificate request
  • grid-cert-request -int
  • generates the .globus subdirectory with a key
    (userkey.pem) and the request (usercert_request.pe
    m)
  • identity string e.g. /OGrid/ONorduGrid/OUbccs.
    uib.no/CNPer Hansen
  • remember to select a good passphrase and keep the
    key secret!
  • send the file /.globus/usercert_request.pem to a
    Certification Authority (CA)
  • see the instructions at your local site / country
    which CA to contact
  • wait for an answer from the CA
  • signed certificate returned by the Certificate
    Authority should be saved as file
    .globus/usercert.pem

14
Logging in to the Grid
  • "Log in" grid-proxy-init
  • the command does not actually log in anywhere,
    but decrypts the private key and uses it to
    create a time-limited proxy
  • the proxy is used for authenticating to the
    resources
  • "Log out" grid-proxy-destroy
  • destroys the proxy
  • "whoami" grid-proxy-info
  • Shows information about the validity of the proxy
  • subject /OGrid/ONorduGrid/OUcsc.fi/CNMichae
    l Gindonis/CN413289378
  • issuer /OGrid/ONorduGrid/OUcsc.fi/CNMichae
    l Gindonis
  • identity /OGrid/ONorduGrid/OUcsc.fi/CNMichae
    l Gindonis
  • type Proxy draft (pre-RFC) compliant
    impersonation proxy
  • strength 512 bits
  • path /tmp/x509up_u7060
  • timeleft 115939

15
Writing a job description file
  • Resource Specification Language (RSL) files are
    used to specify job requirements and parameters
    for submission
  • NorduGrid uses an extended language (xRSL) based
    on the Globus RSL
  • similar to scripts for local batch systems, but
    include some additional attributes
  • job name
  • executable location and parameters
  • location of input and output files of the job
  • architecture, memory, disk and CPU time
    requirements
  • runtime environment requirements

16
xRSL example
  • hellogrid.sh
  • !/bin/sh echo Hello Grid!
  • hellogrid.xrsl
  • (executablehellogrid.sh) (jobnamehellogrid)
    (stdouthello.out) (stderrhello.err) (gmlogg
    ridlog) (cputime10) (memory200) (disk1)

17
Submitting the job
  • submit the job
  • ngsub -d 1 -f hellogrid.xrsl
  • a job id is returned
  • gt Job submitted with jobid gsiftp//ametisti.gri
    d. helsinki.fi2811/jobs/455611239779372141331307

18
ARC Grid Monitor
  • shows currently connected resources
  • almost all elements "clickable"
  • browse queues and job states by cluster
  • list jobs belonging to a certain user
  • no authentication, anyone can browse the info
  • privacy issues

19
Monitoring the Job
  • Query the status using the command line
  • ngstat hellogrid
  • gt Job gsiftp//ametisti.grid.helsinki.fi2811/
    jobs/455611239779372141331307 Jobname
    hellogrid Status INLRMSQ
  • Most common status values are ACCEPTED,
    PREPARING, INLRMSQ, INLRMSR, FINISHING,
    FINISHED
  • Or use the Grid Monitor

20
Fetching the results
  • print the job output
  • ngcat hellogrid
  • shows the standard output of the job
  • this can be done also during the job is running
  • download the result files
  • ngget hellogrid
  • gt ngget downloading files to
    /home/ajt/455611239779372141331307 ngget
    download successful - deleting job from
    gatekeeper.

21
Using a storage element
  • Storage Elements are disk servers accessible via
    the Grid
  • can be used to store job output while user is
    logged out and client machine disconnected from
    the Grid
  • allows to store input files close to the cluster
    where theprogram is executed, on a high
    bandwidth network
  • files can be local and remote in the same job
  • (inputFiles("input1". "/home/user/myexperiment"
    ("input2", "gsiftp//se.example.com/files/data"))
  • (outputFiles("output", "gsiftp//se.example.com/
    mydir/result1")("prog.out", "gsiftp//se.example.
    com/mydir/stdout"))
  • (stdout"prog.out")

22
Runtime environments
  • software packages which are preinstalled on a
    computing resource and made available through
    Grid
  • just send the data and/or parameters to be
    processed
  • useful if there are many users of the same
    software or if the same program is used
    frequently
  • allows local platform specific optimizations
  • For a specific CPU or Parallel Environment
  • Perhaps in the near future GPUs, CUDA
  • required runtime environments can be specified in
    the job description file, for example(runtimeenv
    ironmentAPPS/GRAPH/POVRAY-3.6)
  • Runtime Environment Registry
  • http//www.csc.fi/grid/rer/

23
ARC / NorduGrid / M-grid references
  • NorduGrid (resource monitor, presentations,
    tutorials, docs, )
  • http//nordugrid.org/
  • ARC middleware
  • http//nordugrid.org/middleware
  • User guide http//www.nordugrid.org/documents/ui
    .pdf
  • user support mailing list nordugrid-support at
    nordugrid.org
  • M-grid (Finnish National Grid)
  • http//www.csc.fi/english/research/Computing_servi
    ces/grid_environments/mgrid
  • https//extras.csc.fi/mgrid/
  • support email at CSC grid-support at csc.fi
  • regular ARC training by CSC http//www.csc.fi/en
    glish/csc/courses

24
Do I need to change my application to use ARC?
  • three different approaches
  • using the application as is grid middleware will
    move the executable and the data to the target
    system
  • library dependencies often need to be resolved by
    linking statically or packing them to go with the
    application
  • installing the application on the target system
    and using it via the Grid interface
  • batch processing type applications normally work
    without changes, interactive applications are
    more difficult
  • with ARC middleware this is facilitated by
    runtime environments (RE)
  • modifying the application to fully exploit a
    distributed environment
  • using ARC libraries
  • distributing over a large geographical area is
    not practical unless the computation can be split
    to independent parts

25
Real life applications
  • it's common to send several smaller jobs to the
    Grid to solve a larger problem
  • parallel MPI jobs to a single cluster are
    supported (if correct runtime environment
    installed), but no MPI between clusters
  • splitting the job to suitable parts and gathering
    the parts together is left to the user
  • more error prone environment than traditional
    local systems gt error checking and recovery
    important
  • fault reporting and debugging has room for
    improvements

26
Real life applications
  • Size your job to best exploit the grid
  • group many short jobs into one to avoid
    submission overhead
  • If possible break up larger or longer jobs into
    independent parts
  • If your job must run for a long time, checkpoint
    your results so that your calcuation can be
    resumed, no resource will stay up indefinitely
  • M-grid is ideally suited to jobs of length 1 hour
    to 1 day.
  • Use file caching if it is available
  • Eliminate unnecessarily file transfers (load on
    network)
  • Save time needed to stage files
  • Save disk space on the cluster front-ends

27
Further development of ARC middleware
  • Stated goal not to undermine existing
    functionality and capabilities available in
    pre-ARC components (current stable version)
  • Two SVN branches
  • ARC0 (version 0.6.5, 0.8rc)
  • Pre-existing production components (Pre-KnowARC
    project)
  • Backported features from KnowARC
  • Nordic DataGrid Facility provides support and
    backports features from the KnowARC project into
    the current stable releases of ARC
  • ARC1 (0.9.xxx)
  • Next generation components developed by the
    KnowARC project
  • More information at www.ndgf.org and
    www.knowarc.eu

28
What is new
  • Service Oriented Architecture
  • Modular structure
  • Self-sufficient core components
  • Interoperability built on standard
  • User and developer friendly
  • Business friendly open source
  • License Apache 2.0
  • Portable runs on almost all Linux variants,
  • Solaris, porting to Windows and Mac OS in
    progress
  • Aiming at integration into Fedora
  • Debian and Ubuntu

11/18/2009
www.knowarc.eu
28
29
ARC WS-based components
  • Internal structure of ARC components

11/18/2009
www.knowarc.eu
29
30
Key Feature - New ARC client
  • Relies on dedicated library
  • Implemented in C
  • Python and Java bindings
  • Allows easy development of application-specific
    clients
  • Implements a user Grid toolbox
  • Handling of user host credentials
  • computing resource discovery information
    retrieval
  • matchmaking brokering job submission
  • input/output data handling
  • The new library and arc commands can handle
    glite-CREAM and UNICORE
  • Windows and Mac OS client
  • GUI user interface, just delivered !

11/18/2009
www.knowarc.eu
30
31
Key Feature - HED
  • HED The Hosting Environment Daemon
  • Container for all the server-side functional
    components
  • Main functions
  • Route messages between the services and the
    outside world
  • Provide inter service communication
  • Provides a basic security infrastructure
  • Consists of pluggable modules
  • Light-weight (no Apache, no Axis)

11/18/2009
www.knowarc.eu
31
32
Key Service A-Rex
  • ARC Resource-coupled Execution Service
  • Provides Execution Management capability
  • The Grid Manager from ARC Classic as core
  • Extended with WS interface implementing Basic
    Execution Service (BES)
  • Accepts Job Submission Description Language
    (JSDL)
  • Information and resource discovery GLUE 2
    schema
  • Support for wide range of Local Resource
    Management Systems
  • Torque, PBS/OpenPBS, SGE,
  • LoadLeveler, LSF, Condor and SLURM
  • Released in ARC 0.8, available at
    http//wiki.nordugrid.org/index.php/ARC_v0.8

11/18/2009
www.knowarc.eu
32
33
Key Service New Storage
  • Distributed by Design storage system
  • Global namespace
  • Supports collections and subcollections to any
    depth
  • A-Hash a replicated database to store metadata
  • Librarian handles
  • Metadata and hierarchy of collections and files
  • The location of replicas
  • Health data of the shepherd services
  • Bartender - high-level interface for the users an
    for other services
  • Shepherd manages storage services, and provides
    a simple interface for storing files on storage
    nodes

11/18/2009
www.knowarc.eu
33
34
Welcome to ARC
Lets begin
Off to the PC classroom! (unless the coffee is
ready)
35
Abstracting the middleware
  • http//technical.eu-egee.org/index.php?id290
  • Expand the functionality of the grid
    infrastructure for users,
  • Reduce duplicated development when porting
    applications, and
  • Speeds the porting of new application to the
    grid.
  • GridWay Metascheduler (http//www.gridway.org/)
  • The GridWay Metascheduler performs job execution
    management and resource brokering, allowing
    unattended, reliable, and efficient execution of
    jobs, job arrays, and workflows on heterogeneous
    and dynamic Grids.
  • P-GRADE Portal (http//portal.p-grade.hu/)
  • The Parallel Grid Run-time and Application
    Development Environment Portal (P-GRADE Portal)
    is a workflow-oriented graphical environment that
    covers every stage of Grid application
    lifecycles.
  • Ganga (http//ganga.web.cern.ch/ganga/)
  • Ganga is an easy-to-use frontend for job
    definition and management, implemented in Python.
    Ganga allows trivial switching between testing on
    a local batch system and large-scale processing
    on Grid resources.
Write a Comment
User Comments (0)
About PowerShow.com