Metascheduling on the BioMed Grid - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Metascheduling on the BioMed Grid

Description:

4.BookKeeper saves. winning bid to database. 5. Dispatcher sends job. to the winning Bidder. ... Scheduling Service. 9. .BookKeeper saves. receipt to database , ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 24
Provided by: chuk
Category:

less

Transcript and Presenter's Notes

Title: Metascheduling on the BioMed Grid


1
Metascheduling on the BioMed Grid
  • Francis Tang, Sebastian Ho
  • (joint with Arun Krishnan and Ryan Lim)

2
Scenario
  • Various resources (Beowulf-style cluster, SMPs,
    NUMAs) are available on the Grid
  • Problem how to run your program?
  • Answer globusrun
  • Is there a better solution?
  • A solution where you don't have to write RSL
    scripts
  • Where you don't have to pick the computer with
    the least load

3
A better solution?
  • Better for the User easy to use, less trouble to
    set up
  • Better for the Systems Administrator easier to
    maintain
  • Better for the Grid Administrator easier to
    deploy
  • Better for the Organisation save money

4
Overview
  • Current state of affairs
  • RSL wrestle (globusrun) Accounting blackhole
    Gridmap hell
  • GridX
  • Web interface Usage tracking Logical accounts
  • Scalability
  • Summary
  • Discussion
  • Demo

5
The current state of affairs...
6
RSL wrestle (globusrun)
  • Currently, to run a job on a Grid resource, you
    use globusrun and RSL syntax
  • You have to specify precisely the target machine
    and target scheduler
  • There is very limited support for staging
    typically, you must take care of transfering files

globusrun -r white/jobmanager-pbs
'(executable/bin/uname)(arguments-a)' globusru
n -r tglobus3/jobmanager-grd '(executablempipro
g)(count4)(jobtypempi)'
7
Accounting blackhole
  • Globusrun/GRAM does not provide accounting
  • At best, usage records are buried deep in Globus
    log files

8
Gridmap hell
  • Access to individual resources is controlled by a
    file (the infamous Gridmap file) stored locally
    on the resource itself
  • This file maps Grid DNs (e.g. /CSG/OGrid/OBMG/O
    Ubii.a-star.edu.sg/CNfrancis_at_bii.a-star.edu.sg)
    to unix logins (e.g. francis)
  • Globus (GRAM) looks up the requestor's DN in the
    Gridmap file, and then runs the job using the
    unix login

9
Gridmap admin issues
  • It is infeasible to create a separate unix login
    for each Globus DN
  • Typically, several distinct DNs are mapped to the
    same unix login (e.g. by organisational
    affiliation BII, NUS, etc.)
  • Security issues no ring-fencing

10
Gridmap user issues
  • After receiving his certificate, the user must
    wait for each Gridmap file to be updated
  • There is no guarantee that all Gridmap files will
    be updated at the same time
  • A user may have a cert but cannot access the
    whole Grid
  • To provide good service, the Grid administrator
    must enforce operating procedures to coordinate
    Gridmap updates e.g. email sysadmins when a new
    cert is issued

11
GridX...
12
GridX web interface
  • GridX shields the user from RSL horrors
  • GridX allows the user to provide job requirements
    intuitively
  • GridX chooses a suitable resource
  • GridX stages the files (incl. third-party
    transfer)

13
GridX usage tracking
  • GridX explicitly tracks resource usage
  • GridX lets resource-providers specify rates

14
GridX logical accounts
  • Idea do the Gridmap-style mapping dynamically
  • each resource has a pool of logical accounts
    user01, user02, ..., user50
  • to run a job, pick a free account and run the job
    using that account
  • after job has finished, clean up and free the
    account

15
Logical account advantages
  • When a user receives a new cert, only GridX needs
    to be changed, not each resource
  • Relieves resource sysadmin of Gridmap chore
  • All resources become available, immediately
  • We automatically get ring-fencing

16
Scalability
  • Propose splitting GridX into front and back ends
  • One back-end, as a Facade for resources, for each
    Organisation (e.g. BII, NUS)
  • Front-ends communicate with many back ends
  • Redundancy through several front- and back-ends
  • Access control on a per-organisation level of
    granularity

17
Summary
  • GridX benefits
  • Users resources are more accessible no need to
    wait for Gridmap update
  • System administrators no need to maintain a
    large Gridmap file
  • Grid administrators scalability and easier
    deployment
  • Organisations better resource usage and
    accounting

18
Discussion
  • A prototype has been developed (demo next)
  • Can submit jobs to tglobus3, tbnode1, white
  • Further development requires help....
  • Need CA to help us sign a handful of certs
  • Might need to relax some BMG protocols
  • Need sysadmins to help us
  • Setup/tweak a handful of logical accounts
  • Deploy inGRD on clusters
  • Improve compatibility between clusters
    (SGE/PBS/LSF, MPI, ssh/rsh)
  • Buggy Globus jobmanager-pbs no good for MPI jobs!

19
  • And now for the demonstration....

20
(No Transcript)
21
Job Submission
22
Accounting Matrix
  • Functions
  • Maintain independence in policy administration
  • Value resource usage
  • Enforce usage patterns
  • Accounting Matrix
  • Rows Providers
  • Columns Consumers
  • Each cell is another table
  • Rows Resources
  • Columns Rates for various attributes
  • Each row owned and maintained by corresponding
    organization

23
Accounting Cycle
Write a Comment
User Comments (0)
About PowerShow.com