JSS Job Submission Service - PowerPoint PPT Presentation

About This Presentation
Title:

JSS Job Submission Service

Description:

Only the events 'submitted', 'running' and 'completed' are recorded ... Set of client API's provided from JSS to RB for job submission and job cancel ... – PowerPoint PPT presentation

Number of Views:28
Avg rating:3.0/5.0
Slides: 9
Provided by: massimosg
Category:

less

Transcript and Presenter's Notes

Title: JSS Job Submission Service


1
JSS Job Submission Service
  • Marco Verlato
  • Massimo Sgaravatto
  • INFN Padova

2
Current activities
  • Analysis of Condor-G
  • Development of wrapper of Condor-G

3
Analysis of Condor-G
  • So far the following bugs/missing functionalities
    wrt. the functionalities required for the PM9
    release have been identified
  • Support for the x509userproxy attribute
  • Since the JSS must submit jobs on behalf of
    different users
  • Start running event missing for very short jobs
  • Info about failure of a submission to a Globus
    resource missing in the job log file (only
    present in the gridmanager log file)
  • Necessary if we want to exploit the
    libcondorapi.a to parse the log file
  • Info about when a job has been successfully sent
    to a Globus resource missing
  • Only the events submitted, running and
    completed are recorded
  • Necessary to notify the LB service
  • Support for refresh of user credential
  • API for condor_submit and condor_rm

4
Analysis of Condor-G
  • This list of desiderata submitted to Condor team
    (to understand if, how, when they can address
    these issues, and how we could contribute)
  • Response of Condor team
  • 1,2 hopefully addressed
  • Fixes just received seem working
  • They are going to address 3,4 in a short time
  • 5 Problem !
  • The Globus GRAM API doesn't have a real way to
    refresh the x509 proxy
  • Functionality asked to Globus team by Condor team
    but it is not clear if and how they are planning
    to address the problem
  • 6 Not in the short time
  • We can survive without the condor_submit and
    condor_rm API (not too elegant, but )
  • Access to the source code

5
Development of JSS
  • Architecture
  • 2 processes
  • 1 process listening for incoming client (RB)
    requests for job submission/cancel
  • Client-server communications from RB and JSS
    achieved by means of API calls
  • Set of client APIs provided from JSS to RB for
    job submission and job cancel
  • API provided from RB to JSS for asynchronous
    notifications
  • 1 process parsing the job log file

6
Job submission
  • Receive from RB JDL expression (augmented with
    GlobusResourceContactString QueueName
    LocalPathname) InputSandboxDir
    OutputSandboxDir
  • Build the job wrapper script
  • Build the Condor-G submit file
  • Value for attribute x509userproxy missing
  • Need to agree and implement a solution for user
    credential delegation (see CESNETs proposals on
    MyProxy)
  • Issue condor_submit
  • Save persistently info about job (dg_jobId,
    condor_jobId, ..)
  • Notify RB (JOB_ACCEPTED/JOB_REFUSED)
  • Notify LB (JSSAcceptedEvent/JSSRefusedEvent)

7
Job removal
  • Receive from RB list of dg_jobIds to remove
  • For each job
  • Find the correspondent condor_jobId
  • Issue condor_rm

8
What is missing/issues
  • Condor-G missing functionalities (in particular
    3)
  • Process for parsing the job log file
  • Cancelling the job if failures gt RetryCount
  • Mail to UserContact when job starts running
  • Notifying the RB (JOB_DONE, JOB_CANCELLED,
    JOB_ABORTED)
  • Notifying the LB service (JSSTransferEvent)
  • Testing
  • Improve implementation
  • Issues
  • Refresh of user credential
Write a Comment
User Comments (0)
About PowerShow.com