The Condor JobRouter Condor Week EU in Barcelona - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

The Condor JobRouter Condor Week EU in Barcelona

Description:

original (vanilla) job. Routing is just site-level matchmaking. With feedback from job queue ... Vanilla Universe. Self Contained (everything needed is in file ... – PowerPoint PPT presentation

Number of Views:62
Avg rating:3.0/5.0
Slides: 13
Provided by: MironL3
Category:

less

Transcript and Presenter's Notes

Title: The Condor JobRouter Condor Week EU in Barcelona


1
The Condor JobRouterCondor Week EU in Barcelona
2
aka schedd on the side
3
Status
  • Its in the current development series Condor
    7.1.X, unix (windows soonish)
  • Used heavily by CMS physics experiment for
    simulation on Open Science Grid (millions of jobs
    routed)

4
What is job routing?
routed (grid) job
original (vanilla) job
Universe vanilla Executable sim Arguments
seed345 Output stdout.345 Error
stderr.345 ShouldTransferFiles
True WhenToTransferOutput ON_EXIT
Universe grid GridType gt2 GridResource
\ cmsgrid01.hep.wisc.edu/jobmanager-condor
Executable sim Arguments seed345 Output
stdout Error stderr ShouldTransferFiles
True WhenToTransferOutput ON_EXIT
JobRouter
Routing Table Site 1 Site 2
final status
5
Routing is just site-level matchmaking
  • With feedback from job queue
  • number of jobs currently routed to site X
  • number of idle jobs routed to site X
  • rate of recent success/failure at site X
  • And with power to modify job ad
  • change attribute values (e.g. Universe)
  • insert new attributes (e.g. GridResource)
  • add a portal grid proxy if desired

6
Configuring the Routing Table
  • JOB_ROUTER_ENTRIES
  • list site ClassAds in configuration file
  • JOB_ROUTER_ENTRIES_FILE
  • read site ClassAds periodically from a file
  • JOB_ROUTER_ENTRIES_CMD
  • read periodically from a script
  • example query a collector such as Open Science
    Grid Resource Selection Service

7
Syntax
  • Read the 7.1 manual.
  • Its in the chapter on Grid Computing
  • Name Grid Site 1GridResource gt2
    gatekeeperMaxIdleJobs 10FailureRateThresho
    ld 0.01

8
What Types of Input Jobs?
  • Vanilla Universe
  • Self Contained(everything needed is in file
    transfer list)
  • High Throughput(many more jobs than cpus)

9
What Target Grid Types?
  • Globus, Condor-C work well
  • others untested, but should be fine
  • Why only target the grid universe?
  • no reason at all
  • 7.1.1 now allows any destination universe

10
Grid Gotchas
  • Globus gt2
  • no exit status from job (reported as 0)
  • must explicitly list desired output files

11
JobRouter vs. Glidein
  • Glidein - Condor overlays the grid
  • job never waits in remote queue
  • job runs in its normal universe
  • multiple users prioritized by central glidein
    pool
  • private networks doable, but add to complexity
  • need something to submit glideins on demand
  • JobRouter
  • some jobs wait in remote queue (MaxIdleJobs)
  • multiple users prioritized by remote sites
  • job must be compatible with target grid semantics
  • simple to set up, fully automatic to run

12
Questions?
  • Check the manual for Condor Version 7.1.0 or
    later
  • http//www.cs.wisc.edu/condor/manual/v7.1/5_6Cond
    or_Job.html
  • Send email to condor-admin_at_cs.wisc.edu
  • (also, thanks to Dan Bradley for preparing
    slides)
Write a Comment
User Comments (0)
About PowerShow.com