Title: Workload Management
 1Workload Management
David Colling Imperial College London 
 2-  Release 2 is not based on release 1 
-  Whole new architecture (pretty much described in 
 D1.4)
-  More modular 
-  I have little practical experience of this new 
 architecture (yet).
3So what is the new architecture?
See D1.4 for details 
 4The architecture
User Interface Although there have been several 
changes to the architecture, the commands 
available at the user end are (almost) the same 
now edg-job-submit etc Also now apis Network 
Server The Network Server is a generic network 
daemon, responsible for accepting incoming 
requests from the UI (e.g. job submission, job 
removal), which, if valid, are then passed to the 
Workload Manager. 
 5The architecture
Workload manager The Workload Manager is the 
core component of the Workload Management System. 
Given a valid request, it has to take the 
appropriate actions to satisfy it. To do so, it 
may need support from other components, which are 
specific to the different request types. 
 6The architecture
- Resource Broker 
- This has been turned into one of the modules that 
 help the workload manager, actually 3
 sub-modules
-  Matchmaking 
-  Ranking 
-  Scheduling 
- Job Adapter 
- The Job Adapter put the finishing touches to the 
 jobs jdl and creates the job wrapper.
-  
7The architecture
Job Controller and CondoG Actually submit the job 
to the resources and track progress. 
So how does this all work  
 8Job submission example (for a simple job)
RB node
Replica Catalog
Network Server
 Workload Manager 
Inform. Service
 Job Contr. - CondorG 
CE characts  status
SE characts  status
Computing Element
Storage Element 
 9Job submission
- edg-job-submit myjob.jdl 
- Myjob.jdl 
- JobType  Normal 
- Executable  "(CMS)/exe/sum.exe" 
- InputData  "LFtestbed0-00019" 
- ReplicaCatalog  "ldap//sunlab2g.cnaf.infn.it201
 0/rcWP2 INFN Test Replica Catalog,dcsunlab2g,
 dccnaf, dcinfn, dcit"
- DataAccessProtocol  "gridftp" 
- InputSandbox  "/home/user/WP1testC","/home/file
 , "/home/user/DATA/"
- OutputSandbox  sim.err, test.out, 
 sim.log"
- Requirements  other. GlueHostOperatingSystemNam
 e  linux"
- other. GlueHostOperatingSystemRelease  "Red Hat 
 6.2  other.GlueCEPolicyMaxWallClockTime gt
 10000
- Rank  other.GlueCEStateFreeCPUs 
Job Status
RB node
submitted
Replica Catalog
Network Server
 Workload Manager 
Inform. Service
Job Description Language (JDL) to specify job 
 characteristics and requirements
UI allows users to access the 
functionalities of the WMS
 Job Contr. - CondorG 
CE characts  status
SE characts  status
Computing Element
Storage Element 
 10NS network daemon responsible for 
accepting incoming requests
RB node
 Job Status
Job submission
Replica Catalog
Network Server
Job
Input Sandbox files
 Workload Manager 
Inform. Service
 RB storage
 Job Contr. - CondorG 
CE characts  status
SE characts  status
Computing Element
Storage Element 
 11RB node
 Job Status
Job submission
Replica Catalog
Network Server
Job
 Workload Manager 
Inform. Service
 RB storage
WM responsible to take the appropriate actions 
to satisfy the request
 Job Contr. - CondorG 
CE characts  status
SE characts  status
Computing Element
Storage Element 
 12RB node
 Job Status
Job submission
Replica Catalog
Network Server
 Match- maker 
 Workload Manager 
Inform. Service
 RB storage
Where does this job must be executed ?
 Job Contr. - CondorG 
CE characts  status
SE characts  status
Computing Element
Storage Element 
 13RB node
 Job Status
Job submission
Replica Catalog
Network Server
Matchmaker responsible to find the best CE 
 where to submit a job
 Match- Maker/ Broker 
 Workload Manager 
Inform. Service
 RB storage
 Job Contr. - CondorG 
CE characts  status
SE characts  status
Computing Element
Storage Element 
 14RB node
 Job Status
Where are (which SEs) the needed data ?
Job submission
Replica Catalog
Network Server
 Match- Maker/ Broker 
 Workload Manager 
Inform. Service
 RB storage
What is the status of the Grid ?
 Job Contr. - CondorG 
CE characts  status
SE characts  status
Computing Element
Storage Element 
 15RB node
 Job Status
Job submission
Replica Catalog
Network Server
 Match- maker 
 Workload Manager 
Inform. Service
 RB storage
CE choice
 Job Contr. - CondorG 
CE characts  status
SE characts  status
Computing Element
Storage Element 
 16RB node
 Job Status
Job submission
Replica Catalog
Network Server
 Workload Manager 
Inform. Service
 RB storage
 Job Adapter 
 Job Contr. - CondorG 
CE characts  status
SE characts  status
JA responsible for the final touches to the 
job before performing submission (e.g. creation 
of wrapper script, etc.)
Computing Element
Storage Element 
 17RB node
 Job Status
Job submission
Replica Catalog
Network Server
 Workload Manager 
Inform. Service
 RB storage
Job
 Job Contr. - CondorG 
CE characts  status
JC responsible for the actual job 
management operations (done via CondorG)
SE characts  status
Computing Element
Storage Element 
 18RB node
 Job Status
Job submission
Replica Catalog
Network Server
 Workload Manager 
Inform. Service
 RB storage
 Job Contr. - CondorG 
CE characts  status
Input Sandbox files
SE characts  status
Job
Computing Element
Storage Element 
 19RB node
 Job Status
Job submission
Replica Catalog
Network Server
 Workload Manager 
Inform. Service
 RB storage
 Job Contr. - CondorG 
Input Sandbox
Grid enabled data transfers/ accesses
Computing Element
Storage Element 
 20RB node
 Job Status
Job submission
Replica Catalog
Network Server
 Workload Manager 
Inform. Service
 RB storage
 Job Contr. - CondorG 
Output Sandbox files
Computing Element
Storage Element 
 21Job submission
RB node
 Job Status
edg-job-get-output ltdg-job-idgt
Replica Catalog
Network Server
 Workload Manager 
Inform. Service
 RB storage
 Job Contr. - CondorG 
Output Sandbox
Computing Element
Storage Element 
 22RB node
 Job Status
Job submission
submitted
Replica Catalog
Network Server
waiting
ready
Output Sandbox files
 Workload Manager 
Inform. Service
 RB storage
scheduled
 Job Contr. - CondorG 
running
done
cleared
Computing Element
Storage Element 
 23Logging and bookkeeping. 
edg-job-status ltdg-job-idgt
LB receives and stores job events processes 
 corresponding job status
Job status
Logging  Bookkeeping
Log Monitor
Log of job events
LM parses CondorG log file (where CondorG 
logs info about jobs) and notifies LB 
 24New functionality
- Release 2 of WP 1 software 
- New functionality includes 
-  MPI job submission 
- User APIs 
- Accounting infrastructure (Management have 
 decided not to deploy this for testbed 2)
- Interactive job support 
- Job logical checkpointing 
25New functionality
All these are implemented Specify which sort of 
job using the JobType classad e.g. JobType  
Checkpointable However only tested on the WP 1 
testbed as yet 
Dont have time to go through all of these so 
will just will just go through checkpointing. 
 26Job checkpointing scenario
RB node
Network Server
 Workload Manager 
Logging  Bookkeeping Server
 Job Contr. - CondorG 
 27Job Status
- edg-job-submit jobchkpt.jdl 
- jobchkpt.jdl 
- JobType  Checkpointable 
- Executable  "hsum.exe" 
- StdOutput  Outfile 
- InputSandbox  "/home/user/hsum.exe, 
- OutputSandbox  Outfile, 
- Requirements  member("ROOT", other.GlueHostApplic
 ationSoftwareRunTimeEnvironment)
 member("CHKPT", other.GlueHostApplicationSoftwareR
 unTimeEnvironment)
- Rank  -other.GlueCEStateEstimatedResponseTime 
RB node
submitted
Replica Catalog
Network Server
 Workload Manager 
Logging  Bookkeeping Server
Job Description Language (JDL) to specify job 
 characteristics and requirements
UI allows users to access the 
functionalities of the WMS
 Job Contr. - CondorG  
 28RB node
 Job Status
Network Server
1
Job
 Match- maker 
Job
1
2
3
Input Sandbox files
 Workload Manager 
Logging  Bookkeeping Server
 RB storage
4
 Job Adapter 
5
Job
 Job Contr. - CondorG 
6
Input Sandbox files
6
Job 
 29RB node
 Job Status
Network Server
 Workload Manager 
Logging  Bookkeeping Server
 RB storage
 Job Contr. - CondorG 
 ltsave intermediate filesgt State.saveValue(var1
, value1gt  State.saveValue(varn, 
valuen) State.saveState() 
From time to time users job asks to save the 
intermediate state 
 30RB node
 Job Status
Network Server
 Workload Manager 
Logging  Bookkeeping Server
 RB storage
 Job Contr. - CondorG 
Saving of intermediate files
Saving of job state 
 31RB node
 Job Status
Network Server
 Workload Manager 
Logging  Bookkeeping Server
 RB storage
 Job Contr. - CondorG 
Job fails (e.g. for a CE problem)
Computing Element X
Computing Element Y 
 32RB node
 Job Status
Network Server
 Match- maker 
 Workload Manager 
Logging  Bookkeeping Server
 RB storage
Where must this job be executed ? Possibly on a 
different CE where the job was previously 
submitted 
 Job Contr. - CondorG 
Reschedule and resubmit job
Job 
 33RB node
 Job Status
Network Server
 Match- maker 
 Workload Manager 
Logging  Bookkeeping Server
 RB storage
CE choice CEy
 Job Contr. - CondorG  
 34RB node
 Job Status
Network Server
 Workload Manager 
Logging  Bookkeeping Server
 RB storage
 Job Adapter 
Job
 Job Contr. - CondorG 
CE characts  status 
 35RB node
 Job Status
Network Server
 Workload Manager 
Logging  Bookkeeping Server
 RB storage
 Job Contr. - CondorG 
Input Sandbox files
Job 
 36RB node
 Job Status
scheduled
Network Server
 Workload Manager 
Logging  Bookkeeping Server
done (failed)
 RB storage
waiting
Retrieval of last saved state when job starts
 Job Contr. - CondorG 
ready
Retrieval of intermediate files (previously saved)
scheduled 
 37RB node
 Job Status
scheduled
Network Server
 Workload Manager 
Logging  Bookkeeping Server
done (failed)
 RB storage
waiting
 Job Contr. - CondorG 
ready
Job keeps running starting from the 
point corresponding to the retrieved state 
(doesnt need to start from the beginning)
scheduled
Job 
 38Further additional functionality 
The order of implementation is not up to WP 1 
people Dependent jobs Using Condor DAGMan 
 For example  
 39Further additional functionality 
 A   Executable  "A.sh" PreScript  
"PreA.sh" PreScriptArguments   "1"  
Children   "B", "C"     B   
Executable  "B.sh" PostScript  
"PostA.sh" PostScriptArguments   "RETURN" 
 Children   "D"     C   
Executable  "C.sh" Children   "D"  
   D   Executable  "D.sh" 
PreScript  "PreD.sh" PostScript  
"PostD.sh" PostScriptArguments   "1", "a"    
 40Further additional functionality 
Job partitioning will be similar to 
checkpointing, with the jobs being partitioned 
according to some variable. Partitioned jobs 
will also have a pre-job and aggregator e.g.  
 41Further additional functionality 
  JobType  Partitionable Executable 
 ... JobSteps  ... StepWeight 
 ... Requirements  ... ... 
 ... Prejob   
Executable  ... Requirements  ... 
 ... ... Aggregator  
  Executable  ... 
Requirements  ... ... ...    
 42Further additional functionality 
Also planned is advanced reservation of resources 
and co-location. Much more monitoring and 
performance quantification  
 43- Summary 
-  New architecture has been implemented 
-  Lots of new functionality  but not stress 
 tested
-  Further functionality and performance 
 quantification implemented by testbed 3.
44Further into the future
EDG will not use OGSA, however the future is in 
the OGSA grid world. Work is being done at LeSC 
(See Steven Newhouses talk tomorrow) to wrap the 
WP 1 components. Communication via JDML and 
LBML Virtualisation of RB through OGSA 
factory Use virtualisation to load 
balance Increase interoperability