Practical Mechanisms for Managing Parallel and Interactive Jobs on Grid Environments - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

Practical Mechanisms for Managing Parallel and Interactive Jobs on Grid Environments

Description:

NOT a full virtual machine (Xen, VMWare,...) NO need for special priviledges in the WN ... Unfortunately, free resources are not always available ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 30
Provided by: Mosu5
Category:

less

Transcript and Presenter's Notes

Title: Practical Mechanisms for Managing Parallel and Interactive Jobs on Grid Environments


1
Practical Mechanisms for Managing Parallel and
Interactive Jobs on Grid Environments
  • Enol Fernández
  • UAB

2
  • Introduction
  • CrossBroker
  • Glide In
  • Parallel Job Support
  • Interactive Job Support
  • Conclusions

3
Batch execution on Grids
Internet
REMOTE SITE
REMOTE SITE
4
Parallel Interactive Job Execution
  • Use of resources from different sites
  • Resource-sets search
  • Co-allocation synchronization
  • Fast start-up
  • Execution in high-occupancy situations

Internet
REMOTE SITE
REMOTE SITE
MPI
5
CrossBroker
  • CrossBroker does automatic scheduling in Grid
    Environments
  • Resource discovery
  • Resource Selection
  • Job Execution
  • Jobs not treated by gLite
  • parallel jobs (MPI)?
  • Run in more than one resource, in a coordinated
    fashion.
  • Interactive jobs
  • The user interacts with the application during
    its execution

6
CrossBroker
Outdated information Dynamic changes
Information Index
Migrating Desktop
Scheduling Agent
Resource Searcher
LRMS (PBS, LSF, Condor) limited external
control Non cooperative LRMS Local user jobs
CrossBroker
Replica Manager
Application Launcher
Condor-G
DAGMan
CE
CE
EGEE/Globus
EGEE/Globus
LRMS
LRMS
WN
WN
7
Glide In
  • The idea
  • Each batch job is encapsulated in an agent that
    takes control over the WN independently of its
    LRMS
  • Lightweight Virtual Machines
  • Each Worker Node is divided in 2 VM
  • Each VM can execute jobs independently (e.g.
    batch and interactive)
  • Fast startup of jobs (no need to go trough globus
    LRMS)
  • NOT a full virtual machine (Xen, VMWare,)
  • NO need for special priviledges in the WN

8
Glide In
Grid Resource
CrossBroker
LRMS
Batch Job
Scheduling Agent
Application Launcher
Condor-G
9
Glide In
Grid Resource
CrossBroker
LRMS
Batch Job
Scheduling Agent
Agent
Application Launcher
VM1
VM2
Condor-G
10
Glide In
Grid Resource
CrossBroker
LRMS
Batch Job
Scheduling Agent
Agent
Application Launcher
VM1
VM2
Condor-G
11
Glide In
Grid Resource
CrossBroker
LRMS
Batch Job
Scheduling Agent
Agent
Application Launcher
VM1
VM2
Condor-G
Available for other jobs
12
Parallel Job Support
  • Support for parallel jobs
  • Open MPI
  • PACX-MPI
  • MPICH-P4
  • MPICH-G2
  • Plain (just the machines)
  • Takes into account sites capabilites.
  • Low level details of MPI implementations and
    sites handled by starter scripts.
  • mpi-start is configured automatically and used by
    default.

13
Parallel Job Support
  • Changes in JDL
  • JOBTYPE
  • Normal sequential jobs, just one CPU
  • Parallel more than one CPU
  • SUBJOBTYPE
  • openmpi
  • pacx-mpi
  • mpich
  • mpich-g2
  • Plain
  • Plain allows easy extension for supporting new
    parallel job types

14
Parallel Job Support
Type "Job" VirtualOrganisation
"imain" JobType "Parallel" SubJobType
"pacx-mpi" NodeNumber 5 Executable
"test-app" Arguments "-v" InputSandbox
"test-app", "inputfile" OutputSanbox
"std.out", "std.err" StdErr
"std.err StdOutput "std.out" Rank
other.GlueHostBenchmarkSI00 Requirements
other.GlueCEStateStatus "Production"
15
Parallel Job Support
Groups with 1 CEs Rank2000
aocegrid.uab.es2119/jobmanager-pbs-workq
freeCPUs 10 Groups with 2 CEs
Rank1500 zeus.cyf-kr.edu.pl2119/jobmanager
-pbs-workq freeCPUs 2
bee001.ific.uv.es2119/jobmanager-pbs-workq
freeCPUs 3 Rank1000 bee001.ific.uv.es2
119/jobmanager-pbs-workq freeCPUs 3
lngrid02.lip.pt2129/jobmanager-pbs-workq
freeCPUs 2
16
Parallel Job Support
Startup server
Cross Broker
MPI SubTask
MPI SubTask
1. Launch a PACX Startup Server
2. Submit MPI Subtasks 3. MPI-START will start
each of the Subtasks
4. Subtask notify the startup server and start
running 5. CrossBroker monitors the application
17
Parallel Job Support
  • CrossBroker search and selects sets of resources
    for the jobs
  • There is no guarantee that all tasks of the same
    job will start at the same time
  • 1st choice select only sites with free
    resources. The job will run immediately.
    Unfortunately, free resources are not always
    available
  • 2nd choice allocate a resource temporally and
    wait until all other tasks show up. Timeshare the
    resource with a backfilling policy to avoid
    resource iddleness

18
Glide In for co-allocation
Grid Resource
CrossBroker
LRMS
MPI JOB
Scheduling Agent
Condor-G
19
Glide In for co-allocation
Grid Resource
CrossBroker
LRMS
MPI JOB
Scheduling Agent
Agent
Application Launcher
VM1
VM2
Condor-G
MPI Task
Waiting for the rest of tasks
20
Glide In for co-allocation
Grid Resource
CrossBroker
JOB
LRMS
Scheduling Agent
Agent
Application Launcher
VM1
VM2
Condor-G
MPI TASK
BackFilling While the MPI waits
21
Glide In for co-allocation
Grid Resource
CrossBroker
LRMS
Scheduling Agent
Agent
Application Launcher
VM1
VM2
Condor-G
MPI TASK
JOB
All tasks Ready!
22
Interactive Job Support
  • Fast startup
  • Cache of resources fast matchmaking
  • Scheduling priority use free resources or
    glideins
  • Fast notification of events
  • CrossBroker injects interactive agents that
    enable communication between user and job
  • Transparent to the user
  • Condor Bypass glogin agents

23
Interactive Job Support
  • Changes in JDL
  • INTERACTIVE true/false. Indicates that the job
    is interactive and the broker should treat it
    with higher proirity
  • INTERACTIVEAGENT
  • INTERACTIVEAGENTARGUMENTS
  • These attributes specify the command (and its
    arguments) used to communicate with the user.

24
Interactive MPI application
Type "Job" VirtualOrganisation
"imain" JobType "Parallel" SubJobType
openmpi" NodeNumber 4 Interactive
TRUE InteractiveAgent glogin InteractiveAgen
tArguments -r p 195.168.105.6523433 Executa
ble "test-app" InputSandbox "test-app",
"inputfile" OutputSanbox "std.out",
"std.err" StdErr "std.err StdOutput
"std.out" Rank other.GlueHostBenchmarkSI00
Requirements other.GlueCEStateStatus
"Production"
25
Interactive MPI application
Started by the CrossBroker
Users Machine
Remote Resource
Master
glogin
Video Stream
MPI
Worker
Worker
Worker
Started with mpi-start
26
Glide In for interactive jobs
Grid Resource
CrossBroker
INT. JOB
LRMS
Scheduling Agent
Agent
Application Launcher
VM1
VM2
Condor-G
BATCH
27
Glide In for interactive jobs
Grid Resource
CrossBroker
INT. JOB
LRMS
Scheduling Agent
Agent
Application Launcher
VM1
VM2
Condor-G
BATCH
BATCH
Priority adjustment
Startup-time Reduction Only one layer involved
28
Conclusions Future work
  • CrossBroker gives support to Parallel and
    Interactive jobs
  • Automatically
  • Interoperable with EGEE
  • Glide In
  • Fast startup of jobs
  • Co-allocation without reservation or wasting
    resources
  • Future work
  • Explore more complex multiprogramming (e.g. 3 or
    more VM)
  • Decentralization of the services

29
Practical Mechanisms for Managing Parallel and
Interactive Jobs on Grid Environments
  • Enol Fernández
  • UAB
Write a Comment
User Comments (0)
About PowerShow.com