Remote Deployment and Execution in Distributed Systems - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

Remote Deployment and Execution in Distributed Systems

Description:

HASSO PLATTNER INSTITUTE. for IT Systems ... Basic Questions of Remote Deployment and Execution ... Resource Matchmaking via Classified Advertisement Mechanism ... – PowerPoint PPT presentation

Number of Views:36
Avg rating:3.0/5.0
Slides: 30
Provided by: east78
Category:

less

Transcript and Presenter's Notes

Title: Remote Deployment and Execution in Distributed Systems


1
Remote Deployment and Execution in Distributed
Systems
  • Martin Breest

2
Agenda
  • Historical Development of Distributed Systems
  • Basic Questions of Remote Deployment and
    Execution
  • Example application Image Rendering with POV-Ray
  • Remote Deployment and Execution with
  • a Simple Shell Script
  • the Distributed Resource Management System Condor
  • the Globus Grid Toolkit

3
Historical Development of Distributed Systems
  • Problem Execution of computationally intensive
    jobs
  • First solution supercomputer
  • Expensive much processing power ownership by
    one organisation restricted access
  • Introduction of personal computers
  • Cheap each user can own one little processing
    power
  • Introduction of internet and Web technolgies
  • World wide access to any resource global
    distribution possible
  • Second solution clustering of personal computers
  • Cheap distributed scalable
  • Type 1 Distributed ownership, heterogeneous
    resources
  • Type 2 Central ownership, homogeneous resources
  • Third solution Grid computing
  • Share resources (supercomputer, cluster) across
    organizational borders through standardized
    interfaces and protocols

Job Submission
Job Submission
Job Submission
4
Basic Questions of Remote Deployment and Execution
  • How do we describe what we want to do ?
  • Shell script, C program
  • Job description
  • How do we transfer files to and from the
    execution machine?
  • Program files
  • Input and output data
  • Log and error data
  • How do we execute and manage the jobs?
  • Uncontrolled (start process manually)
  • Controlled (use management software for cluster
    or grid systems)

5
Example Application Image Rendering with POV-Ray
  • POV-Ray, the Persistence of Vision Raytracer, is
    a ray tracing program that can render a 3D scene
    from a scene description file written in the
    scene description language (SDL).
  • Problem
  • Rendering of a scene requires much processing
    power and might take hours or days
  • Solution
  • POV-Ray allows to render only parts of a scene
    and to store them in PPM format
  • POV-Ray allows to compose a scene out of the
    independently rendered scene parts

6
Example Application Image Rendering with POV-Ray
povray FP Irenderfile.pov
Opart1.ppm W1024 H768
SR1 ER96
  • Image rendering with POV-Ray is a good example
    for the distributed execution of an application
    on a cluster or grid system!

7
Distributing a Job with a Simple Shell Script
  • Requirements
  • Knowledge about available machines (execution
    nodes)
  • User account on each execution node (NIS)
  • Private key public key on each execution node
    (SSH-KEYGEN)
  • Executable shell scripts
  • Deployment
  • Alternative 1 NFS home directory is available
    on each execution node
  • Alternative 2 SMOUNT mount directory of
    submission node on each execution node
  • Alternative 3 SFTP (SCP) transmit program
    input data to and output data from each execution
    node
  • Execution
  • Start process on execution node via SSH

Execution Node
Execution Node
SSH
SSH
SSH
SSH
Submission Node
Execution Node
Execution Node
User Account (NIS)
Home Directory (NFS)
8
Distributing the POV-Ray Job with a Simple Shell
Script
  • multipovray.sh script executed on submission node
  • starts render processes on execution nodes and
    waits for process end
  • builds image from rendered parts using the
    buildimage.sh script
  • createimagepart.sh script executed on execution
    nodes renders an image part

1- Prepare image generation
4- Build image from rendered parts
ssh tb0.asg-platform.org ./createimagepart.sh
part1 W1024 H768 SR1 ER192 while -f
/tb0.asg-platform.org do sleep 1 done
Submission Node
2 Start process on execution node
(SSH)
tail --bytes17 part1.ppmgtpart_t1.ppm echo
"P6"gtheader echo "1024 768"gtgtheader echo
"255"gtgtheader cat header part_t1.ppm gt
renderedimage.ppm
3 Wait for process execution end
3 Render image part
Execution Node
touch /hostname povray FP Irenderfile.pov
O1.ppm 2 3 4 5 rm /hostname
9
Problems of the Simple Shell Script Solution
  • General Problems
  • User requires account on each execution node
  • User needs to know all available execution nodes
  • Job code and job management code are mixed up
  • Script can hardly be reused
  • Execution Specific Problems
  • Job execution is not reliable
  • Capabilities of execution node are not considered
  • Number of processors, processor speed, operating
    system, shell, installed software, etc.
  • Process priority not considered
  • Utilization of execution node is not considered
  • Advanced Problems
  • Consumption of resources can not be monitored,
    metered, accounted and billed

10
Lessons learned from the Simple Shell Script
Solution
  • We require a better resource management that
    tells us which resources are available, what
    capabilities they have, and how they are
    utilized!
  • We require a better job management that allows to
    match resources required by a job with available
    resources, to execute jobs reliably, to define
    the order of job execution, and to cash a user
    for consumed resources!
  • We require a Distributed Resource Management
    (DRM) system that provides the desired
    functionality!

11
Architecture of a DRM
12
Scheduling Strategies of a DRM
Submission Node
  • Basic Scheduling Algorithm
  • First-Come-First-Serve (FCFS)
  • Queue with priority order
  • Backfilling
  • Allows small jobs to move ahead
  • Problem Starvation
  • Advanced Reservation
  • Book resources in advance to run a job in the
    future
  • Problem Gaps
  • Gang Scheduling
  • Schedules related threads or processes to run
    simultaneously on different processors
  • Allows threads to communicate with each other at
    the same time
  • Jobs are preempted and re-scheduled as a unit

Job Submission
Head Node with Batch Scheduler
D
C
B
First Job
A
Job Execution
Job Queues
Execution Nodes
Fully utilized node
Booked from 700 800 PM
13
What is Condor?
  • System for Distributed Resource Management (DRM)
  • Manages resources (machines) and resource
    requests (jobs)
  • System for High Throughput Computing (HTC)
  • Manage and exploit unused computing resources
    efficiently
  • Maximize the amount of accessible resources to
    its users
  • Resources are not dedicated and not always
    available as in other DRM systems
  • Ownership of resources is distributed among
    different users

14
Architecture of Condor
15
Key Features of Condor
  • Distributed Infrastructure
  • Available resources are always known
  • Job execution can be monitored and is reliable
  • Declarative Job Description
  • Job code and job management code are seperated
  • Resource Matchmaking via Classified Advertisement
    Mechanism
  • Resources advertise their capabilities
  • Jobs describe the required and desired resources
  • Universe Mechanism
  • Different run-time environments for program
    execution can be selected (Standard, Vanilla,
    MPI, etc.)

16
Key Features of Condor
  • Checkpointing
  • Job execution is checkpointed and jobs can be
    migrated to another resource
  • File Transfer Mechanism
  • Program code and data can automatically be
    transfered to execution node
  • Priority Scheduling Algorithm
  • Priority queue is sorted by user priority, job
    priority and submission time
  • Starvation is prevented by giving each users the
    same amount of machine allocation time over a
    specified interval
  • Scheduling behaviour can be changed through the
    use of ClassAd mechanism
  • DAGMan meta scheduler
  • Job dependencies can be described in Directed
    Acyclic Graphs
  • DAG can be used to describe sequential and
    parallel executions

17
Distributing a Job with Condor
  • Requirements
  • User account on each execution node (NIS)
  • Machine configured as submission node
  • Valid job description
  • Deployment
  • Alternative 1 NFS - home directory is available
    on each execution node
  • Alternative 2 Condors file transfer mechanism
  • Execution
  • Execute condor_submit or condor_submit_dag to
    add job to local queue

Send resource ClassAds
Execution Node
Central Node
Query resource request ClassAds
Start remote process
Send resource ClassAds
Start remote process
Execution Node
Submission Node
User Account (NIS)
Home Directory (NFS)
18
Distributing the POV-Ray Job with Condor
  • createimagepart job file contains job description
    for image part rendering
  • buildimage job file contains job description for
    image generation
  • multipovray job file contains workflow for image
    generation

1- Submit Job
2- Start first job in workflow
Executable ./povray-3.6/povray Universe
vanilla Requirements (Arch "INTEL" OpSys
"LINUX") Arguments FP Irenderfile.pov
Opart1.ppm
L./povray-3.6/include/ W1024 H768 SR1
ER96 Queue
4 Start next job in Workflow
Submission Node
Central Node
3 Start image part rendering
5 Start image generation
Executable ./buildimage.sh Universe
vanilla Requirements (Arch "INTEL" OpSys
"LINUX") Queue
Execution Node
Execution Node
User Account (NIS)
Home Directory (NFS)
Job A createimageparts Job B
buildimage PARENT A CHILD B
19
Problems with the Condor Solution
  • User requires account on each execution node
  • No central submission node
  • Jobs cannot be executed if submission node is
    down
  • No automated distribution and deployment of
    software
  • Software required for job execution is not
    deployed automatically
  • Interoperability with other DRMs and Grid
    solutions
  • Standardized protocols and interfaces to access
    resources and scheduler of other clusters,
    supercomputers and also to provide access
    (Condor-G, Glide-In, Flocking)

20
Leasons learned from Condor Solution
  • Condor is an excellent solution to distribute a
    computationally intensive job to a pool of
    available resources. But it would be nice to be
    able to also access schedulers and resources of
    other DRMs and to provide access to Condor
    managed schedulers and resources to other DRMs.
  • To make a long story short, it would be nice to
    have standardized protocols and interfaces that
    allow to share (computing) resources across
    organizational borders. This is one goal that
    grid computing tries to achieve.

21
What is the Globus Toolkit?
  • A fundamental enabling technology for the Grid
  • Allows people to share computing power,
    databases, and other tools
  • Resources can be shared across corporate,
    institutional, and geographic boundaries
  • Enforces local autonomy
  • A software toolkit for developing grid
    applications
  • Provides software services and libraries for
    resource management (WS-GRAM), data management
    (RFT GridFTP), information services (WS MDS
    Index Trigger Service), security
  • Services, interfaces and protocols are based on
    the WS-Resource Framework (WSRF) and the Open
    Grid Services Architecture (OGSA) standards
  • Goal Achieve interoperability for distributed,
    dynamic and heterogeneous environments

22
Globus Toolkit Architecture
23
Distributing a Job with the Globus Toolkit
  • Requirements
  • Valid Globus security credentials
  • User account on each execution host
  • Mapping from Globus credentials to local user
    identity
  • Machine configured as submission node
  • Valid job description
  • Deployment
  • GridFTP server on submission node and Reliable
    File Transfer (RFT) service in globus grid
    container
  • RFT service and execution nodes use shared file
    system
  • Execution
  • Create security credentials via grid-proxy-init
  • Submit job via globusrun-ws to the specified
    WS-GRAM service

GridFTP
Submission Node
File upload and download
Submission of WS-GRAM job description
Globus Node
Submit job to LSF via adapter
Submit job to PBS via adapter
LSF Head Node
PBS Head Node
Execution Nodes
User Credentials
User Account (NIS)
Home Directory (NFS)
24
Distributing the POV-Ray Job with the Globus
Toolkit
  • createimageparts.xml contains WS-GRAM job
    description
  • First part Definition of WS-GRAM WSRF factory
    endpoint

Submission Node
1- Submit job using job description
2- Factory endpoint points to WS-GRAM service on
tb1
lt?xml version"1.0" encoding"UTF-8"?gt ltjob
xmlnsgram"http//www.globus.org/namespaces/2004/
10/gram/job" xmlnswsa"http//schemas.xmlso
ap.org/ws/2004/03/addressing"gt
lt/jobgt
Globus Node 1 (tb1)
Globus Node 2 (tb2)
ltfactoryEndpointgt ltwsaAddressgthttps//tb1
.asg platform.org8443/wsrf/
services/ManagedJobFactoryServicelt/wsaA
ddressgt ltwsaReferencePropertiesgt
ltgramResourceIDgtLSFlt/gramResourceIDgt
lt/wsaReferencePropertiesgt lt/factoryEndpointgt
3- ResourceID says use scheduler on LSF cluster

PBS Head Node
LSF Head Node

25
Distributing the POV-Ray Job with the Globus
Toolkit
  • Second part Definition of program to be executed

lt?xml version"1.0" encoding"UTF-8"?gt ltjob
xmlnsgram"http//www.globus.org/namespaces/2004/
10/gram/job" xmlnswsa"http//schemas.xmlso
ap.org/ws/2004/03/addressing"gt
lt/jobgt
Globus Node (tb1)

LSF Head Node (tb1)
ltdirectorygtGLOBUS_USER_HOME/povray/globus/e
xec1lt/directorygt ltexecutablegt./povray-3.6/pov
raylt/executablegt ltargumentgtIrenderfile.pov
L./povray-3.6/include/ OrenderedImage.ppmlt/argum
entgt ltargumentgtFP W1024 H768 SR1 ER768
lt/argumentgt ltstderrgtmultipovray.errlt/stderrgt
ltstdingt/dev/nulllt/stdingt
ltstdoutgtmultipovray.outlt/stdoutgt
ltcountgt1lt/countgt
Execution Node (tb3)

1- Execute povray on execution node
2- Use renderfile.pov as input and store
result in renderedImage.ppm
26
Distributing the POV-Ray Job with the Globus
Toolkit
  • Third part File staging statements

lt?xml version"1.0" encoding"UTF-8"?gt ltjob
xmlnsgram"http//www.globus.org/namespaces/2004/
10/gram/job" xmlnswsa"http//schemas.xmlso
ap.org/ws/2004/03/addressing"gt
lt/jobgt
Submission Node

GridFTP

ltfileStageIngt lttransfergt
ltsourceUrlgtgsiftp//tb1.asg-platform.org2811/povr
ay/globus/renderfile.povlt/sourceUrlgt
ltdestinationUrlgtfile///GLOBUS_USER_HOME/povray

/globus/exec1/renderfile.povlt/destinationUrlgt
lt/transfergt lt/fileStageIngt
ltfileStageIngt lttransfergt
ltsourceUrlgtgsiftp//tb1.asg-platform.org2811/povr
ay/globus/povray-3.6lt/sourceUrlgt
ltdestinationUrlgtfile///GLOBUS_USER_HOME/povray

/globus/exec1/povray-3.6lt/destinationUrlgt
lt/transfergt lt/fileStageIn ltfileStageOutgt
lttransfergt ltsourceUrlgtfile///GL
OBUS_USER_HOME/povray
/globus/exec1/renderedImage.ppmlt/sourceUrlgt
ltdestinationUrlgtgsiftp//tb1.asg-platform.or
g2811/povray/globus/outputlt/destinationUrlgt
lt/transfergt lt/fileStageOutgt
1- Download renderfile.pov and povray-3.6
folder
2- Upload renderedImage.ppm
File transferdefinition
Globus Node (tb1)
27
Problems with the Globus Solution
  • User account on each execution node required plus
    valid security credentials plus mapping from
    credentials to local user identity
  • Additional infrastructure and adapters required
  • Additional overhead through Web services and data
    exchange via XML
  • Not everything (services, interfaces, protocols)
    standardized yet
  • Matchmaking
  • Automated deployment of software
  • Workflow
  • etc.

28
Lessons learned from Globus Grid Solution
  • The Globus Grid Toolkit is not the Holy Grail for
    solving interoperability problems in distributed,
    dynamic and heterogeneous environments.
  • The advantage of standardized interfaces,
    services, protocols and job descriptions comes
    along with possibly less accessible features and
    more administrative overhead for homogenizing the
    heterogeneous infrastructure.

29
Questions?
Write a Comment
User Comments (0)
About PowerShow.com