Using the Parallel Universe beyond MPI - PowerPoint PPT Presentation

About This Presentation
Title:

Using the Parallel Universe beyond MPI

Description:

Using the Parallel Universe beyond MPI. www.cs.wisc.edu/~bgietzel ... services on the same machine does not allow for testing across a network or different platforms ... – PowerPoint PPT presentation

Number of Views:16
Avg rating:3.0/5.0
Slides: 10
Provided by: rga82
Category:

less

Transcript and Presenter's Notes

Title: Using the Parallel Universe beyond MPI


1
Using the Parallel Universe beyond MPI
2
Parallel Universe applications using Metronome
  • Metronomes support for running parallel jobs
    builds on Condors Parallel Universe
  • Possible to run coordinated Metronome jobs on
    multiple machines at the same time with available
    communication between them
  • Provides advanced testing opportunities
  • Some examples client/server, cross-platform,
    compatibility, stress/scalability

3
Service testing challenges
  • Starting multiple services on the same machine
    does not allow for testing across a network or
    different platforms
  • Deciding when to start the services and when to
    start tests requires human intervention
  • Setup of the services is usually a manual
    process, or dont bother testing.
  • Same goes for the teardown of services to return
    the machines to their original state

4
Benefits of using Metronome
  • Condor manages dynamic claiming of resources,
    communication between job nodes and cleaning up
    after the jobs run
  • Metronome publishes basic information about each
    task to the job ad where its accessible by any
    node, acting as a scratch space for the job
  • The hostnames of all job nodes, the start time,
    return code, and end time for each task on each
    node are published to this shared job ad
  • This information is useful for communication
    between nodes and synchronization in the users
    glue scripts.

5
Client/server test example
SERVER
Start server
Execute Node 0
Parallel Job
Send port to client
Handle client requests
Poll for ALLDONE from client
Exit
Submit Node
Discover server hostname and port
Start client
Run queries against server
Send ALLDONE message to server
Execute Node 1
Exit
CLIENT
6
How to submit a parallel job in Metronome
  • Several minor modifications to the Metronome
    submit file are necessary for parallel jobs
  • List of platforms is comma separated with
    parentheses around the outside
  • Platforms (x86_rhas_3, x86_rhas_4)

7
Parallel job submit files continued
  • Add a glue script for each task/node combination
    to be executed remotely.
  • platform_pre_0 client/platform_pre
  • platform_pre_1 server/platform_pre
  • remote_declare_0 client/remote_declare
  • remote_declare_1 server/remote_declare
  • remote_task_0 client/remote_task
  • remote_task_1 server/remote_task
  • remote_task_args_0 9000
  • remote_task_args_1 9001
  • and so forth for all glue scripts.

8
Other parallel job use cases
  • Cross platform testing (Linux to Solaris)
  • Scalability/stress testing (1 server, many
    clients)
  • Compatibility testing (cross version, stable vs.
    development series)

9
For more information
  • Documentation is available on the NMI site
  • See http//nmi.cs.wisc.edu/node/1001 for
    information on running parallel jobs using
    Metronome
  • http//nmi.cs.wisc.edu/node/282 describes how to
    set up your own Metronome installation for
    running parallel jobs
Write a Comment
User Comments (0)
About PowerShow.com