A Universal Machine File Format for MPI Jobs - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

A Universal Machine File Format for MPI Jobs

Description:

Beowulf clusters = affordable parallel computer. Competing queue managers (QM) PBS, LSF, SGE ... Different Beowulf configurations in same Department/Institute ... – PowerPoint PPT presentation

Number of Views:76
Avg rating:3.0/5.0
Slides: 13
Provided by: chuk
Category:

less

Transcript and Presenter's Notes

Title: A Universal Machine File Format for MPI Jobs


1
A Universal Machine File Format for MPI Jobs
  • Francis Tang
  • Ho Liang Yoong
  • Chua Ching Lian
  • Arun Krishnan
  • Bioinformatics Institute, Singapore

2
Background
  • Commodity super computing
  • Beowulf clusters affordable parallel computer
  • Competing queue managers (QM)
  • PBS, LSF, SGE
  • Competing interconnects
  • Myrinet, Quadrics, Infiniband, Giga-/Fast-Ethernet

3
Consequences
  • Different Beowulf configurations in same
    Department/Institute
  • Different (incompatible) submit scripts
  • Different MPI implementations
  • Different execution mechanisms
  • Lack of interoperability support between clusters.

4
Example BII
  • BII runs three production linux clusters
  • Viper 128 cpu, LSF, Quadrics
  • Mamba 64 cpu, PBS/Torque, Myrinet
  • Cobra 64 cpu, SGE, Myrinet
  • Users unwilling to move between clusters
  • Too much trouble!
  • Rewrite/debug submit scripts

5
Example LSFQuadrics
  • Quadrics prun and LSF scheduling do not mix
  • Must force prun to use nodes allocated by LSF
  • !/bin/bash
  • cd /home/myhome/myworkdir
  • rm -f myprocfile
  • for i in echo LSB_HOSTS do
  • HNecho icut -f 1 -d .
  • echo "1 HN/home/myhome/myworkdir/ mympijob"
    gtgt myprocfile
  • done
  • chmod x myprocfile
  • prun -f myprocfile

6
Example SGEMyrinet
  • Must force mpirun to use nodes allocated by SGE
  • !/bin/sh
  • -S /bin/sh
  • -pe make 5-5
  • cp -a PE_HOSTFILE /home/yanghwee/sge/pe_hostfile
  • cat /home/yanghwee/sge/pe_hostfile \
  • cut -d\ -f1-2 \
  • sed "s/\ //g" \
  • gt /home/yanghwee/sge/my_machines
  • /usr/local/mpich-gm/bin/mpirun -np 5 \
  • -machinefile /home/yanghwee/sge/my_machines \
  • /home/yanghwee/sge/yhcpi

7
A partial solution
  • Standardise MPI implementation
  • Abstraction VMI
  • Lowest common denominator MPICH-p4 (TCP/IP)
  • Pros
  • Homogeneous mechanism for running MPI programs
  • Binary compatibility between clusters
  • Cons
  • Abstraction is slow
  • MPICH-p4 doesnt use high-end interconnect
  • Only solves MPI problem

8
Solution QM-MPI Adapters
  • A specific QM-MPI adapter
  • Determines parallel environment allocated by QM
  • Translates into form suitable for MPI
    implementation
  • (Optional) Execute MPI program
  • Important criterion
  • Solution must be scalable

9
The Solution Abstraction
  • Universal Machine File (UMF)
  • Abstract description of parallel environment
  • Intermediate between QM and MPI descriptions
  • Pros scalability

10
UMF Example
  • Following is a sample script for executing
    mpiprog using MPICH-GM and SGE
  • mkumf automagically detects the QM

!/bin/sh -S /bin/sh -pe make 5-5
alternatives include mpichp4, mpiqsnet MPIVERSION
mpichgm write umf xml mkumf -p mpiprog gt
example.xml write machine file, and execute
program eval umf2mpi -u example.xml -o
machines.txt -e MPIVERSION
11
Availability
  • UmfProgs
  • Simple Xerces-based implementation of UMF
  • GPL (opensource, free)
  • Supports
  • SGE, LSF, PBS/Torque
  • MPICH-GM (Myrinet), MPICH-p4, QsNet (Quadrics)
  • http//web.bii.a-star.edu.sg/francis/UmfProgs/

12
Summary
  • Interoperability of clusters is important
  • UMF provides
  • Scalable platform for developing QM/MPI
    interfaces
  • No run-time performance penalty
  • Freely available implementation - UmfProgs
Write a Comment
User Comments (0)
About PowerShow.com