ECE498 Project 1 Building a Condor Cluster - PowerPoint PPT Presentation

1 / 10
About This Presentation
Title:

ECE498 Project 1 Building a Condor Cluster

Description:

Joseph St. Pierre. Overview. Project Overview. Building a Cluster. Condor ... Jobs must be submitted through Dedicated Scheduler. Cannot checkpoint parallel jobs ... – PowerPoint PPT presentation

Number of Views:53
Avg rating:3.0/5.0
Slides: 11
Provided by: jstp9
Category:

less

Transcript and Presenter's Notes

Title: ECE498 Project 1 Building a Condor Cluster


1
ECE498 Project 1Building a Condor Cluster
  • Jason Thomas
  • Joseph St. Pierre

2
Overview
  • Project Overview
  • Building a Cluster
  • Condor Overview
  • Installing/Configuring Condor
  • Condor and MPI
  • Submitting Jobs
  • Improvements
  • Questions

3
Project Overview
  • Built a cluster of three computers
  • Uses condor to manage resources
  • Runs MPI jobs

4
Building a Cluster
  • Create host names for machines
  • Configure network files

5
Condor Overview
  • Uses desktop computers
  • Schedules parallel jobs on idle desktop computers
  • Administrator of computer defines job scheduling
  • Submit multiple jobs at once
  • Checkpointing

6
Installing/Configuring Condor
  • Condor User
  • Central Manager
  • Local and Global Configuration Files
  • Owner Preferences

7
Condor and MPI
  • MPI Requires Dedicated Resources
  • Jobs must be submitted through Dedicated
    Scheduler
  • Cannot checkpoint parallel jobs

8
Submitting Jobs
  • Must submit MPI jobs on dedicated scheduler
  • Write script
  • Sample script
  • universe MPI
  • executable simplempi
  • log logfile
  • input infile.(NODE)
  • output outfile.(NODE)
  • error errfile.(NODE)
  • machine_count 4
  • queue

9
Improvements
  • Set up machines to run both dedicated and
    opportunistic jobs
  • MPICH-V
  • Add more machines to cluster

10
Questions
Write a Comment
User Comments (0)
About PowerShow.com