Condor Introduction Asia Pacific Grid Workshop Tokyo, Japan October 2001 - PowerPoint PPT Presentation

About This Presentation
Title:

Condor Introduction Asia Pacific Grid Workshop Tokyo, Japan October 2001

Description:

'How fast can I run simulation X on this machine?' High-throughput: CPU cycles/day (week, month, year?) under non-ideal circumstances. ... – PowerPoint PPT presentation

Number of Views:36
Avg rating:3.0/5.0
Slides: 16
Provided by: Miron1
Category:

less

Transcript and Presenter's Notes

Title: Condor Introduction Asia Pacific Grid Workshop Tokyo, Japan October 2001


1
Condor Introduction Asia Pacific Grid
WorkshopTokyo, JapanOctober 2001
2
Outline
  • Overview What is Condor
  • What does Condor do?
  • What is Condor good for?
  • What kind of results can I expect?

3
The Condor Project (Established 85)
  • Distributed High Throughput Computing research
    performed by a team of 25 faculty, full time
    staff and students who
  • face software engineering challenges in a
    distributed UNIX/Linux/NT environment,
  • are involved in national and international
    collaborations,
  • actively interact with academic and commercial
    users,
  • maintain and support a large distributed
    production environment,
  • and educate and train students.
  • Funding US Govt. (DoD, DoE, NASA, NSF),
  • ATT, IBM, INTEL, Microsoft, UW-Madison

4
What is High-Throughput Computing?
  • High-performance CPU cycles/second under ideal
    circumstances.
  • How fast can I run simulation X on this
    machine?
  • High-throughput CPU cycles/day (week, month,
    year?) under non-ideal circumstances.
  • How many times can I run simulation X in the
    next month using all available machines?

5
What is Condor?
  • Condor converts collections of distributively
    owned workstations and dedicated clusters into a
    distributed high-throughput computing (HTC)
    facility.
  • Condor uses ClassAd Matchmaking to make sure that
    everyone is happy.
  • Fault tolerance provided with checkpointing and
    other technologies.

6
The Condor System
  • Unix and NT
  • Operational since 1986
  • Manages more than 1300 CPUs at UW-Madison
  • Software available free on the web
  • More than 150 Condor installations worldwide in
    academia and industry

7
Some HTC Challenges
  • Condor does whatever it takes to run your jobs,
    even if some machines
  • Crash (or are disconnected)
  • Run out of disk space
  • Dont have your software installed
  • Are frequently needed by others
  • Are far away managed by someone else

8
What is ClassAd Matchmaking?
  • Condor uses ClassAd Matchmaking to make sure that
    work gets done within the constraints of both
    users and owners.
  • Users (jobs) have constraints
  • I need an Alpha with 256 MB RAM
  • Owners (machines) have constraints
  • Only run jobs when I am away from my desk and
    never run jobs owned by Bob.

9
Upgrade to Condor-G
  • A Grid-enabled version of Condor that provides
    robust job management for Globus.
  • Robust replacement for globusrun
  • Provides extensive fault-tolerance
  • Brings Condors job management features to Globus
    jobs

10
What Have We Done on the Grid Already?
  • Example NUG30
  • quadratic assignment problem
  • 30 facilities, 30 locations
  • minimize cost of transferring materials between
    them
  • posed in 1968 as challenge, long unsolved
  • but with a good pruning algorithm
    high-throughput computing...

11
NUG30 Solved on the Grid with Condor Globus
  • Resource simultaneously utilized
  • the Origin 2000 (through LSF ) at NCSA.
  • the Chiba City Linux cluster at Argonne
  • the SGI Origin 2000 at Argonne.
  • the main Condor pool at Wisconsin (600
    processors)
  • the Condor pool at Georgia Tech (190 Linux boxes)
  • the Condor pool at UNM (40 processors)
  • the Condor pool at Columbia (16 processors)
  • the Condor pool at Northwestern (12 processors)
  • the Condor pool at NCSA (65 processors)
  • the Condor pool at INFN (200 processors)

12
NUG30 - Solved!!!
  • Sender goux_at_dantec.ece.nwu.edu Subject Re Let
    the festivities begin.
  • Hi dear Condor Team,
  • you all have been amazing. NUG30 required 10.9
    years of Condor Time. In just seven days !
  • More stats tomorrow !!! We are off celebrating !
  • condor rules !
  • cheers,
  • JP.

13
The Idea
  • Computing power is everywhere, we try to make
    it usable by anyone.

14
Condor Tutorial This Afternoon Outline
  • Understanding Condor
  • Using Condor to manage jobs
  • Using Condor to manage resources
  • Condor Architecture and Mechanisms
  • Condor on the Grid
  • Flocking
  • Condor-G
  • Case Study Distributed TeraFlop

15
Thank you!
  • Check us out on the Web
  • http//www.cs.wisc.edu/condor
  • Email
  • condor-admin_at_cs.wisc.edu
Write a Comment
User Comments (0)
About PowerShow.com