Harnessing%20the%20Capacity%20of%20Computational%20Grids%20for%20High%20Energy%20Physics%20%20Jim%20Basney,%20Miron%20Livny,%20Paolo%20Mazzanti - PowerPoint PPT Presentation

About This Presentation
Title:

Harnessing%20the%20Capacity%20of%20Computational%20Grids%20for%20High%20Energy%20Physics%20%20Jim%20Basney,%20Miron%20Livny,%20Paolo%20Mazzanti

Description:

... Miron Livny, Paolo Mazzanti. ondor. C. www.cs.wisc.edu/condor. Background ... of large collections of commodity (desktop and clusters) computing resources we ... – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0

less

Transcript and Presenter's Notes

Title: Harnessing%20the%20Capacity%20of%20Computational%20Grids%20for%20High%20Energy%20Physics%20%20Jim%20Basney,%20Miron%20Livny,%20Paolo%20Mazzanti


1
Harnessing the Capacity of Computational Grids
for High Energy PhysicsJim Basney, Miron
Livny, Paolo Mazzanti
2
Background
  • This work is the result of an ongoing
    collaboration between the Condor Team at the
    University of Wisconsin Madison and the Bologna
    section of INFN
  • Collaboration started in 1996
  • An INFN Condor pool with more than 170 CPUs is
    serving the INFN community (http//www.mi.infn.it/
    condor)
  • New features were developed and tested
  • as a result of this collaboration

3
Data Transfer Challenge
  • In order to harness for HEP the processing
    capacity of large collections of commodity
    (desktop and clusters) computing resources we
    need effective mechanisms and policies to manage
    the transfer and placement of checkpoint and data
    files and means to established affinity between
    execution sites and data storage sites.

4
Need to take into account
  • Network topology and capabilities
  • Distribution, capabilities and availability of
    storage resources
  • Distribution, capacity and availability of
    computing resources
  • Impact on interactive users

5
The Condor HTC System
  • Condor is a distributed job and resource
    management system that employs a novel
    matchmaking approach to allocate resources to
    jobs.
  • Symmetric - Requests and Offers
  • Open - No centralized schema
  • Dynamic - Easy to change information and
    semantics
  • Expressive - Full power of Boolean expressions

6
ClassAd examples
Resource Offer
OpSys "Solaris2.6" Arch "Sun4u"
Memory 256 LoadAvg
0.25 Cluster "UWCS" Requirements
My.LoadAvg lt 0.3 Rank (Target.Group
"AI)
Resource Request
Group AI" Requirements Target.Memory
gt 80 Target.OpSys "Solaris2.6"
Target.Arch "Sun4u" Rank
(Target.Cluster "UWCS)
7
Checkpoint Domains
  • Every Computational resource belongs to a
    checkpoint domain
  • Jobs can start on any resource
  • Checkpoint is saved to the local (domain)
    checkpoint server
  • Jobs are restarted only on local (domain)
    computational resources
  • Checkpoints can migrate

8
I/O Domains
  • Each resource belongs to an I/O domain. A domain
    may consist of a single machine.
  • User stages input data on storage devices and
    updates the ClassAds of the jobs and/or the
    resources to reflect the location and
    availability of the data.
  • User is responsible for moving output data to
    storage system
  • Condor monitors and reports I/O activity
  • performed via remote I/O.

9
Ongoing I/O related work
  • Improve performance and mapping capabilities of
    Remote I/O capabilities of Condor.
  • Provide interfaces to SRB (SDSC), SAM (FERMI) and
    CORBA (LBL) data storage systems.
  • Support co-scheduling of processing and network
    resources
  • Develop staging services and interface them with
    the matchmaking frame work.
  • Extend reporting and monitoring capabilities

10
Visit us at http//www.cs.wisc.edu/condor
Write a Comment
User Comments (0)
About PowerShow.com