Remote%20Operation%20of%20a%20Monte%20Carlo%20Production%20Farm%20Using%20Globus - PowerPoint PPT Presentation

About This Presentation
Title:

Remote%20Operation%20of%20a%20Monte%20Carlo%20Production%20Farm%20Using%20Globus

Description:

Installation at Ohio State. Globus 2.2.4 on dedicated server ... Farm configuration details hidden. Loss of dynamic configurability but much simpler ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 11
Provided by: dirkhu3
Category:

less

Transcript and Presenter's Notes

Title: Remote%20Operation%20of%20a%20Monte%20Carlo%20Production%20Farm%20Using%20Globus


1
Remote Operation of a Monte Carlo Production Farm
Using Globus
  • Dirk Hufnagel, Teela Pulliam,
  • Thomas Allmendinger, Klaus Honscheid
  • (Ohio State University) 

2
The Problem
  • High luminosity experiments need large MC sample
    (Belle,BaBar require hundreds of millions of MC
    events)
  • Massive computing power needed (farms of Linux
    machines)
  • Farms are typically geographically distributed
  • CLEO two sites
  • DELPHI five sites
  • BaBar two dozen sites (US and Europe)
  • Belle eight sites

3
Hardware alone is not sufficient
  • Hardware, system level software maintenance
  • Experiment specific MC software setup
  • MC production
  • Job submission
  • Job monitoring (rerun failed jobs)
  • Data transfer
  • Coordination

4
Is there another way?
  • Reduced manpower requirements
  • More efficient coordination
  • Our approach
  • Select one of the steps in the MC production
    chain
  • MC Production
  • Centralize operations
  • Remote submission and monitoring
  • Evaluate GRID tools. Can they help with MC
    production?
  • Globus toolkit

5
OSU MC Production Farm
  • 27 dual Athlon nodes 1U
  • 1 dual Athlon server 4U
  • 840GB disk in RAID
  • OpenPBS batch system
  • File/batch queue server
  • 600-700k MC events/day

6
Globus Toolkit
  • Globus
  • Secure access
  • Certificates for client and server
  • Remote command execution system
  • We observed significant overhead
  • few seconds for single command
  • Integrated tools
  • e.g. GRIDftp
  • Installation at Ohio State
  • Globus 2.2.4 on dedicated server
  • Separate batch queue system for testing
  • No Resource Broker
  • Farm configuration details hidden
  • Loss of dynamic configurability but much simpler

7
MC production I Job submission
  • Typical input information
  • (MC software release), run range, events
  • To do
  • build MC jobs and submit them
  • Choose on option
  • One Globus command starts whole run range
    production
  • many (thousands) of local jobs
  • still need local script
  • One Globus command starts a single MC production
    job
  • Too slow
  • Submit all production runs at once
  • Only submit enough runs to fill queue
  • Re-submitted jobs proceed faster

8
MC production II Job monitoring
  • Job Status (qstat)
  • Use local script to monitor log files
  • Resubmit crashed jobs locally
  • Monitor through Globus (remotely)
  • Speed?
  • Data Quality Monitoring
  • check physics histograms
  • not always done during production

9
MC production III Data transfer
  • Easy if MC output is in file format
  • GridFTP
  • Can be complicated otherwise
  • Example would be writing MC into a database
  • Remote or local file management?
  • Limited disk space -gt delete generated MC
  • Log files

10
Conclusion
  • MC production for a high luminosity experiment
    requires significant hardware and manpower
    resources.
  • GRID tools can help to centralize this effort.
  • Simple test show that remote operation of MC
    farms is possible
  • Relatively easy to setup
  • Globus framework (secure access, remote command
    execution)
  • Local scripts for job submission, monitoring
  • Still, significant software infrastructure
    (local scripts required.
  • Other parts of the MC production chain need to be
    addressed before this becomes a realistic option.
  • Remote MC software installation and version
    management
Write a Comment
User Comments (0)
About PowerShow.com