Dave Newbold, University of Bristol2462003

About This Presentation

Title:

Description:

Number of Views:106

Avg rating:3.0/5.0

Slides: 9

Provided by: davene2

Category:

Tags: bristol2462003 | dave | impala | newbold | university

Transcript and Presenter's Notes

Title: Dave Newbold, University of Bristol2462003

1
CMS MC production tools

A lot of work in this area recently!
Context PCP03 (100TB) just started
Short-term development team 10 people core
deployment team 10 people? (incl. UK).
New generation of tools
Based upon existing distributed toolset IMPALA,
BOSS, RefDB
Evolution draws from experience gained in DC02
Not explicitly designed for use on LCG testbed,
but intended to operate on Grid later (experience
from CMS EDG stress test, etc).
New umbrella project OCTOPUS
Covers all CMS distributed production and Grid
tools
Overtly Contrived Toolkit of Previously
Unrelated Stuff?
Oh Crap Time to Operate Production
Uber-Software
Formal support system / bug tracking now in place
(via Savannah)
Our worldwide Octopus has more than eight arms

2
The problems to solve

3
Core user-side toolset

McRunjob generic python local production
framework
Originally a D0 tool D0 and CMS versions almost
merged
Glues together the various stages of a
production chain in a consistent and generic way
handles job setup and input / output tracking
CMS-specific classes are provided to configure
our applications.
ImpalaLite CMS-specific modules in McRunjob
Core functionality from IMPALA, handling job
preparation
Interfaces global CMS bookkeeping database
(RefDB), data validation, job submission
BOSS local job submission and tracking
Provides a uniform interface to the various batch
systems (PBS, LSF, BQS, MOP etc etc)
Based on MySQL job tracking database
BODE is a web-based front end for local job
management

4
System-side toolset

RefDB central bookkeeping / metadata database
Provides (physicist) user interface for
requesting data
Web interface allows users to track their
requests, drill down into detailed metadata
corresponding to produced data
Used remotely by ImpalaLite at job preparation
time to establish job input parameters, etc
Based upon MySQL database at CERN
DAR packaging of applications
Very simple way of automatically packaging CMS
software components (CMKIN, CMSIM, OSCAR, ORCA)
with required libraries, etc
Minimal dependence upon site conditions
Ensures uniformity of application versions, etc,
across sites.
NB only one current platform for production,
linux RH73

5
RefDB web user interface
One drawback need big laptop screen for browser!
6
Data handling

Dcache pileup background serving
Highly challenging from the hardware point of
view
e.g. need to serve up to 200MByte/s to the RAL
farm during high-lumi digitisation step cheap
disk servers dont cut it due to random seek
access pattern
Some large sites planning to use dcache for
background library
Each sub-farm (workers on one network switch)
has its own local disk pool should provide a
scaleable solution without killing network
SRB wide-area data management
Subject of some debate in CMS (versus Grid tools)
SRB is short-term solution, since nothing else
works (at 100TB scale) results from CMS EDG
stress test, UK / US work in 03.
Supported via UCSD / FNAL and RAL e-science
centre
RAL will host central MCAT server for PCP03
(thanks RAL).
Generic Interface to RAL datastore in testing
phase
CMSUK responsible for roll-out and support for
PCP03