Use of the European Data Grid software in the framework of the BaBar distributed computing model - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

Use of the European Data Grid software in the framework of the BaBar distributed computing model

Description:

See Poster Session for more details Conclusions Grid technology is of prime importance for BaBar to fully ... The input tcl file is sent through the input sandbox ... – PowerPoint PPT presentation

Number of Views:70
Avg rating:3.0/5.0
Slides: 15
Provided by: Bou119
Category:

less

Transcript and Presenter's Notes

Title: Use of the European Data Grid software in the framework of the BaBar distributed computing model


1
Use of the European Data Grid software in the
framework of the BaBar distributed computing
model
  • T. Adye (1), R. Barlow (2), B. Bense (3), D.
    Boutigny (4), D. Colling (5) , B. Cowles (3), A.
    Forti (2), D. Smith (6), G. Grosdidier (7), A.
    Hasan (3), J. Martyniak (5), A.McNab (2), R.
    Walker (5)

On behalf of the BaBar computing group
(1)   Rutherford Appleton Laboratory (2)
University of Manchester - (3) Stanford Linear
Accelerator Center (4) Laboratoire d'Annecy le
Vieux de Physique des Particules CNRS / IN2P3
- (5) University of London, Imperial College -
(6) University of Birmingham (6) Laboratoire de
l'Accélérateur Linéaire CNRS / IN2P3
2
Motivations for BaBar-Grid BaBar Specificities
(1)
  • Distributed computing is one of the main axis of
    the BaBar computing model
  • Tier A Main computing centers - Hold all or a
    large fraction of the data
  • Currently SLAC, IN2P3, RAL and FZK/GridKa
  • INFN Padova is specialized in data reprocessing.
    Will probably turn to an analysis Tier A later
  • INFN Ferrara (with SLAC) looking to MC production
    on the GRID
  • Tier B Does not really exist
  • Tier C Smaller centers, have only small chunks
    of data or n-tuples

3
Motivations for BaBar-Grid BaBar Specificities
(2)
  • Special Configuration in the UK
  • Large center at RAL
  • Several smaller centers with significant
    computing and data storage resources
  • Main motivation for BaBar-Grid
  • Need a simple and reliable tool for remote job
    submission
  • Data may be spread between several sites
  • Need a Metadata Catalog and a tool to
    automatically split and submit the jobs to
    centers holding the data
  • BaBar is taking data, the introduction of Grid
    tools should not disrupt physics production

4
Short Term Goals for BaBarGrid developments
  • Setup a Grid system able to submit analysis jobs
    in major Tier-A centers
  • Proof of concept
  • Demonstrate usage in real analysis applications
  • Test various Grid implementations and
    inter-operability (EDG, LCG-1, VDT,)
  • Have to handle 2 data formats Objectivity and
    Root
  • Data Distribution ? "Distributing BaBar Data
    using the Storage Resource Broker (SRB) " W.
    Kroeger (previous talk)
  • BdbServer A user-driven data location and
    retrieval tool (Poster)
  • Metadata catalog and automatic job splitting ?
    "BaBar WEB job submission with Globus
    authentication and AFS access" A. Forti

5
The BaBar Grid as of March 2003
CE SE WN
VO RC
CE SE WN
CE SE WN
RB
CE SE WN
CE SE WN
6
European Data Grid (EDG) Setup
  • BaBar benefits from the EDG test bed
    installations in the European sites,
  • We just had to add a dedicated Virtual
    Organization (VO) and a Replica Catalog (in
    Manchester)
  • An automatic system has been developed for any
    BaBar user to automatically register its
    certificate to the VO
  • The existence of a special file on the SLAC AFS
    is the proof that the user is registered in BaBar
  • We use the RB installed at Imperial College which
    is shared with other experiments using the EDG
    test bed
  • We decided to restrict ourselves to basic RC
    usage
  • We don't use GDMP
  • We are looking forward testing RLS

7
SLAC Setup
  • The EDG software has been deployed at SLAC
  • Version 1.3.4 compatible with RB 1.4.x
  • Some special adaptation had to be done
  • WN are running LSF
  • WN are located behind a firewall so they can't
    communicate directly with the RB
  • Solved by splitting the submission scripts in
    such a way that any communication is going
    through the Gatekeeper
  • SLAC is accepting both EDG and DOE certificates
  • AFS
  • gssklog has been installed in order to get AFS
    tokens
  • The fact that EDG 1.3 / 1.4 needs RH 6.2 is a
    real problem and needs a special arrangement with
    the Computing Services

8
RB Specificities
  • One major problem with EDG 1.4.x is related to
    the Meta Directory Service (MDS)
  • Resources disappearing in a random way from the
    Information Index (II)
  • 2 solutions
  • Replace the dynamic Information Index by a static
    one (BDII) ? EDG tested recommended solution
  • Install monitoring scripts which automatically
    detect disappearing and reappearing resources and
    restart the II accordingly
  • Sometimes gives flaky II oscillating with
    resources coming in and out.
  • If this happens the resource matching process
    fails
  • Both solutions have been tested at Imperial
    College

9
The Analysis Job Use Case
  • The user has an executable and a configuration
    file (tcl)
  • The executable needs input data in Objectivity or
    Root format depending on the running site
  • The result of the analysis job is a Root-tuple
    and a log file
  • We suppose that a suitable BaBar release is
    available in the target site
  • In the future, we may package the BaBar release
    and will be able to install it before actually
    running the job
  • We want the executable to be stored in a Storage
    Element (SE) closed to the Computing Element (CE)
  • The input tcl file is sent through the input
    sandbox (OK as it is relatively small)
  • The output log file is returned through the
    sandbox
  • The Root-tuple is stored in a SE close to the CE

10
The Machinery
Ntuple
Executable
11
Getting a generic script able to run everywhere
  • Make use of the edg-brokerinfo commands
  • Discover the CE and SE parameters
  • For instance
  • edg-brokerinfo getCloseSEs returns the closest SE
    hostnames
  • edg-brokerinfo getSEMountPoint returns the mount
    point of the SE file system
  • ? EDG API allows to build a fully generic script
    in a very simple way

12
Results
  • Success rate "Submission OK and n-tuple and
    log-files recovered
  • With the dynamic MDS equipped with the control
    scripts
  • Success rate 55 to 75
  • 98 of the failing jobs are due to the RB unable
    to match the requested resources with any CE
  • With the static MDS
  • Success rate 99
  • A few jobs have been lost by the RB !!!
  • During the test we have also been hit by a limit
    to 512 jobs present at the same time in the RB ?
    Serious limitation but should be removed in
    future versions.

13
Monte-Carlo Production
  • Very active work to "grid-ify" BaBar MC
    production
  • Similar to analysis application previously
    described but with a stable and controlled
    environment.
  • Store MC executable on the SE(s)
  • Produce output files (in Root) format ? Store in
    the SE
  • Send data back to SLAC or Tier-A
  • Need to package MC production in order to be able
    to run in any institutes even those not
    maintaining BaBar software
  • One difficulty even if we produce data in Root
    format, we still need Objectivity for conditions
    data.
  • See Poster Session for more details

14
Conclusions
  • Grid technology is of prime importance for BaBar
    to fully exploit its distributed computing model
  • Many Grid activities related to
  • Data distribution
  • MC production
  • Analysis
  • We have demonstrated that EDG has all the
    necessary functionalities for running Analysis
    jobs on the Grid
  • Reliability much better with the static MDS, but
    still several open issues on the scalability of
    the system.
  • We look forward testing EDG 2.0 and are open to
    other Grid implementations
  • Will test VDT soon
  • Will move to LCG-1 as soon as it is available
  • We do not expect to have the same Grid software
    implemented everywhere
  • We need to work on the inter-operability of the
    various systems
Write a Comment
User Comments (0)
About PowerShow.com