A CMS computing project - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

A CMS computing project

Description:

A CMS computing project BOSS (Batch Object Submission System ) Zhang YongJun ... currently only MonaLisa and direct MySQL connections (to be deprecated) ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 28
Provided by: nianqi
Category:

less

Transcript and Presenter's Notes

Title: A CMS computing project


1
A CMS computing project BOSS (Batch Object
Submission System )
  • Zhang YongJun
  • (Imperial College London)
  1. Background GRID and LHC
  2. CMS computing project BOSS

2
LHC (Large Hadron Collider)
  • LHC is a particle accelerator located at CERN,
    which situated Geneva on the border between
    Switzerland and France. It is scheduler to start
    operation in 2007.
  • LHC will collide protons with colliding energy 14
    TeV and will also collide heavy ions like lead
    (Pb).

3
Detector and trigger
  • 75 million electronics channels from various
    subdetector
  • Data from detector is electrical signals.
  • By applying calibration, the physical quantity
    (momentum, energy ) can be know from the
    strength of the electrical signal
  • Trigger system selects interesting event
  • Reconstruction procedure builds physics object
    with property from raw event
  • Data analysis apply a set of cut to select
    specific set of event corresponding a specific
    physical channel

4
Software
full simulation
fast simulation
physics
  • Simulation is essential for the detector/software
    design as well as for data analysis
  • Fast simulation comparing to full simulation is
    fast but depends on the parameters extracted from
    full simulation

generator
LHC
generator
simulation
detector
digitization
trigger
Fast simulation
reconstruction
Data analysis (ROOT)
5
LHC computing model
  • 225MB/s for CMS from online to offline. Lot of
    data will come and it is out the ability for one
    site to process all data. So a tier data
    distribution structure is proposed. CERN is
    Tier0, and every country has one Tier1 and
    several Tier2.
  • Tier1 reconstructs event and host data. Tier 2
    runs physicists analysis job.
  • This Tier structure is built upon Grid software.

6
Computing before Grid
CERN
Imperial College
(yjzhang)
(yzhang)
RAL
(????????)
  • Need an account to submit job to every site.
  • To submit job to a newly joined site, a new
    account needs to be created.
  • Although these sites actually take part in the
    same project like CMS, it is difficult to share
    CPU and data

7
Computing on Grid
CERN
Imperial College
  • Instead of using account, user holds a
    certificate to submit job
  • Those sited accept this certificate form a
    Virtual Organization (VO). All those sites
    joined CMS experiment can join CMS VO.
  • Certificate is issued by some kind of authority
    by using RSA algorithm.
  • On VO, more services can be added to help user to
    submit job, for example, scheduling and
    mornitoring.

RAL
certificate
Bag Attributes friendlyName yongjun zhang's
eScience ID localKeyID 65 AB 3E 55 38 77 49
B3 3A 93 26 B5 08 68 D1 8C A9 CD 6A D8
subject/CUK/OeScience/OUImperial/LPhysics/CN
yongjun zhang issuer/CUK/OeScience/OUAuthorit
y/CNCA/emailAddressca-operator_at_grid-support.ac.u
k -----BEGIN CERTIFICATE----- MIIFbzCCBFegAwIBAgIC
FHowDQYJKoZIhvcNAQEFBQAwcDELMAkGA1UEBhMCVUsx ETAPB
gNVBAoTCGVTY2llbmNlMRIwEAYDVQQLEwlBdXRob3JpdHkxCzA
JBgNVBAMT . -----END CERTIFICATE-----
8
Work flow management on Grid
CERN (VO CMS)
Imperial College (VO CMS)
Resource Broker (CMS)
RAL (VO CMS)
certificate
  • To make users job submission even more easier, a
    job submission service - Resource Broker (RB) can
    set up upon VO. RB can delegate user to submit
    job a non-busy site.
  • To accept jobs submitted from all over VO, a
    dedicate cluster can be set up as Computing
    Element (CE).
  • Similarly, there are many VO based services like
    monitoring and logging have been developed.

9
Work flow management on Grid
CERN (VO CMS)
Imperial College (VO CMS)
Resource Broker (CMS)
RAL (VO CMS)
certificate
LFN
PFN
Catalogue DataBase
  • On Grid, user specify file by its Logical File
    Name (LFN). Grid service looks up database to
    find out all its corresponding Physical File
    Names (PFN), and selects one from them to do real
    work. Between LFN and PFN is UUID to link these
    two.
  • Dedicated site can be built to be a Storage
    Element (SE) to host large amount of data, for
    example gfe02.hep.ph.ic.ac.uk, which uses dCache
    tool.

10
BOSS - Batch Object Submission System
CRAB
BOSS
BOSS Logging
  • Boss is a part of CMS workload management system
  • Boss provides logging, bookkeeping and
    monitoring.
  • Boss sits between user(CRAB) and scheduler/Grid.
  • Boss is a generic submission tool, and will
    provide Python / C APIs which will be used by
    CRAB, then CRABBOSS are the complete submission
    tools.

monitoring
11
Sample Task specification
  • lt?xml version"1.0" encoding"UTF-8"
    standaloneyes"?gt
  • lttaskgt
  • ltiterator nameITR start0 end100
    step1gt
  • ltchain scheduler"glite rtupdater"mysql"
    ch_tool_name"jobExecutor"gt
  • ltprogram exec"test.pl"
  • argsITR"
  • stderr"err_ITR
  • program_type"test
  • stdin"in
  • stdout"out_ITR"
  • infiles"Examples/test.pl,Examples/i
    n
  • outfiles"out_ITR,err_ITR
  • outtopdir"" /gt
  • lt/chaingt
  • lt/iteratorgt
  • lt/taskgt
  • Example of task containing 100 chains each
    consisting of 1 program.
  • Program specific monitoring activated - results
    returned via MySQL connection.

12
BOSS components overview
user CLI
admin. CLI
Python interface
user GUI
Pro-active UI ?
BOSS Logging
BOSS kernel APIs kernel objects BossTask,
database scheduler
Grid or Local Scheduler
monitoring
BOSS on WN jobExecutor tar ball Configuration
file and executables
  • Boss has 2 parts, (1) BOSS on UI and (2) BOSS on
    WN.
  • Boss on UI has two sub layer further (a) user
    interface, (b) Boss kernel.
  • Boss kernel further include APIs
    (BossUserSession, BossAdministratorSession) and
    kernel objects (BossConfiguration, BossTask,
    BossDataBase and BossScheduler).
  • Boss on WN has level structure Task, Chain,
    Program, userExecutable.

13
BOSS internal data flow
administrator
user/CRAB
WN
task.xml
schema.xml
API/user
API/adm
Job.tar ( job.xml, monitoring,,ORCA,input...)
JOB_ID 1
START_TIME
STATUS running

JOB_ID TYPE INPUT
1 ORCA FILE1
2 ORCA FILE2
wrapper /Shreek
Journal file
BOSS logging
scheduler/JDL
monitoring
14
BOSS internal work flow
15
BOSS WN reorganization proposal
BOSS UI reorganization proposal
job.tar
job components
plug-in
core services
blackboard
pro-active Plug-in
JobExecuter
File of job configuration
pro-active service
JobMonitor

programChaining
monitor interface
  • All variable things go to configuration file so
    that leave rest components simple, even no
    recompilation needed when new components added
  • Configuration file is created during job
    preparation stage, it owns all information needed
  • JobExecuter only has to interpret the
    configuration file
  • Core services can talk each other, so they
    dependent each other
  • Plug-in only talks to services so that it achieve
    independency to be plug-in
  • Tar ball job.tar is created during job
    preparation stage, synchronized with
    configuration file creation. A service or plug-in
    is referenced by configuration files ( logically
    or even physically there are more than one
    configuration files ) should be added to the tar
    ball as well

16
Structure of level 2, 3 and final
level 1
level 2
level final
level 3
  • Chaining configuration file owns all information
    to chain programs together, it leaves
    programChaining program clean and stable
  • Chaining configuration file is created during
    chain preparation stage ( a step of the job
    preparation stage )
  • programChaining interprets the chaining
    configuration file and executes its commands
  • Job configuration file, chaining configuration
    file and program configuration file have similar
    ( or same ) structure and functionality. They
    even can share the same physical file, but
    logically they should be different to achieve
    flexibility

17
BOSS Status and plans
BOSS Status and plans
  • New functionality has been implemented or is
    being written
  • Tasks, job and executables.
  • XML task description.
  • C and Python APIs
  • Basic executable chaining - currently only
    default chainer with linear chaining.
  • Separate logging and monitoring DBs.
  • Implemented DBs in either MySQL or SQLite (more
    to come).
  • Optional RT monitoring with multiple
    implementations, currently only MonaLisa and
    direct MySQL connections (to be deprecated).
  • To be done in the near future
  • Allow chainer plugins.
  • Implement more RT monitoring solutions i.e R-GMA.
  • Look at writing wrapper in scripting language i.e
    Perl/Python.
  • Optimize architecture and separate data from
    functionality.

18
GRID organizations
Resource management Grid Resource Allocation
Management Protocol (GRAM) Information Services
Monitoring and Discovery Service (MDS) Security
Services Grid Security Infrastructure (GSI)
Data Movement and Management Global Access to
Secondary Storage (GASS) and GridFTP
ALICE ATLAS CMS LHCb
Projects     PI   -   POOL/CondDB   -   SEAL  
-  ROOT -  Simulation   -   SPI   -   3D (GDA)
  1. To build a consistent, robust and secure Grid
    network that will attract additional computing
    resources.
  2. To continuously improve and maintain the
    middleware in order to deliver a reliable service
    to users.
  3. To attract new users from industry as well as
    science and ensure they receive the high standard
    of training and support they need.

There are many national scale Grid related
collaboration, for example, GridPP, is a UK
national collaboration funded by the UK
government through PPARC as part of its e-Science
Programme. It collaborates with CERN and EGEE.
19

Backup slides
20
Boss key components
administrator
user/CRAB
WN
task.xml
schema.xml
API/user
API/adm
Bosstask
BossScheduler
BossDB
Job.tar ( job.xml, monitoring,,ORCA,input...)
JOB_ID 1
START_TIME
STATUS running

JOB_ID TYPE INPUT
1 ORCA FILE1
2 ORCA FILE2
wrapper /Shreek
Journal file
scheduler/JDL
monitoring
BOSS logging
21
Boss level structure on WN
Blackboard
Pro-active
Interface?
JobExecuter (wrapper)

pre-filter
user executable
runtime-filter
JobMonitor
programExecutor1
post-fileter
JobChaining
programExecutor2



level 0
level 1
level 2
level final
level 3
  • At least level 0, level 1 and level final have to
    be there
  • level 2 and level 3 can be omitted, this can
    easily achieved by rewriting configuration file
  • New level can be easily inserted between level 1
    and level final by rewriting configuration file
  • Every level can has its configuration file or not
  • JobExecutor controls all proccess on worker node
  • Pro-active process not planned for first release.
  • JobChaining simple linear program execution in
    first release allow possibility of plugins (ie
    Shreek) in the future.
  • Simple monitoring via output stream filters
    planned for first release more extensive
    options available later.

22
BOSS history
W. Bacchi, G. Codispoti, C. Grandi, INFN
Bologna D. Colling, B. MacEvoy, S. Wakefield, Y.
Zhang. Imperial College London
Old BOSS
Italian group Claudio, 2001-
Imperial group Hugh,Stuart, Dave,
Barry,Yong 2003-2005
GROSS
logging bookkeeping
CMS specific functionality group of jobs
scheduler
Bologna Imperial joint meeting Stuard, Dave,
Barry,Yong, Claudio,and all Bologna
group 17/12/2004, Bologna
monitoring
New BOSS
Joint meeting Stuart,Dave,Yong,Henry, Claudio. 02-
03/02/2005, Imperial
taskjobprogram
CMS WM workshop 14-15/07/2005, Padova
adopted XML structure
defined framework and priority
BOSS Group meeting 12-14/10/2005 Bologna
23
Schema configuration file proposal
- ltTABLE NAME"TASK"gt   ltELEMENT NAME"TASK_ID"
TYPE"INTEGER PRIMARY KEY" DAUGHTER"CHAIN" /gt  
ltELEMENT NAME"ITERATORS" TYPE"TEXT NOT NULL
DEFAULT """ /gt   ltELEMENT NAME"TASK_INFILES"
TYPE"TEXT NOT NULL DEFAULT """ /gt   ltELEMENT
NAME"DECL_USER" TYPE"TEXT NOT NULL DEFAULT """
/gt   ltELEMENT NAME"DECL_PATH" TYPE"TEXT NOT
NULL DEFAULT """ /gt   ltELEMENT NAME"DECL_TIME"
TYPE"INTEGER NOT NULL DEFAULT 0" /gt  
lt/TABLEgt - ltTABLE NAME"CHAIN"gt   ltELEMENT
NAME"CHAIN_ID" TYPE"INTEGER PRIMARY KEY"
DAUGHTER"PROGRAM" MOTHER"TASK" /gt   ltELEMENT
NAME"TASK_ID" TYPE"INTEGER NOT NULL DEFAULT 0"
TAG4DB"MOTHER_ID" /gt   ltELEMENT
NAME"SCHEDULER" TYPE"TEXT NOT NULL DEFAULT """
/gt   ltELEMENT NAME"RTUPDATER" TYPE"TEXT NOT
NULL DEFAULT """ /gt   ltELEMENT NAME"SCHED_ID"
TYPE"TEXT NOT NULL DEFAULT """ /gt   ltELEMENT
NAME"CHAIN_CLAD_FILE" TYPE"TEXT NOT NULL
DEFAULT """ /gt   ltELEMENT NAME"LOG_FILE"
TYPE"TEXT NOT NULL DEFAULT """ /gt   ltELEMENT
NAME"SUB_USER" TYPE"TEXT NOT NULL DEFAULT """
/gt   ltELEMENT NAME"SUB_PATH" TYPE"TEXT NOT
NULL DEFAULT """ /gt   ltELEMENT NAME"SUB_TIME"
TYPE"INTEGER NOT NULL DEFAULT 0" /gt  
lt/TABLEgt - ltTABLE NAME"PROGRAM"gt   ltELEMENT
NAME"PROGRAM_ID" TYPE"INTEGER PRIMARY KEY"
MOTHER"CHAIN" /gt   ltELEMENT NAME"CHAIN_ID"
TYPE"INTEGER NOT NULL DEFAULT 0"
TAG4DB"MOTHER_ID" /gt   ltELEMENT NAME"TYPE"
TYPE"TEXT NOT NULL DEFAULT """ /gt   ltELEMENT
NAME"EXEC" TYPE"TEXT NOT NULL DEFAULT """ /gt  
ltELEMENT NAME"ARGS" TYPE"TEXT NOT NULL DEFAULT
""" /gt   ltELEMENT NAME"STDIN" TYPE"TEXT NOT
NULL DEFAULT """ /gt   ltELEMENT NAME"STDOUT"
TYPE"TEXT NOT NULL DEFAULT """ /gt   ltELEMENT
NAME"STDERR" TYPE"TEXT NOT NULL DEFAULT """ /gt
  ltELEMENT NAME"PROGRAM_TIMES" TYPE"TEXT NOT
NULL DEFAULT """ /gt   ltELEMENT NAME"INFILES"
TYPE"TEXT NOT NULL DEFAULT """
TAG4SCHED"IN_FILES" /gt   ltELEMENT
NAME"OUTFILES" TYPE"TEXT NOT NULL DEFAULT """
TAG4SCHED"OUT_FILES" /gt   ltELEMENT
NAME"OUTTOPDIR" TYPE"TEXT NOT NULL DEFAULT """
/gt   lt/TABLEgt - ltTABLE NAME"PROGRAMTYPE"gt  
ltELEMENT NAME"NAME" TYPE"CHAR(30) NOT NULL
PRIMARY KEY" TAG4DB"UPDATE_KEY"
TAG4SCHED"META_DATA" /gt   ltELEMENT
NAME"PROGRAM_SCHEMA" TYPE"TEXT NOT NULL DEFAULT
""" TAG4DB"INSERT_FILE_CONTENT,CREATE_TABLE_CONTE
NT" TAG4SCHED"PROGRAMTYPE_CONTENT" /gt  
ltELEMENT NAME"COMMENT" TYPE"VARCHAR(100) NOT
NULL DEFAULT """ TAG4SCHED"META_DATA" /gt  
ltELEMENT NAME"PRE_BIN" TYPE"TEXT NOT NULL
DEFAULT """ TAG4DB"INSERT_FILE_CONTENT"
TAG4SCHED"PROGRAMTYPE_CONTENT" /gt   ltELEMENT
NAME"RUN_BIN" TYPE"TEXT NOT NULL DEFAULT """
TAG4DB"INSERT_FILE_CONTENT" TAG4SCHED"PROGRAMTYP
E_CONTENT" /gt   ltELEMENT NAME"POST_BIN"
TYPE"TEXT NOT NULL DEFAULT """
TAG4DB"INSERT_FILE_CONTENT" TAG4SCHED"PROGRAMTYP
E_CONTENT" /gt   lt/TABLEgt
24
Dataset and PhEDEx
Boss level structure on WN
How to understand PhEDEs?
jm_Hit245_2_g133/jm03b_qcd_120_170
01D4DF4E-A4EB-4047-A94A-1A550265872F.zip
866822E9-244B-4C1D-BF1D-080E71D343F0.zip 021C736B-
A2A4-43E2-9F25-829F9E7E8F35.zip
8B3ED1AD-14AB-4696-BADD-71119EA7652A.zip ... tota
lly 135 files, 200GB
  • Manually dataset transfer
  • find out from where to copy the dataset
  • copy the files one by one
  • publish files into catalog one by one
  • write private scripts to do the transfer
  • use PhEDEx
  • PhEDEx has a collection of scripts or script
    templates
  • PhEDEx provides a framework (a set of agents) to
    support scripts
  • PhEDEx has a central Database (TMDB) to
    coordinate every step in transfer process
  • PhEDEx has a website to monitor transfer status
    and handle dataset request
  • ...

file catalog
lt?xml version"1.0" encoding"UTF-8"
standalone"no" ?gt ltPOOLFILECATALOGgt ltFile
ID"01D4DF4E-A4EB-4047-A94A-1A550265872F"gt
ltphysicalgt ltpfn filetype""
name"dcap//gfe02.hep.ph.ic.ac.uk22128/pnfs/hep.
ph.ic.ac.uk/data/cms/phedex/jm03b_qcd_120_170/Hit/
01D4DF4E-A4EB-4047-A94A-1A550265872F.zip"/gt
lt/physicalgt ltlogicalgt ltlfn
name"ZippedEVD.121000153.121000154.jm_Hit245_2_g1
33.jm03b_qcd_120_170.zip"/gt lt/logicalgt
ltmetadata att_name"dataset" att_value"jm03b_qcd_
120_170"/gt ltmetadata att_name"jobid"
att_value"1126203628"/gt ltmetadata
att_name"owner" att_value"jm_Hit245_2_g133"/gt lt/
Filegt ltFile ID"866822E9-244B-4C1D-BF1D-080E71D3
43F0"gt lt/Filegt lt/POOLFILECATALOGgt
62796961
25
Developers point of view of PhEDEx
26
Users point of view of PhEDEx
NODE (IC)
Configuration
NODE
FileDownloadDestination
FileDownload
FileDownloadVeryfy
FilePFNExport
TMDB
WWW
FileDownloadDelete
NodeRouter
FileDownloadPublish
FileRouter
PFNLookup
NODE (RAL)
...
...
User needs to write glue scripts which are driven
by agents.
27
Event data model
  • Data is defined as event data model (EDM).
Write a Comment
User Comments (0)
About PowerShow.com