Title: Distributed Computing and Data Analysis for CMS in view of the LHC startup
1Distributed Computing and Data Analysis for CMS
in view of the LHC startup
- Peter Kreuzer
- RWTH-Aachen IIIa
International Symposium on Grid Computing
(ISGC) Taipei, April 9, 2008
2Outline
- Brief overview of Worldwide LHC Grid WLCG
- Distributed Computing Challenges at CMS
- Simulation
- Reconstruction
- Analysis
- The physicist view
- The road to the LHC startup
3From local to distributed Analysis
- Before centrally organised Analysis
- Example CMS 4-6 PBytes data per year, 2900
scientists, 40 countries, 184 institutes !
- Solution Tiered Computing Model
4Worldwide LHC Computing GRID
- Level of distribution motivated by the desire to
leverage and empower resources share load,
infrastructure and funding
- Tier-0 at CERN
- Prompt Reconstruction
- Calibration and Low
- latency work
- Archiving
- 1.0 GByte/s
- Tier-1s at large national
- labs or universities
- Re-Reconstruction
- Physics skimming
- Data Serving
- Archiving
Aggregate Rate from CERN to Tier-1s ? gt
1.0 GByte/s
- Transfer Rate
- to Tier-2
- 50-500
- MBytes/s
- Tier-2s primarily at Universities
- Simulation
- User Analysis
- Tier-3s at Institutes with
- modest Infrastructure
- Local User Analysis
- Opportunistic Simulation
5WLCG Infrastructure
- EGEE Enabling Grid for E-Science
- OSG Open Science Grid
1 Tier-0 11 Tier-1 67 Tier-2
CMS 1 Tier-0 7 Tier-1 35 Tier-2
Tier-0 -- Tier-1 dedicated 10Gbs Optical Network
6Examples of Sites
- T2 RWTH (Aachen) ?
- CPU 540 KSI2k 360 cores
- Disc 100TB
- Network (WAN) 2Gbit/sec
- (2009 450 cores 150TB)
- T1 ASGC
- CPU 2.4 MSI2k 1800 cores
- Disc 930TB ? 1.5PB
- Tape 586TB ? 800TB ?
- Network 10Gbit/sec
- T2 Taiwan
- CPU 150 KSI2k
- Disc 19TB ? 62TB
- Network up to 10Gbit/sec
7Pledged WLCG Resources
250,000 cores
2008 66,000 cores
? CPU
MSI2k
2008 40 PetaBytes
Disc Storage ?
PetaBytes
- (Tape Storage
- 33 PBytes in 2008)
(Reference LCG Project Planning 1.3.08)
8Challenges for Experiments Example CMS
- Scale-up and test distributed Computing
Infrastructure - Mass Storage Systems and Computing Elements
- Data Transfer
- Calibration and Reconstruction
- Event skimming
- Simulation
- Distributed Data Analysis
- Test CMS Software Analysis Framework
- Operate in quasi-real data taking conditions and
simulateously at various Tier levels - ? Computing Software Analysis (CSA) Challenge
9CMS Computing and Software Analysis Challenges
- CMS Scaling-up in the last 4 years
- Test (year) Goal Jobs/day
Scale - DC04 15,000 5
- 2005 - 2006 New Data Model and
New Software Framework - CSA06 50,000 25
- CSA07 100,000 50
- CSA08 150,000 100
- Requires 100s M simulated events input
?
10The CSA07 data Challenge
100M Simulated Data
Reconstruction 100Hz
TIER-0
CASTOR
CAF
HLT
Calibration Express Analysis
300MB/s
Re-Reconstruction Skimms 25k jobs/day
TIER-1
TIER-1
TIER-1
TIER-1
20-200MB/s
10MB/s
TIER-2
TIER-2
TIER-2
TIER-2
Analysis 75k jobs/day
Simulation 50M evt/month
11In this presentation
- Mainly covering CMS Simulation, Reconstruction
and Analysis challenges - Data transfers challenges covered in talk by
Daniele Bonacorsi during this session
12CMS Simulation System
CMS Physicist
ltlt Please simulate new physics gtgt
ltlt Where are my data ? gtgt
Tier-1
Global Data Bookkeeping (DBS)
Tier-2
ProdAgent
ProdRequest
Tier-2
Production Manager
Tier-2
Tier-2
ProdAgent
ProdAgent
Tier-2
Tier-2
GRID
Tier-2
Tier-2
13ProdAgent workflows
2) Merging
1) Processing
- Data processing / bookkeeping / tracking /
monitoring in local-scope - Output promoted to global-scope DBS Data
transfer system PhEDEx - Scaling achieved by running in parallel multiple
ProdAgent instances
14CMS Simulation Performance
- 250M Events in 5 months
- Tier-2 alone 72
- OSG alone 50
- (Overall 07-08 450M)
June November 2007
M Evts / Month
Production Rate x 1.8
70
60
50
- 20k jobs/day reached
- lt Job efficiency gt 75
40
30
Jul
Jan
Oct
Apr
15Utilization of CMS Resources
- average 50
- In best productions periods 75
Missing Requests
5000 job- slots
June November 2007
16CSA07 Simulation lessons
- Major boost in scale and reliability of
production machinery - Still too many manual operations. From 2008 on
- Deploy ProdManager component (in CSA07 was
human !) - Deploy Resource Monitor
- Deploy CleanUpSchedule component
- Further improvments in scale and reliability
- gLite WMS bulk submission 20k jobs/day with 1
WMS server - Condor-G JobRouter bulk submission 100k
jobs/day and can saturate all OSG resources in 1
hour. - Threaded JobTracking and Central Job Log Archival
- Introduced task-force for CMS Site Commissioning
- help detect site issues via stress-test tool
(enforce metrics) - couple site-state to production and analysis
machinery - Regular CMS Site Availability Monitoring (SAM)
checks
17CMS Site Availability Monitoring
Availability Ranking
(ARDA Dashboard)
03/22/08
04/03/08
0
100
- Important tool to protect CMS use cases at sites
18CSA07 Reconstruction Skimming
0) preparation of Primary Datasets
mimics real CMS Detector Trigger data
1) Archive and Reconstruction at CERN T0 2)
Archive and Re-Reconstruction at T1s 3) Skimming
at T1s 4) Express analysis Calibration at CERN
Analysis Facility
? 3 different calibrations 10pb-1,100pb-1, 0pb-1
19Produced CSA07 Data Volumes
x1e8
DIGI-RAW-HLT-RECO events
Total CSA07 event counts 80M GEN-SIM 80M
DIGI-RAW 80M HLT 330M RECO (3 diff.
calibrations) 250M AOD 100M skims --------------
------------- 920M events
10/07
02/08
- Total Data volume 2PB
- Corresponds to
- expected 2008 volume !
CMS data in CASTOR_at_CERN 3.7PB
20CSA07 Reconstruction lessons
2k running jobs
T0 and T1 processing
- T0 Reconstruction at 100Hz
- only in bursts, mainly due
- to stream splitting activity
- Heavy load on CASTOR
- Usefull feedback to ProdAgent Developpers to
prepare 2008 data taking (repacker, ) - T1 Processing submission rate was main
limitation. Now based on gLite bulk submission
and reaching 12-14k jobs/day with 1 ProdAgent
instance - Further rate improvment to be expected with T1
resource up-scaling
21CMS Analysis System
CRAB CMS Remote Analysis Builder An interface
to the GRID for CMS physicists Challenge match
processing resources with large quantities of
data chaotic Processing
Tier-1
Tier-2
Global Data Bookkeeping (DBS)
ltlt Please analyse datasets X/Y gtgt
CMS Physicist
Tier-2
CRAB
ltlt Where are my jobs ? gtgt
Tier-2
Tier-2
CRAB Server
Tier-2
Tier-2
GRID
Tier-2
Tier-2
22CRAB Architecture
- Easy and transparent means for CMS users to
submit analysis jobs via the GRID (LCG RB, gLite
WMS, Condor-G)
- CSA07 analysis direct submission by user to
GRID. Simple, but lacking automation and
scalability - ? 2008 CRAB server
- Other new feature local DBS for private users
23CSA07 Analysis
- 100k jobs/day not achieved
- mainly due to lacking data during the challenge
- still limitted by data distribution 55 jobs at
3 largest Tier-1s - and failure rate too high
53 Successful Jobs 20 failed Jobs 27 Unknown
20k jobs/day achieved regularly 30k/day
JobRobot submissions
Number of jobs
24CMS Grid Users since 1 year
- plot showing distinct users
- 300 users during February 2008
- 20 most active users carry 1/3 of jobs
Users
Month
CRAB Server
25The Physicist View
- SUSY Search in
- di-lepton jets MET
- Goal Simulate excess over Standard Model (LM1
at 1 fb-1) - Infrastructure
- 1 desktop PC
- CMS Software Environment (CMSSW , CRAB,
Discovery GUI, ) - GRID Certificate member of a Virtual
Organisation (CMS) - Input data (CSA07 simulation/production)
- Signal (RECO) 120k events 360 GB
- Skimmed Background (AOD) 3.3 M events 721 GB
- WW / WZ / ZZ / single top
- ttbar / Z / W jets
- Unskimmed Background 27 M events 4 TB (for
detailed studies only) - Location of input data
- T0/T1 CERN (CH), FNAL (US), FZK (Germany)
- T2 Legnaro (Italy), UCSD (US), IFCA (Spain)
1.1 TB
26GRID Analysis Result
End-Point Signal
- Analysis Latency
- Signal Bgd
- 322 jobs ? 22h
- to produce this result !
- Detailed studies 1300 jobs ? 3.5 days
Z peak from SUSY cascades
GeV
Georgia Karapostoli Athens Univ.
27CSA07 Analysis lessons
- Improve Analysis scalability, automation and
reliability - CRAB-Server
- Automate job re-submission
- Optimize job distribution
- Decrease failure rate
- Move Analysis to Tier-2s
- To protect Tier-0/1 LSF and storage systems
- To make use of all available GRID resources
- Encourage Tier-2_to_Physics_group association
- In close collaboration with sites
- With solid overall Data Management strategy
- Assess local scope DM for Physics groups
storage of user data - Aim for 500 users by June and exceed capacity of
several gLite WMS
28Goals for CSA08 (May 08)
- Play through first 3 months of data taking
- Simulation
- 150M events at 1 pb-1 (S43)
- 150M events at 10 pb-1 (S156)
- Tier-0 Prompt reconstruction
- S43 with startup-calibration
- S156 with improved calibration
- CERN Analysis Facility (CAF)
- Demonstrate low turn-around AlignmentCalibration
workflows - Coordinated and time-critical physics analyses
- Proof-of-principle of CAF Data and Workflow
Managment Systems - Tier-1 Re-Reconstruction with new calibration
constants - S43 with improved constants based on 1 pb-1
- S156 with improved constants based on 10 pb-1
- Tier-2
- iCSA08 simulation (GEN-SIM-DIGI-RAW-HLT)
- repeat CAF-based Physics analyses with Re-Reco
data ?
292008
Detector installation, commissioning and operation
Preparation of Software, Computing and Physics
analysis
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
2007 Physics Analyses results
Cooldown of magnet
Private global runs (2 days/week) Private
mini-daq
CCRC08-1
GRUMM
CMSSW 1.8.0 sample production
CMSSW 2.0 release production start-up MC samples
Low i test
2 weeks of 2.0 testing
Beam-pipe baked-out Pixels installed
CR 0T
iCSA08 sample generation
CROT
CR 0T
iCSA08 / CCRC08-2
CMS closed
pre CR 4T
CRAFT
CMSSW 2.1 release all basic sw components
ready for LHC, new T0 prod tools
Initial CMS ready for run
CR 4T
fCSA08
or beam!
Must keep exercises mostly non-overlapped
CCRC Common-Vo Computing Readiness Challenge CR
Commissioning Run
30Where do we stand ?
- WLCG major up-scaling since 2 years !
- CMS impressive results and valuable lessons
from CSA07 - Major boost in Simulation
- Produced 2 PBytes data in T0/T1 Reconstruction
and Skimming - Analysis number of CMS Grid-users ramping up
fast ! - Software addressed memory footprint and data
size issues - Further Challenges for CMS scale from 50 to
100 - Simultaneous and continuous operations at all
Tier levels - Analysis distribution and automation
- Transfer rates (see talk by D.Bonacorsi)
- Upscale and commission the CERN Analysis Facility
(CAF) - CSA08, CCRC08, Commissioning Runs
- Challenging and motivating goals in view of Day-1
LHC !