Joint DOE and NSF Review of LHC Software and Computing Lawrence Berkeley Lab, January 1417, 2003 US - PowerPoint PPT Presentation

About This Presentation
Title:

Joint DOE and NSF Review of LHC Software and Computing Lawrence Berkeley Lab, January 1417, 2003 US

Description:

Joint DOE and NSF Review of LHC Software and Computing. Lawrence Berkeley Lab, January 14-17, 2003 ... We have reworked the WBS as a tool to manage the project ... – PowerPoint PPT presentation

Number of Views:60
Avg rating:3.0/5.0
Slides: 38
Provided by: Claudio48
Learn more at: https://uscms.org
Category:

less

Transcript and Presenter's Notes

Title: Joint DOE and NSF Review of LHC Software and Computing Lawrence Berkeley Lab, January 1417, 2003 US


1
Joint DOE and NSF Review of LHC Software and
ComputingLawrence Berkeley Lab, January 14-17,
2003 US CMS Software and Computing Project
Overview
  • Lothar A. T. Bauerdick/FermilabProject Manager

2
US CMS Software and Computing
  • ? Provide software engineering support for CMS ?
    CAS subproject
  • ? Provide SC environment to do LHC Physics in
    the U.S. ? UF subproject
  • Tier-1 center at Fermilab plus Five Tier-2
    centers in the U.S.
  • Tier-2s together will provide same CPU/Disk
    resources as Tier-1
  • The US CMS System from the beginning spans Tier-1
    and Tier-2 systems
  • There is an economy of scale, and we plan for a
    central support component
  • Already making opportunistic use of resources
    that are NOT Tier-2 centers
  • Important for delivering the resources to physics
    AND to involve Universities
  • e.g. UW Madison condor pool, MRI initiatives at
    several Universities
  • The US CMS Grid System of T1 and T2 prototypes
    and testbedshas an important function within CMS
  • help develop a truly global and distributed
    approach to the LHC computing problem
  • ensure full participation of the US physics
    community in the LHC research program
  • To succeed the U.S. requires the ability and
    ambition for leadership and a strong support
    to get the necessary resources!

3
US CMS SC Since November 2001
  • Consolidation of the project, shaping out the RD
    program
  • Project Baselined in Nov 2001 Workplan for CAS,
    UF, Grids endorsed
  • CMS has embraced the Grids
  • Working and profiting from US and European Grid
    projects
  • UF US CMS has build an initial successful Grid
    for CMS production
  • commissioning of T1/T2 systems facilities, data
    storage, data transfers and throughput
  • major production efforts for PRS/HLT studies,
  • Commissioning of Grid-enabled Impala MC event
    production system (MOP), testbed
  • Integration Grid Testbed, which uses T1/T2
    facilities
  • Evaluation, then commissioning of production Grid
  • CAS many new developments, technical and
    organizational
  • We do have a working software and computing
    system, that is fit for realistic physics
    studies! ? Higher Level Trigger study, DAQ TDR
    submitted
  • CCS will drive much of the common LCG Application
    Area
  • PM many changes
  • NSF proposal (Oct 2001) lead to 2-years 2002-2003
    grant in September 2002NSF RP may start in 2003
    with a 35M total until 2008
  • New DOE guidance received in spring was
    devastating, and 2003/4 baseline cannot be
    realized
  • partly mitigated by softening profiles of
    construction/SC projects, through D.Green
    working with DOE
  • New bare-bones project defined, smaller than
    baseline, but fundedallows us to ramp the
    Tier-1 effort to the required level

4
Organizational Changes
  • US CMS Research Program (RP) started, Detector
    MO, Upgrade RD, and the SC Project
  • RP proposal submitted to the NSF covering
    2003-2008
  • Change in US CMS line managementUS CMS RP
    Manager Dan Green Detector Operations and SC
    Project part of the RP
  • SC will continue as a project with baseline and
    funding profile
  • HostlabThe SC Project and the MO effort will
    each have its own separate funding
    allocationChanges to the SC plan will be
    managed in the same way as changes to the
    Construction plan are managed, through a change
    control process including approval by the host
    laboratory and the funding agencies.
  • Will need to develop a coherent PMP Role of RP
    Manager, PMG, ASCB, SCOP,

5
US CMS SC Organization
  • CMS Assignments related to U.S. CMS and CCS
  • Greg Graham/Fermilab CMS CCS Grid Integration
    Task/Production Subtask Leader
  • Julian Bunn/Caltech CMS CCS Grid Integration
    Task/Analysis Subtask Leader
  • Tony Wildish/Princeton CMS CCS L2 Task
    Production Processing and Data Management
  • LATBauerdick/Fermilab CMS CCS L2 Task Computing
    Centers
  • LCG Assignments related to U.S. CMS
  • LATBauerdick/Fermilab SC2 member for U.S. LHC,
    I.Foster on SC2 for U.S. Grid technologies
  • R.Pordes/Fermilab and M.Livny/UW Madison LCG
    Project Execution Board rep. U.S. Grid Projects
  • V.White/Fermilab U.S. RC representative to the
    LCG Grid Deployment Board
  • Miron Livny U.S. Grid representative in the GDB
  • Rick Cavanaugh U.S. CMS representative in the
    GAG

6
Project Funds FY02
  • DOE funded US CMS SC with in total 2,394,856
  • The total sum was sent to Fermilab,
  • main CAS effort subcontracted w/ universities
  • Funding for CAS efforts at Universities in
    units of 1 fulltime person
  • NSF awarded a 2-year NSF grant 2002-2003 (Nov
    2001 RP proposal)
  • 800k in FY02 1000k in FY03 (through NEU) for
    US CMS RP
  • 690k and 750k were allocated to SC by US CMS
    RPM Dan Green (60k overhead)
  • NSF iVDGL started FY2002, total 13.7M 2M
    matching over 5 years
  • 466k have gone to US CMS prototype Tier-2
    centers
  • Project has successfully tracked ACWP for all
    DOE-funded activitiesand have tracked ACWP in
    FTE-time for the NSF-funded activities
  • We have requested invoices for all these
    activities, and expect to receive them

7
FY02 Funding BA
  • Details of the FY02 Budget Authority

8
FY02 Actual Costs of Work Performed
  • FY02 BA and ACWP for US CMS SC
  • (Numbers in italics are BCWS, as no invoices were
    received for NSF-funded efforts)

9
Accomplishments
  • Prototyped Tier-1 and Tier-2 centers and deployed
    a Grid System
  • Participated in a world-wide 20TB data production
    for HLT studies
  • US CMS delivered key components IMPALA, DAR
  • Made available large data samples (Objectivity
    and nTuples) to the physics community
  • ? successful submission of the CMS DAQ TDR
  • Worked with Grid Projects and VDT to harden
    middleware products
  • Integrated the VDT middleware in CMS production
    system
  • Deployed Integration Grid Testbed and used for
    real productions
  • Decoupled CMS framework from Objectivity
  • allows to write data persistently as ROOT/IO
    Files
  • Released a fully functional Detector Description
    Database
  • Released Software Quality and Assessment Plan

10
USCMS Contributions to CCS
  • Contributions to CCS Manpower (assuming LCG 10
    FTE)
  • CERN Members State contributions mainly coming in
    through LCG AA
  • The U.S. is providing a fair share contribution

11
US Contribution to Spring Production
  • Contribution in events produced
  • This is not a complete metric, but gives a good
    indication
  • US CMS is providing a fair share of the CMS
    resources forsimulations to support trigger and
    physics studies

events high lumi pile-up
events simulated
12
US Contributions to CMS Production
  • Spring Production size of the community of
    people contributing? this is NOT FTE
    count!
  • CMS software and production systems attract a
    sizable community in the US
  • US Base physicists/PRS people getting involved
    in CMS production software for their DAQ-TDR
    preparations
  • US Grids Trillium, and also Middleware
    providers (Condor)
  • US Project in total 5 US CMS SC engineers
    were involved

13
Main Current Activities
  • Develop the US CMS T1/T2 system into a working
    Data Grid
  • high throughput data transfers
  • Grid-wide job scheduling
  • monitoring
  • Middleware is VDT Condor, Globus et al
  • US CMS-developed DPE toolkit and procedures,
    underlying the
  • standard CMS production environment (Impala,
    RefDB etc)
  • Integration Grid Testbed (IGT) was very
    successful, see later talks
  • Next steps preparing the US CMS Production Grid
  • Major coming milestones
  • participation in the CMS 5 data challenge DC04
  • be operational as part of LCG Production Grid
    in June 2003
  • Major development efforts are still needed for
    those milestones
  • Providing a viable storage management solution
    for multi-Terabytes data sets
  • building on dCache, SRM, etc
  • End-to-end throughput with the goal of TBs/day
    sustained rates from mass storage to mass
    storage
  • Interfacing the Grid VO system to the local user
    registration/security requirements
  • Consolidating the production system, data bases,
    production configuration and meta-data systems
    (MC_Runjob, catalogs, scheduling)

14
Upcoming Projects
  • VO management and security
  • Working with Fermilab security and PPDG Site-AA
    team
  • many RD, deployment, integration issues
  • Need to develop operations scenario
  • Cluster Management Generic Farms, Partitioning
  • Needed to borrow manpower from Tier-2s (!!)
  • Storage Management and Access
  • Storage Architecture components and interfaces
  • Also data set catalogs, metadata, replication,
    robust file transfers
  • Networking Terabyte throughput to T2, to CERN
  • TCP/IP on high-throughput WANs, end-to-end, QoS,
    VPN, Starlight?
  • Web Services, Document Systems, Data Bases
  • Need to ramp up general support infrastructure
  • Physics Analysis Center
  • Analysis cluster
  • Desktop support
  • Software Distribution, Software support, User
    Support Helpdesk
  • Collaborative tools

15
CMS Milestones v33 (June 2002)
  • DC04

16
Set of High Level Milestones
  • Integration Grid Testbed deployed, running PRS
    production
  • October 2002
  • SC2002 demonstration of Grid distributed
    production and WorldGrid
  • November 2002
  • review of SC2002 demo, promotion and termination
    might be combined with the testbed review after
    the SC2002 demonstration
  • December 2002
  • Farm Configuration Definition and Deployment
  • February 2003.
  • Fully Functional Production Grid on a National
    Scale
  • February 2003
  • Migration of TestBed functionality to Production
    Facilities
  • March 2003
  • Start of LCG 24x7 Production Grid,
  • June 2003 -- this needs definition from the
    LCG/GDB as to what it actually means
  • Start of CMS DC04 production preparation, PCP04
  • July 2003
  • Running of DC04
  • Feb 2004

17
Resource Expectations for DC04-prep
18
The need for a new plan
  • Although the project was morally baselined in
    November 2001and had a scope that corresponded
    to the Funding Agency guidancethe DOE found that
    it would be unable to fund the full baseline
  • We received a new guidance much below the
    previous one, which also included the costs
    detector MO and Upgrade RD
  • I was asked by the US CMS RPM to produce a bare
    bones project plan that would address the
    funding short fall during FY03 and FY04
  • This was done using a top-down approach and
    presented to CMS and Funding Agencies during the
    summer
  • I received guidance for the FY03 DOE funds
    available to SC, which is about the costs for
    the bare bones plan
  • With this input I asked L2 managers and
    sub-project leaders to develop a new WBS and
    resource loaded schedule
  • This is addressing the necessary planning to
    arrive at TDRs in 2004 and 2005
  • Which will include a re-evaluation of costs and
    efforts for the US CMS UF

19
Bare Bones Plan Top-Down
  • As a response to the revised funding guidance
    defined a Bare Bones project plan
  • Continue to deliver SC engineering support to
    CCSkeep this effort constant over the next 2
    years, instead of ramping further
  • However, CMS Software is about 12 months behind
    due to past severe understaffing
  • CMS needs the US (and CERN etc) man power to work
    with the LCG Applications Area,
  • Contingencies if any will possibly develop in
    course of joining forces with ATLAS et al
  • Ensure the US CMS system of Regional Centers is
    ready for the Data Challenges(finally!) ramp the
    Tier-1 effort to the required level of 13 FTE in
    FY2003
  • Starting at 6 FTE now, need to ramp to 13 FTE to
    fully participate in the Data Challenges
  • The Bare Bones plan should enable US CMSto
    become part in the LCG Production Grid
    milestone during 2003
  • Can participate in DC04 and 10 Data Challenges,
    will try to pull in as much as we can from Grid
    projects
  • Modest hardware procurements for Tier-1 and
    Tier-2 centers
  • Typically 500k/year in FY2003, FY2004
  • Need more (700k) in 2005 (latest) for
    participation in 10 challenge, RC TDR
  • Start the deployment phase late in 2005
  • Then need to start hiring or re-assigning
    facility support people at the Tier-1
  • Start pilot implementation of US Tier-2

20
Bare Bones Project Costs
21
Bare-Bones Scope BCWS FY02-FY05

22
Funding Guidance to RP
23
FY03 RP Funding Allocation
  • Allocation of RP funding to SC and MO by
    Research Program Manager
  • Assumption on RP taking a loan from construction
    to be paid back in 2005/6 when construction
    project finishes, softening the CP profile
  • This is being recognized in the DOE/NSF
    Strawman Funding Guidance

24
US CMS SC FY03 Budget
  • Cost objective for FY03 4M

25
A New Detailed WBS for 2003/4
  • We have reworked the WBS as a tool to manage the
    project
  • Adapt the project plan to the new shrunk scope of
    slightly above bare bones -- instead of
    executing the Nov2001 baseline
  • Formulate a consistent plan leading to a strong
    US CMS participation in the
  • LCG production grid, starting June 2003, and
    the
  • CMS 5 data challenge DC04
  • Those plans are in the process of becoming
    concrete enough to being planned for in detail
    --
  • choice of middleware,
  • RC resource scheduling
  • Security scheme
  • The new WBS now clearly recognizes the roles
  • of our engagement in the Grid and end-to-end
    projects
  • of the testbeds for facilities and Grid
    developments
  • of the short prototype cycles that enable to put
    RD results into production
  • Worked out a WBS that reflects reality, and has a
    structure that works with the new Fermilab
    project accounting scheme, and allows tracking of
    effort and progress at the lowest level in the WBS

26
New WBS Level 3
  • Note the large effort captured from Grids and/or
    being at Universities
  • This is now being explicitly tracked by US CMS

27
US CMS Approach to RD, Integration, Deployment
  • prototyping, early roll out, strong QC/QA
    documentation, tracking of external practices

28
WBS Level 4 -- 1.1

29
WBS Level 4 -- 1.2

30
WBS Level 4 -- 1.3

31
WBS Level 4 -- 1.4

32
Estimated Resource Needs at T1 FY04/04

33
US CMS T1 and T2 Manpower
34
WBS and Schedule
  • Developing the new project plan and a WBS with
    resource loaded schedule is a rather slow
    process, but we have established the guiding
    principles and have gone a long way
  • WBS is existing and is being used for detailed
    work planning
  • The UF effort will need to come from existing
    manpower at Fermilabworking with CD management
    to extract that manpower
  • There is general agreement on the WBS which will
    allow us to consolidate a large and distributed
    effort with many different funding sources
  • It is very encouraging to be able to agree on the
    general strategy between the Tier-1, the Tier-2
    centers and the Grid projects
  • The new structure is very much better suited so
    that the WBS is owned by the local managers.
    This is necessary to keep the WBS up to date and
    to track the project.
  • The WBS is being resource-loaded, the detailed
    schedule covers the time until end 2003
  • However, many unknowns in 2003, including success
    of POOL, LCG-1 etc
  • Please browse the WBS and schedule at
    http//heppc16.ucsd.edu/Planning_new/
  • We are developing a detailed equipment
    procurement plan for FY03

35
NSF Research Program Proposal
  • Have 2-years grant for FY02/FY03
  • expecting the RP to start in FY03 ramping over 5
    years
  • This is addressing
  • engineering for CCS
  • developing the distributed Grid computing model
  • building up the LHC computing infrastructure in
    the US
  • operations of the emerging Grid
  • some middleware support
  • participation in the data challenges
  • deploying the US LHC infrastructure

36
NSF RP Proposal until 2007
  • Proposal for Research Program 2003-2007 as
    submitted through NEU

37
Conclusions on US CMS SC
  • US CMS SC Project is delivering a working Grid
    environment, with a strong participation of
    Fermilab and U.S. Universities
  • We need to do a lot more RD to build the system
    for physics
  • Our customers (CCS, PRS and US CMS Users) are
    happy (last time we were asking), but need and
    want more support
  • US CMS is driving the US Grid integration and
    deployment work
  • We have a unique opportunity to bring in our
    ideas of doing science in a global and open
    international and collaborative environment
  • Proposal to the NSF ITR solicitation to Globally
    Enable Analysis Communities
  • That goes beyond the LHC and even HEP
  • US CMS has shown that the US Tier-1/Tier-2 User
    Facility system can indeed work to deliver
    effort and resources to US CMS!
  • We definitely are on the map for LHC computing
    and the LCG
  • With the funding advised by the funding agencies
    and project oversightwe will have the manpower
    and equipment at the lab and universities to
    participate in strongly in the CMS data
    challenges,
  • bringing the opportunity for U.S. leadership into
    the emerging LHC physics program
Write a Comment
User Comments (0)
About PowerShow.com