LCG Milestones for Deployment, Fabric, - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

LCG Milestones for Deployment, Fabric,

Description:

... packaging and release of software ... centre and help desk (call centre) ... Needs a problem tracking database several candidate systems. In ... – PowerPoint PPT presentation

Number of Views:23
Avg rating:3.0/5.0
Slides: 17
Provided by: ianb196
Category:

less

Transcript and Presenter's Notes

Title: LCG Milestones for Deployment, Fabric,


1
LCG Milestonesfor Deployment, Fabric, Grid
Technology
  • Ian Bird
  • LCG Deployment Area Manager
  • Presentation to LHCC Referees
  • 25-Nov-2002

2
M1.1 First Global Service Initial
AvailabilityJuly 2003
  • This comprises the construction and commissioning
    of the 1st LHC computing service for physics
    usage. The service must offer reliably 24x7
    availability to all 4 LHC experiments and include
    some 10 Regional Centres in Europe, North
    America, and Asia.
  • The milestone includes delivery of the associated
    Technical Design, containing description of the
    architecture, functionality and quantified
    technical specifications of performance
    (capacity, throughput, reliability,
    availability). It must also include middleware
    specifications, agreed as a common toolkit by
    Europe and US.
  • The service must prove functional, providing a
    batch service for event production and analysis
    of the simulated data set. For the milestone to
    be met, operation must be sustained reliably
    during a 7 day period stress tests and user
    productions will be executed, with a failure rate
    below 1.

3
L2 milestones for M1.1
  • Define LCG-1 in terms of
  • functionality, resources, operations, security,
    support
  • Series of evolving pilot services for testing,
    with increasing resources
  • Testing, certification, packaging and release of
    software
  • Set up infrastructure and operational procedures
  • Set up operations centre and help desk (call
    centre)
  • LCG-1 commissioning and acceptance

4
M1.1 (a)
  • Define LCG-1 functionality, resources,
    operations, security
  • The 5 working groups of the Grid Deployment Board
    will define LCG-1.
  • Functionality needed by the experiments for their
    data challenges identify VDT, EDG components to
    provide it negotiate support agreements with
    providers.
  • Resources and Regional Centres to participate in
    LCG-1 deployment schedule and resource ramp-up.
    Define resource request and review process.
  • Negotiate initial security model (authentication,
    authorization etc.) acceptable to all centres,
    provide a plan to achieve the full requirements
    of the centres.
  • Define operating procedures, negotiate agreements
    with centres to put these into place.
  • Define the user support model.
  • WG1-4 will provide an interim report on Dec 9,
    final report Feb 1, 2003, WG5 3 months later

5
M1.1 (b)
  • Series of evolving pilot services for testing,
    with increasing resources
  • Pilot-1 service February 1, 2003.
  • 50 machines (CE), 10 TB (SE). Runs middleware
    currently on LCG testbeds. Initial testbed at
    CERN.
  • Add 1 remote site by February 28, 2003.
  • Pilot-2 service March 15, 2003.
  • 100 machines (CE), 10 TB (SE). CERN service
    will run full prototype of WP4 installation and
    configuration system.
  • Add 1 US site to pilot March 30, 2003
  • Add 1 Asian site to pilot April 15, 2003
  • Add 2-3 more EU and US sites April May, 2003
  • Service includes 6-7 sites June 1, 2003
  • LCG-1 initial production system July 2003.
  • 200 machines (CE), 20 TB (SE). Uses full WP4
    system with fully integrated fabric
    infrastructure. Global service has 6-7 sites in
    3 continents.

6
M1.1 (c)
  • Testing, certification, packaging and release of
    software
  • This is the process by which we make the service
    reliable and supportable (production service)
  • Certification, testing, release process defined
    January 2003.
  • To verify functionality, robustness, etc.
    Essential to provide production service. Process
    defined for EDG, modify for LCG.
  • Packaging/configuration mechanism defined March
    2003.
  • Needed to automate installation and
    configuration. A collaborative activity LCGgrid
    projects. Requirements gathering in progress.
  • Delivery of middleware software packages March
    1, 2003
  • This is delivery to LCG from the grid middleware
    providers
  • Iterative, incremental release cycle, with major
    functional releases
  • V1.0 June 1, 2003
  • V1.1 October 1, 2003
  • Incremental releases to improve stability,
    robustness, fix problems.

7
M1.1 (d)
  • Set up Infrastructure Operational procedures
    January June 2003
  • Schedule and details driven by outcome of GDB
    working groups
  • Certificate Authorities and VO management systems
    in place May 2003
  • Based on existing EU and US inter-operating
    systems
  • Deploy grid services to participating sites
  • As they come online according to WG2 schedule
  • Agreement on responsibilities for management of
    services
  • This is the outcome from WG 4 February 1, 2003
  • Resource accounting and reporting procedures set
    up May 2003
  • Security procedures defined and agreed June
    2003
  • Incident response and security management

8
M1.1 (e)
  • Set up operations centre and help desk (call
    centre)
  • Identify operations and call centre locations
    February 1, 2003
  • A call centre to provide operational and helpdesk
    support
  • Distributed across 2 sites initially to provide
    reasonable coverage
  • Monitoring system based on tools used in testbeds
    and recent demonstrations
  • Existing experience in Teragrid and iVDGL,
    DataTAG
  • Needs a problem tracking database several
    candidate systems
  • In place by June 2003

9
M1.1 (f)
  • LCG-1 commissioning and acceptance June 2003
  • 30 day commissioning period with user productions
    and stress tests, including
  • 7 day acceptance period

10
M1.4 Fully Operational LCG-1 ServiceNovember 2003
  • This comprises the availability of LCG-1 as a
    fully operational and performant 24x7 production
    service. Operation must be sustained for a
    period of 1 month. This service would be used
    for the 5 data challenges of the LHC
    experiments. LCG-1 will be operated
    continuously, evolving in terms of capacity,
    performance and functionality. It includes the
    addition of Regional Centres as they come on-line
    as defined in GDB Working Group 2
  • It include the delivery of the technical service
    specifications and user documentation, and
    deployment/consolidation of an appropriate user
    support infrastructure. It also includes
    incremental releases of middleware to improve
    reliability, robustness, and performance.
  • The service level must be as required for the
    2004 data challenges. The determination and
    acceptance of the milestone should be done with a
    review of the service by representatives of the
    experiments, regional centres, and LCG.

11
L2 Milestones for M1.4
  • Define LCG-1 performance goals July 2003
  • In concert with experiments and their data
    challenge requirements, set performance goals in
    terms of capacity, throughput, reliability, etc.
    A GDB working group.
  • 10 Regional Centres participating October 2003
  • WG2 defines the implementation schedule may be
    adjusted in July. Add centres 1 at a time until
    October.
  • LXBatch service merged into LCG-1 October 2003
  • All resources of LXBATCH will be grid-enabled and
    accessible as part of the LCG-1 service.
  • Milestone release of middleware October 2003
  • V1.1 release with improved functionality
    October 2003
  • Review of service November 2003
  • The LCG-1 service level should be that required
    for the 2004 data challenges. The determination
    and acceptance of achieving the target will be
    done in a review of the service by
    representatives from the experiments, the
    regional centres and LCG.

12
M1.6 Fully Operational LCG-3 ServiceJanuary 2005
  • This comprises the construction and commissioning
    of a fully operational full-size prototype
    (LCG-3) of what will be the initial LHC computing
    production service. Operation must be sustained
    24x7 reliably for a period of 1 month.
  • LCG-3 will be used as a proof that the LHC
    computing model will work, including Tier 0,1,2
    and 3 regional centres, providing practical
    backup for the computing service TDR. LCG-3 will
    use the LHC Grid toolkit, will have 50 of the
    components required for the 2007 production
    service of CMS or ATLAS, and will be used for the
    20 milestones of the experiments.

13
L2 Milestones for M1.6
  • Define LCG-3 February 2004
  • Functionality middleware packages
  • Resources, Regional Centre participants
  • Performance goals
  • LCG-3 pilot system available July 2004
  • Operate in parallel with LCG-1 production
    service. Used for integration and functional
    tests by experiments.
  • Decision on new batch system software (CERN)
    December 2004
  • Following a review of scheduler software
    alternatives
  • Upgrade LCG-1 service to LCG-3
  • December 2004 January 2005. This is a major
    upgrade that can only be done at a quiet time.

14
M1.8 Completion of the Computing Service TDRJune
2005
  • The Computing Service TDR will specify the
    requirements for the Grid that will be used for
    the first production services for the four LHC
    experiments. It will include details of the
    architecture, functionality, capacity,
    performance, throughput and availability.
  • It will include the Regional Centre plans that
    will have been developed to meet these
    requirements, and will provide cost estimates and
    an overall installation and verification
    schedule. It is assumed that the TDR will be
    approved by the LHCC within three months
    following its availability, and may be used to
    provide data for the Memorandum of Understanding
    for Phase 2 of the project.
  • The full process from acquisition to service
    verification is expected to take 12-18 months
    (according to the administrative procedures of
    the Regional Centres). The initial service must
    be in full production by September 2006 (6
    months before data taking). The TDR will
    therefore be approved after the acquisition
    procedures have started, but before orders are
    placed.

15
L2 Milestones for M1.8 - TDR
  • Complete proposals for NSF-ITR and EU-FP6 April
    2003
  • Programs at proposal stage to re-engineer,
    robustify, improve grid middleware
  • Report on comprehensive reviews of grid
    technologies, define strategy for missing
    functionality July 2003
  • Reviews to identify technology providers,
    capabilities and strategies for LCG-3. Includes
    a plan to provide functions not provided above.
  • Review of status of progress July 2004
  • Experiments final analysis models December
    2003
  • In the light of 1st 6 months experience with
    LCG-1, the experiments should provide updated
    analysis models
  • SC2 Review December 2004
  • Comprehensive review of experience in the
    experiments and at the Regional Centres in
    deploying, operating, and using LCG services.
    Update the requirements and service model for
    deployment and operation of the final system.

16
Timelines
Incremental middleware releases
Incrementally add regional centres ?
Write a Comment
User Comments (0)
About PowerShow.com