LHC Computing Grid Project LCG - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

LHC Computing Grid Project LCG

Description:

Key points about LCG, EGEE and other Grid Infrastructures. Status & Concerns ... The new, improved middleware from EGEE is awaited with impatience ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 21
Provided by: lesr150
Category:

less

Transcript and Presenter's Notes

Title: LHC Computing Grid Project LCG


1
  • LHC Computing Grid Project LCG
  • CERN European Organisation for Nuclear Research
  • Geneva, Switzerland

Project Status Les Robertson, LCG Project
Leader EGEE Conference Den Haag 26 November 2004
2
Summary
  • Key points about LCG, EGEE and other Grid
    Infrastructures
  • Status Concerns
  • Planning for LHC Startup

3
LCG Project Activity Areas
Applications Development environment and common
libraries, frameworks, tools for the LHC
experiments
CERN Fabric Construction and operation of the
central LHC computing facility at CERN
Networking Planning the availability of the high
bandwidth network services to interconnect the
major computing centres used for LHC data analysis
4
Risks and Opportunities
  • LCG EGEE are combining resources to build an
    operation that is wider in scope and ambition
    than LCG would be able to tackle on its own.
  • LCG has all of its middleware eggs in the EGEE
    basket
  • If we can use the real needs and real resources
    of the LHC experience to establish a general
    science grid infrastructure that is supported
    long term we will all benefit -
    -- that is why we are in this project
  • EGEE stops in March 2006! LHC starts in 2007!
  • This is an enormous risk for LCG
  • I am not sure that there are other applications
    that have shown this level of confidence in the
    EGEE project
  • I am sure that the LCG reviewers would not agree
    entirely with some of the views of the EGEE
    reviewers
  • -- the risk we are taking deserves a considerable
    priority from the EGEE project

5
LCG Service Hierarchy
  • Tier-2 100 centres in 40 countries
  • Simulation
  • End-user analysis batch and interactive

6
Networking
  • Latest estimates are that Tier-1s will need
    connectivity at 10 Gbps with 70 Gbps at CERN
  • There is no real problem for the technology
    as has been demonstrated by a succession
    of Land
    Speed Records
  • But LHC will be one of the few applications
    needing - this level of performance as a
    service on a global scale
  • We have to ensure that there will be an effective
    international backbone that reaches
    through the national research networks

    to the Tier-1s
  • LCG has to be pro-active in working with service
    providers
  • Pressing our requirements and our timetable
  • Exercising pilot services

7
LHC Computing Resources
  • Most of the LHC resources around the world are
    organised as national and regional grid projects,
    integrated into the combined LCG-2/EGEE
    operation
  • There are separate infrastructures in the US
    (Grid-3) and the Nordic countries (NorduGrid)
    that use different middleware
  • The LCG project has a dual role
  • Operating the LCG-2/EGEE grid - a joint LCG-EGEE
    activity
  • Coordinating the wider set of resources available
    to LHC
  • There is an active programme aimed at
    compatibility/inter-working of LCG-2/EGEE and
    Grid3
  • And on-going technical discussions with similar
    aims with NorduGrid
  • ? Lack of standards is a major headache for LHC
    experiments
  • In practice, the standard is most likely to be
    set by a
    winning middleware implementation

8
Status Concerns
9
Grid Deployment - going well
  • The grid deployment process (LCG-2) is working
    well
  • Integration certification debugging
  • Distribution - installation
  • Rapid reaction to problems encountered during
    the LHC experiments data challenges?
    incremental releases of LCG-2? significant
    improvements in reliability, performance and
    scalability
  • within the limits of the current architecture
  • Scalability is much better than
    scheduled, or expected a year ago
  • ? 90 nodes, 9,000 processors ? close
    to final scale of the LCG grid!
  • Heavily used during the data challenges in 2004
  • lots of real work done for real physicists --
    these are not tests or demos
  • many small sites have contributed to simulation
    runs
  • one experiment (LHCb) has run up to 3,500
    concurrent jobs

10
Grid Deployment - concerns
  • The basic issues of middleware reliability and
    scalability that we were struggling with a year
    ago have been overcome
  • BUT - there are many issues of functionality,
    usability and performance to be
    resolved -- soon
  • Overall job success rate 60-75
  • Can be tolerated for production work
    submitted by small teams with automatic job
    generation, bookkeeping systems
  • Unacceptable for end-user data analysis

11
  • Urgent to improve operations coordination and
    management
  • EGEE support resources now in place
  • Core operations centres established ? CLRC
    Oxford, IN2P3 Lyon, CNAF Bologna, ASCC Taipei,
    CERN
  • Global Grid User Support centre ?
    Forschungszentrum Karlsruhe
  • Operations workshop at CERN 2-4 November
  • The new, improved middleware from EGEE is awaited
    with impatience

12
LCG-2 and Next Generation Middleware
LCG-2
gLite
2004
  • LCG-2focus on production, large-scale data
    handling
  • The service for the 2004/5 data challenges
  • Provides experience on operating and managing a
    global grid service -- middleware neutral
  • Continuing, modest development programme driven
    by data challenge experience
  • Will be supported until gLite is able to replace
    it (functionality, scaling, reliability,
    performance)
  • focus on analysis
  • LHC applications and users closely involved in
    prototyping development (ARDA/NA4 project)
  • Short development cycles
  • Deployed along with LCG-2 (co-existence)
  • Hope to be able to replace some LCG-2 components
    at an early stage with gLite components

prototyping
prototyping
product
2005
product
?
13
Middleware from EGEE
  • We have a rapidly growing number of sites
    connecting to the LCG-2/EGEE grid -- but there
    are major holes in the functionality, especially
    in data management, and concerns about workload
    management
  • The first gLite prototype was made available in a
    development environment in May (6 weeks after
    EGEE started!)
  • Good experience with this leads to strong
    pressure for extended access
    more users, more data
  • But there are difficulties in getting the product
    out
  • the first pieces are only being delivered to the
    pre-production testbed this month
  • key components will only arrive next year
  • Absolute priority must now be to get the basic
    gLite functionality out on the pre-production
    testbed
  • -- and establish the process of short
    development cycles
  • The LHC experiments have a pressing time-line
    -- I do not want them to be forced to employ
    alternative solutions

14
Planning for LHC Startup
15
Planning for LHC Startup
To what extent will there be experience of the
new middleware before these major decisions are
made?
  • The agreements between the centres that will
    implement the LHC computing environment will be
    mapped out over the next 6-9 months
  • December 2004
  • Experiment requirements and computing models
    published
  • First quarter 2005
  • Establish resource plans for Tier-0, Tier-1 and
    major Tier-2s
  • Initial plan for Tier-0/1/2 networking
  • April 2005
  • Formal collaboration framework memorandum of
    understanding
  • July 2005 Technical Design Report
  • Detailed plan for installation and commissioning
    the LHC computing environment

16
Service Challenge Programmeto Ramp-up to LHC
Startup
  • Dec04 - Service Challenge 1
  • Basic high performance data transfer - 2 weeks
    sustained
  • CERN 3 Tier-1s, 500 MB/sec between CERN and
    Tier-1s
  • Mar05 - Service Challenge 2
  • Reliable file transfer service
  • mass store (disk) - mass store (disk)
  • CERN 5 sites, 500 MB/sec between sites, 1
    month sustained

17
Service Challenge Programmeto Ramp-up to LHC
Startup
  • Jul05 - Service Challenge 3
  • - Tier-0/Tier-1 base service- CERN 5
    Tier-1s, 300 MB/sec. including mass store
    (disktape)- sustained 1 month
  • - 5 Tier-2 centres at lower bandwidth
  • Preparation for --
  • Tier-0/1 model verification two experiments
    concurrently at 50 of nominal data rate

2008
First beams
Full physics run
18
Service Challenge Programmeto Ramp-up to LHC
Startup
  • Apr06 - Service Challenge 4
  • - Tier-0, ALL Tier-1s, major Tier-2s operational
    at full target data rates (1.2 GB/sec at Tier-0
    )
  • Preparation for ..
  • Tier-0/1/2 full model test - All experiments
  • - 100 nominal data rate, with processing load
    scaled to 2006 cpus
  • - sustained 1 month

2008
First beams
Full physics run
19
Service Challenge Programmeto Ramp-up to LHC
Startup
  • Nov06 - Service Challenge 5
  • Infrastructure Ready at ALL Tier-1s, selected
    Tier-2s
  • - Tier 0/1/2 operation - sustained 1 month
  • - twice target data rates ( 2.5 GB/sec at
    Tier-0)
  • Preparation for ..
  • Feb07 - ATLAS CMS LHCb ALICE (proton mode)
  • - Tier-0/1 100 full model test

2008
First beams
Full physics run
20
Summary
  • Grid Operation
  • Very good progress during the past year
  • Large scale deployment
  • Real work performed for experiments
  • Much work to be done to improve job success rate
    -- operations management, site discipline,
    middleware
  • Grid Middleware
  • Some of the missing functionality can be provided
    through short term developments of LCG-2
  • But we are looking to the EGEE/gLite work for
    middleware adapted to end user analysis
  • Urgent to deliver the base set of gLite
    components
  • LCG needs a permanent, increasingly stable
    service for experiments to do physics
  • And in addition has a tight schedule of service
    and computing model readiness tests
Write a Comment
User Comments (0)
About PowerShow.com