LCG Status GridPP 8 September 22nd 2003 Tony.Cass@CERN.ch - PowerPoint PPT Presentation

Loading...

PPT – LCG Status GridPP 8 September 22nd 2003 Tony.Cass@CERN.ch PowerPoint presentation | free to download - id: 1645b6-ZDc1Z



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

LCG Status GridPP 8 September 22nd 2003 Tony.Cass@CERN.ch

Description:

... with experiment integrators to resolve bugs and issues exposed in integration. ... LEAF: LHC Era Advanced Fabrics. quattor thoroughly in control of CERN fabric ... – PowerPoint PPT presentation

Number of Views:19
Avg rating:3.0/5.0
Slides: 23
Provided by: tony138
Learn more at: http://www.gridpp.ac.uk
Category:
Tags: 22nd | cern | gridpp | lcg | bug | cass | leaf | september | status | tony

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: LCG Status GridPP 8 September 22nd 2003 Tony.Cass@CERN.ch


1
LCG StatusGridPP 8September 22nd
2003Tony.Cass_at_CERN.ch
2
LCG/PEB Work areas
  • Applications
  • Fabrics
  • Grid Technology
  • Grid Deployment

3
Applications
  • SPI
  • POOL
  • SEAL
  • PI
  • Simulation

4
Applications
  • SPI
  • Software Infrastructure solidly in place. The
    different components were covered in depth at
    GridPP 7.
  • Effort in this area reduced, but incremental
    improvements being delivered in response to
    feedback.
  • POOL
  • SEAL
  • PI
  • Simulation

5
Applications
  • SPI
  • POOL
  • First production release delivered on schedule in
    June
  • Experiment integration now underway
  • production use for CMS Pre-Challenge Production
    milestone met at end July
  • completion of first ATLAS integration milestone
    expected in September.
  • POOL deployment on LCG-1 beginning
  • POOL and SEAL working closely with experiment
    integrators to resolve bugs and issues exposed in
    integration.
  • Lots of them!, but this was expected!
  • SEAL
  • PI
  • Simulation

6
Applications
  • SPI
  • POOL
  • SEAL
  • Project on track
  • PI
  • Simulation

7
Applications
Release Date Status Description (goals)
V 0.1.0 14/02/03 internal Establish dependency between POOL and SEAL Dictionary support generation from header files
V 0.2.0 31/03/03 public Essential functionality sufficient for the other existing LCG projects (POOL) Foundation library, system abstraction, etc. Plugin management
V 0.3.0 16/05/03 internal Improve functionality required by POOL Basic framework base classes
V 1.0.0 30/06/03 public Essential functionality sufficient to be adopted by experiments Collection of basic framework services Scripting support
Released 1426/02/03
Released 04/04/03
Released 23/05/03
Released 18/07/03
8
Applications
  • SPI
  • POOL
  • SEAL
  • Project on track
  • Waiting for detailed feedback on current
    functionality from POOL experiments
  • Planning to develop new requested functionality
  • Object whiteboard (transient datastore)
  • Improvements to scripting LCG dictionary
    integration, ROOT integration
  • Complete support for C types in the LCG
    dictionary
  • PI
  • Simulation

9
Applications
  • SPI
  • POOL
  • SEAL
  • PI
  • Principal initial mandate, a full ROOT
    implementation of AIDA histograms, recently
    completed
  • Still a small effort with limited scope, though.
  • Future planning depends on what comes out of the
    ARDA RTAG
  • Architectural Roadmap towards Distributed
    Analysis
  • Reviewing DA activities, HEPCAL II use cases,
    interfaces between Grid, LCG and
    experiment-specific services.
  • Started in September, scheduled to finish in
    October.
  • Simulation

10
Applications
  • SPI
  • POOL
  • SEAL
  • PI
  • Simulation
  • Physics Validation subproject particularly active
  • pion shower profile for ATLAS improved
  • expect extensive round of comparison with
    testbeam data in autumn.
  • ROSE (Revised Overall Simulation Environment)
  • Looking at generic framework high level design,
    implementation approach, software to be reused.
    Decisions expected in September.

11
Fabrics
  • CC Infrastructure
  • Recosting
  • Management successes
  • RH release cycles

12
Fabrics CC Infrastructure
13
Fabrics CC Infrastructure
14
Fabrics Recosting I
  • Representatives from IT and the 4 LHC experiments
    reviewed the expected equipment cost for LCG
    phase 2.
  • Took into account adjusted requirements from the
    experiments and some slight changes to the
    overall model.
  • Results published in July.

15
Fabrics Recosting II
All units in million CHF
16
Fabrics System Management
  • Overall management suite christened over the
    summer ELFms with components
  • quattor EDG/WP4 installation configuration
  • Lemon LHC Era monitoring
  • LEAF LHC Era Advanced Fabrics
  • quattor thoroughly in control of CERN fabric
  • migration to RH 7.3 managed by quattor in spring.
  • LSF 5 migration took 10 minutes in late August
  • Across 800 batch nodes. Equivalent migration in
    2002 took over 3 weeks with much disruption.
  • EDG/WP4 OraMon repository in production since
    September 1st.
  • State Management System development underway.

17
Fabrics RedHat Release Cycles
  • RedHat are moving to a twin product line
  • Frequent end-user releases with support limited
    to 1 year, but free.
  • Less frequent business releases with long term
    support at a cost.
  • Neither product really adapted to our needs
  • Annual change of system version is too rapid
    18-24 month cycle more realistic.
  • Cost of Enterprise server prohibitive for our
    farms.
  • Move to negotiate with RedHat for compromise
  • Major labs club together to pay for limited
    support (security patches ?) for the end-user
    product for, say, 2 years.
  • Discussions at HEPiX in Vancouver
  • Plus visit to RedHat?

18
Grid Deployment I
  • Deployment started (with pre-release tag) in
    July, to original 10 Tier 1 sites
  • CERN, BNL, CNAF, FNAL, FZK, Lyon, Moscow, RAL,
    Taipei, Tokyo
  • Other sites joined PIC (Barcelona), Prague,
    Budapest
  • Situation today (18/9/03)
  • 10 sites up CERN, CNAF, RAL, FZK, FNAL, Moscow,
    Taipei, Tokyo, PIC, Budapest
  • Still working on installation BNL, Prague, Lyon
    (situation not clear)
  • Other sites currently ready to join
  • Bulgaria, Pakistan, Switzerland, Spanish Tier
    2s, Nikhef, Sweden
  • Official certified LCG-1 release (tag
    LCG-1.0.0) was available on 1 September at 5pm
    CET
  • Was installed at CERN, Taiwan, CNAF, Barcelona,
    Tokyo 24 hours later(!), and several others
    within a few days

19
Grid Deployment II
  • LCG-1 is
  • VDT 1.1.8-9 (Globus 2.2.4)
  • Information System (MDS)
  • Selected software from EDG 2.0
  • Workload Management System (RB)
  • EDG Data Management (RLS, LRC, )
  • GLUE Schema 1.1 LCG extensions
  • LCG local modifications/additions/fixes, such as
  • Special job managers (LCGLSF, LCGPBS, LCGCONDOR)
    to solve the problem of sharing home directories
  • Gatekeeper enhancements (adding some accounting
    and auditing features, log rotation, that LCG
    requires)
  • Number of MDS fixes (also coming from NorduGrid)
  • Number of misc. Globus fixes, most of them
    included now in the VDT version LCG is using
  • Some problems remain. Overall, though, impressive
    improvement in terms of stability.

20
Grid Deployment III
  • Starting to get experiments testing LCG-1 now
  • Loose cannons currently running on LCG-1 to
    verify basic functionality
  • Scheduling now with experiments
  • Initially we need to carefully control who does
    what
  • we need to monitor the system as the tests run
    to understand the problems
  • Migrate CMS LCG-0 to LCG-1
  • Atlas, US_Atlas (want to demonstrate
    interoperability)
  • ALICE continue with tests started by Loose
    Cannons
  • LHCb ?
  • We are scheduling these tests now, will commence
    next week
  • Once experiments verify their software on LCG-1
    we must begin to add resources at each site
  • Currently very basic resources available

21
Grid Deployment IV
  • Basics are in place but many tasks to be done at
    high priority to make a real production system
  • Experiment sw distribution mechanism
  • Monitors to watch essential system resources on
    essential services (/tmp, etc)
  • System cleanup procedures
  • System auditing must ensure procedures are in
    place
  • Need basic usage accounting in place
  • Need tool independent WN installation procedure
    also for UI
  • Integration with MSS (setting up task force)
  • NB sites with MSS will need to implement
    interfaces
  • Integration with LXBatch (and others)
  • Standard procedures we will start but needs a
    team from sites and GOC
  • for setting Runtime Environments
  • Change procedures
  • Operations
  • Incident handling

22
Summary
  • In general, good progress
  • Applications area
  • Fabrics,
  • Yes, LCG-1 is delayed
  • but dont forget the vast improvements to the
    overall system driven by the focus on delivering
    a production quality environment.
  • UK contribution to this work is extensive and
    much appreciated.
About PowerShow.com