Proposal for changes to CMS Core Software PowerPoint PPT Presentation

presentation player overlay
1 / 20
About This Presentation
Transcript and Presenter's Notes

Title: Proposal for changes to CMS Core Software


1
Proposal for changes to CMS Core Software
  • CMS Framework Review Task Force

2
Charge Mandate (1)
  • Issued by
  • - David Stickland CCS
  • - Paris Sphicas PRS
  • We were asked to
  • List good concepts to be present in a framework
    and event model based on the Tevatron RunII
    experiences
  • Review the current features of the CMS EDM
  • Provide a list of "missing features" in the CMS
    environment

3
Charge Mandate (2)
  • Establish a set of rules or guidelines for ORCA
    code developers (various physics packages) to
    determine the interaction between them and the
    framework
  • Produce a list of changes that are needed. The
    list can include changes in concept and the
    design of the EDM
  • Produce a possible work-plan along with an
    estimate of the resources required -- to
    implement the changes needed

4
Participants
  • Vincenzo Innocente
  • Bill Tanenbaum
  • Liz Sexton-Kennedy
  • Marc Paterno
  • Jim Kowalkowski
  • Walter Brown
  • Lothar Bauerdick
  • Peter Elmer
  • Lassi Tuura
  • Avi Yagil
  • Ken Bloom

5
Overview
  • We did cover all points of the charge
  • Made a list of important features needed based on
    RunII, part of a draft document
  • Examined their presence in the CMS framework
  • ?Agreed on a list of missing features!
  • Produced a list of desirable changes to CMS
    framework
  • Will present a possible work plan to implement
    the above (IF decision to do so is reached, and
    the CMS collaboration decides to adopt it)

6
Main issues
  • Following are what we a-priori thought to be the
    biggest, stickiest, most contentious issues
  • Event class, Modules, communication
  • Scheduling (Reco-on-Demand, HLT etc.)
  • Data structure, usage patterns (ROOT, ORCA-Lite)

7
Event Data Model (EDM)
  • There will be one (and only one) Event class
  • Event represents a single crossing
  • Modules (Reconstruction Algos) Communication
  • Modules interact with each other only via the
    Event class
  • Modules shall not interact directly (call) other
    modules.
  • Modules are directly "configurable
  • internal details (algorithms) are configured by
    "percolating" ParameterSets to them, from the
    Module that contains them.
  • no other method of obtaining configuration is
    allowed
  • Event data objects should not depend upon the
    classes that create them.

8
Order of execution(who does what, when)
  • Based on requirement of HLT configuration and
    performance
  • ? Explicit, up-front scheduling of module
    execution (path)
  • ?Be able to load all required libs, before
    first event is processed
  • Implicit invocation (reco-on-demand) has merits
    in use-cases (visualization, interactive
    analysis)
  • ?Can it be kept as well?
  • ?How much does it cost to do so?

9
Scheduling
  • We had a long discussion of scheduling mechanisms
  • Agreed on a system that allows for two
  • explicit scheduling
  • implicit invocation (Reconstruction on Demand)
  • In the context of explicit scheduling, the
    framework will prevent implicit invocation
  • We have agreed to put off calculated scheduling
    it may be re-considered at a later date.

10
Event Format - Usage Patterns
  1. Bare Root. The stored form of objects has to be
    sufficiently simple to allow their use without
    class libraries or additional code.
  2. ROOT with a small set of libraries. This can use
    classes of type "basic elements". We want the
    classes of this second level to be defined by
    CMS. Analysis of data in this format shall not
    require access to external databases.
  3. ORCA-Lite. Within ROOT or Framework with a
    medium set of libraries. This can use classes of
    type "1" and also of type "2". Analysis of data
    in this format shall not require access to
    external databases.
  4. Full ORCA. Within the framework, using the full
    reconstruction program. This includes not only
    the software libraries but also includes full
    connectivity to any external resources.

11
What Does Use of Root Look Like?
  • To figure out what the branch structure of the
    data should look like, we discussed what we want
    to use at the Root prompt (in case "1").
  • Our example assumes a branch "ele" carrying
    electrons from one algorithm, and a branch "trk"
    carrying tracks from one algorithm.
  • Our example is to plot the pt of the track
    associated with each electron in the electron
    collection
  • t-gtDraw("trk.ptele.tk.id")
  • Raw data is stored in a ROOT tree as well, but is
    not required to be transparent to Bare ROOT.

12
Requirements on ORCA
  • Clear guidelines to Developers
  • For proper interaction with the framework
  • Will be specified and exemplified as a part of
    migration
  • Object structure
  • Persistent objects should be simple, to support
    the required four usage patterns
  • No code should be needed to interpret such basic
    elements
  • The representation in memory, and the persistent
    representation, should be the same (refit Vs.
    puff)
  • ORCA-Lite
  • Code reorganization (de-spaghettization)
  • Split into Analysis and Reconstruction Tiers
  • Data object definitions should be packaged
    separately from the Ana/Reco algorithms.

13
Additional Issues
  • Non-Event Data (was MetaData)
  • (data sets)
  • (conditions)
  • Plug-in Manager
  • Component Model

14
Non-Event Data data sets
  • Main Objective dataset book-keeping
  • Events are classified into O(50) Primary
    datasets.
  • These are achieved via 10 intermediate online
    streams.
  • Classification into those, based on trigger
    decision
  • This hierarchy is one of event classification ---
    different types of events, not different kinds of
    contents.
  • data tiers, which are defined by contents and
    levels of processing.
  • MUST specify at job config stage if one wants
    AOD, RECO or RAW data

15
Non-Event Data data sets
  • Dataset
  • ? collections of files
  • ? collections of events.
  • To define the event sample to be processed by a
    given job, need to specify
  • Event class/type Primary DataSet (inclusive ele,
    di-jet)
  • Event representation (RECO, AOD)
  • Processing pass (this years re-reconstruction)
  • N.B. Skims, Tags, E.D. etc. may be identified as
    a different processing pass or with an additional
    DataSet attribute.
  • Event Run number ranges

16
Non-Event Data conditions
  • Need to handle various configuration data
  • Algorithm configuration data (parameter set)
  • Calibration data, constants database
  • It is important to have full information for
    "officially produced" event data, while still
    supporting development by individuals doing
    something other than "official production."
  • We identify two differently tracked information
  • Algorithm configuration data -- we've been
    calling these ParameterSets. They have a central
    authority to issue unique IDs.
  • Calibration, survey, etc. data -- these have a
    different central authority to issue unique IDs.
    A single ID should suffice to specify all such
    data.

17
Plug-in Manager Dynamic Loading
  • We propose that the SEAL plug-in manager and
    scheme for dynamic loading should be adopted by
    CMS
  • We think this assures us of all three required
    use cases
  • Never load anything during run-time
  • Run-time loading of all required Libs before
    processing starts
  • Loading on-demand
  • No intent to support static build

18
Component ModelMulti Threading
  • We propose that the SEAL component model should
    be adopted by CMS
  • We propose that services (geometry service,
    calibration service, ) should use this component
    model
  • We propose strong limits on the use of
    multithreading
  • Non framework code should not spawn threads
  • Non framework code should not require explicit
    locking
  • Simple rules (no use of class statics, etc.)
    should be enforced for user code, to allow for
    thread-safe use

19
Deliverables Notes (To be submitted by CMS
week)
  • Precursor note containing
  • RunII use cases
  • Observations on availability in CMS
  • List of suggested additions
  • Start from internal document
  • ? Editor VI
  • Problem Space note
  • May be introduction of the next
  • Editor Lothar Avi
  • Workshop Summary note
  • Structured skelton
  • Incorporating detailed notes
  • ? Editor Marc Bill

20
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com