Persistent User Data using Objectivity - PowerPoint PPT Presentation

About This Presentation
Title:

Persistent User Data using Objectivity

Description:

Although a model was developed and prescriptions provided there was no evidence ... (each populating a different User Collection) are allowed in a single ORCA job ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 17
Provided by: ygap
Category:

less

Transcript and Presenter's Notes

Title: Persistent User Data using Objectivity


1
Persistent User Datausing Objectivity
The missing Milestone
  • Vincenzo Innocente
  • CERN/EP/CMC

2
Introduction
  • Last RD45 milestone was about private persistent
    data and classes
  • Although a model was developed and prescriptions
    provided there was no evidence that it would have
    worked in a HEP-experiment production environment
  • In CMS, following and extending the RD45 model,
    we have developed procedures which allows any
    physicist
  • to develop and test private persistent classes
  • to manage its own private persistent objects

3
HEP Data
  • Environmental data
  • Detector and Accelerator status
  • Calibrations, Alignments
  • Event-Collection Meta-Data
  • (luminosity, selection criteria, )
  • Event Data, User Data

Navigation is essential for an effective physics
analysis Complexity requires coherent access
mechanisms
4
Requirements
  • Software Development
  • Physics reconstruction developers should be able
    to develop, test and integrate persistent classes
    without interferer with other developments (same
    as for transient classes)
  • End Users should be able to develop and use
    private persistent classes
  • Data
  • Physicists (End Users) should be able to access
    any kind of data without interfering with its
    production
  • Physicists should be able to populate private
    databases, using and referencing common
    objects, without interfering with production
    activities
  • Environment
  • Development and running environment should be the
    same for system (experiment-wide) and user data
  • Access mechanisms should be the same for system
    and user data

5
Technical Solutions
  • FD-Shallow-copy
  • A federation shallow-copy is a local copy of
    .boot and .FDDB ooinstalled -nocatalog with all
    original database files made read-only
  • Development
  • Named schema (few 5 or so) are used to avoid
    interferences and ease integration
  • Development and tests are performed against
    fd-shallow-copy
  • Schema is exchanged using ooschemadump/upgrade
  • Standard scripts (today making use of SCRAM,
    tomorrow integrated into SCRAM) are provided to
    parse ddl
  • A rich middle-ware of C classes, often
    template, is provided to reduce (to zero?) the
    Objectivity-specific code to be known by
    physicists
  • In particular a user development environment is
    provided to develop concrete-Tags of simple
    structure

6
Technical Solutions
  • Object shallow-copy
  • Local copy with (one-way-)references to
    constituents
  • Object deep-copy
  • Local copy with local copy of constituents
  • Data
  • Users always start with a local
    federation-shallow-copy
  • Events are never modified in place
    reconstruction always generate a new event
    collection and a new event-data structure with a
    shallow copy of the parent event
  • Users can produce deep copy of (part of) the
    event for a selected sample and generate a user
    collection
  • Concrete Tags (user private persistent objects)
    can be added to a user collection

7
Navigation
  • Top Level
  • User sees and navigates a Unix-like tree
    structure through a C or Python API (Shell)
  • Implementation is by Objy naming (root is a
    database system name) or any other
    object-containment mechanism mapped to a
    Unix-like tree by the Shell
  • Collections
  • We use a fully hirarchical composite collection
    system with metadata associated to each component
  • It allows sequential and random access with full
    support for fast user selection on MetaData
  • It can be used to organize any kind of objects
    that need indexing but slow update
  • Event
  • Navigation in the event structure and from the
    event to the configuration is implemented using
    one-way references (pure ooRefs)

8
Dataset Collection
MetaData User Tag
Run Collection
Rec Event
9
User Collection By Reference
MetaData User Tag
DB Name (physical location)
Context Name
Collection Name
Run Collection
User Collections are populated by User
Filters Multiple User Filters (each populating a
different User Collection) are allowed in a
single ORCA job
Original RecEvent
10
RecApplication I/O
Federation
Datset Collection or User Collection
Create/extend User Collections
Histograms Tags
Append new Run to a Dataset
Store
RecReader
Request
Output Run is a new event collection containing
new data (digis RecObjs) and reference to or
replica of input data
Output User Collections are unmodified sub-samples
of the input collection
11
Top Level Event Structure (COBRA5)
Run
Crossing
Trigger
Pile-up
SimEvent
12
Raw Event
RawData are identified by the corresponding
ReadOut. RawData belonging to different detector
s are clustered into different containers. The
granularity will be adjusted to optimize I/O
performances. An index at RawEvent level is
used to avoid the access to all containers in
search for a given RawData. A range index at
RawData level could be used for fast
random access in complex detectors.
RawEvent
ReadOut
ReadOut
...
RawData
RawData
Index implemented as an ordered vector of pairs
13
CMS Reconstructed Objects
Reconstructed Objects produced by a given
algorithm are managed by a Reconstructor.
RecEvent
A Reconstructed Object (Track) is split into
several independent persistent objects to allow
their clustering according to their access
patterns (physics analysis, reconstruction,
detailed detector studies, etc.). The top level
object acts as a proxy. Intermediate
reconstructed objects (RHits) are cached by value
into the final objects .
S-Track Reconstructor
esd
Track SecInfo
rec
S Track
..
Track Constituents
aod
Vector of RHits
S Track
14
Re-Reconstruction Clones
Run
Run
Id-1
Local Replica
Crossing
Trigger
Pile-up
15
Collection By Value
MetaData User Tag
New Owner Name
DataSet Name
Run Collection
New RecEvent with new or cloned Digis RecObjs
16
Physical clustering
Write a Comment
User Comments (0)
About PowerShow.com