Nurcan Ozturk - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

Nurcan Ozturk

Description:

What is the schema of the AMI dataset catalogue? ... See more details for FDR & TAGs from a talk by James Frost, April Exotics Working Group meeting ... – PowerPoint PPT presentation

Number of Views:31
Avg rating:3.0/5.0
Slides: 32
Provided by: kau78
Category:
Tags: ami | james | nurcan | ozturk

less

Transcript and Presenter's Notes

Title: Nurcan Ozturk


1
Data Discovery Tools, DQ2 Enduser Tools
andPhysics Analysis Tools
  • Nurcan Ozturk
  • University of Texas at Arlington
  • SCHOOL ON HEP_at_TR-GRID
  • April 30 May 2, 2008
  • Turkish Atomic Energy Authority (TAEA), Ankara,
    Turkey

2
Outline
  • Users work-flow for Data Analysis
  • Data Discovery Tools
  • AMI - ATLAS Metadata Interface
  • TAG Browser - ELSSI
  • DQ2 Enduser Tools
  • ATLAS Analysis Model
  • Analysis Model Forum Recommendations
  • Derived Physics Data (DPD)
  • Analyzing the Data (inside or outside Athena)
  • AthenaRootAccess (ARA)
  • EventView

3
Users Work-flow for Data Analysis
Setup the analysis code
Locate the data
Setup the analysis job
Submit to the Grid
Retrieve the results
Analyze the results
4
Data Discovery Tools
5
ATLAS Metadata Interface (AMI)
http//ami3.in2p3.fr8080/opencms/opencms/AMI/www/
index.html
  • AMI is a bookkeeping project.
  • AMI is a generic cataloging system (a database
    application). The majority of datasets currently
    catalogued in AMI are Monte Carlo datasets. AMI
    reads information from the task request system,
    and correlates it with information read from the
    production database.
  • AMI contains the physics metadata for
  • 2008 real data
  • 2008 FDR exercise
  • 2007 Cosmics runs (M5 data)
  • 2006/2007 service challenge datasets
  • StreamTest
  • Data Challenges DC1 and DC2 / Rome Production
    System
  • Combined Test Beam
  • AMI also powers the TagCollector release
    management tool.

6
AMI Tutorial
  • http//ami3.in2p3.fr8080/opencms/opencms/AMI/www/
    Tutorial/
  • Or
  • http//ami3.in2p3.fr8080/opencms/opencms/AMI/www/
    Tutorial/FastTrackTutorial.html
  • What is AMI?
  • Where does AMI get its Information?
  • How do I search for a dataset?
  • Which information can I get from the result of an
    AMI dataset search?
  • What is the schema of the AMI dataset catalogue?
  • Why can I sometimes not find a dataset when I can
    see its existence in other catalogues?
  • Can I refine the search?
  • Can I simply browse all of the information in
    AMI?
  • Can I bookmark an AMI page?
  • Why doesn't the back button of my browser work?
  • Can I use AMI without going through the web
    interface?
  • How can I extract information from AMI?
  • How to I write to AMI?

7
How Do I Search For A Dataset? Simple Search
Follow the link to the simple search interface
from the tutorial page
type here
8
Results From Simple Search (1)
pull down menu
link
link
links
9
Results From Simple Search (2)
When you click on Provenance link it shows what
version of Athena software used in making
evgen/digit/reco
10
Results From Simple Search (3)
When you click on DQ2 link it shows DQ2 Dataset
Metadata, existing replicas of the dataset, a
link to PanDA monitor
11
Results From Simple Search (4)
When you click on PANDA link It gets you to the
dataset browser
12
How Do I Search For A Dataset? Advanced Search
Follow the link to the Advanced search
interface from the tutorial page
13
Results From Advanced Search
14
TAG
  • ATLAS will produce petabytes of data, a system of
    event-level metadata is needed to quickly
    identify and select events that are interested
    for a given analysis. This is provided by TAG
    files, and the TAG database.
  • TAG files are built from AOD according to offline
    analysis-style code. TAG files are then loaded
    into TAG database.
  • TAG files store information about the status of
    each sub-detector, trigger and physics object ID.
  • For instance for FDR-1 data TAGs contain
  • Event information
  • Run number, event number, luminosity block,
    number of vertices and tracks, primary vertex
    position. (Luminosity has an entry but not
    filled)
  • Variables such as the summed cell Et, missing Et
    magnitude, and phi
  • Trigger information BitMasks encode pass, pass
    after prescale for each trigger item/chain
  • Physics objects
  • multiplicity of physics objects and the Pt, eta,
    phi for the highest Pt objects
  • A tightness criterion for e/mu/gamma is included
    as is b-tag likelihoods and tau candidate
    likelihood.
  • PhysWords 32-bit TAG Word. For b-physics for
    instance
  • Bit 0 HighPtMuonPair, Bit 1 J/Psi candidate,
    Bit 2 Upsilon candidate.
  • See more details for FDR TAGs from a talk by
    James Frost, April Exotics Working Group meeting

15
How Does TAG Selection Work?
  • Use the TAG file as an input to EventSelector or
    PoolTAGInput.
  • Make sure the matching Pool file (eg. AOD) is in
    the PoolFileCatalog.
  • Define you query of the TAG content.
  • Run the job.
  • Very flexible
  • Can use the TAG to preselect the events from an
    AOD in which you are interested, passing only
    those to an analysis algorithm.
  • Can use the ATG to write out an AOD (or ESD, RDO)
    of only the selected events.
  • How to learn more? Good tutorials are available
    already
  • https//twiki.cern.ch/twiki/bin/view/Atlas/FeedBac
    kForTags
  • https//twiki.cern.ch/twiki/bin/view/Atlas/TagForE
    ventSelection
  • https//twiki.cern.ch/twiki/bin/view/Atlas/TagForE
    ventSelectionBuilding_Tags_Under_12_0_31 (create
    tag files)
  • https//twiki.cern.ch/twiki/bin/view/Atlas/Physics
    AnalysisWorkBookTAG
  • https//twiki.cern.ch/twiki/bin/view/Atlas/Physics
    AnalysisWorkBookTAGAnalysis
  • https//twiki.cern.ch/twiki/bin/view/Atlas/TopFdrT
    ag
  • http//twiki.mwt2.org/bin/view/Main/TutorialTag080
    318 (All the above links are available from this
    one.)

16
TAG Browser ELSSI (1)
  • TAGs are accessed by users via a web interface
    called ELSSI, the ATLAS Event Level Selection
    Service Interface.
  • For FDR-1 data (tutorial) https//atldbdev01.cern
    .ch/tagservices/tutorial/index.htm
  • For FDR-1 data https//atldbdev01.cern.ch/tagser
    vices/fdr/index.htm

You need Firefox to see this page As Jack
Cranshaw informed me.
17
TAG Browser ELSSI (2)
  • How to use ELSSI
  • Define a query to select runs, streams, data
    quality, trigger chains,
  • Review the query
  • Execute the query and retrieve the TAG file (a
    root file)

18
DQ2 Enduser Tools
19
The Client Tools to Retrieve Data
  • DQ2 enduser tools
  • Includes dq2_xxx (dq2_ls, dq2_get, etc) commands
  • Available to download from
  • https//twiki.cern.ch/twiki/bin/view/Atlas/U
    singDQ2Download
  • The setup files are edited to accommodate local
    needs (dq2.sh, setup.sh)
  • Available on AFS at CERN
  • source /afs/cern.ch/project/gd/LCG-share/curr
    ent/etc/profile.d/grid_env.sh
  • source /afs/cern.ch/atlas/offline/external/GR
    ID/ddm/endusers/setup.sh.CERN
  • gLite UI (User Interface)
  • Includes lcg-cp, egee-gridftp-xxx
  • Available on AFS at CERN
  • source /afs/usatlas.bnl.gov/lcg/current/etc/p
    rofile.d/grid_env.sh
  • source /afs/cern.ch/project/gd/LCG-share/curr
    ent/external/etc/profile.d/grid-env.sh
  • Why glite UI may be needed in OSG
  • dq2_put/get may use some gLite commands
    depending on the site they interact with
    (TiersOfATLASCache.py description) lcg-lg,
    lcg-rf, glite-gridftp-ls, lcg-gt
  • More Info
  • https//twiki.cern.ch/twiki/bin/view/Atlas/D
    DMEndUserTutorial

20
DQ2 Enduser Tools
  • dq2_ls returns a list of datasets matching a
    given pattern
  • dq2_ls fdr08_run1.0003051.StreamEgamma.merge.AOD.
    o1_r6_t1
  • dq2_get copies the files from DQ2 to a local
    area
  • dq2_get rv fdr08_run1.0003051.StreamEgamma.merge
    .AOD.o1_r6_t1
  • dq2_put registers datasets to DQ2
  • dq2_poolFCjobO creates PoolFileCatalog and
    Athena job-option for DQ2 datasets
  • dq2_register uploads and registers external
    generator input files to DQ2
  • dq2_cleanup deletes a dataset from a site's
    catalog and storage.
  • dq2_sample copies a portion of an existing
    dataset and registers it to DQ2
  • More info
  • https//twiki.cern.ch/twiki/bin/view/Atlas/UsingDQ
    2DQ2_end_user_tools

21
ATLAS Analysis Model
22
Analysis Model Forum Recommendations on the
Analysis Model
includes metadata simple UserData
23
Derived Physics Data - DPnD
  • Primary DP1D POOL-based DPD produced by the GRID
    production system. There are expected to be O(10)
    primary DPDs, so the contents will not be very
    specific to an analysis. It is expected to be
    skimmed (keeping only interesting events),
    slimmed (keeping only interesting objects, for
    example electrons and muons), and thinned
    (keeping only the subset of information inside
    objects that is relevant in future steps)
    compared to the AOD.
  • An Example Job Options file AODtoDPD.py (see CVS)
  • Packages In CVS TopDPDMaker, TauDPDMaker,
    BPhysicsDPDMaker, SUSYDPDMaker
  • Secondary DP2D POOL-based DPD with more
    analysis-specific information. Typically, this is
    produced from Primary DPD and may be created
    using an Athena tool like EventView.
  • SimpleThinningExample
  • HighPtViewDPDThinningTutorial
  • Tertiary DP3D Does not need to be POOL-based, it
    includes flat ntuples.

24
Analyzing the Data
  • Inside Athena
  • Interactive or batch using C, python code.
  • Needs a part from Athena (depends on user needs).
  • Provides full access to all tools and services.
  • Outside Athena AthenaRootAccess (ARA)
  • CINT, or using python, or compiled C code.
  • Does not need full Athena installation (expected
    1GB)
  • Not all classes are available (example,
    calo-Cells)
  • Important both methods use the same files as
    input.

25
ARA - AthenaRootAccess
  • Allows to read an AOD in ROOT like you would read
    a normal ntuple (without using Athena).
  • The goal is to seamlessly use Athena tools.
  • One can use identical code/tools to run on ESDs,
    AODs, DPDs.
  • The names of the variables in the AOD ROOT tree
    are the same as in the AOD.
  • Limitations
  • However it uses the transient classes and
    converters of the ATLAS software so a portion of
    the offline is needed. A 1GB distribution
    including Athena libraries.
  • Tools and data that need detector description,
    conditions, B-field etc, cannot be called in ARA.
    However this type of info can be put in UserData
    in DPD.
  • Gaudi based classes (like AlgTools, Services)
    dont work in ARA. Wrapping machinery is needed
    to reuse the code in Athena/ARA.

26
ARA Examples (1)
  • CINT macros
  • Easy development (change code and run),
  • Run time is slow x10 C compiled code
  • C compiled code
  • Slower development (change code, recompile,
    cannot reload libs)
  • Fastest runtime
  • Integrates easily back into Athena
  • Python scripts
  • Easy development (change code, reload and run)
  • Simple example shows runtime x3 C compiled
    code
  • May be able to compile Python
  • Integration of developed code into Athena?
  • Examples on Twiki and in Release
  • https//twiki.cern.ch/twiki/bin/view/Atlas/AthenaR
    OOTAccess
  • PhysicsAnalysis/AthenaROOTAccessExamples

27
ARA Examples (2)
  • Available in CVS under PhysicsAnalysis/AthenaROOTA
    ccessExamples
  • Need python script to open file and setup
    transient tree
  • lxplusgt get_files AthenaROOTAccess/test.py
  • Compiled C Example
  • lxplusgt root
  • root 0 TPythonExec("execfile('test.py')")
  • root 1 CollectionTree_trans (TTree
    )gROOTgtGet("CollectionTree_trans")
  • root 2 ClusterExample ce // Example class in
    AthenaROOTAccessExamples
  • root 3 ce.plot(CollectionTree_trans)
  • root 4 TruthInfo ti
  • root 5 ti.truth_info(CollectionTree_trans)
  • test.py takes about 20 secs to load necessary
    dictionaries
  • One can recompile and then restart from the
    beginning

28
ARA Examples (3)
  • CINT Example
  • lxplusgt root
  • root 0 TPythonExec("execfile('test.py')")
  • root 1 CollectionTree_trans (TTree
    )gROOT-gtGet("CollectionTree_trans")
  • root 2 gROOT-gtLoadMacro("AthenaROOTAccessExample
    s/macros/cluster_example.C")
  • root 3 plot(CollectionTree_trans)
  • One can now edit cluster_example.C and re-run
    LoadMacro
  • Python Example
  • lxplusgt python -i test.py
  • gtgtgt import AthenaROOTAccessExamples.cluster_exampl
    e
  • gtgtgt AthenaROOTAccessExamples.cluster_example.plot(
    tt)
  • One can now edit cluster_example.py and re-run
  • gtgtgt reload(AthenaROOTAccessExamples.cluster_exampl
    e)
  • gtgtgt AthenaROOTAccessExamples.cluster_example.plot(
    tt)

29
Analysis Frameworks EventView (1)
  • This framework provides general tools for common
    analysis tasks like
  • particle selection
  • overlap removal
  • observable calculation
  • combinatorics
  • Recalibration
  • systematics evaluation
  • generating ntuples
  • Users can perform a great deal of their analyses
    in Athena by chaining and configuring a set of
    these tools and producing an ntuple for further
    analysis in ROOT.
  • Twiki page

https//twiki.cern.ch/twiki/bin/view/Atlas/EventVi
ew
30
Analysis Frameworks EventView (2)
  • Though this style of "modular" analysis usually
    does not require writing C, the EventView
    framework is completely extensible, so if
    necessary users can easily develop and mix their
    own C tools with the common EventView tools and
    share their configurations and tools with other
    collaborators.
  • Most users are introduced to EventView through
    one of the "View" packages (eg TopView, SusyView,
    HighPtView) which for the most part collect
    configurations of EventView tools for a specific
    set of analyses and produce a standard ntuple
    output.
  • These users typically start by analyzing the View
    ntuples produced by the various physics working
    groups, and then continue to re-configuring and
    re-running the respective View package if they
    require additional tuning for their specific
    analyses.
  • There also efforts to evolve (the persistent
    piece of) EventView in the context of
    AthenaROOTAccess.

31
We will practice with the tools during the
tutorial.
Write a Comment
User Comments (0)
About PowerShow.com