A segmented principal component analysis applied to calorimetry information at ATLAS

About This Presentation

Title:

A segmented principal component analysis applied to calorimetry information at ATLAS

Description:

Federal University of Rio de Janeiro - UFRJ. Brazilian Center for Physics Research - CBPF ... Data compaction (segmented principal component analysis) ... – PowerPoint PPT presentation

Number of Views:21

Avg rating:3.0/5.0

Slides: 12

Provided by: cbpf1

Category:

more less

Transcript and Presenter's Notes

Title: A segmented principal component analysis applied to calorimetry information at ATLAS

1
A segmented principal component analysisapplied
to calorimetry information at ATLAS
Federal University of Rio de Janeiro -
UFRJ Brazilian Center for Physics Research - CBPF
H. P. Lima Jr, J. M. de Seixas

ACAT 2005 - May 22-27, Zeuthen, Germany

2
Outline

Scenario
Signal processing
Data assembling
Data compaction (segmented principal component
analysis)
Particle discrimination (neural network)
Conclusions

3
Scenario

The ATLAS trigger system comprises three
distinct levels of event selection LVL1, LVL2
and Event Filter.
From an initial bunch crossing rate of 40 MHz,
the trigger system will select events up to 100
Hz for permanent storage.
LVL1 operates at 40 MHz with reduced granularity
information in order to take a fast decision. It
also defines Regions of Interest (RoI) that will
guide the LVL2 selection process.
At LVL2 complex algorithms operate over full
granularity information, with a maximum latency
of 10 ms.
The three levels of selection use information
provided by the calorimeter system due to its
fast response and the detailed energy deposition
profiles it provides.

The ATLAS trigger system.
4
Calorimeter system

The ALTAS calorimeter is very segmented and
presents high granularity.
The proposed system should address 11
subdetectors layers of the electromagnetic and
hadronic calorimeters
Pre-sampler, barrel
EM Calo, barrel, front layer
EM Calo, barrel, middle layer
EM Calo, barrel, back layer
Pre-sampler, endcap
EM Calo, endcap, front layer
EM Calo, endcap, middle layer
EM Calo, endcap, back layer
Hadronic Calo, barrel, layer 0
Hadronic Calo, barrel, layer 1
Hadronic Calo, barrel, layer 3

Cross section of the EM Calorimeter.
5
Signal processing

The proposed signal processing approach will
operate at Level 2, on calorimeter data, in order
to
Reduce the high computational load due to the
high granularity of the information
Speed up the selection process
Achieve higher particle identification
efficiency (main focus on electrons/jets
channel).
Proposed techniques
Segmented principal component analysis ? in
order to explore the highly segmented calorimeter
system, data representation is made at the layer
level instead of global random process
representation.
Neural networks for particle identification ?
projected data will be concatenated and fed into
a feedforward neural network for electron/jet
discrimination.

6
Data assembling

Simulated LVL2 data produced in the Athena
environment were used. They correspond to jets
and two signatures of the Higgs boson in the
following decays H?2e-2µ and H?4e-.
Two types of data assembling were tested direct
and ring.
Direct assembling ? each data vector is
organized group cells in the way they appear in
the RoI layer.
Ring assembling ? for each calorimeter layer,
the cell with the highest deposited energy is
identified, and the data vector is formed by
sequentially grouping rings of cells around this
marked cell.
This type of assembling puts in evidence the
energy deposition pattern of the incident
particle, which is an important feature that
makes further classification easier to achieve.

1
25
2
2
Principal component extraction
1
24
25
5 x 5 RoI
data vector
(cell 1 has the highest deposited energy)
7
Data compaction

Due to the high complexity of the calorimeter
system, raw random vectors have up to 3115
components (calorimeter cells).
The following table illustrates the level of
compaction achieved for each subdetector layer,
for different levels of random process energy
preservation.

Subdetector Layer Original Dimension Fraction of energy Fraction of energy Fraction of energy Fraction of energy Fraction of energy Fraction of energy Fraction of energy Fraction of energy Fraction of energy Fraction of energy
Subdetector Layer Original Dimension 82 82 85 85 90 90 95 95 98 98
Subdetector Layer Original Dimension ring direct ring direct ring direct ring direct ring direct
Pre-sampler - barrel 105 1 18 2 20 3 24 5 31 16 42
EM Calo barrel front layer 800 5 166 6 187 11 233 35 309 109 390
EM Calo barrel middle layer 400 3 64 3 74 4 95 7 131 29 173
EM Calo barrel back layer 200 1 32 2 38 3 52 6 77 27 108
Pre-sampler - endcap 60 1 13 1 14 3 19 5 27 12 36
EM Calo endcap front layer 720 3 175 4 194 6 232 15 290 45 350
EM Calo endcap middle layer 400 2 36 3 42 4 53 6 75 16 106
EM Calo endcap back layer 200 2 15 3 18 5 26 11 42 37 70
Hadronic Calo barrel layer 0 100 8 35 12 40 23 50 41 63 60 76
Hadronic Calo barrel layer 1 90 15 37 21 41 32 48 45 59 66 72
Hadronic Calo barrel layer 2 40 15 19 16 21 19 25 25 30 31 36
TOTAL 3115 56 610 73 689 113 857 201 1134 448 1459
8
Data compaction

The following figures illustrate how much we
gain with ring data assembling.

It is point out ring data assembling allows
higher levels of compaction, as expected, since
data vectors are organized according to the
energy deposition pattern. For ring data
assembling 11 components preserve 90 of the
energy.
9
Particle identification

Particle identification is performed by a simple
three layer feedforward neural network. All
neurons have hyperbolic tangent as activation
function.
The input layer receives the calorimeter data
projections, concatenated as a single input
vector.

10
Particle identification

Network training was realized with the Resilient
Backpropagation (RPROP) algorithm.
This training algorithm eliminates the harmful
effects of the magnitudes of the partial
derivatives. Only the sign of the derivative is
used to determine the direction of the weight
update.
First runs of training were realized by
splitting randomly the complete data set
available (24068 electrons and 2066 jets) into
two data sets with the same size training and
testing.
A training step comprised a random selection of
a electron/jet pair in order to avoid
overtraining on electrons due to the different
statistics.
Preliminary results 90 efficiency.

11
Conclusions

The segmented PCA is a very attractive signal
processing approach to the calorimeter
information at ATLAS. The reasons are the high
segmentation of the subdetectors and their high
granularity.
Ring data assembling, following the energy
deposition pattern, achieved considerably higher
levels of compaction than the simple organized
group cells of each RoI. Results demonstrate that
a compaction level of more than 96 is achieved
if 90 of the energy is preserved.
Another possible approach under study is the use
of ring sums for data assembling, also making the
energy deposition pattern clear.
The relevance of the principal components will
be also investigated in order to verify the
importance of each component to the neural
classifier.