Multivariate Discriminant Analysis applied to classification of ne CC events in MINOS Update - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

Multivariate Discriminant Analysis applied to classification of ne CC events in MINOS Update

Description:

nt. 12/1/09. Alex Sousa-MINOS Meeting. 13. Results (Efficiencies ) ... Investigate discriminating power of cosq vs f distributions. ... – PowerPoint PPT presentation

Number of Views:92
Avg rating:3.0/5.0
Slides: 30
Provided by: minosPh
Category:

less

Transcript and Presenter's Notes

Title: Multivariate Discriminant Analysis applied to classification of ne CC events in MINOS Update


1
Multivariate Discriminant Analysisapplied to
classification of ne CC events in MINOS-Update-
  • Alex Sousa
  • Tufts University
  • MINOS Collaboration Meeting _at_ Fermilab
  • 09/18/2004

2
Changes since Ely
  • Using new MDC sample fully reprocessed with R1.9.
  • Consolidation of analysis framework
  • Developed in C/ROOT, independent of Minossoft
    downstream of ntuple generation.
  • Unsynced Reco and Truth trees Handling -
    corrected some bugs and increased algorithm speed
    and robustness.
  • Easy reading of analysis variables with cuts and
    easy creation of oscillated samples - sample
    generation problem with handling nt fixed
  • New Event Display
  • CVS repository of the code available on the Tufts
    MINOS server.

3
Analysis Overview
MDC ntuples
Variables ntuple
Variable generation
Sample selection
Cuts
Computation of results
Event Display
MDA output
SAS Input format
Variable selection
MDA classification
4
Samples
  • Sample contents
  • Constructed from 20 nm, 9 ne, and 39 nt MDC
    ntuples processed with release R1.9 in the batch
    farm and in the Tufts server.
  • Visible energy and track length cuts optimized
    through use of decision trees (see Maylis talk)
  • Mild containment cut eliminates background from
    nm truncated at the end of the detector
  • Training Sample

5
Samples
  • Test Sample

6
Variable Selection
  • Variable selection is performed using SAS
    Stepwise discriminant procedure
  • Original 77 variables sorted by discriminant
    power
  • 45 variables selected for running on the training
    sample
  • Best results for 18 variables
  • uv_rms
  • plane_n
  • ph_pe
  • nstrip
  • uv_kurt
  • trk_plane_ntrklike
  • e_hit_total
  • ntrack
  • s_hit_trans_ratio
  • shw_nstrip_ratio
  • trk_pe_ratio shw_ph_nstrip max_pe_plane chisq_ndf
  • uv_asym_peak e_hit_long e_hit_trans
    trk_chi2_ndof

7
Some Selected Variables
8
MDA output (Probability Distributions)
Training Sample
9
Threshold Determination
  • Calculate the training sample Figure Of Merit for
    several possible thresholds. Apply threshold
    corresponding to highest FOM to test sample
    classification.

10
Results (Energy Distributions )
Test Sample (no threshold)
NC
ne
Signal
BG
BG
nm
nt
BG
11
Results (Energy Distributions )
Test Sample (T0.88)
NC
ne
nm
nt
12
Results (Energy Distributions )
Test Sample (T0.92)
ne
NC
nm
nt
13
Results (Efficiencies )
Test Sample ( no osc, no threshold)
NC
ne
nt
nm
14
Results (Efficiencies )
Test Sample (osc, no threshold)
NC
ne
nm
nt
15
Results (Efficiencies )
Test Sample (T0.88)
NC
ne
nt
nm
16
Results (Efficiencies )
Test Sample (T0.92)
NC
ne
nt
nm
17
Results (ne appearance signal)
Test Sample (T0.88)
Test Sample (T0.92)
SignalBG
SignalBG
BG
BG
18
Results (comparison table)
19
Some events
20
Some events
21
Some events
22
Current and Future Work
  • Investigate discriminating power of cosq vs f
    distributions.
  • Look at other less trivial threshold cuts in the
    probability space. Evaluate method stability in
    multiple randomly generated test samples.
  • Fit signal and background histograms to the test
    sample instead of fixing a threshold.
  • Apply method to the Near Detector MDC files.

23
Fitting (very preliminary)
  • Instead of defining probability thresholds, fit
    the complete MDA probability histograms defined
    on the training sample to the ones obtained from
    classification on the test sample.

24
Fitting (very preliminary)
25
Fitting (very preliminary)
26
MDA Procedure
  • Define a set of variables that
    appropriately describes the data sample.
  • Calculate the covariance matrix for each class
  • Determine the Mahalanobis distance to each class
    for each event
  • Compute the probabilities for an event to belong
    to each class (scores).

27
Energy Distributions
Training Sample (no threshold)
28
Energy Distributions
Training Sample (T0.88)
29
Energy Distributions
Training Sample (T0.92)
Write a Comment
User Comments (0)
About PowerShow.com