Multivariate%20Discriminant%20Analysis%20applied%20to%20Classification%20of%20NC%20Events%20in%20MINOS%20-Near%20Data/MC%20Comparisons- - PowerPoint PPT Presentation

About This Presentation
Title:

Multivariate%20Discriminant%20Analysis%20applied%20to%20Classification%20of%20NC%20Events%20in%20MINOS%20-Near%20Data/MC%20Comparisons-

Description:

Data test sample: 24 December LE-10 sub-run files (Run 9554) (~8x1017POT) Sample cuts ... Limit training set to variables displaying best agreement between data/mc ... – PowerPoint PPT presentation

Number of Views:58
Avg rating:3.0/5.0
Slides: 18
Provided by: minosdo
Category:

less

Transcript and Presenter's Notes

Title: Multivariate%20Discriminant%20Analysis%20applied%20to%20Classification%20of%20NC%20Events%20in%20MINOS%20-Near%20Data/MC%20Comparisons-


1
Multivariate Discriminant Analysisapplied to
Classification of NC Events in MINOS-Near
Data/MC Comparisons-
  • Alex Sousa
  • Oxford University
  • MINOS Collaboration Meeting
  • NC Working Group
  • 09/09/2006

2
Introduction
  • Update on work presented at last Friday NC phone
    meeting
  • Look at Near Detector Data/MC selection
    comparison
  • Data distributions of input variables
  • Use Near trained discriminant function to
    classify data
  • Selection performance
  • Note no reweighting applied.

3
Multivariate Discriminant Analysis
  • Define a set of discriminator variables
    that appropriately describes the data sample
  • Calculate the covariance matrix for each class
  • Determine the Mahalanobis distance to each class
    for each event
  • Compute the score for an event to belong to each
    class

4
Analysis Structure Overview
Sntp ntuples
Analysis Ntuples
SAS Input format
NCAnalysis Reader Module
Efficiency-Purity, Fitting, etc.
Cuts
NCExtraction Object
MDA output
Variable selection
MDA classification
5
Samples and Cuts
  • Near
  • Training sample 209 R1_18_2 (Carrot) Near MC
    files (1.027x1016 POT/file)
  • Test sample 209 R1_18_2 (Carrot) Near MC
    files (1.027x1016 POT/file)
  • No overlapping events
  • Data test sample 24 December LE-10 sub-run files
    (Run 9554) (8x1017POT)
  • Sample cuts
  • Vertex contained in Oxford fiducial volume
  • Far Longitudinal 0.272m lt vtxZ lt 13.66m 16.8m
    lt vtxZ lt 28.9m
  • Far Transverse vtx XY position at least 0.50m
    from closest edge
  • Coil hole vtx XY position gt
    0.45m radius around center
  • Near Longitudinal 1.212m lt vtxZ lt 4.766m
  • Near Transverse vtx XY position at least 0.50m
    from closest edge (of a partial plane)
  • Event length lt 50 planes
  • For data Use IsGoodBeamSnarl()

6
Variable Selection
  • Selection performed using Stepwise discriminant
    analysis via SAS
  • Input 75 variables directly available from
    AnalysisNtuples processing
  • Variables were sorted by discriminating power
  • The 14 best variables were chosen to form the
    multivariate discriminant

7
Selected Variables
8
Selected Variables
9
Selected Variables
10
MDA Probability Distributions
  • Using the multivariate discriminant, each event
    is assigned 2 scores or probabilities of
    belonging to the NC or CC group.

11
Prob. Threshold and Eff., Pur.
  • Using a probability threshold cut we can tune
    efficiency and purity values, e.g. Prob(NC) gt0.85
    gt Eff. lt92, Pur. gt52

Training Sample
Test Sample
12
Summary Table (Near Test)
  • Results obtained when applying the classifier
    obtained from the Far MC training sample to the
    Near MC sample (Note test sample418 files).

Near Test Sample NC CC Beam ne NC CC Beam ne
Total 176047 1180485 15241 100 100 100
FidVolOx Cut 24952 88473 1538 14 8 10
Ev. Length lt 50 plane cut 24830 46751 1521 14 4 10
MDA Class. (sig. and bg.) 23639 22131 1485 13 2 10
MDA Class. (Prob(NC)gt0.75) 22737 19704 1458 13 2 10
Divide Near test sample with cuts into 50 Training, 50 NewTest, create new MDA classifier Divide Near test sample with cuts into 50 Training, 50 NewTest, create new MDA classifier Divide Near test sample with cuts into 50 Training, 50 NewTest, create new MDA classifier Divide Near test sample with cuts into 50 Training, 50 NewTest, create new MDA classifier Divide Near test sample with cuts into 50 Training, 50 NewTest, create new MDA classifier Divide Near test sample with cuts into 50 Training, 50 NewTest, create new MDA classifier Divide Near test sample with cuts into 50 Training, 50 NewTest, create new MDA classifier
Total NewTest (with cuts) 12452 23443 792 -- -- --
MDA Class. (Prob(NC)gt0.85) 11445 9861 747 -- -- --
MDA Class (low efficiency) 6227 3742 583 -- -- --
NC Selection Efficiency 92 NC Selection Purity
52
NC Selection Efficiency 92 NC Selection Purity
52
NC Selection Efficiency 50 NC Selection Purity
59
13
Summary Table
Near Test Sample NC CC Beam ne
Test (with cuts) 12452 23443 792
MDA Class. (Prob(NC)gt0.85) 11445 9861 747
MDA Class (low efficiency) 6227 3742 583
Total Data (with cuts) 12841 12841 12841
MDA Class. (Prob(NC)gt0.85) Data 7549 4270 ----
NC Selection Efficiency 50 NC Selection Purity
59
NC Selection Efficiency 92 NC Selection Purity
52
14
Eff., Pur. vs Visible Energy
  • Near NC Selection efficiency and purity as a
    function of visible energy.

Near Training Sample
Near Test Sample
  • CC contamination in lowest energy bin more severe
    than in the Far case
  • Efficiency mostly flat between 0-6 GeV, decreases
    somewhat in high energy tail

15
Near NC Selection Distributions
  • Distributions for events from the Near Test/Data
    sample selected as NC.

16
Near NC Selection Distributions
  • Distributions for events from the Near Test
    sample selected as NC.

17
Future Work
  • Implement reweighting before variable selection
    and look for any improvements
  • Limit training set to variables displaying best
    agreement between data/mc
  • Perform classification in with/without track
    samples.
  • Use MDA selection and NCUtils implementation to
    go through the full analysis chain.
  • Data/MC agreement using the MDA method looks
    reasonable for a preliminary attempt, should only
    improve from here on
Write a Comment
User Comments (0)
About PowerShow.com