SignalBackground Discrimination in Particle Physics - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

SignalBackground Discrimination in Particle Physics

Description:

Signal/Background Discrimination Harrison B. Prosper SAMSI, March 2006. 1 ... Every signal/background discrimination method is ultimately an algorithm to ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 35
Provided by: sisla0
Category:

less

Transcript and Presenter's Notes

Title: SignalBackground Discrimination in Particle Physics


1
Signal/Background Discrimination in Particle
Physics
  • Harrison B. Prosper
  • Florida State University
  • SAMSI
  • 8 March, 2006

2
Outline
  • Particle Physics Data
  • Signal/Background Discrimination
  • Summary

3
Particle Physics Data
proton anti-proton -gt positron (e) neutrino
(n) Jet1 Jet2 Jet3 Jet4 This event is
described by (at least) 3 2 3 x 4
17 measured quantities.
4
Particle Physics Data
106
1
H0 Standard Model H1 Model of the Week
5
Signal/Background Discrimination
  • To minimize misclassification probability,
    compute
  • p(Sx) p(xS) p(S) / p(xS) p(S) p(xB)
    p(B)
  • Every signal/background discrimination method is
    ultimately an algorithm to approximate this
    function, or a mapping thereof.
  • p(s) / p(b) is the prior signal to background
    ratio, that is, it is S/B before applying a cut
    to p(Sx).

6
Signal/Background Discrimination
  • Given
  • D x, y
  • x x1,xN, y y1,yN
  • of N training examples (events)
  • Infer
  • A discriminant function f(x, w), with parameters
    w
  • p(wx, y) p(x, yw) p(w) / p(x, y)
  • p(yx, w) p(xw) p(w) / p(yx) p(x)
  • p(yx, w) p(w) / p(yx)
  • assuming p(xw) -gt p(x)

7
Signal/Background Discrimination
  • A typical likelihood for classification
  • p(yx, w) Pi f(xi, w)y 1 f(xi, w)1-y
  • where y 0 for background events
  • y 1 for signal events
  • If f(x, w) flexible enough, then maximizing
    p(yx, w) with respect to w yields f p(Sx),
    asymptotically.

8
Signal/Background Discrimination
  • However, in a Bayesian calculation it is more
    natural to average
  • y(x) ? f(x, w) p(wD) dw
  • Questions
  • 1. Do suitably flexible functions f(x, w) exist?
  • 2. Is there a feasible way to do the integral?

9
Answer 1 Yes!
  • Hilberts 13th problem Prove a special case of
    the conjecture
  • The following is impossible, in general,
  • f(x1,,xn) F( g1(x1),, gn(xn) )
  • In 1957, Kolmogorov proved the
  • contrary A function fRn -gt R can be
  • represented as follows
  • f(x1,..,xn) ?i12n1 Qi( ?j1n Gij(xj) )
  • where Gij are independent of f(.)

10
Kolmogorov Functions
A neural network is an example of a Kolmogorov
function, that is, a function capable of
approximating arbitrary mappings fRn -gt R
The parameters w (u, a, v, b) are called weights
11
Answer 2 Yes!
  • Computational Method
  • Generate a Markov chain (MC) of N points w,
    whose stationary density is p(wD), and average
    over the last M points.
  • Map problem into that of particle moving in a
    spatially-varying potential and use methods of
    statistical mechanics to generate states (p, w)
    with probability exp(-b H),
  • where H is the Hamiltonian
  • H log p(wD) p2, with momentum p.

12
Hybrid Markov Chain Monte Carlo
  • Computational Method
  • For a fixed H traverse space (p, w) using
    Hamiltons equations, which guarantees that all
    points consistent with H will be visited with
    equal probability exp(-bH).
  • To allow exploration of states with differing
    values of H one introduces, periodically, random
    changes to the momentum p.
  • Software
  • Flexible Bayesian Modeling by Radford Neal
  • http//www.cs.utoronto.ca/radford/fbm.software.h
    tml

13
Example 1
14
Example 1 1-D
  • Signal
  • ppbar -gt t q b
  • Background
  • ppbar -gt W b b
  • NN Model Class
  • (1, 15, 1)
  • MCMC
  • 500 tqb Wbb events
  • Use last 20 points in a chain of 10,000,

Wbb
tqb
x
skipping every 20th
15
Example 1 1-D
Dots p(Sx) HS/(HSHB) HS, HB, 1-D
histograms Curves Individual NNs n(x,
wk) Black curve lt n(x, w) gt
x
16
Example 2
17
Example 2 14-D (Finding Susy!)
Transverse momentum spectra Signal black curve
Signal/Noise 1/25,000
18
Example 2 14-D (Finding Susy!)
Missing transverse momentum spectrum (caused
by escape of neutrinos and Susy particles)
Measured quantities 4 x (ET, h, f) (ET,
f) 14
19
Example 2 14-D (Finding Susy!)
  • Signal
  • 250 ppbar -gt gluino, gluino (Susy) events
  • Background
  • 250 ppbar -gt top, anti-top events
  • NN Model Class
  • (14, 40, 1) (w ? 641-D parameter space!)
  • MCMC
  • Use last 100 networks in a Markov chain of
    10,000, skipping every 20.

Likelihood Prior
20
Results
Network distribution beyond n(x) gt 0.9 Assuming
L 10 fb-1 Cut S B S/vB 0.90 5x103 2x106
3.5 0.95 4x103 7x105 4.7 0.99 1x103 2x104
7.0
21
But Does It Really Work?
  • Let
  • d(x) N p(xS) N p(xB)
  • be the density of the data, containing 2N
    events, assuming, for simplicity, p(S) p(B).
  • A properly trained classifier y(x) approximates
  • p(Sx) p(xS)/p(xS) p(xB)
  • Therefore, if the data (signal background) are
    weighted with y(x), we should recover the signal
    density.

22
But Does It Really Work?
It seems to!
23
Example 3
24
Particle Physics Data, Take 2
  • Two varieties of jet
  • Tagged (Jet 1, Jet 4)
  • Untagged (Jet 2, Jet 3)
  • We are often interested in
  • Pr(TaggedJet Variables)

25
Example 3 Tagging Jets
p(xT) or d(x)
p(Tx) p(xT) p(T) / d(x) d(x) p(xT) p(T)
p(xU) p(U) x (PT, h, f) (red curve is
d(x)!)
Tagged-jet
Untagged-jet
collision point
26
Probability Density Estimation
  • Approximate a density by a sum over kernels K(.),
    one placed at each of the N points xi of the
    training sample.
  • h is one or more smoothing parameters adjusted to
    provide the best approximation to the true
    density p(x).
  • If h is too small, the model will be very spiky
    if h is too large, features of the density p(x)
    will be lost.

27
Probability Density Estimation
  • Why does this work? Consider the limit as N -gt 8
    of
  • In the limit N -gt 8, the true density p(x) will
    be recovered provided that h -gt 0 in such a way
    that

28
Probability Density Estimation
  • As long as the kernel behaves sensibly in the N
    -gt 8 limit any kernel will do. In practice, the
    most commonly used kernel is the product of 1-D
    Gaussians, one for each dimension i
  • One advantage of the PDE approximation is that it
    contains very few adjustable parameters
    basically, the smoothing parameters.

29
Example 3 Tagging Jets
Projections of estimated p(Tx) (black curve)
onto the PT, h and f axes. Blue points ratio of
blue to red histograms (see slide 25)
Tagged-jet
collision point
Untagged-jet
30
Example 3 Tagging Jets
Projections of data weighted by p(Tx). Recovers
tagged density p(xT).
Tagged-jet
Untagged-jet
collision point
31
But, How Well Does It Work?
How well do the n-D model and the n-D data
agree? A thought (JL, HBP) 1. Project the
model and the data onto the same set of randomly
directed rays through the origin. 2. Compute some
measure of discrepancy for each pair of
projections. 3. Do something sensible with this
set of numbers!!
Tagged-jet
Untagged-jet
collision point
32
But, How Well Does It Work?
Projections of p(Tx) onto 3 randomly chosen rays
through the origin.
Tagged-jet
Untagged-jet
collision point
33
But, How Well Does It Work?
Projections of weighted tagged untagged data
onto the 3 randomly selected rays.
Tagged-jet
Untagged-jet
collision point
34
Summary
  • Multivariate methods have been applied with
    considerable success in particle physics,
    especially for classification. However, there is
    considerable room for improving our understanding
    of them as well as expanding their domain of
    application.
  • The main challenge is data/model comparison when
    each datum is a point in 120 dimensions. During
    the SAMSI workshop we hope to make some progress
    on the use of projections onto multiple rays.
    This may be an interesting area for collaboration
    between physicists and statisticians.
Write a Comment
User Comments (0)
About PowerShow.com