Perspective on Astrostatistics from Particle Physics - PowerPoint PPT Presentation

1 / 47
About This Presentation
Title:

Perspective on Astrostatistics from Particle Physics

Description:

Perspective on Astrostatistics from Particle Physics Louis Lyons Particle Physics, Oxford (CDF experiment, Fermilab) l.lyons_at_physics.ox.ac.uk SCMA IV – PowerPoint PPT presentation

Number of Views:133
Avg rating:3.0/5.0
Slides: 48
Provided by: lyo61
Category:

less

Transcript and Presenter's Notes

Title: Perspective on Astrostatistics from Particle Physics


1
Perspective on Astrostatistics from Particle
Physics
  • Louis Lyons
  • Particle Physics, Oxford
  • (CDF experiment, Fermilab)
  • l.lyons_at_physics.ox.ac.uk

  • SCMA IV

  • Penn State

  • 15th June 2006

2
Topics
  • Basic Particle Physics analyses
  • Similarities between Particles and Astrophysics
    issues
  • Differences
  • What Astrophysicists do particularly well
  • What Particle Physicists have learnt
  • Conclusions

3
Particle Physics
  • What it is
  • Typical experiments
  • Typical data
  • Typical analysis

4
(No Transcript)
5
Typical Experiments
  • Experiment Energy Beams events
    Result
  • LEP 200 GeV e e- 107
    Z N 2.987 0.008
  • BaBar/Belle 10 GeV e e- 108 B
    anti-B CP-violation
  • Tevatron 2000 GeV p anti-p 1014
    SUSY?
  • LHC 14000 GeV p p
    (2007) Higgs?
  • K?K 3 GeV ?µ
    100 ? oscillations

6
(No Transcript)
7
K2K -from KEK to Kamioka- Long Baseline Neutrino Oscillation Experiment      
                                                                                                                                                                                                                                                                                                                                                 
8
CDF at Fermilab
9
Typical Analysis
  • Parameter determination dn/dt 1/t
    exp(-t / t)
  • Worry about backgrounds, t resolution,
    t-dependent efficiency
  • 1) Reconstruct tracks
  • 2) Select real events
  • 3) Select wanted events
  • 4) Extract t from L and v
  • 5) Model signal and background
  • 6) Likelihood fit for lifetime and statistical
    error
  • 7) Estimate systematic error
  • t st (stat) st (syst)
  • Goodness of fit

10
Typical Analysis
Hypothesis testing Peak or statistical
fluctuation?
  • Why 5s ?
  • Past experience
  • Look elsewhere
  • Bayes priors

11
Similarities
  • Large data sets ATLAS event Mbyte
    total 10 Pbytes
  • Experimental resolution
  • Acceptance
  • Systematics
  • Separating signal from background
  • Parameter estimation
  • Testing models (versus alternative?)
  • Search for signals Setting limits or Discovery
  • SCMA and PHYSTAT

12
Differences
  • Bayes or Frequentism? ?
  • Background
  • Specific Astrophysics issues
  • Time dependence
  • Spatial structures
  • Correlations
  • Non-parametric methods
  • Visualisation
  • Cosmic variance
  • Blind analyses

13
Bayesian versus Frequentism
Bayesian Frequentist
Basis of method Bayes Theorem ? Posterior probability distribution Uses pdf for data, for fixed parameters
Meaning of probability Degree of belief Frequentist definition
Prob of parameters? Yes Anathema
Needs prior? Yes No
Choice of interval? Yes Yes (except FC)
Data considered Only data you have ... other possible data
Likelihood principle? Yes No
14
Bayesian versus Frequentism
Bayesian Frequentist
Ensemble of experiments No Yes (but often not explicit)
Final statement Posterior probability distribution Parameter values ? Data is likely
Unphysical/ empty ranges Excluded by prior Can occur
Systematics Integrate over prior Extend dimensionality of frequentist construction
Coverage Unimportant Built-in
Decision making Yes (uses cost function) Not useful


15
Bayesianism versus Frequentism
Bayesians address the question everyone is
interested in, by using assumptions no-one
believes
Frequentists use impeccable logic to deal with
an issue of no interest to anyone
16
Differences
  • Bayes or Frequentism?
  • Background
  • Specific Astrophysics issues
  • Time dependence
  • Spatial structures
  • Correlations
  • Non-parametric methods
  • Visualisation
  • Cosmic variance
  • Blind analyses

17
What Astrophysicists do well
  • Glorious pictures ?
  • Scientists Statisticians working together
  • Sharing data
  • Making data publicly available
  • Dealing with large data sets
  • Visualisation
  • Funding for Astrostatistics
  • Statistical software

18
Whirlpool Galaxy
Width of Z0 3 light neutrinos
19
What Astrophysicists do well
  • Glorious pictures
  • Sharing data
  • Making data publicly available
  • Dealing with large data sets
  • Visualisation
  • Funding for Astrostatistics
  • Statistical software

20
What Particle Physicists now know
  • ?(ln L) 0.5 rule
  • Unbinned Lmax and Goodness of fit
  • Prob (data hypothesis) ? Prob (hypothesis
    data)
  • Comparing 2 hypotheses
  • ?(c 2) ? c 2
  • Bounded parameters Feldman and Cousins
  • Use correct L (Punzi effect)
  • Blind analyses

21
?lnL -1/2 rule
If L(µ) is Gaussian, following definitions of s
are equivalent 1) RMS of L(µ) 2) 1/v(-d2L/dµ2)
3) ln(L(µs) ln(L(µ0)) -1/2 If L(µ) is
non-Gaussian, these are no longer the
same Procedure 3) above still gives interval
that contains the true value of parameter µ with
68 probability (Even Gaussian L(µ) might not)
Heinrich CDF note 6438 (see CDF Statistics
Committee Web-page) Barlow Phystat05
22
COVERAGE How
often does quoted range for parameter include
params true value? N.B. Coverage is a property
of METHOD, not of a particular exptl
result Coverage can vary with Study coverage
of different methods of Poisson parameter
, from observation of number of events n Hope
for
100
Nominal value
23
COVERAGE If true for all
correct coverage
Plt for some undercoverage

(this is serious !)
Pgt for some overcoverage
Conservative Loss of rejection power
24
Coverage L approach (Not frequentist)
P(n,µ) e-µµn/n! (Joel Heinrich CDF note
6438) -2 ln?lt 1 ? P(n,µ)/P(n,µbest)
UNDERCOVERS
25
Frequentist central intervals, NEVER
undercover(Conservative at both ends)
26
  • (n-µ)2/µ ? 0.1 24.8
    coverage?
  • NOT frequentist Coverage 0 ? 100

27
Unbinned Lmax and Goodness of Fit?
Find params by maximising L So larger L better
than smaller L So Lmax gives Goodness of Fit??
Great?
Good?
Bad
Monte Carlo distribution of unbinned Lmax
Frequency
Lmax
28

Unbinned Lmax and Goodness of Fit?
  • Not necessarily
    pdf
  • L (data,params)

  • fixed vary
    L
  • Contrast pdf (data,params) param
  • vary fixed



  • data
  • e.g.
  • p(t?) ? e -?t
  • Max at t 0

    Max at ? 1/ t
  • p
    L
  • t
    ?

29
Lmax and Goodness of Fit?
Example 1 Fit exponential to times t1, t2 ,t3
. Joel Heinrich, CDF 5639 L ln
Lmax -N(1 ln tav) i.e. Depends only on
AVERAGE t, but is INDEPENDENT OF DISTRIBUTION OF
t (except for..) (Average t is a
sufficient statistic) Variation of Lmax in Monte
Carlo is due to variations in samples average t
, but NOT TO BETTER OR WORSE FIT

pdf Same average t
same Lmax

t


30

Example 2 Lmax and Goodness of
Fit?

1 a cos2 ? L

cos ? pdf (and likelihood) depends
only on cos2 ?i Insensitive to sign of cos ?i So
data can be in very bad agreement with expected
distribution e.g. All data with cos ? lt 0 and
Lmax does not know about it.
Example of general principle
31
Lmax and Goodness of Fit?
Example 3 Fit to Gaussian with variable µ, fixed
s ln Lmax N(-0.5 ln 2p ln s) 0.5 S(xi
xav)2 / s2 constant
variance(x) i.e. Lmax depends only on
variance(x), which is not relevant for fitting µ
(µest xav) Smaller than expected
variance(x) results in larger Lmax
x

x Worse fit,
larger Lmax Better
fit, lower Lmax
32
Transformation properties of pdf and L
  • Lifetime example dn/dt ? e ?t
  • Change observable from t to y vt
  • dn/dy (dn/dt) (dt/dy) 2 y ? exp(?y2)
  • So (a) pdf CHANGES but
  • (b)
  • i.e. corresponding integrals of pdf are INVARIANT

33
Now for Likelihood When parameter changes from ?
to t 1/? (a) L does not change dn/dt 1/t
exp-t/t and so L(tt) L(?1/tt) because
identical numbers occur in evaluations of the two
Ls BUT (b) So it is NOT meaningful to
integrate L (However,)
34
  • CONCLUSION
  • NOT recognised
    statistical procedure
  • Metric dependent
  • t range agrees with tpred
  • ? range inconsistent
    with 1/tpred
  • BUT
  • Could regard as black box
  • Make respectable by L Bayes
    posterior
  • Posterior(?) L(?) Prior(?)
    and Prior(?) can be constant

35
pdf(t?) L(?t)
Value of function Changes when observable is transformed INVARIANT wrt transformation of parameter
Integral of function INVARIANT wrt transformation of observable Changes when param is transformed
Conclusion Max prob density not very sensible Integrating L not very sensible
36
P (DataTheory) P (TheoryData)
Theory male or female Data pregnant or not
pregnant P (pregnant female) 3
37
P (DataTheory) P (TheoryData)
Theory male or female Data pregnant or not
pregnant
P (pregnant female) 3 but P (female
pregnant) gtgtgt3
38
P (DataTheory) P (TheoryData)
HIGGS SEARCH at CERN
Is data consistent with Standard Model?
or with Standard Model Higgs?
End of Sept 2000 Data not very consistent with
S.M. Prob (Data S.M.) lt 1 valid
frequentist statement Turned by the press into
Prob (S.M. Data) lt 1 and therefore Prob
(Higgs Data) gt 99 i.e. It is almost certain
that the Higgs has been seen
39
p-value ? Prob of hypothesis being correct
  • Given data and H0 null hypothesis,
  • Construct statistic T (e.g. ?2)
  • p-value probability T? tobserved, assuming H0
    true
  • If p 10-3, what is prob that H0 true?
  • e.g. Try to identify µ in beam (H0 particle µ)
    with p contam.
  • Prob (H0) depends on
  • a) similarity of µ and p masses
  • b) relative populations of µ and p
  • If N(p) N(µ), prob(H0) ? 0.5
  • If N(p) ?? N(µ), prob(H0) 1 0
    p 1
  • If N(p) 10N(µ), prob(H0) 0.1
  • i.e. prob(H0) varies with p-value, but is not
    equal to it

40
p-value ? Prob of hypothesis being correct
  • After Conference Banquet speech
  • Of those results that have been quoted as
    significant at the 99 level, about half have
    turned out to be wrong!
  • Supposed to be funny, but in fact is perfectly OK

41
PARADOX
  • Histogram with 100 bins
  • Fit 1 parameter
  • Smin ?2 with NDF 99 (Expected ?2 99 14)
  • For our data, Smin(p0) 90
  • Is p1 acceptable if S(p1) 115?
  • YES. Very acceptable ?2 probability
  • NO. sp from S(p0 sp) Smin 1 91
  • But S(p1) S(p0) 25
  • So p1 is 5s away from best
    value

42
(No Transcript)
43
(No Transcript)
44
(No Transcript)
45
(No Transcript)
46
Comparing data with different hypotheses
47
?2 with ? degrees of freedom?
  • ? data free parameters ?
  • Why asymptotic (apart from Poisson ? Gaussian) ?
  • a) Fit flatish histogram with
  • y N 1 10-6 cos(x-x0) x0 free param
  • b) Neutrino oscillations almost degenerate
    parameters
  • y 1 A sin2(1.27 ?m2 L/E) 2
    parameters
  • 1 A (1.27 ?m2 L/E)2
    1 parameter Small ?m2

48
?2 with ? degrees of freedom?
  • 2) Is difference in ?2 distributed as ?2 ?
  • H0 is true.
  • Also fit with H1 with k extra params
  • e. g. Look for Gaussian peak on top of smooth
    background
  • y C(x) A exp-0.5 ((x-x0)/s)2
  • Is ?2H0 - ?2H1 distributed as ?2 with ? k 3
    ?
  • Relevant for assessing whether enhancement in
    data is just a statistical fluctuation, or
    something more interesting
  • N.B. Under H0 (y C(x)) A0 (boundary of
    physical region)

  • x0 and s undefined

49
Is difference in ?2 distributed as ?2 ?
Demortier H0 quadratic bgd H1
Gaussian of fixed width
  • Protassov, van Dyk, Connors, .
  • H0 continuum
  • H1 narrow emission line
  • H1 wider emission line
  • H1 absorption line
  • Nominal significance level 5

50
Is difference in ?2 distributed as ?2 ?
  • So need to determine the ??2 distribution by
    Monte Carlo
  • N.B.
  • Determining ??2 for hypothesis H1 when data is
    generated according to H0 is not trivial, because
    there will be lots of local minima
  • If we are interested in 5s significance level,
    needs lots of MC simulations

51
(No Transcript)
52
(No Transcript)
53
(No Transcript)
54
(No Transcript)
55
(No Transcript)
56
Xobs -2 Now gives upper limit
57
(No Transcript)
58
(No Transcript)
59
Getting L wrong Punzi effect
  • Giovanni Punzi _at_ PHYSTAT2003
  • Comments on L fits with variable resolution
  • Separate two close signals, when resolution s
    varies event by event, and is different for 2
    signals
  • e.g. 1) Signal 1 1cos2?
  • Signal 2 Isotropic
  • and different parts of detector give
    different s
  • 2) M (or t)
  • Different numbers of tracks ?
    different sM (or st)

60
Punzi Effect
Events characterised by xi and si A events
centred on x 0 B events centred on x
1 L(f)wrong ? f G(xi,0,si) (1-f)
G(xi,1,si) L(f)right ? fp(xi,siA) (1-f)
p(xi,siB)
p(S,T) p(ST) p(T)
p(xi,siA) p(xisi,A) p(siA)
G(xi,0,si)
p(siA) So L(f)right ?f G(xi,0,si) p(siA)
(1-f) G(xi,1,si) p(siB) If p(sA)
p(sB), Lright Lwrong but NOT otherwise
61
Punzi Effect
  • Punzis Monte Carlo for A G(x,0,
    sA)

  • B G(x,1, sB)

  • fA 1/3

  • Lwrong
    Lright
  • sA sB
    fA sf
    fA sf
  • 1.0 1.0
    0.336(3) 0.08 Same
  • 1.0 1.1 0.374(4)
    0.08 0. 333(0) 0
  • 1.0 2.0 0.645(6)
    0.12 0.333(0) 0
  • 1 ? 2 1.5 ?3
    0.514(7) 0.14 0.335(2) 0.03
  • 1.0 1 ? 2
    0.482(9) 0.09 0.333(0) 0
  • 1) Lwrong OK for p(sA) p(sB) , but
    otherwise BIASSED
  • 2) Lright unbiassed, but Lwrong biassed
    (enormously)!
  • 3) Lright gives smaller sf than Lwrong

62
Explanation of Punzi bias
sA 1 sB 2
A events with s 1
B events with s
2 x ?

x ? ACTUAL DISTRIBUTION
FITTING FUNCTION

NA/NB variable, but same for A
and B events Fit gives upward bias for NA/NB
because (i) that is much better for A events
and (ii) it does not hurt too much for B events
63
Another scenario for Punzi problem PID
A B
p K M

TOF Originally Positions of peaks constant
K-peak ? p-peak at large momentum si
variable, (si)A ? (si)B si
constant, pK ? pp COMMON FEATURE Separation /
Error ? Constant
Where else?? MORAL Beware of event-by-event
variables whose pdfs do not
appear in L
64
Avoiding Punzi Bias
  • Include p(sA) and p(sB) in fit
  • (But then, for example, particle
    identification may be determined more by momentum
    distribution than by PID)
  • OR
  • Fit each range of si separately, and add (NA)i ?
    (NA)total, and similarly for B
  • Incorrect method using Lwrong uses weighted
    average of (fA)j, assumed to be independent of j
  • Talk by Catastini at PHYSTAT05

65
BLIND ANALYSES
  • Why blind analysis? Selections, corrections,
    method
  • Methods of blinding
  • Add random number to result
  • Study procedure with simulation only
  • Look at only first fraction of data
  • Keep the signal box closed
  • Keep MC parameters hidden
  • Keep fraction visible for each bin hidden
  • After analysis is unblinded, ..
  • Luis Alvarez suggestion re discovery of free
    quarks

66
Conclusions
  • Common problems scope for learning from each
    other
  • Large data sets
  • Separating signal
  • Testing models
  • Signal / Discovery
  • Targetted Workshops e.g.
  • SAMSI Jan?May 2006
    Banff July 2006
  • (Limits with nuisance params significance
    tests signal-bgd sepn.)
  • Summer schools Spain, FNAL,
  • Thanks to Stefen Lauritzen, Lance Miller, Subir
    Sarkar, SAMSI, Roberto Trotta.

67
  • Excellent Conference
  • Very big THANK YOU
  • to Jogesh and Eric !
Write a Comment
User Comments (0)
About PowerShow.com