Probability Frequentist versus Bayesian and why it matters - PowerPoint PPT Presentation

1 / 48
About This Presentation
Title:

Probability Frequentist versus Bayesian and why it matters

Description:

German insured male racing drivers. Each of these gives a different number. ... But m 3 does bracket x 68% of the time: The statement MT lies between 171 and ... – PowerPoint PPT presentation

Number of Views:310
Avg rating:3.0/5.0
Slides: 49
Provided by: RogerB99
Category:

less

Transcript and Presenter's Notes

Title: Probability Frequentist versus Bayesian and why it matters


1
ProbabilityFrequentist versus Bayesianand why
it matters
  • Roger Barlow
  • Manchester University
  • 11th February 2008

2
Outline
  • Different definitions of probability Frequentist
    and Bayesian
  • Measurements definitions usually give the same
    results
  • Differences in dealing with
  • Nongaussian measurements
  • Small number counting
  • Constrained parameters
  • Difficulties for both
  • Conclusions and recommendations

3
What is Probability?
  • A is some possible event or fact.
  • What is P(A)?
  • Classical An intrinsic property
  • Frequentist Limit N?? N(A) / N
  • Bayesian My degree of belief in A

What do we mean by P(A)?
4
Classical (Laplace and others)
  • Symmetry factor
  • Coin ½
  • Cards 1/52
  • Dice 1/6
  • Roulette 1/32
  • Equally likely outcomes

The probability of an event is the ratio of the
number of cases favourable to it, to the number
of all cases possible when nothing leads us to
expect that any one of these cases should occur
more than any other, which renders them, for us,
equally possible. Théorie analytique des
probabilités
Extend to more complicated systems of several
coins, many cards, etc.
5
Classical Probability Breaks down
  • Cant handle continuous variables
  • Bertrands paradox if we draw a chord at random,
    what is the probability that it is longer than
    the side of the triangle?
  • Answer 1/3 or ½ or 1/4

Cannot enumerate equally likely cases in a
unique way
6
Frequentist Probability (von Mises, Fisher)
Ensemble of Everything
A
  • Limit of frequency
  • P(A) Limit N?? N(A)/N
  • This was a property of the classical definition,
    now promoted to become a definition itself
  • P(A) depends not just on A but on the ensemble
    which must be specified.
  • This leads to two surprising features

7
Feature 1There can be many Ensembles
  • Probabilities belong to the event and the
    ensemble
  • Insurance company data shows P(death) for 40 year
    old male clients 1.4 (example due to von
    Mises)
  • Does this mean a particular 40 year old German
    has a 98.6 chance of reaching his 41st Birthday?
  • No. He belongs to many ensembles
  • German insured males
  • German males
  • Insured nonsmoking vegetarians
  • German insured male racing drivers
  • Each of these gives a different number. All
    equally valid.

8
Feature 2 Unique events have no ensemble
  • Some events are unique.
  • Consider
  • It will probably rain tomorrow.
  • There is only one tomorrow (Tuesday 12th
    February). There is NO ensemble. P(rain) is
    either 0/1 0 or 1/1 1
  • Strict frequentists cannot say 'It will probably
    rain tomorrow'.
  • This presents severe social problems.

9
Circumventing the limitation
  • A frequentist can say
  • The statement It will rain tomorrow has a 70
    probability of being true.
  • by assembling an ensemble of statements and
    ascertaining that 70 (say) are true.
  • (E.g. Weather forecasts with a verified track
    record)
  • Say It will rain tomorrow with 70 confidence
  • For unique events, confidence level statements
    replace probability statements.

10
Bayesian (Subjective) Probability
  • P(A) is a number describing my degree of belief
    in A
  • 1certain belief. 0total disbelief
  • Can be calibrated against simple classical
    probabilities.
  • P(A)0.5 means I would be indifferent given the
    choice of betting on A or betting on a coin
    toss.
  • A can be anything death, rain, horse races,
    existence of SUSY
  • Very adaptable. But no guarantee my P(A) is the
    same as your P(A). Subjective unscientific?

11
Bayes Theorem
  • General (uncontroversial) form
  • P(AB)P(B) P(A B) P(BA) P(A )
  • P(AB)P(BA) P(A)
  • P(B)
  • P(B) can be written P(BA) P(A) P(Bnot A)
    (1-P(A))
  • Examples
  • People P(ArtistBeard)P(BeardArtist) P(Artist)

  • P(Beard)
  • ? /K Cherenkov counter P(?signal)P(signal ?)
    P(?)
  • P(signal)
  • Medical diagnosis P(diseasesymptom)P(symptomd
    isease) P(disease)
  • P(symptom)

0.90.5/(.9.5.01.5) 0.989
12
Bayes Theorem
  • Bayesian form
  • P(TheoryData)P(DataTheory) P(Theory)
  • P(Data)
  • Theory may be an event (e.g. rain tomorrow)
  • Or a parameter value (e.g. Higgs Mass) then
    P(Theory) is a function
  • P(MHData)P(DataMH) P(MH)
  • P(Data)

Prior
Posterior
13
MeasurementsBayes at work
  • Result value x Theoretical true value ?
    P(?x)?? P(x?) P(?)
  • Prior is generally taken as uniform
  • Ignore normalisation problems
  • Construct theory of measurements prior of
    second measurement is posterior of the first
  • P(x?) is often Gaussian, but can be anything
    (Poisson, etc)
  • For Gaussian measurement and uniform prior, get
    Gaussian posterior

14
Aside Objective Bayesian statistics
  • Attempt to lay down rule for choice of prior
  • Uniform is not enough. Uniform in what?
  • Suggestion (Jeffreys) uniform in a variable for
    which the expected Fisher information ltd2ln
    L/dx2gtis minimum (statisticians call this a
    flat prior).
  • Has not met with general agreement different
    measurements of the same quantity have different
    objective priors

15
Measurement and Frequentist probability
  • MT174?3 GeV What does it mean?
  • For true value ? the probability (density) for a
    result x is (for the usual Gaussian measurement)
  • P(x ?, ?)(1/ ??2?) exp-(x -?)2/2?2
  • For a given ?, the probability that x lies within
    ?? is 68. This does not mean that for a given x,
    the inverse probability that ? lies within ??
    is 68
  • P(x ?, ?) cannot be used as a probability for
    ?.
  • (It is called the likelihood function for ? given
    x.)

MT174?3 GeV Is there a 68 probability that MT
lies between 171 and 177 GeV? No. MT is unique.
It is either in the range or outside. (Soon
well know.) But m ? 3 does bracket x 68 of the
time The statement MT lies between 171 and 177
GeV has a 68 probability of being true. MT lies
between 171 and 177 GeV with 68 confidence
16
Pause for breath
  • For Gaussian measurements of quantities with no
    constraints/objective prior knowledge the same
    results are given by
  • Frequentist confidence intervals
  • Bayesian posteriors from uniform priors
  • A frequentist and a simple Bayesian will report
    the same outcome from the same raw data, except
    one will say confidence and the other
    probability. They mean something different but
    such concerns can be left to the philosophers

17
Frequentist confidence intervals beyond the
simple Gaussian
  • Select Confidence Level value CL and strategy
  • From P(x ,?) choose construction (functions
    x1(?), x2(?)) for which
  • P(x??x1(?), x2(?)) ?? CL for all ?
  • Given a measurement X, make statement
  • ????LO, ?HI _at_ CL
  • Where Xx2(?LO), Xx1(?HI)
  • (Neyman technique)

18
Confidence Belt
Constructed horizontally such that the
probability of a result lying inside the belt is
68(or whatever) Read vertically using the
measurement
m
Example proportional Gaussian ? 0.1 ? Measures
with 10 accuracy Result (say) 100.0 ?LO90.91
?HI 111.1
x
X
19
Bayesian Proportional Gaussian
  • Likelihood function
  • C exp(- ½(?-100)2/(0.1 ?)2)
  • Integration gives C0.03888
  • 68 (central) limits
  • 92.6 and 113.8
  • Different techniques give different answers

68
16
16
20
Small number counting experiments
  • Poisson distribution P(r ? ) e-? ? r / r!
  • For large ? can use Gaussian approx. But not
    small ?
  • Frequentists Choose CL. Just use one curve to
    give upper limit
  • Discrete observable makes smooth curves into ugly
    staircases
  • Observe n. Quote upper limit as ?HI from solving
  • ?0n P(r, ?HI) ?0n e-?HI ?HI r/r! 1-CL
  • Translation. n is small. ? cant be very large.
    If the true value is ?HI (or higher) then the
    chance of a result this small (or smaller) is
    only (1-CL) (or less)

21
Frequentist Poisson Table
  • Upper limits
  • n 90 95 99
  • 0 2.30 3.00 4.61
  • 1 3.89 4.74 6.64
  • 2 5.32 6.30 8.41
  • 3 6.68 7.75 10.05
  • 4 7.99 9.15 11.60
  • 5 9.27 10.51 13.11
  • .....

22
Bayesian limits from small number counts
P(?)
  • P(r,?)exp(- ?) ? r/r!
  • With uniform prior this gives posterior for ?
  • Shown for various small r results
  • Read off intervals...

r0
r1
m
r2
r6
23
Upper limits
  • Upper limit from n events
  • ?0?HI exp(- ?) ?n/n! d? CL
  • Repeated integration by parts
  • ?0n exp(- ?HI) ?HIr/r! 1-CL
  • Same as frequentist limit
  • This is a coincidence! Lower Limit formula is
    not the same

24
Result depends on Prior
  • Example 90 CL Limit from 0 events
  • Prior flat in m
  • Prior flat in ?m

2.30
X


X
1.65
25
Which is right?
  • Bayesian Method is generally easier, conceptually
    and in practice
  • Frequentist method is truly objective. Bayesian
    probability is personal degree of belief. This
    does not worry biologists but should worry
    physicists.
  • Ambiguity appears in Bayesian results as
    differences in prior give different answers,
    though with enough data these differences vanish
  • Check for robustnesss under change of prior is
    standard statistical technique, generally ignored
    by physicists
  • Uniform priors is not a sufficient answer.
    Uniform in what?

26
Problems for FrequentistsAdd a background
mSb
  • Frequentist method (b known, m measured, S
    wanted)
  • Find range for m
  • Subtract b to get range for S
  • Examples
  • See 5 events, background 1.2
  • 95 Upper limit 10.5 ? 9.3 ?
  • See 5 events, background 5.1
  • 95 Upper limit 10.5 ? 5.4 ?
  • See 5 events, background 10.6
  • 95 Upper limit 10.5 ? -0.1 ?

27
Slt -0.1? Whats going on?
  • If Nltb we know that there is a downward
    fluctuation in the background. (Which happens)
  • But there is no way of incorporating this
    information without messing up the ensemble
  • Really strict frequentist procedure is to go
    ahead and publish.
  • We know that 5 of 95CL statements are wrong
    this is one of them
  • Suppressing this publication will bias the global
    results

28
Similar problems
  • Expected number of events must be non-negative
  • Mass of an object must be non-negative
  • Mass-squared of an object must be non-negative
  • Higgs mass from EW fits must be bigger than LEP2
    limit of 114 GeV
  • 3 Solutions
  • Publish a clearly crazy result
  • Use Feldman-Cousins technique
  • Switch to Bayesian analysis

29
mSb for Bayesians
  • No problem!
  • Prior for m is uniform for S?b
  • Multiply and normalise as before
  • Posterior Likelihood
    Prior
  • Read off Confidence Levels by integrating
    posterior

X

30
Another Aside Coverage
  • Given P(x?) and an ensemble of possible
    measurements xi and some confidence level
    algorithm, coverage is how often ? LO??? ??HI
    is true.
  • Isnt that just the confidence level? Not quite.
  • Discrete observables may mean the confidence belt
    is not exact move on side of caution
  • Other nuisance parameters may need to be taken
    account of again erring on side of caution
  • Coverage depends on ?. For a frequentist it is
    never less than the CL (undercoverage). It may
    be more (overcoverage) this is to be
    minimised but not crucial
  • For a Bayesian coverage is technically irrelevant
    but in practice useful

31
Bayesian pitfall(Heinrich and others)
  • Observe n events from Poisson with ???Sb
  • Channel strength S unknown. Flat prior
  • Efficiency x luminosity ? - from sub-experiment
    with flat prior
  • Background b - from sub-experiment with flat
    prior
  • Investigated coverage and all OK
  • Partition into classes (e.g. different run
    periods)
  • Coverage falls!

32
Whats happening
  • Problem due to efficiency x Lumi priors
  • ? ?1? 2 ? 3
  • Uniform density in all N components means P (?)??
    ?N-1
  • Solve by taking priors P(?i)?? 1/ ?i
  • (Arguments you should have done so in the first
    place Jeffreys Prior)

?2
?1
33
Another example Unitarity triangle
  • Measure CKM angle ? by measuring B???? decays
    (charged and neutral, branching ratios and CP
    asymmetries). 6 quantities.
  • Many different parametrisations suggested
  • Uniform priors in different parametrisations give
    different results from each other and from a
    Frequentist analysis (according to CKMfitter
    disputed by UTfit)
  • For a complex number zxiyrei?? a flat prior
    in x and y is not the same as a flat prior in r
    and ?

34
Loss of ambiguities?
  • Toy example
  • Measure X(??)21.00 ?0.07, Y ?21.10?0.07
  • ? Is interesting but ? is a nuisance parameter
  • Clearly 4 fold ambiguity ???1, ??0 or ?2
  • Frequentists stop there
  • Bayesians integrate over ? and get a peak at ??0
    double that at ?2 feature persists whatever the
    prior used for ?.
  • Is this valid? (Real example will be more
    subtle)

35
Conclusions
  • Frequentist statistics cannot do everything
  • Bayesian statistics can be dangerous. Choose
    between
  • 1) Never use it
  • 2) Use only if frequentist method has problems
  • 3) Use only with care and expert guidance and
    always check for robustness under different
    priors
  • 4) Use as investigative tool to explore possible
    interpretations
  • 5) Just plug in the package and write down the
    results
  • But always know what you are doing and say what
    you are doing.

36
Backup slides
37
Incorporating Constraints Poisson
  • Work with total source strength (sb) you know is
    greater than the background b
  • Need to solve
  • Formula not as obvious as it looks.

38
Feldman Cousins MethodWorks by attacking what
looks like a different problem...
by Feldman and Cousins, mostly
39
Feldman Cousins msbb is known. N is
measured. s is what we're after
  • This is called 'flip-flopping' and BAD because is
    wrecks the whole design of the Confidence Belt
  • Suggested solution
  • 1) Construct belts at chosen CL as before
  • 2) Find new ranking strategy to determine what's
    inside and what's outside

1 sided 90
2 sided 90
40
Feldman Cousins Ranking
  • First idea (almost right)
  • Sum/integrate over range of (sb) values with
    highest probabilities for this observed N.
  • (advantage that this is the shortest interval)
  • Glitch Suppose N small. (low fluctuation)
  • P(Nsb) will be small for any s and never get
    counted
  • Instead compare to 'best' probability for this
    N, at sN-b or s0 and rank on that number
  • Such a plot does an automatic flip-flop
  • Nb single sided limit (upper bound) for s
  • Ngtgtb 2 sided limits for s

41
How it works
  • Has to be computed for the appropriate value of
    background b. (Sounds complicated, but there is
    lots of software around)
  • As n increases, flips from 1-sided to 2-sided
    limits but in such a way that the probability
    of being in the belt is preserved

s
n
Means that sensible 1-sided limits are quoted
instead of nonsensical 2-sided limits!
42
Arguments against using Feldman Cousins
  • Argument 1
  • It takes control out of hands of physicist. You
    might want to quote a 2 sided limit for an
    expected process, an upper limit for something
    weird
  • Counter argument
  • This is the virtue of the method. This control
    invalidates the conventional technique. The
    physicist can use their discretion over the CL.
    In rare cases it is permissible to say We set a
    2 sided limit, but we're not claiming a signal

43
Feldman Cousins Argument 2
Example you reward a good student with a lottery
ticket which has a 10 chance of winning 10. A
moderate student gets a ticket with a 1 chance
of winning 20. They both win. Were you unfair?
  • Argument 2
  • If zero events are observed by two experiments,
    the one with the higher background b will quote
    the lower limit. This is unfair to hardworking
    physicists
  • Counterargument
  • An experiment with higher background has to be
    lucky to get zero events. Luckier experiments
    will always quote better limits. Averaging over
    luck, lower values of b get lower limits to
    report.

Example you reward a good student with a lottery
ticket which has a 10 chance of winning 10. A
moderate student gets a ticket with a 1 chance
of winning 20. They both win. Were you unfair?
44
3. Including Systematic Errors
  • maSb
  • m is predicted number of events
  • S is (unknown) signal source strength. Probably a
    cross section or branching ratio or decay rate
  • a is an acceptance/luminosity factor known with
    some (systematic) error
  • b is the background rate, known with some
    (systematic) error

45
3.1 Full Bayesian
  • Assume priors
  • for S (uniform?)
  • For a (Gaussian?)
  • For b (Poisson or Gaussian?)
  • Write down the posterior P(S,a,b).
  • Integrate over all a,b to get marginalised P(s)
  • Read off desired limits by integration

46
3.2 Hybrid Bayesian
  • Assume priors
  • For a (Gaussian?)
  • For b (Poisson or Gaussian?)
  • Integrate over all a,b to get marginalised
    P(r,S)
  • Read off desired limits by ?0nP(r,S) 1-CL etc
  • Done approximately for small errors (Cousins and
    Highland). Shows that limits pretty insensitive
    to ?a , ?b
  • Numerically for general errors (RB java applet
    on SLAC web page). Includes 3 priors (for a) that
    give slightly different results

47
3.3-3.9
  • Extend Feldman Cousins
  • Profile Likelihood Use P(S)P(n,S,amax,bmax)
    where amax,bmax give maximum for this S,n
  • Empirical Bayes
  • And more
  • Results being compared as outcome from Banff
    workshop

48
Summary
  • Straight Frequentist approach is objective and
    clean but sometimes gives crazy results
  • Bayesian approach is valuable but has problems.
    Check for robustness under choice of prior
  • Feldman-Cousins deserves more widespread adoption
  • Lots of work still going on
  • This will all be needed at the LHC
Write a Comment
User Comments (0)
About PowerShow.com