Stefan Arnborg, KTH - PowerPoint PPT Presentation

About This Presentation
Title:

Stefan Arnborg, KTH

Description:

Antoine Augustine Cournot (1801--1877) Pioneer in stochastic processes, market theory ... Jan Walker, Pentagon spokeswoman, in Wired, Dec 2, 2002. ... – PowerPoint PPT presentation

Number of Views:123
Avg rating:3.0/5.0
Slides: 41
Provided by: nada8
Category:

less

Transcript and Presenter's Notes

Title: Stefan Arnborg, KTH


1
DD2447, DD3342, spring 2011
Statistical Methods in Applied Computer Science
http//www.nada.kth.se/stefan
Stefan Arnborg, KTH
2
SYLLABUS Common statistical models and their
use Bayesian, testing, and fiducial statistical
philosophy Hypothesis choice Parametric
inference Non-parametric inference Elements of
regression Clustering Graphical statistical
models Prediction and retrodiction Chapman-Kolmogo
roff formulation Evidence theory, estimation and
combination of evidence. Support Vector Machines
and Kernel methods Vovk/Gammerman hedged
prediction technology Stochastic simulation,
Markov Chain Monte Carlo. Variational Bayes
3
LEARNING GOALS After successfully taking this
course, you will be able to -motivate the use
of uncertainty management and statistical
methodology in computer science applications, as
well as the main methods in use, -account for
algorithms used in the area and use the
standard tools, -critically evaluate the
applicability of these methods in new contexts,
and design new applications of uncertainty
management, -follow research and development in
the area.
4
GRADING DD2447 Bologna grades Grades are E-A
during 2009. 70 of homeworks and a very short
oral discussion of them gives grade C. Less gives
F-D. For higher grades, essentially all
homeworks should be turned in on time.
Alternative assignments will be substituted for
those homeworks you miss. For grade B you must
pass one Master's test, for grade A you must do
two Master's tests or a project with some
research content. DD3342 Pass/Fail Research
level project, or deeper study of part of course
5
(No Transcript)
6
(No Transcript)
7
(No Transcript)
8
Applications of Uncertainty everywhere
  • Medical Imaging/Research (Schizophrenia)
  • Land Use Planning
  • Environmental Surveillance and Prediction
  • Finance and Stock
  • Marketing into Google
  • Robot Navigation and Tracking
  • Security and Military
  • Performance Tuning

9
Some Masters Projects using this syllabus
(subset)
  • Recommender system for Spotify
  • Behavior of mobile phone users
  • Recommender system for book club
  • Recommender for job search site
  • Computations in evolutionary genetics
  • Gene hunting
  • Psychiatry genes, anatomy, personality
  • Command and control Situation awareness
  • Diagnosing drilling problems
  • Speech, Music,

10
(No Transcript)
11
Aristotle Logic
Logic as a semi-formal system was created by
Aristotle, probably inspiredby current practice
in mathematicalarguments. There is no record of
Aristotle himselfapplying logic, but probably
the Elementsof Euclid derives from Aristotles
illustrations of the logical method.
Which role has logic in Computer Science??
12
Visualization
  • Visualize data in such a way that the important
    aspects are obvious - A good visualization
    strikes you as a punch between your eyes (Tukey,
    1970)
  • Pioneered by Florence Nightingale, first female
    member of Royal Statistical Society, inventor of
    pie charts and performance metrics

13
Probabilistic approaches
  • Bayes Probability conditioned by observation
  • Cournot An event with very small probability
    will not happen.
  • Vapnik-Chervonenkis VC-dimension and
    PAC,distribution-independence
  • Kolmogorov/Vovk A sequence is random if it
    cannot be compressed

14
Peirce Abduction and uncertainty
Aristotles induction , generalizingfrom
particulars, is considered invalidby strict
deductionists.Peirce made the concept clear, or
atleast confused on a higher level. Abduction
is verification by findinga plausible
explanation. Key processin scientific progress.
15
Sherlock Holmes common sense inference
Techniques used by Sherlock are modeled on Conan
Doyles professor in medical school, who followed
the methodological tradition of Hippocrates and
Galen. Abductive reasoning, first spelled out by
Peirce, is found in 217 instances in
Sherlock Holmes adventures - 30 of them in the
first novel, A study in Scarlet.
16
Thomas Bayes,amateur mathematician
If we have a probability modelof the world we
know how to compute probabilities of
events. But is it possible to learn aboutthe
world from events we see? Bayes proposal was
forgottenbut rediscovered by Laplace.
17
Antoine Augustine Cournot (1801--1877)Pioneer in
stochastic processes, market theoryand
structural post-modernism. Predicted demise of
academic system due to discourses of
administration and excellence(cf Readings).
  • An alternative to Bayes method - hypothesis
    testing - is based on Cournots Bridgean
    event with very small probability will not happen

18
Kolmogorov and randomness
Andrei Kolmogorov(1903-1987) is the mathematician
best known for shaping probability theory into a
modern axiomatized theory. His axioms of
probability tells how probability measures are
defined, also on infinite and infinite-dimensional
event spaces and complex product
spaces. Kolmogorov complexity characterizes a
random string by the smallest size of a
description of it. Used to explain Vovk/Gammerman
scheme of hedged prediction. Also used in MDL
(Minimum Description Length) inference.
19
Normative claim of Bayesianism
  • EVERY type of uncertainty should be treated as
    probability
  • This claim is controversial and not universally
    accepted Fisher(1922), Cramér, Zadeh, Dempster,
    Shafer, Walley(1999)
  • Students encounter many approaches to uncertainty
    management and identify weaknessess in
    foundational arguments.

20
Foundations for Bayesian Inference
  • Bayes method, first documented methodbased on
    probability Plausibility of event depends on
    observation, Bayes rule
  • Bayes rule organizing principle for uncertainty
  • Parameter and observation spaces can be extremely
    complex, priors and likelihoods also.
  • MCMC current approach -- often but not always
    applicable (difficult when posterior has many
    local maxima separated by low density regions)
  • Variational Bayes approximate posterior by
    factorized function result also approximate.

21
Showcase application PET-camera
Camera geometrynoise film scene regularity
and also any other camera or imaging device
22
PET camera
likelihood
prior
D film, count by detector pair jX
radioactivity in voxel ia camera geometry
Inference about Y gives posterior,its mean is
often a good picture
23
Sinogram and reconstruction
Tumour
Fruit FlyDrosophila family (Xray)
24
(No Transcript)
25
(No Transcript)
26
WIRED on Total Information Awareness WIRED
(Dec 2, 2002) article "Total Info System Totally
Touchy" discusses the Total Information
Awareness system. Quote "People have to
move and plan before committing a terrorist act.
Our hypothesis is their planning process has a
signature." Jan Walker, Pentagon spokeswoman, in
Wired, Dec 2, 2002. "What's alarming is the
danger of false positives based on incorrect
data," Herb Edelstein, in Wired, Dec 2, 2002.
27
Combination of evidence
In Bayes method, evidence is likelihood for
observation.
28
Particle filter-general tracking
29
Chapman Kolmogorov version of Bayes rule
30
Berry and Linoff have eloquently stated their
preferences with the often quoted
sentence "Neural networks are a good choice for
most classification problems when the results of
the model are more important than
understanding how the model works". Neural
networks typically give the right answer
31
(No Transcript)
32
(No Transcript)
33
1950-1980 The age of rationality. Let us
describe the world witha mathematical model and
compute the best way to manage it!! This is a
large Bayesian Network, a popular statistical
model
34
Ed Jaynes devoted a large part of his career to
promoteBayesian inference. He also championed
theuse of Maximum Entropy in physics Outside
physics, he received resistance from people who
hadalready invented other methods.Why should
statistical mechanics say anything about our
daily human world??
35
Robust Bayes
  • Priors and likelihoods are convex sets of
    probability distributions (Berger, de Finetti,
    Walley,...) imprecise probability
  • Every member of posterior is a parallell
    combination of one member of likelihood and one
    member of prior.
  • For decision making Jaynes recommends to use
    that member of posterior with maximum entropy
    (Maxent estimate).

36
SVM and Kernel method
Based on Vapnik-Chervonenkis learning
theorySeparate classes by wide margin
hyperplane classifier,or enclose data points
between close parallell hyperplanesfor
regression Possibly after non-linear mapping to
highdimensional spaceAssumption is only point
exchangeability
37
Classify with hyperplanes
Frank Rosenblatt (1928 1971) Pioneering work
in classifying byhyperplanes in high-dimensional
spaces. Criticized by Minsky-Papert, sincereal
classes are not normallylinearly separable. ANN
research taken up again in1980s, with
non-linear mappingsto get improved
separation.Predecessor to SVM/kernel methods
38
Find parallel hyperplanes
Classification Red true separatingplane. Blue
wide marginseparation in sample Classify by
planebetween blue planes
39
SVM and Kernel method
40
Vovk/Gammerman Hedged predictions
  • Based on Kolmogorov complexity ornon-conformance
    measure
  • In classification, each prediction comes with
    confidence
  • Asymptotically, misclassifications appear
    independently and with probability 1-confidence.
  • Only assumption is exchangeability
Write a Comment
User Comments (0)
About PowerShow.com