Title: AuthorTopic Modeling from Large Document Collections Investigators: Padhraic Smyth, Mark Steyvers, U
1Optimal predictions in everyday cognitionTom
Griffiths Josh Tenenbaum Brown
University MIT
Predicting the future
Results
Optimality and Bayesian inference
The effects of prior knowledge
Strategy examine the influence of prior
knowledge in an inductive problem we solve every
day
Many people believe that perception is optimal
What should we use as the prior, p(ttotal)?Gott
(1993) use the uninformative priorp(ttotal)
?1/ttotalYields a simple prediction rulet
2t
- How often is Google News updated?
- t time since last update
- ttotal time between updates
- What should we guess for ttotal given t?
people
empirical prior
but cognition is not.
parametric prior
Gotts rule
More generally
Nonparametric priors
t ? 4000 years, t ? 8000 years
- You encounter a phenomenon that has existed for t
units of time. How long will it continue into the
future? (i.e. whats ttotal?) - We could replace time with any other variable
that ranges from 0 to some unknown upper limit
Predicting everyday events
You arrive at a friends house, and see that a
cake has been in the oven for 34 minutes. How
long will it be in the oven? People make good
predictions despite the complex distribution
- This seems like a good strategy
- You meet someone who is 35 years old. How long
will they live? - 70 years seems reasonable
- But, its not so simple
- You meet someone who is 78 years old. How long
will they live? - You meet someone who is 6 years old. How long
will they live?
In particular, there is controversy over whether
peoples inferences follow Bayes rule
Everyday prediction problems
- You read about a movie that has made 60 million
to date. How much money will it make in total? - You see that something has been baking in the
oven for 34 minutes. How long until its ready? - You meet someone who is 78 years old. How long
will they live? - Your friend quotes to you from line 17 of his
favorite poem. How long is the poem? - You see taxicab 107 pull up to the curb in front
of the train station. How many cabs in this city?
No direct experience
The effects of priors
You learn that in ancient Egypt, there was a
great flood in the 11th year of a pharaohs
reign. How long did he reign?
h hypothesis d data
How long did the typical pharaoh reign in ancient
Egypt? People identify the form, but are
mistaken about the parameters
which indicates how a rational agent should
update beliefs about hypotheses h in light of
data d. Several results suggest people do not
combine prior probabilities with data correctly.
(e.g.,
Tversky Kahneman, 1974)
Bayesian inference
p(ttotalt) ? p(tttotal) p(ttotal) assuming
random sampling, the likelihood is p(tttotal)
1/ttotal
Conclusions
Evaluating human predictions
A puzzle
posterior probability
likelihood
prior
- Different domains with different priors
- a movie has made 60 million
power-law - your friend quotes from line 17 of a poem
power-law - you meet a 78 year old man
Gaussian - a movie has been running for 55 minutes
Gaussian - a U.S. congressman has served for 11 years
Erlang - Prior distributions derived from actual data
- Use 5 values of t for each
- People predict ttotal
- A total of 350 participants and ten scenarios
- People produce accurate predictions for the
duration and extent of everyday events - People have strong prior knowledge
- form of the prior (power-law or exponential)
- distribution given that form (parameters)
- non-parametric distribution when necessary
- Reveals a surprising correspondence between
probabilities in the mind and in the world, and
suggests that people do use prior probabilities
in making inductive inferences
- If they do not use priors, how do people
- predict the future
- infer causal relationships
- identify the work of chance
- assess similarity and make generalizations
- learn languages and concepts
- and solve other inductive problems?
- Drawing strong conclusions from limited data
requires using prior knowledge
What is the best guess for ttotal? (call it t)
Not the maximal value of p(ttotalt) (thats just
t t)
We use the posterior median P(ttotal lt tt) 0.5
tt
p(ttotalt)
p(ttotalt)
ttotal
ttotal
ttotal t