Overview PowerPoint PPT Presentation

presentation player overlay
About This Presentation
Transcript and Presenter's Notes

Title: Overview


1
Predictive Tests
2
Overview
  • Introduction
  • Some theoretical issues
  • The failings of human intuitions in prediction
  • Issues in formal prediction
  • Inference from class membership The individual
    versus group problem (and its only solution)
  • Some predictive tests
  • Some wider implications Nonlinear predictions in
    sience and psychometrics

3
Predictive Tests
  • Many tests are used to make predictions, of
    levels of achievement or success, or of
    likelihood of recidivism, or diagnostic category
  • Two kinds of predictions
  • Categorical Predict which category this subject
    will fall into (diagnosis, occupation)
  • Numerical predict the value of a relevant
    numerical value (GPA, economic return to company)

4
The failings of human intuition
  • We have already seen many ways in which humans
    succumb to errors in numerical reasoning
  • Kahneman Tversky Asked subjects about areas of
    graduate specialization base rate estimation,
    estimates (from a description) of similarity to
    other students in each field, and predictive
    estimate (also from a description)

5
Results
  • Results
  • Similarity and prediction correlate at 0.97
  • Similarity and base rates correlate at -0.65
  • What does this result remind you of?
  • What do these subjects need to be taught?

6
6 Errors discussed by Kahneman Tversky
  • Representativeness error Assumes predictions are
    not different from assessments of similarity
  • Insufficient regression error People fail to
    take into account that when predictive validity
    is less than perfect, correlations between
    predictors and performance should be lt 1
  • Central tendency error Subjects making judgments
    tend to avoid extremes, and compress their
    judgments into a smaller range than the
    phenomenon being judged

7
6 Errors discussed by Kahneman Tversky
  • Discounting of prior probabilities Human
    predictors will throw out base rate info for
    almost any reason
  • Overweighting of coherence There is greater
    confidence in predictions based on consistent
    input than inconsistent input with the same
    average (i.e. two B's is better than a B C for
    predicting a B average)
  • Overweighting of extremes Confidence in judgment
    is over-weighted at extremes, especially positive
    extremes ( j-shaped confidence function)

8
What do we need to make good predictions?
  • We need three pieces of information
  • 1.) Base rates
  • 2.) Relevant predictors in the individual case
  • 3.) Bounds on accuracy (cutting scores)
  • Kahneman Tversky's experimental evidence
    (previous slides) show that subjects usually fail
    to weight any of these three properly

9
What can we infer from class membership?
  • Some commentators have suggested that inference
    from class membership is inherently fallacious
  • i.e. 25 of first-degree relatives of those
    diagnosed with malignant melanoma (skin cancer)
    will also develop melanoma
  • I am a first-degree relative of a person
    diagnosed with melanoma, so I take my odds of
    developing the disease to be 25
  • Critics of the inference say No, it is either 0
    (I don't develop the disease) or 100 (I do)
    i.e. group probabilities don't apply to
    individuals

10
Do group probabilities apply to individuals?
  • Meehl's response "If nothing is rationally
    inferable from membership in a class, no
    empirical prediction is ever possible"
  • The argument is a re-statement of the necessity
    of inference even in the case of predicting
    individual behavior from that individual's data,
    we need to consider the pattern over past data
  • Moreover, claim of 'certainty' is philosophical,
    not real in the absence of knowing which group
    you are in, there is only probability, not
    knowledge

11
Some Predictive Tests Standardized admission
tests
  • Thanks to Lily Tsui for these GRE slides
  • Scholastic Aptitude Tests (SAT, GREs) are highly
    reliable tests developed to painstaking
    psychometric standards
  • The general GRE has four sections verbal
    (including reading comprehension), quantitative
    (including chart comprehension), analytical, and
    a random test section
  • The subject test has 215 multiple choice
    questions
  • On psychology 40 experimental/natural science
    43 social science 17 general
  • The test is timed and corrected for guessing

12
Sample Verbal Questions
  • Analogies
  • ETERNAL END
  • a. precursory beginning
  • b. grammatical sentence
  • c. implausible credibility
  • d. invaluable worth
  • e. frenetic movement

13
Sample Verbal Questions
  • Sentence Completions
  • Museums, which house many paintings and
    sculptures, are good places for students of
    _____.
  • a. art
  • b. science
  • c. religion
  • d. dichotomy
  • e. democracy

14
Sample Verbal Questions
  • Antonyms
  • MALADROIT
  • a. ill-willed
  • b. dexterous
  • c. cowardly
  • d. enduring
  • e. sluggish

15
Sample Quantitative Questions
  • Quantitative Comparison
  • Column A y-6 Column B -3
  • If y gt 2
  • a. the quantity in column A is always greater
  • b. the quantity in column B is always greater
  • c. the quantities are always equal
  • d. It cannot be determined from the information
    given

16
Sample Quantitative Questions
  • Problem Solving
  • The sum of x distinct integers greater than zero
    is less than 75. What is the greatest possible
    value of x ?
  • a. 8
  • b. 9
  • c. 10
  • d. 11
  • e. 12

17
Sample Analytical Questions
  • A pastry shop will feature 5 desserts-- V,W,X,Y
    Z-- to be served Monday thru Friday, one dessert
    a day, that conforms to the following
    restrictions
  • Y must be served before V.
  • X and Y must be served on consecutive days.
  • Z may not be the second dessert to be served.

18
Reliability
  • Within-test reliability is 0.9
  • Test re-test reliability is not so good Repeat
    test takers for both tests show an average score
    gain of 20-30 points
  • This may move a student by a large amount more
    than 10 percentiles

19
Predictive Validity
  • In one meta-analysis by Sternberg and Williams,
    they point out that empirical validities of the
    GRE vary somewhat by field
  • GRE correlations between various combinations of
    GRE scores and grad school performance are only
    between 0.25 and 0.35, and only marginally better
    (0.4) if you include undergraduate grades

20
Correlations of GRE Scores
21
Construct Validity
  • Is the GRE getting at anything related to
    graduate school?
  • What about motivation, creativity, devotion,
    conscientiousness, and other aspects that make a
    successful graduate student?
  • Some complaints
  • Graduate assignments require that students
    develop research skills, but GRE does not test
    this
  • GRE is timed but real life is rarely timed
  • GRE is individualised but real work usually
    involves collaboration

22
Why is the GRE so popular?
  • Because is in the public eye
  • Since average scores for admissions on tests such
    as the GRE are published, there is pressure on
    schools to keep the average scores of the
    students that they accept high so that they can
    remain competitive with other institutions in
    the public eye
  • One strength of the GR that they have specific
    regression equation by college i.e. they can
    predict future performance at a particular
    college independently
  • Because there is relatively little variation in
    their reference letters and undergraduate GPA --gt
    GRE scores are one main sources of the variation
    that is needed to rank applicants

23
Some Predictive Tests The SAT
  • SAT r 0.4 with university GPA
  • By comparison, high school grade r 0.48
  • Together, r 0.55

24
Can you beat the standards?
  • Notwithstanding the huge industry waiting to take
    money from anxious high school students, studying
    for the SAT doesn't help much
  • SAT coaching increases scores by about 15 points,
    which is 0.15 SDs
  • Repeat testing increases it a little less, about
    12 points or 0.12 SDs

25
Some Predictive Tests Professional tests
  • Professional school tests (MCAT, LSAT)
  • MCAT r low .80s
  • LSAT r gt 0.9
  • There is relatively little evidence of validity
  • They predict performance about as well as
    undergraduate GPA alone r 0.25 - 0.3

26
Some Predictive Tests The Strong Interest
Inventory
  • The Strong (1927) Interest Inventory
    (Strong-Campbell, 1981) widely used test of
    interests as predictors of professional aptitude
  • Empirically constructed with concurrent validity,
    comparing each vocational group to the overall
    average
  • Has 325 items, 162 scales covering 85 occupations
  • Reliability is high
  • 0.9test/retest over weeks 0.6-0.7 over years
    unless they were old ( 25years!) at first test,
    then 0.8 even after 20 years
  • Does not predict success or satisfaction in a
    profession
  • Does predict likelihood of entering and remaining
    in a profession chances of 50 that a person
    will end up in a profession most strongly
    predicted (A score), and only 12 that he will
    end in one least predicted (C score)

27
Prediction in scientific psychology
  • Prediction scientific explanation are related
  • We admire Newton's laws precisely because they
    are accurate in predicting real phenomena
  • Many cognitive models in psychology are purely
    descriptive they fail to make an effort to
    predict how a person will perform on unseen
    stimuli
  • There are many ways to do so, if you have
    sufficient variation in predictors multiple
    regression, neural networks, 'cheap' methods
    (i.e. best single predictor)

28
What is a linear relation?
  • Things are linearly related if they change in
    direct proportion to each other When one goes up
    or down at a constant rate, so does the other
  • Things are non-linearly related if changes in one
    are not mirrored by analogous changes in the
    other
  • Many biological systems are non-linear

29
Example Predicting lexical decision RTs
  • Lexical decision ( time to decide if a string is
    a word or not) is a simple task to perform
  • Many well-specified variables can be calculated
    for words frequency, similarity to other words,
    frequency of components
  • This allows for predictive testing How well can
    we predict how long it will take (average
    reaction time RT) to reach a decision about
    wordness?
  • We used 35 predictors, and a non-linear method of
    combining them (genetic programming) to predict
    average RTs

30
(No Transcript)
31
Some lessons about scientific prediction
  • Models can 'cheat' by using variance in the input
    data set that does not transfer to unseen data
    you must test your predictions on unseen data
  • Some models that are very good may be very good
    precisely because they are very good at using
    this 'within-set' variation
  • Very simple (3-variable) non-linear models may do
    as well or better than than much more complex
    models, especially linear models, and may exclude
    highly-correlated variables
  • Different measures of successful prediction may
    yield quite different results (i.e. test
    correlation versus 0.5 SD correlation)

32
Prediction in psychometrics
  • A test was designed to measure the construct of
    geekiness the extent to which a person is a
    geek.
  • This test was validated against a self-rating on
    a Likert scale.
  • The test consisted of 76 questions.
  • We split the data into two parts a validation
    set and a test set
  • The validation set contained 59 subjects.
  • The test set contained 30 subjects.

33
Prediction in psychometrics
Development Set Test Set
Summed score 0.54 0.59
Multiple regression 0.70 0.20
GP 0.89 0.56
  • The estimate produced by non-linear means is
    about as good at predicting scores on unseen
    tests as using the summed score.
  • However, the GP equation used a non-linear
    combination of responses to only 12 of the 76
    test questions in its prediction!

34
Prediction in psychometrics
  • The take-home message Linear assumptions may be
    very limiting
  • More predictive power may sometimes (perhaps
    often) be obtained by dropping the assumptions
    of linear relations between predictors and the
    quality to be predicted
Write a Comment
User Comments (0)