Optical illusion ? - PowerPoint PPT Presentation

Loading...

PPT – Optical illusion ? PowerPoint presentation | free to download - id: 58ade4-OWI5N



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Optical illusion ?

Description:

Title: Packing Densities of Permutations Author: Walter Stromquist Last modified by: Walter Stromquist Created Date: 11/24/2003 12:56:23 AM Document presentation format – PowerPoint PPT presentation

Number of Views:107
Avg rating:3.0/5.0
Slides: 52
Provided by: walte47
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Optical illusion ?


1
  • Optical illusion ?
  • Correlation ( r or R or ? )
  • -- One-number summary of the strength of a
    relationship
  • -- How to recognize
  • -- How to compute
  • Regressions
  • -- Any model has predicted values and residuals.
  • (Do we always want a model with small residuals
    ? )
  • -- Regression lines
  • --- how to use
  • --- how to compute
  • -- The regression effect
  • (Why did Galton call these things regressions
    ? )
  • -- Pitfalls Outliers
  • -- Pitfalls Extrapolation
  • -- Conditions for a good regression

2
Which looks like a stronger relationship?
3
Optical Illusion ?
4
Kinds of Association
  • Positive vs. Negative
  • Strong vs. Weak
  • Linear vs. Non-linear

5
CORRELATION
  • CORRELATION
  • (or, the CORRELATION COEFFICIENT)
  • measures the strength of a linear relationship.
  • If the relationship is non-linear, it measures
    the strength of the linear part of the
    relationship. But then it doesnt tell the whole
    story.
  • Correlation can be positive or negative.

6

correlation .97
correlation .71
7

2
2
1
1
0
0
Y
Y
-1
-1
-1
0
1
2
-2
0
2
X
X
correlation .97
correlation .71
8

2
1
0
Y
-1
-1
0
1
2
X
correlation .97
correlation .97
9

correlation .24
correlation .90
10

correlation .50
correlation 0
11
Computing correlation
  • Replace each variable with its standardized
    version.
  • Take an average of ( xi times yi )

12
Computing correlation
sum of all the products
r, or R, or greek ? (rho)
n-1 or n ?
13
Good things about correlation
  • Its symmetric ( correlation of x and y means
    same as correlation of y and x )
  • It doesnt depend on scale or units
  • adding or multiplying either variable by
  • a constant doesnt change r
  • of course not r depend only on the
  • standardized versions
  • r is always in the range from -1 to 1
  • 1 means perfect positive correlation dots
    on line
  • -1 means perfect negative correlation dots
    on line
  • 0 means no relationship, OR no linear
    relationship

14
Bad things about correlation
  • Sensitive to outliers
  • Misses non-linear relationships
  • Doesnt imply causality

15
Made-up Examples
STATE AVE SCORE
PERCENT TAKING SAT
16
Made-up Examples
IQ
SHOE SIZE
17
Made-up Examples
JUDGES IMPRESSION
450
250
350
BAKING TEMP
18
Made-up Examples
LIFE EXPECTANCY
GDP PER CAPITA
19
Observed Values, Predictions, and Residuals
resp. var.
explanatory variable
20
Observed Values, Predictions, and Residuals
resp. var.
explanatory variable
21
Observed Values, Predictions, and Residuals
resp. var.
explanatory variable
22
Observed Values, Predictions, and Residuals
Observed value
Predicted value
resp. var.
Residual observed predicted
explanatory variable
23
Linear models and non-linear models
  • Model A Model B
  • y a bx error y a x1/2 error
  • Model B has smaller errors. Is it a better model?

24
  • aa opas asl poasie aaslkf 4-9043578
  • y 453209)_(_n (LKH lj)()(
    error
  • This model has even smaller errors. In fact,
    zero errors.
  • Tradeoff Small errors vs. complexity.
  • (Well only consider linear models.)

25
(No Transcript)
26
(No Transcript)
27
About Lines
  • y mx b

slope m
b
28
About Lines
  • y mx b

slope m
y intercept
b
slope
29
About Lines
  • y mx b

slope m
b
30
About Lines
  • y mx b
  • y b mx

slope m
b
31
About Lines
  • y mx b
  • y b mx
  • y ? ?x
  • y ?0 ?1x

32
About Lines
  • y mx b
  • y b mx
  • y ? ?x
  • y ?0 ?1x
  • y b0 b1x

33
About Lines
  • y mx b
  • y b mx
  • y ? ?x
  • y ?0 ?1x
  • y b0 b1x

slope b1
b0
slope
y intercept
34
About Lines
  • y mx b
  • y b mx
  • y ? ?x
  • y ?0 ?1x
  • y b0 b1x

slope b1
b0
slope
y intercept
35
Computing the best-fit line
  • In STANDARDIZED scatterplot
  • -- goes through origin
  • -- slope is r
  • In ORIGINAL scatterplot
  • -- goes through point of means
  • -- slope is r ?Y ? ?x

36

5 5.68 5 4.74 5 5.73 8 6.89



37
The Regression Effect
  • A preschool program attempts to boost childrens
    reading scores.
  • Children are given a pre-test and a post-test.
  • Pre-test mean score 100, SD 10
  • Post-test mean score 100, SD 10
  • The program seems to have no effect.

38
  • A closer look at the data shows a surprising
    result
  • Children who were below average on the pre-test
    tended to gain about 5-10 points on the post-test
  • Children who were above average on the pre-test
    tended to lose about 5-10 points on the
    post-test.

39
  • A closer look at the data shows a surprising
    result
  • Children who were below average on the pre-test
    tended to gain about 5-10 points on the post-test
  • Children who were above average on the pre-test
    tended to lose about 5-10 points on the
    post-test.
  • Maybe we should provide the program only for
    children whose pre-test scores are below average?

40
  • Fact
  • In most testretest and analogous situations, the
    bottom group on the first test will on average
    tend to improve, while the top group on the first
    test will on average tend to do worse.
  • Other examples
  • Students who score high on the midterm tend on
    average to score high on the final, but not as
    high.
  • An athlete who has a good rookie year tends to
    slump in his or her second year. (Sophomore
    jinx, "Sports Illustrated Jinx")
  • Tall fathers tend to have sons who are tall,
    but not as tall. (Galtons original example!)

41
(No Transcript)
42
  • It works the other way, too
  • Students who score high on the final tend to
    have scored high on the midterm, but not as high.
  • Tall sons tend to have fathers who are tall,
    but not as tall.
  • Students who did well on the post-test showed
    improvements, on average, of 5-10 points, while
    students who did poorly on the post-test dropped
    an average of 5-10 points.

43
  • Students can do well on the pretest
  • -- because they are good readers, or
  • -- because they get lucky.
  • The good readers, on average, do exactly as well
    on the post-test. The lucky group, on average,
    score lower.
  • Students can get unlucky, too, but fewer of that
    group are among the high-scorers on the pre-test.
  • So the top group on the pre-test, on average,
    tends to score a little lower on the post-test.

44
Extrapolation
  • Interpolation Using a model to estimate Y
  • for an X value within the range on which the
    model was based.
  • Extrapolation Estimating based on an X value
    outside the range.

45
Extrapolation
  • Interpolation Using a model to estimate Y
  • for an X value within the range on which the
    model was based.
  • Extrapolation Estimating based on an X value
    outside the range.
  • Interpolation Good, Extrapolation Bad.

46
Nixons GraphEconomic Growth
47
Nixons GraphEconomic Growth
Start of Nixon Adm.
48
Nixons GraphEconomic Growth
Start of Nixon Adm.
Now
49
Nixons GraphEconomic Growth
Start of Nixon Adm.
Projection
Now
50
Conditions for regression
  • Straight enough condition (linearity)
  • Errors are mostly independent of X
  • Errors are mostly independent of anything else
    you can think of
  • Errors are more-or-less normally distributed

51
  • How to test the quality of a regression
  • Plot the residuals.
  • Pattern bad, no pattern good
  • R2
  • How sure are you of the coefficients ?
About PowerShow.com