# Optical illusion ? - PowerPoint PPT Presentation

PPT – Optical illusion ? PowerPoint presentation | free to download - id: 58ade4-OWI5N

The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
Title:

## Optical illusion ?

Description:

### Title: Packing Densities of Permutations Author: Walter Stromquist Last modified by: Walter Stromquist Created Date: 11/24/2003 12:56:23 AM Document presentation format – PowerPoint PPT presentation

Number of Views:110
Avg rating:3.0/5.0
Slides: 52
Provided by: walte47
Category:
Transcript and Presenter's Notes

Title: Optical illusion ?

1
• Optical illusion ?
• Correlation ( r or R or ? )
• -- One-number summary of the strength of a
relationship
• -- How to recognize
• -- How to compute
• Regressions
• -- Any model has predicted values and residuals.
• (Do we always want a model with small residuals
? )
• -- Regression lines
• --- how to use
• --- how to compute
• -- The regression effect
• (Why did Galton call these things regressions
? )
• -- Pitfalls Outliers
• -- Pitfalls Extrapolation
• -- Conditions for a good regression

2
Which looks like a stronger relationship?
3
Optical Illusion ?
4
Kinds of Association
• Positive vs. Negative
• Strong vs. Weak
• Linear vs. Non-linear

5
CORRELATION
• CORRELATION
• (or, the CORRELATION COEFFICIENT)
• measures the strength of a linear relationship.
• If the relationship is non-linear, it measures
the strength of the linear part of the
relationship. But then it doesnt tell the whole
story.
• Correlation can be positive or negative.

6

correlation .97
correlation .71
7

2
2
1
1
0
0
Y
Y
-1
-1
-1
0
1
2
-2
0
2
X
X
correlation .97
correlation .71
8

2
1
0
Y
-1
-1
0
1
2
X
correlation .97
correlation .97
9

correlation .24
correlation .90
10

correlation .50
correlation 0
11
Computing correlation
• Replace each variable with its standardized
version.
• Take an average of ( xi times yi )

12
Computing correlation
sum of all the products
r, or R, or greek ? (rho)
n-1 or n ?
13
Good things about correlation
• Its symmetric ( correlation of x and y means
same as correlation of y and x )
• It doesnt depend on scale or units
• adding or multiplying either variable by
• a constant doesnt change r
• of course not r depend only on the
• standardized versions
• r is always in the range from -1 to 1
• 1 means perfect positive correlation dots
on line
• -1 means perfect negative correlation dots
on line
• 0 means no relationship, OR no linear
relationship

14
• Sensitive to outliers
• Misses non-linear relationships
• Doesnt imply causality

15
STATE AVE SCORE
PERCENT TAKING SAT
16
IQ
SHOE SIZE
17
JUDGES IMPRESSION
450
250
350
BAKING TEMP
18
LIFE EXPECTANCY
GDP PER CAPITA
19
Observed Values, Predictions, and Residuals
resp. var.
explanatory variable
20
Observed Values, Predictions, and Residuals
resp. var.
explanatory variable
21
Observed Values, Predictions, and Residuals
resp. var.
explanatory variable
22
Observed Values, Predictions, and Residuals
Observed value
Predicted value
resp. var.
Residual observed predicted
explanatory variable
23
Linear models and non-linear models
• Model A Model B
• y a bx error y a x1/2 error
• Model B has smaller errors. Is it a better model?

24
• aa opas asl poasie aaslkf 4-9043578
• y 453209)_(_n (LKH lj)()(
error
• This model has even smaller errors. In fact,
zero errors.
• Tradeoff Small errors vs. complexity.
• (Well only consider linear models.)

25
(No Transcript)
26
(No Transcript)
27
• y mx b

slope m
b
28
• y mx b

slope m
y intercept
b
slope
29
• y mx b

slope m
b
30
• y mx b
• y b mx

slope m
b
31
• y mx b
• y b mx
• y ? ?x
• y ?0 ?1x

32
• y mx b
• y b mx
• y ? ?x
• y ?0 ?1x
• y b0 b1x

33
• y mx b
• y b mx
• y ? ?x
• y ?0 ?1x
• y b0 b1x

slope b1
b0
slope
y intercept
34
• y mx b
• y b mx
• y ? ?x
• y ?0 ?1x
• y b0 b1x

slope b1
b0
slope
y intercept
35
Computing the best-fit line
• In STANDARDIZED scatterplot
• -- goes through origin
• -- slope is r
• In ORIGINAL scatterplot
• -- goes through point of means
• -- slope is r ?Y ? ?x

36

5 5.68 5 4.74 5 5.73 8 6.89

37
The Regression Effect
• A preschool program attempts to boost childrens
• Children are given a pre-test and a post-test.
• Pre-test mean score 100, SD 10
• Post-test mean score 100, SD 10
• The program seems to have no effect.

38
• A closer look at the data shows a surprising
result
• Children who were below average on the pre-test
tended to gain about 5-10 points on the post-test
• Children who were above average on the pre-test
tended to lose about 5-10 points on the
post-test.

39
• A closer look at the data shows a surprising
result
• Children who were below average on the pre-test
tended to gain about 5-10 points on the post-test
• Children who were above average on the pre-test
tended to lose about 5-10 points on the
post-test.
• Maybe we should provide the program only for
children whose pre-test scores are below average?

40
• Fact
• In most testretest and analogous situations, the
bottom group on the first test will on average
tend to improve, while the top group on the first
test will on average tend to do worse.
• Other examples
• Students who score high on the midterm tend on
average to score high on the final, but not as
high.
• An athlete who has a good rookie year tends to
slump in his or her second year. (Sophomore
jinx, "Sports Illustrated Jinx")
• Tall fathers tend to have sons who are tall,
but not as tall. (Galtons original example!)

41
(No Transcript)
42
• It works the other way, too
• Students who score high on the final tend to
have scored high on the midterm, but not as high.
• Tall sons tend to have fathers who are tall,
but not as tall.
• Students who did well on the post-test showed
improvements, on average, of 5-10 points, while
students who did poorly on the post-test dropped
an average of 5-10 points.

43
• Students can do well on the pretest
• -- because they are good readers, or
• -- because they get lucky.
• The good readers, on average, do exactly as well
on the post-test. The lucky group, on average,
score lower.
• Students can get unlucky, too, but fewer of that
group are among the high-scorers on the pre-test.
• So the top group on the pre-test, on average,
tends to score a little lower on the post-test.

44
Extrapolation
• Interpolation Using a model to estimate Y
• for an X value within the range on which the
model was based.
• Extrapolation Estimating based on an X value
outside the range.

45
Extrapolation
• Interpolation Using a model to estimate Y
• for an X value within the range on which the
model was based.
• Extrapolation Estimating based on an X value
outside the range.
• Interpolation Good, Extrapolation Bad.

46
Nixons GraphEconomic Growth
47
Nixons GraphEconomic Growth
Start of Nixon Adm.
48
Nixons GraphEconomic Growth
Start of Nixon Adm.
Now
49
Nixons GraphEconomic Growth
Start of Nixon Adm.
Projection
Now
50
Conditions for regression
• Straight enough condition (linearity)
• Errors are mostly independent of X
• Errors are mostly independent of anything else
you can think of
• Errors are more-or-less normally distributed

51
• How to test the quality of a regression
• Plot the residuals.
• Pattern bad, no pattern good
• R2
• How sure are you of the coefficients ?