Lecture 2- Alternate Correlation Procedures - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

Lecture 2- Alternate Correlation Procedures

Description:

Goodman;s Association, biserial, Nagelkirke. Lambda , Tschuprow's T, ... Table 3.5: Calculation of point-biserial correlation coefficient for First Grade ... – PowerPoint PPT presentation

Number of Views:216
Avg rating:3.0/5.0
Slides: 38
Provided by: victorl1
Category:

less

Transcript and Presenter's Notes

Title: Lecture 2- Alternate Correlation Procedures


1
Lecture 2- Alternate Correlation Procedures
  • EPSY 640
  • Texas AM University

2
CORRELATION MEASURES FOR VARIOUS SCALES OF
MEASUREMENT
Y
Nominal Level Nominal
Level Ordinal Level Interval or
ratio X
(dichotomous) (polychotomous)
Level
Nominal Level Phi coefficient,
Dichotomous Yule's Q,
Pearson rank-biserial
point-biserial,
Goodmans ? Association,
biserial,
Nagelkirke

Lambda , Tschuprows
T, R-square (logistic)
tetrachoric
Pearson Nominal Level
Association,
reduce to (Polychotomous)
Tschiuprows T,
dichotomous or R-square

Cramers C
Kruskal-Wallis

based statistic Ordinal

Spearman,
R-squared Level

Kendalls tau Interval/Ratio


Pearson r
3
Dichotomous-Dichotomous Case- PHI COEFFICIENT
  • the phi coefficient can be written as
  • rphi (ad bc) / (ac)(bd)(ab)(cd)1/2

MINORITY STATUS
a b c d
GENDER
4
Dichotomous-Dichotomous Case- PHI COEFFICIENT
Political affiliation
  • rphi (7x10 2x2) / (72)(210)(72)(210)1/2
  • (70 - 4) / 9x12x9x121/2
  • 66/108 .611
  • Pearson r .157/(.5071 x .5071) .611

Row total 9 12
7 2 2 10
Gender
Column Total 9 12
21
5
CHI-SQUARE I J ?2 ? ? n.j
(nij/n ni./n)2 / ni./n i1 j1
97/21 9/212 /( 9/21) 92/21 9/21 2/(
9/21) 122/21 12/212 /(12/21) 1210/21
12/212 / (12/21) 4/21 49/21
100/21 4/21 157/21
7.476 PEARSON ASSOCIATION TSCHUPROW'S
T P ?2 / ?2 n1/2 T ?2/n(r-c)(c-1)1/
21/2 7.476/22.4761/2
7.476/21 x 1 x 11/21/2 .577
.597
6
SPSS Crosstabs procedure
  • Select Analyze/ Descriptive Statistics/
    Crosstabs
  • Select Row and Column variables for the two
    nominal variables
  • Select under Statistics the options you want
    such as Chi Square and various Nominal
    association measures

7
(No Transcript)
8
(No Transcript)
9
1,1
0,1
1
1,0
0,0
0
1
0
TETRACHORIC ASSUMPTIONS- underlying normality of
observed dichotomies
10
n11 70 n12 20 n21 20 n22 100
ux height of normal curve for
proportion 90/210 U(.4290) The
z-score for .429 -.18 The ux for U(.4290)
(requires stat table or SPSS procedure
Pdf.Norm) .3637 uy height of
normal curve for proportion 90/210 U(.4290)
.3637 rtet (70 x 100)
(20 x 20) / .3637 x .3637 x 2102
6600/ (.132 x 2102) 1.13
not a good estimate! Table 3.4 Computation of
tetrachoric correlation
70
20
20
100
11
N O R M A L D E N S I T Y H E I G H T
12
N O R M A L D E N S I T Y H E I G H T
13
POINT-BISERIAL CORRELATION
  • Y score on interval measure (eg. Test score)
  • x 0 or 1 (grouping eg. gender)

Y1. Y0. ____________ rpb
_________ . ? n1n0 / n(n-1)
sy
14
Descriptive Statistics
Mean SD N Covariance GENDER .4667 .51
15 1.846 READING COMPREHENSION 0
(boys) 68.26 18.39 8 1 (girls) 75.19 11.11
7 Total 71.49 15.32 15
_________________ r
pb 75.19 68.26 . ? 8 x 7 / 15 x 14
15.32
.233 Table 3.5 Calculation of point-biserial
correlation coefficient for First Grade reading
comprehension of boys and girls
15
POINT BISERIAL CORRELATION
Y
X X X X X X X X
m mean
m
X X X X X X X X
m
F
M
X
16
Dichotomous (Normal)-Interval Case
  • biserial correlation
  • Y1. Y0.
  • rbis _________ . n1n0 / uxn2 ,
  • sy
  • where u height of normal curve for proportion
    n1/(n0 n1)

17
Y1. Y0. rbis
_________ . n1n0 / uxn2 ,
sy 75.19 68.26 . 8 x 7 /
.3675 x 152 15.32
.306
18
BISERIAL CORRELATION
Y
X X X X X X X X
m
X X X X X X X X
m
F
M
X
19
RANK-RANK DATA
  • 1. DATA ARE INTERVAL OR RATIO
  • Transformed to ranks because of odd distribution
  • 2. DATA ARE ORDINAL, NO INTERVAL INFORMATION
    AVAILABLE
  • USE SPEARMAN CORRELATION (Pearson formula used on
    the ranks - no ties assumed)

20
Rank distribution of real estate price per square
foot in Manhattan 21 26 19.5 17 12 22 19.5
15 24.5 9 24.5
8 6 4 7 3 14 Rank Price per foot
13 27
11 23 16 4 10 2
1 5 18 Battery
Central Park Location The
relative position of the ranks above is only
approximate, due to typeface limitations. All are
ordered correctly. This results in the
following ranks rank location 17 20 23 18 7
1 8 21 23 14 24 6 11 5 3
12 9 19 4 16 10 22 2 15 27 26
13 price 8 4 1 6 12 21 22 2 5
9 3 17 24.5 27 19.5 11 19.5 10 13 16 15
7 26 24.5 18 14 23 Computation of rank
correlation coefficient rrank sxy/sxsy
-.647 rSpearman -.640 (from SPSS, Version
13) Table 3.7 Computation of rank correlation
for Real Estate location in Manhattan with Price
Per Square Foot
21
Least squares estimation
  • The best estimate will be one in which the sum of
    squared differences between each score and the
    estimate will be the smallest among all possible
    linear unbiased estimates (BLUES, or best linear
    unbiased estimate).

22
Least squares estimation
  • errors or disturbances. They represent in this
    case the part of the y score not predictable from
    x
  • ei yi b1xi -b0x.
  • The sum of squares for errors follows
  • n
  • SSe ? e2i .
  • i1

23
y
e
e
e
e
e
e
e
e
x
SSe ?e2i
24
Matrix representation of least squares estimation.
  • We can represent the regression model in matrix
    form
  • y X? e

25
Matrix representation of least squares estimation
  • y X
    ? e
  • y1 1 x1 e1
  • ?0
  • y2 1 x2 ?1 e2
  • y3 1 x3 e3
  • y4 1 x4 e4
  • . 1 . .
  • . 1 . .
  • . 1 . .

26
Matrix representation of least squares estimation
  • y Xb e
  • The least squares criterion is satisfied by the
    following matrix equation
  • b (X?X)-1X?y .
  • The term X? is called the transpose of the X
    matrix. It is the matrix turned on its side. When
  • X?X is multiplied together, the result is a 2
    x 2 matrix
  • n ?xi
  • ?xi ?x2i Note all information here sample
    size, mean (sum of scores), variance (squared
    scores)

27
SUMS OF SQUARES computational equivalents
  • SSe (n 2 )s2e
  • SSreg ? ( b1 xi y. )2
  • SSy SSreg SSe

28
SUMS OF SQUARES-Venn Diagram
SSy
SSreg
SSx
SSe
Fig. 8.3 Venn diagram for linear regression with
one predictor and one outcome measure
29
SUMS OF SQUARES- ANOVA Table
  • SOURCE df Sum of Squares Mean Square F
  • x 1 SSreg SSreg / 1 SSreg/ 1
  • SSe /(n-2)
  • e n-2 SSe SSe / (n-2)
  • Total n-1 SSy SSy / (n-1)
  • Table 8.1 Regression table for Sums of Squares

30
  • Rupley and Willson (1997) studied the
    relationship between word recognition and reading
    comprehension for 200 six- and seven-year olds
    using a national sample of students that mirrored
    the U.S. census of 1980. The mean for Word
    Recognition was 100, SD15, and the mean for
    Reading Comprehension was 23.16, SD14.74. The
    regression analysis is reported in the table
    below
  • Dep. Var. Reading Comprehension
  • SOURCE df Sum of Squares Mean Square F
    Prob. R2
  • Word recog- 1 34316.55 34316.55
    763.17 .001 .794
  • nition
  • error 198 8903.23 44.97
  • total 199 43219.78 se 6.71

31
Two variable linear regression Which direction?
  • Regression equations
  • y xb1x xb0
  • x yb1y yb0
  • Regression coefficients
  • xb1 rxy sy / sx
  • yb1 rxy sx / sy

32
Two variable linear regression
  • y b1x b0
  • If the correlation coefficient is calculated,
    then b1 can be calculated from the equation
    above
  • b1 rxy sy / sx
  • The intercept, b0, follows by placing the means
    for x and y into the equation above and solving
  • b0 y. rxysy/sx x.

33
Two variable linear regression.
yb1 rxy sx / sy
xb1 rxy sy / sx
x
y
y
x
Fig. 8.1 Slopes for two regression
representations of Pearson correlation
34
Three variable linear regression
  • y b1x1 b2x2 b0
  • Two predictors all variables may be correlated
    with each other
  • Exact equations exist to compute b1 , b2 but not
    for more than two predictors, matrix form normal
    equations must be used

35
Three variable linear regression
  • Path model representationunstandardized

x1
b1
y
e
?12
x2
b2
36
Three variable linear regression
  • Path model representationstandardized

x1
?1
y
e
r12
x2
?2
37
SUMS OF SQUARES-Venn Diagram
SSx1
SSy
SSreg
SSe
SSx2
Fig. 8.3 Venn diagram for linear regression with
two predictors and one outcome measure
Write a Comment
User Comments (0)
About PowerShow.com