Correlation is a statistical technique that describes the degree of relationship between two variables when you have bivariate data. - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Correlation is a statistical technique that describes the degree of relationship between two variables when you have bivariate data.

Description:

EDUCATIONAL STATISTICS EDU5950 WEEK12 Correlation is a statistical technique that describes the degree of relationship between two variables when you have bivariate data. – PowerPoint PPT presentation

Number of Views:68
Avg rating:3.0/5.0
Slides: 27
Provided by: Abdul62
Category:

less

Transcript and Presenter's Notes

Title: Correlation is a statistical technique that describes the degree of relationship between two variables when you have bivariate data.


1
EDUCATIONAL STATISTICS
EDU5950 WEEK12
  • Correlation is a statistical technique that
    describes the degree of relationship between two
    variables when you have bivariate data.
  • A bivariate distribution is required for
    correlation and regression technique.
  • A bivariate distribution is a joint distribution
    of two variables, the individual scores of which
    are paired in some logical way.
  • A bivariate distribution may show positive
    correlation, negative correlation, or zero
    correlation.

2
EDUCATIONAL STATISTICS
EDU5950 WEEK12
  • In a case of a positive correlation between two
    variables, high measurements on one variable tend
    to be associated with high measurements on the
    other variable, and low measurements on one
    variable with low measurements on the other.
  • Table 1 shows a graph of relationship of tall
    fathers tend to have sons who grew up to be tall
    men.
  • Short fathers tend to have sons who grow up to be
    short men.
  • If such were the case (which, of course, is
    ridiculous), then it would be possible to predict
    without error the adult height of an unborn son
    simply by measuring his father.

3
EDUCATIONAL STATISTICS
EDU5950 WEEK12
  • .

4
EDUCATIONAL STATISTICS
EDU5950 WEEK12
  • .

5
EDUCATIONAL STATISTICS
EDU5950 WEEK12
  • In Figure 1, each point represents a pair of
    scores, the height of a father and the height of
    his son.
  • Such an array of points is called a scatterplot.
  • The line that runs through the points is called a
    regression line.
  • It is a line of best fit.
  • When there is perfect correlation (r 1.00), all
    points fall exactly on the regression line.

6
EDUCATIONAL STATISTICS
EDU5950 WEEK12
  • When a correlation is negative, increases in one
    variable are accompanied by decreases in the
    other variable (an inverse relationship).
  • With negative correlation, the regression line
    goes from the upper left corner of the graph to
    the lower right corner.
  • As you may recall, such lines have a negative
    slope.
  • Although some correlation coefficients are
    positive and some are negative, one is not more
    valuable than the other.

7
EDUCATIONAL STATISTICS
EDU5950 WEEK12
  • The algebraic sign simply tells you the direction
    of the relationship (which is important when you
    are describing how the variables are related).
  • The absolute size of r, however tells you the
    degree of the relationship.
  • A strong relationship (either positive or
    negative) is usually more valuable than a weaker
    one.
  • A zero correlation means there is no linear
    relationship between two variables.

8
EDUCATIONAL STATISTICS
EDU5950 WEEK12
  • High and low scores on the two variables are not
    associated in any predictable manner.
  • Figure 2 shows a scatterplot that produces a zero
    correlation coefficient.
  • When r 0, the regression line is horizontal at
    a height of Y.
  • This make sense if r 0, then your best
    estimate of Y for any value is Y.

9
EDUCATIONAL STATISTICS
EDU5950 WEEK12
  • .

10
EDUCATIONAL STATISTICS
EDU5950 WEEK12
  • A correlation coefficient provides a quantitative
    way to express the degree of relationship that
    exists between two variables.
  • The definition formula is
  • where r Pearson product-moment correlation
    coefficient
  • zx a z score for variable X
  • zy the corresponding z score for variable Y
  • N number of pairs of X and Y values

11
EDUCATIONAL STATISTICS
EDU5950 WEEK12
  • Because researchers often use means and standard
    deviations when telling the story of the data,
    this formula (Blanched formula) is used by many
  • Where x y paired observations
  • xy product of each x value multiplied
  • by its paired y value
  • mean of variable x
  • mean of variable y
  • standard deviation of variable x
  • standard deviation of variable y
  • N number of pairs of observations

12
EDUCATIONAL STATISTICS
EDU5950 WEEK12
  • With the raw score formula, you calculate r from
    the raw scores without computing means and
    standard deviations.
  • The formula is
  • Remember that N is the number of pairs of values.

13
EDUCATIONAL STATISTICS
EDU5950 WEEK12
  • .

14
EDUCATIONAL STATISTICS
EDU5950 WEEK12
  • .

15
EDUCATIONAL STATISTICS
EDU5950 WEEK12
  • .

16
EDUCATIONAL STATISTICS
EDU5950 WEEK12
  • The basic simple interpretation of r is probably
    familiar to you at this point.
  • A correlation coefficient measures the degree of
    linear relationship between two variables of a
    bivariate distribution.
  • What is qualifies as a large correlation
    coefficient? What is small?
  • Jacob Cohen proposed that the question be
    answered by calculating an effect size index (d)
    and that d values of .20, .50, and .80 were
    designated as small, medium, and large,
    respectively.

17
EDUCATIONAL STATISTICS
EDU5950 WEEK12
  • In a similar way, Cohen addressed the question of
    calculating an effect size index for correlation
    coefficients.
  • Small r .10
  • Medium r .30
  • Large r .50
  • The correlation coefficient is also the basis of
    the coefficient of determination, which tells the
    proportion of variance that two variables in a
    bivariate distribution have in common.

18
EDUCATIONAL STATISTICS
EDU5950 WEEK12
  • The coefficient of determination is calculated by
    squaring r it is always a positive value between
    0 and 1.
  • coefficient of determination r2

19
EDUCATIONAL STATISTICS
EDU5950 WEEK12
  • Linear regression is a technique that uses the
    data to write an equation for a straight line
    then to make predictions.
  • Often predictions are based on an assumption that
    the relationship between two variables is linear.
  • Formula for a straight line is
  • Y mX b
  • Where - Y and X are variables representing
    scores on
  • the Y and X axes
  • - m slope of the line
  • - b intercept of the line with the Y axis

20
EDUCATIONAL STATISTICS
EDU5950 WEEK12
  • Least square method is
  • Consider this data
  • Father height 64 in
  • First daughter height 66 in
  • Second daughter height 64 in
  • Draw a regression line using least square method

21
EDUCATIONAL STATISTICS
EDU5950 WEEK12
  • There is an error for each person on the
    scatterplot.
  • The least square method creates a straight line
    such that the sum of the squares of the errors is
    a minimum.
  • The least square method produces numerical values
    for the slope and the intercept write the
    equation for a straight line this line is the
    one that best fits the data.
  • In statistics, the regression equation is

22
EDUCATIONAL STATISTICS
EDU5950 WEEK12
  • Where Y Y value predicted from a particular X
  • value
  • a point at which the regression line
  • intersects the s the axis
  • b slope of the regression line
  • X X value for which you wish to predict
    a Y
  • value
  • Note
  • In correlation problem, the symbol Y can be
    assigned to either variable, but in regression
    equation, Y is assigned to the variable you wish
    to predict.

23
EDUCATIONAL STATISTICS
EDU5950 WEEK12
  • You need the values for a and b, which are called
    regression coefficients can be calculated from
    any bivariate set of data.
  • To calculate b,
  • To calculate a,

24
EDUCATIONAL STATISTICS
EDU5950 WEEK12
  • Let use our data from earlier correlation
    problem, predict math from oral test in SPM.

25
EDUCATIONAL STATISTICS
EDU5950 WEEK12
  • Entering these regression coefficient values into
    the regression equation produces a formula that
    predicts math from oral test

26
EDUCATIONAL STATISTICS
EDU5950 WEEK12
  • .
Write a Comment
User Comments (0)
About PowerShow.com