Is the Association Statistically Significant Session 16 - PowerPoint PPT Presentation

1 / 25

About This Presentation

Title:

Is the Association Statistically Significant Session 16

Description:

Null hypothesis is that there is no pattern of distribution. ... the goal of research is to explain why variables vary. The Basic Regression Model. Y = a bX e ... – PowerPoint PPT presentation

Number of Views:20

Avg rating:3.0/5.0

Slides: 26

Provided by: hollyb

Category:

more less

Transcript and Presenter's Notes

Title: Is the Association Statistically Significant Session 16

1
Is the Association Statistically Significant
Session 16
2
Tests of Statistical Significance

Nominal
Lambda t test
Phi ?2
Contingency Coefficient ?2
Cramers V ?2
Ordinal
Gamma t test
Somers d t test
Tau-b, Tau-c t test
Interval
Pearsons r t test

3
Chi-Square TestStatistical Significance for
Nominal and Ordinal Level Variables

To be used when
Variables are not interval level
Not normally distributed

4
One-Way Chi-Square Distribution of Values Across
a Single Variable

Are variations across cell frequencies chance
variations or are they a pattern?
Null hypothesis is that there is no pattern of
distribution. In other words, cases are
distributed evenly across cells.
Alternative hypothesis is that categories vary.
Distribution is not the same across all categories

We have two sets of cell frequencies
The cell frequencies that correspond with the
null hypothesis
The observed cell frequencies
How large is the discrepancy between these two
sets of values?

6
Formula

?2 S(fo fe)2
fe
The closer fo is to fe, the smaller the value of
the chi-square test
The larger the discrepancy, the larger the value
of the chi-square test
A larger value means we are more likely to reject
the null and say that there is a pattern

degrees of freedom are
k 1
where k the number of categories

8
Two-Way Chi-Square Distribution of Values Across
a Two Variables

Used to compare two frequency distributions in
other words, a crosstab
Null hypothesis is that cases are distributed
evenly across cells.
Alternative hypothesis is that there is variation
in the distribution of values of one variable
across categories of the other

9
Formulas and Calculations

Expected frequency for null hypothesis is based
on marginal values
Formula for the chi-square test is the same
Degrees of freedom
df (r 1)(c 1)

Median test (p. 302-305) skip this

11
Linear Regression

Provides a way to evaluate the influence of one
independent variable on the dependent variable,
controlling for the influence of other variables

12
Statistics Produced by Regression

a the constant
y-hat the predicted values of y given certain
values of the independent variables
e the error, the discrepancy between the actual
observed value of y and the predicted value of y,
the slush factor
beta coefficients the influence on y of a one
unit change in x
standardized beta coefficients puts the
independent variables in the same metric

13
Statistics Produced by Regression

t tests the statistical significance of x on y
p values the probability of the observed value
of the beta coefficient if the true influence
were 0
R-squared how well the model fits the data, or,
the percentage of variation in y explained by the
variation in the independent variables

The Adjusted R-squared or coefficient of multiple
determination Collectively, urbanization,
population growth, and GDP explain 79 of
variation in female literacy rate.

For each one percent increase in the percentage
of people living in cities, female literacy
increases by .61 or six tenths of one percent
For each one percent increase in the annual
population the female literacy rate decreases by
13.7 percent.
Gross domestic product per capita does not have a
statistically significant influence on female
literacy

16
Why Use Multivariate Analysis - Regression?

Descriptive Statistics one variable
Measures of Association two variables
Multivariate Analysis three variables or more

17
Why Multivariate Analysis Regression

To identify spurious relationships
To correctly specify relationships
To thoroughly describe a process
the goal of research is to explain why variables
vary

18
The Basic Regression Model

Y a bX e
Y is the observed value of the dependent variable
a is the expected value of Y when X 0 (a
baseline value)
b is the slope steep when X has a strong
influence on Y (in other words, b is larger)
e is the amount of variation in Y that cant be
explained by X

19
The Regression Line?

The regression line (the slope) is drawn to
minimize the distance between the slope and the
plotted points which are the observed values of
the dependent variable
The regression line represents predicted values
(predicted by the equation) and the points
represent actual observed values

Using the slope coefficient (b), the actual
values of X, and the value of a, we can plot the
regression line.
The error term, e, is the distance between the
regression line (the slope) and the location of
the actual observed points, the values of the
dependent variable.

21
Assumptions for Regression

Both the independent and the dependent variables
are measured at the interval level
The relationship is linear
Variables must be normally distributed or sample
must be large
Sample must be random for tests of statistical
significance

22
The Significance of the Errors

Back to the proportionate reduction in error
The errors should be as small as possible
We use the average value of Y to guess
We compare this to our guess using the value of X
for that observation

23
Pearsons, Regression, and the Coefficient of
Determination

Coefficient of Determination is also know as the
R-squared. And if there is more than one
independent variable its the adjusted R-squared.
Enough names for ya? The adjusted R-squared
takes the number of independent variables into
consideration. Kind of like degree of difficulty
in diving and gymnastics.

The R-squared value is the percent of variation
in the dependent variable that is explained by
the independent variables, collectively.

25
The Other Statistics

T-score and p-value the statistical
significance of the individual coefficients. In
other words, is the influence of this independent
variable on the dependent variable statistically
different from 0?
The beta coefficient the magnitude of the
influence of X on Y. The amount of change in Y
for a one unit change in X.

Write a Comment

User Comments (0)