1 / 38

Inferential statistics

- Hypothesis testing

The goals of quantitative research

- We usually want to make some statements about how

things are related - Does Sesame Street teach kids to read?
- Does watching TV news lead to support for the

Iraq War? - Do young men enjoy horror films more than young

women do?

Studying relationships quantitatively

- We use statistics to draw conclusions from

quantitative data - To look for relationships among variables, we

want to determine - Whether a relationship exists in the sample data
- Whether that relationship can be generalized to

the population we sampled from

Does a population characteristic differ from some

expectation?

- For example, do Telecommunications students

(population) spend more than 5 hours a week

playing video games? - Take a sample of Tel students and ask them how

many hours per week they play video games. - If the mean is higher than 5, you would go on to

see whether the difference you found was

statistically significant.

Differences between two groups

- You may want to see if two groups are different

on some characteristic - If the difference between the two groups is large

compared to the variance within the groups, then

there will be a statistically significant

difference between them

For example

- Do male students spend more time watching sci-fi

than do female students? - Ask a sample of students, some male and some

female, how many hours they spend watching sci-fi

per week - Determine the mean number of hours male and

female students watch Sci-Fi

To test for a difference

- If the mean number of sci-fi viewing hours

differs quite a bit between males and females

relative to the variation among males and among

females, then you would conclude that there is a

real difference within the population (not just

in your sample) - You accept your sample estimate of the difference

between males and females as the best estimate of

the population difference - NOTE There are statistics to help make this

judgment

Females

Males

of viewers

Hours viewing

Mean scifi viewing hours among

females

Mean scifi viewing hours among males

(No Transcript)

(No Transcript)

(No Transcript)

Statistics for comparisons among groups

- t-test
- preferred where two groups can be compared based

on some hypothesis - ANOVA
- comparison among multiple groups
- allows for factorial designs
- good at dealing with interactions
- These are the main statistics for analyzing

experimental data

Example Effects of gender and commercial message

- We need to choose between cell phone commercials
- We want to know which commercial to use, whether

men or women spend more on cell phones, and

whether the commercials have different effects by

gender

Main effect of gender

Main effect of commercial

Interaction

Relationships among variables

- When both your variables can take many values,

you may want to generate a scatterplot - Values on one variable are represented on the X

axis - Values on the other variable are represented on

the Y axis

(No Transcript)

- The relationship between the variables is

estimated by fitting a line to the data - The line is fitted in a particular way that

provides the least total error or distance

between the points and the line

(No Transcript)

Measures of association

- If we want to know to what extent two variables

are related, we look at a measure of association

- Different measures are used depending upon

whether the data were collected using nominal,

ordinal, interval or ratio scale - Parametric
- Nonparametric
- Two nominalChi-square (non-parametric)
- Two ordinalSpearmans rho
- Two intervalPearsons r

Chi-square

- Because the data provide no direction or distance

on the scales they use, Chi-square is based on

the percentage of subjects found in each cell of

a contingency table

(No Transcript)

Covariance among variables

- Correlation
- Pearsons r
- r2 coefficient of determination
- how much of variance in one variable can be

accounted for by variance in another

(No Transcript)

(No Transcript)

Linear regression

- Minimizes the total squared distances between

individual data points and a constructed

regression line - yaxb
- allows for prediction of the behavior of the

dependent variable - preferred to simple correlation

(No Transcript)

The problem with outliers

- Extreme cases (outliers) can unduly influence

measures of association

(No Transcript)

Multiple variables

- We can statistically analyze the relations among

multiple variables at the same time - Multiple correlation
- Multiple regression
- Control for multiple variables
- Unique contribution of a single variable

Statistical significance

- Because you have sampled from a population, your

results may have occurred largely by chance - Researchers usually want to be certain that their

findings are unlikely to be a result of chance - But you cant entirely eliminate chance, so you

set a limit on how much chance you are willing to

accept

- So, you set a p level
- .05
- Fewer than than 5 samples in 100 would generate a

result this unlikely - 95 in 100 samples would generate a mean estimate

within a designated distance of the sample

estimate - If your findings meet this criterion, they are

statistically significant - Other common significance levels are .01 or .001

(No Transcript)

(No Transcript)

- Statistically significant findings are not

necessarily theoretically significant - If you have a large sample, a relatively small

effect size may be statistically significant

So

- We often want to determine whether variables are

related in a population - We use appropriate sample statistics to determine

whether they are related in our sample - We use inferential statistics to evaluate whether

we think what we found in our sample is true of

the larger population