Basic Quantitative Methods in the Social Sciences (AKA Intro Stats)

About This Presentation

Title:

Basic Quantitative Methods in the Social Sciences (AKA Intro Stats)

Description:

Basic Quantitative Methods in the Social Sciences (AKA Intro Stats) 02-250-01 Lecture 8 Is the Difference Between Two Means Significant? A group of 100 14-year old ... – PowerPoint PPT presentation

Number of Views:93

Avg rating:3.0/5.0

Slides: 73

Provided by: dho47

Category:

more less

Transcript and Presenter's Notes

Title: Basic Quantitative Methods in the Social Sciences (AKA Intro Stats)

1
Basic Quantitative Methods in the Social
Sciences(AKA Intro Stats)

02-250-01
Lecture 8

2
Is the Difference Between Two Means Significant?

A group of 100 14-year old boys take a general
math test. Their mean score is 85.
A group of 100 14-year old girls take the same
math test. Their mean score is 80.
Can we deduce that 14 year old boys are better at
math than 14 year old girls?

3
Two sample t-tests

Examine differences between the means of
Independent samples
Related samples
Two different t-test formulas!
SO If you want to compare the means of two
samples, you must carefully consider whether your
samples are related. What do I mean by that?.

4
When are Samples Related?

When knowing one member of a PAIR of scores tells
you something about the other member
Examples
pre-treatment vs. post-treatment depression
scores - an example where each person in the
study contributes a pair of scores
are men more dissatisfied with their marriages
than women? Each couple would be a pair. Why
wouldnt they be considered independent?

5
Independent Sample t-tests

The most frequently used procedures for testing
to determine whether or not the means of two
independent groups could conceivably have come
from the same population.
Underlying assumptions
A raw score is independent of all others.
The groups of scores are random samples from
normally distributed populations
The groups of scores are random samples from
populations with equal variances.

6
But isnt it enough to just examine the two
sample means?

Of course not!
If you compute means for two samples, they will
almost always differ to some degree. The job of
the t-test is to see whether they differ by
chance or whether the difference is real and
reliable.
Stated differently Do the means differ simply
because of the variability between the scores?

7
Example.

A group of 100 14-year old boys take a general
math test. Their mean score is 85.
A group of 100 14-year old girls take the same
math test. Their mean score is 80.
Why is there a 5 point difference between these
two sample means?

8
A couple of possibilities

The difference could be due to the fact that 14
year olds naturally differ from each other to
some degree in math ability, such that two
samples derived from a population of 14 year olds
will always differ, no matter how you slice it
(boys vs. girls, brown hair vs. black hair,
etc.).
If this is true, the differences between the
scores of each 14 year old is due to individual
differences, and therefore the difference between
means extracted from random samples is a
reflection of these individual differences.

9
Or

The difference could be due to the fact that boys
are better than girls at math.
If this is true,
a sample of 14 year old boys will still have
variation in their scores (due to individual
differences in the sample of boys)
a sample of 14 year old girls will also have
variation in their scores (due to individual
differences in the sample of girls)
but overall, the difference between the two
samples will be larger than the variability
generated by individual differences (within the
two samples combined that is, individual
differences in 14 year olds in general).

10
In this context, the t-test can be thought of as.

A ratio Between group differences divided by
within group differences (i.e., variability)
The larger the numerator gets (between group
differences), the larger the t value is.
The larger the denominator gets (within group
variability), the smaller the t value is.

11
The Ratio Explained

Between group differences is measured by
Within group variability is measured by the
calculated pooled variance (i.e., the group 1
variance and group 2 variance are pooled together)

12
Stated Algebraically.
13
Continued..

Why isnt the numerator
Because The expected value of
is zero (i.e., if the null hypothesis is
accepted,
would not differ significantly
from zero).

14
An Example

It has been reported that employment interviewers
spend more time talking to applicants who are
hired than they do with those who are not hired.
To determine whether this is true, we will
analyze the data on the next slide
the duration of interview (in minutes) taken
from a random selection of candidates and then
categorized on the basis of whether or not they
were hired.

15
Two Sample T-tests Making the Right Steps

Two sample t-test for independent samples

Null H0 No difference between means
2) Sample random sample of 49 applicants
3) Significance level a.05

16
Two Sample Tests Calculating T

Two sample t-test for independent samples

17
Two Sample t-tests Basic Components

Two sample t-test for independent samples

18
Two Sample t-tests Basic Components
(continued again)

Two sample t-test for independent samples

19
Two Sample t-tests Basic Components (and yet
again)

Two sample t-test for independent samples

20
The pooled (or combined) Variance -

Two sample t-test for independent samples

21
The pooled (or combined) Variance -

Two sample t-test for independent samples

22
The pooled (or combined) Variance -

Two sample t-test for independent samples

23
Doing the t for two independent samples -

Two sample t-test for independent samples

24
Finally!

tobt 9.444
With a.05, and df n1n2 - 2 47,
tcrit 1.676 (remember, look at df50 since
thats the next highest up), therefore the null
hypothesis is rejected. The sample of those who
were hired were interviewed for a significantly
longer period of time than the sample who were
rejected.

25
Confidence Intervals

We can use a confidence interval to determine the
plausible range of the difference between
independent means, that is, the probable range
within which terms occur.
95 C.I.

26
Confidence Intervals continued..

If we use a confidence interval as an alternative
way to test the null hypothesis that the two
samples do not differ, then the expected value
for is zero
If the 95 confidence interval overlaps zero,
then zero is within the plausible range of
values which have a 0.95 probability of
occurrence. In other words, the observed
is not significantly different from 0.00, and
the null hypothesis should be retained

27
Same example, but with the CI approach

It has been reported that employment interviewers
spend more time talking to applicants who are
hired than they do with those who are not hired.
To determine whether this is true, we will
analyze the data on the next slide
the duration of interview (in minutes) taken
from a random selection of candidates and then
categorized on the basis of whether or not they
were hired.

28
Two Sample T-test Data Summary

Two sample t-test for independent samples

Null H0 No difference between means
2) Sample random sample of 49 applicants
3) Significance level a.05

29
SO

95 CI 24.7308 - 18 6.7308 ? 2.009(.7127)
6.7308 ?
1.4318
5.30 to 8.16
Since the interval does not overlap with zero,
the null hypothesis is rejected.

30
Another Example

John, a 21 year old University of Windsor student
meets Jennifer, a 20 year old University of
Windsor student at Faces. After striking up a
conversation, he gets her phone number. He goes
home with the following dilemma How many days
should he wait to call her?
30 male Intro to Psych students and 30 female
Intro to Psych students were asked to indicate
how many days they thought John should wait
before calling. The following is a summary of
real data collected last year

31
Males Females
s12 (259 (87)2/30)/29 0.2310 s22 (183
(65)2/30)/29 1.4540
32
And..

S2Pooled 48.8650 / 58 0.8425

33
So.

H0 There will be no difference between how long
the male and female students say John should wait
before calling Jennifer.
H1 The male students will say to wait longer
than the female students will say to wait
(one-tailed test) gt

34
Finally!

tobt 3.093
With a .05, and df n1n2 - 2 58,
tcrit 1.660, therefore the null hypothesis
is rejected. The male students thought that John
should wait significantly longer than the female
students thought he should wait before calling
Jennifer

35
Confidence Interval Approach

To solve the last problem using the 95
Confidence Interval Approach, use
why is t.05 1.984?
Conclusions? Since 0 is
outside the interval, reject Ho

36
Work On it! Example 1

A public school board is considering eliminating
music and art classes from an elementary school
curriculum. Researcher Z wants to prove that
students who take these classes get into less
trouble than their peers who do not take music
and art classes (to be able to persuade the
school board not to cut the classes). He randomly
selects 10 students who are currently taking
music and art and 10 students who are not taking
music and art. He then asks the students parents
to rate them on the Trouble Scale from 1 to 10
(where 1no trouble and 10in trouble all the
time)

37
Example 1 continued

The data he collected are as follows
Ratings for taking music and art students
2, 4, 1, 5, 3, 3, 4, 2, 5, 1
Ratings for not taking music nor art students
4, 2, 3, 4, 6, 2, 3, 6, 4, 3
Test the researchers hypothesis at the .05 level
of significance (state hypotheses, variables,
complete all calculations, interpret the results)
Test the hypothesis using the confidence interval
approach at the .05 level of significance
Should the school board reconsider?

Ho Kids taking music and art classes will get
into the same amount of trouble as kids not
taking music and art classes
Ha Kids taking music and art classes will get
into less trouble than kids not taking music and
art classes
IV classes taken (art/music, no art/music)
DV Trouble Scale rating

39
Data
X1 X12 X2 X22
2 4 4 16
4 16 2 4
1 1 3 9
5 25 4 16
3 9 6 36
3 9 2 4
4 16 3 9
2 4 6 36
5 25 4 16
1 1 3 9
SX130 SX12110 SX237 SX22155
40
tcrit (df 18, alpha .05) 1.734, retain
Ho No difference in the amount of trouble kids
get into between those in music and art and those
not
41
Because 0 is within the interval, we retain the
null hypothesis The researcher will not have
proof to present to the school board they will
not reconsider based on his findings
42
Work On It! Example 2

Researcher Q thinks that students who are in
their fourth year of undergraduate studies get
better grades than students who in their first
year of undergraduate studies. At the end of the
academic year, she randomly selects 20 students
who just finished their fourth year and 20
students who just finished their first year and
asks them their sessional GPAs. Summary data are
presented on the next slide
Test the researchers hypothesis at the .01 level
of significance using both the t-test and the
confidence interval approach

43
Fourth year First Year
44

Ho Students in fourth year will have the same
GPAs as students in first year
Ha Students in fourth year will have higher GPAs
than students in first year
IV year of university (fourth, first)
DV GPA

45
tcrit (df 38, alpha .01) 2.423, reject
Ho Fourth year students had better GPAs than
First year students
46
Because 0 is outside of the interval, we reject
the null hypothesis Fourth year students had
better GPAs than First year students
47
t-Tests for Related Samples Examining Difference
Scores

When the two samples are related, we do not focus
on the scores themselves, but on the difference
scores of each of the pairs.
SO The difference (D) is obtained for each pair
of scores by subtracting the second score from
the first in each pair.
Difference scores are them summed (?D), and then
the mean difference ( ) is obtained.

48
Related-samples t-test

A researcher wanted to know whether St. Johns
Wort is effective as a treatment for depression.
He recruited a sample of 20 depressed people.
Before taking St. Johns Wort, participants were
asked to fill out the Beck Depression Inventory
to see how depressed they were.
Participants then each took 2 tablets of St.
Johns Wort each day for 6 weeks.
At 6 weeks, participants filled out the Beck
Depression Inventory again.

The data (scores gt 20 depression)
Pre Post D Pre Post D
1 26 20 6 11 32 26 6
2 22 15 7 12 22 24 -2
3 20 10 10 13 24 17 7
4 25 24 1 14 23 12 11
5 36 30 6 15 30 10 20
6 36 26 10 16 32 24 8
7 24 18 6 17 30 18 12
8 22 18 4 18 27 20 7
9 25 20 5 19 24 20 4
10 22 19 3 20 19 19 0

131 6.55
50
H0

We want to know if the mean difference score is
significantly greater than zero.
Stated differently, our null hypothesis is that
the mean difference score generated by these
samples does not differ from a population mean
difference score of 0. If this null hypothesis
is rejected, it means that the mean difference
score is large enough to confidently say that the
difference between scores is significant.

51
Related Samples t-Formula
Since 0,
Degrees of Freedom Note!! df the number of
pairs - 1
52
Lets try it!
4.7736 / 4.4721 1.0674
Since 0,
t obt 6.55 / 1.0674 6.136 tcrit (1-tailed)
1.729 So, H0 is rejected.
53
Interpretation

Post-treatment depression scores are
significantly lower than pre-treatment depression
scores
St. Johns Wort seems to work for reducing
depression symptoms

54
Underlying Assumptionsof the Related Samples
t-test

The raw scores within each group are independent
of others within the group.
The groups of scores are random samples from
normally distributed populations.
The groups of scores are random samples from
populations with equal variances.

55
Why Use Difference Scores for Related Samples?

Difference scores allow us to avoid problems
regarding variability between subjects.
Difference scores allow us to control for
extraneous variables.
Difference scores allow us to use fewer
participants. Its easier to get 20 people to do
something twice than to get 40 people to do it
once.

56
Order and Carry-over effects

Sometimes the very order in which treatments are
administered effects the performance.
The effect of previous trials (conditions) can
carry over to effect the next trial.
Example IQ tests are administered before and
after participants eat a smart bar.

57
Another Example

A researcher went to Silver City and stood
outside the doors of Titanic at the end of the
movie, and stopped 10 couples, asking them to
rate how cheesy the movie was on a scale from 1
(not cheesy at all) to 10 (dripping with cheese).
The question Did the men find the movie to be
cheesier than the women?

58
The Data
30.2 3.02

3.02 / .9424 tobt 3.205 tcrit 1.833 SO
H0 is rejected
59
Confidence Intervals

We can use a confidence interval to find the
plausible range of the mean difference, that is,
the probable range within which terms occur.
The CI for the mean difference
95 C.I.
If the interval overlaps 0, then 0 is within the
plausible range of values which have a .95
probability of occurrence. (The null hypothesis
would be retained)

60
So..

3.02 ? 2.262 (.9424)
3.02 ? 2.1317
95 C.I. .89 to 5.15
So Interval does not overlap 0, so null
hypothesis is rejected.

61
Work On It!

Researcher X thinks that, in families with 2
children, the older sibling would be more
outgoing than the younger sibling. He randomly
selects 10 families and rates how outgoing each
child is on the Outgoing Scale (from 1 to 10,
where 1 very shy and 10 the most
outgoing). The data are presented on the next
slide.
Test the researchers hypothesis using both the
statistic and confidence interval approaches
Make sure you state the hypotheses, variables,
etc.

62
Work On It Example 1 Data
Family Oldest Sib Rating Youngest Sib Rating
1 5 2
2 7 4
3 9 3
4 5 6
5 4 1
6 9 5
7 3 6
8 10 7
9 7 5
10 8 2
63

Ho Older siblings are as outgoing as younger
siblings
Ha Older siblings are more outgoing than younger
siblings
IV sibling (older and younger)
DV Outgoing Scale rating
This test is one tailed at alpha .05

64
Work On It Example 1 Data
Family Oldest Sib Rating Youngest Sib Rating D D2
1 5 2 3 9
2 7 4 3 9
3 9 3 6 36
4 5 6 -1 1
5 4 1 3 9
6 9 5 4 16
7 3 6 -3 9
8 10 7 3 9
9 7 5 2 4
10 8 2 6 36
SD26 SD2138
65
tcrit (df 9) 1.833, so reject Ho Older
siblings are more outgoing than younger siblings
66
Confidence Interval Approach
Since 0 is outside the interval, we reject
Ho Older siblings are more outgoing than younger
siblings
67
Work On It Example 2

Bob wants the employees in his retail store to
increase the number of sales they make each day.
He randomly selects 8 employees and records their
number of sales per day. He then sends them to a
sales seminar (and hopes they will make more
sales after the completion of the seminar). When
they return to work, he again records their
number of sales per day. The data are presented
on the next slide.
Did the sales seminar work? Test using both the
test statistic and confidence interval approaches
(at the .01 level of significance)

68
Work On It Example 2 Data
Employee of sales before seminar of sales after seminar
1 2 2
2 5 7
3 12 9
4 6 9
5 8 13
6 9 11
7 7 8
8 9 14
69

Ho the sales staff will sell the same amount
before the seminar as after
Ha the sales staff will sell more after the
seminar
IV assessment time (before and after seminar)
DV number of sales per day
This test is one tailed at alpha .01

70
Employee of sales before seminar of sales after seminar D D2
1 2 2 0 0
2 5 7 -2 4
3 12 9 3 9
4 6 9 -3 9
5 8 13 -5 25
6 9 11 -2 4
7 7 8 -1 1
8 9 14 -5 25
SD-15 SD277
71
tcrit (df 7) -2.998, so retain Ho The sales
seminar did not work
72
Confidence Interval Approach
Since 0 is within the interval, we retain Ho The
sales seminar did not work

Write a Comment

User Comments (0)