Suppose we have a population of adult men with a mean height of 71 inches and standard deviation of 2.5 inches. We also have a population of adult women with a mean height of 65 inches and standard deviation of 2.3 inches. Assume heights are

About This Presentation

Title:

Suppose we have a population of adult men with a mean height of 71 inches and standard deviation of 2.5 inches. We also have a population of adult women with a mean height of 65 inches and standard deviation of 2.3 inches. Assume heights are

Description:

... Students should be allowed to report controversial issues in their student newspapers without ... The sampling distribution ... Are women still paid less ... – PowerPoint PPT presentation

Number of Views:86

Avg rating:3.0/5.0

Slides: 41

Provided by: Plano254

Category:

more less

Transcript and Presenter's Notes

Title: Suppose we have a population of adult men with a mean height of 71 inches and standard deviation of 2.5 inches. We also have a population of adult women with a mean height of 65 inches and standard deviation of 2.3 inches. Assume heights are

1
Suppose we have a population of adult men
with a mean height of 71 inches and
standard deviation of 2.5 inches. We also have a
population of adult women with a mean height of
65 inches and standard deviation of 2.3 inches.
Assume heights are normally distributed.
Suppose we take a random sample of 30 men and a
random sample of 25 women from their respective
populations and calculate the difference in their
heights (mans height womans height). If we
did this many times, what would the distribution
of differences be like?
On the next slide we will investigate this
distribution.
2
Randomly take one of the sample means for the
males and one of the sample means for the females
and find the difference in mean heights.
sM 2.5
sF 2.3
3

Heights Continued . . .
Describe the sampling distribution of the
difference in mean heights between men and women.
What is the probability that the difference in
mean heights of a random sample of 30 men and a
random sample of 25 women is less than 5 inches?

The sampling distribution is normally distributed
with
4
1. 2. and
The variance of the differences is the sum of the
variances.
5
When two random samples are independently
selected and n1 and n2 are both large or the
population distributions are (at least
approximately) normal, the distribution of
We must know s1 and s2 in order to use this
procedure.
If s1 and s2 is unknown we must use t
distributions.
is described (at least approximately) by the
standard normal (z) distribution.
6
Two-Sample t Test for Comparing Two Populations

Null Hypothesis H0 m1 m2 hypothesized value
Test Statistic
The appropriate df for the two-sample t test is
The computed number of df should be truncated to
an integer.

A conservative estimate of the P-value can be
found by using the t-curve with the number of
degrees of freedom equal to the smaller of (n1
1) or (n2 1).
The hypothesized value is often 0, but there are
times when we are interested in testing for a
difference that is not 0.
where
and
7
Two-Sample t Test for Comparing Two Populations
Continued . . .

Null Hypothesis H0 m1 m2 hypothesized value

Alternative Hypothesis P-value
Ha m1 m2 gt hypothesized value Area under the appropriate t curve to the right of the computed t

Area under the appropriate t curve to the left of
the computed t
Ha m1 m2 lt hypothesized value
2(area to right of computed t) if t or 2(area
to left of computed t) if -t
Ha m1 m2 ? hypothesized value
8
Another Way to Write Hypothesis Statements
H0 m1 m2

H0 m1 - m2 0
Ha m1 - m2 lt 0
Ha m1 - m2 gt 0
Ha m1 - m2 ? 0

When the hypothesized value is 0, we can rewrite
these hypothesis statements
Be sure to define BOTH m1 and m2!
Ha m1 lt m2
Ha m1 gt m2
Ha m1 ? m2
9
Two-Sample t Test for Comparing Two Populations
Continued . . .

Assumptions
The two samples are independently selected random
samples from the populations of interest
The sample sizes are large (generally 30 or
larger) or the population distributions are (at
least approximately) normal.
When comparing two treatment groups, use the
following assumptions
Individuals or objects are randomly assigned to
treatments (or vice versa)
The sample sizes are large (generally 30 or
larger) or the treatment response distributions
are approximately normal.

Are women still paid less than men for comparable
work? A study was carried out in which
salary data was collected from a random
sample of men and from a random sample of women
who worked as purchasing managers and who were
subscribers to Purchasing magazine. Annual
salaries (in thousands of dollars) appear below
(the actual sample sizes were much larger). Use
a .05 to determine if there is convincing
evidence that the mean annual salary for male
purchasing managers is greater than the mean
annual salary for female purchasing managers.
H0 m1 m2 0
Ha m1 m2 gt 0

If we had defined m1 as the mean salary for
female purchasing managers and m2 as the mean
salary for male purchasing managers, then the
correct alternative hypothesis would be the
difference in the means is less than 0.
Men 81 69 81 76 76 74 69 76 79 65
Women 78 60 67 61 62 73 71 58 68 48
Where m1 mean annual salary for male purchasing
managers and m2 mean annual salary for female
purchasing managers
State the hypotheses
11

Salary War Continued . . .
H0 m1 m2 0
Ha m1 m2 gt 0
Assumptions
Given two independently selected random samples
of male and female purchasing managers.

Men 81 69 81 76 76 74 69 76 79 65
Women 78 60 67 61 62 73 71 58 68 48
Where m1 mean annual salary for male purchasing
managers and m2 mean annual salary for female
purchasing managers
Even though these are samples from subscribers of
Purchasing magazine, the authors of the study
believed it was reasonable to view the samples
as representative of the populations of interest.

2) Since the sample sizes are small, we must
determine if it is plausible that the sampling
distributions for each of the two populations are
approximately normal. Since the boxplots are
reasonably symmetrical with no outliers, it is
plausible that the sampling distributions are
approximately normal.
Verify the assumptions
12

Salary War Continued . . .
H0 m1 m2 0
Ha m1 m2 gt 0
Test Statistic
P-value .004 a .05
Since the P-value lt a, we reject H0. There is
convincing evidence that the mean salary for male
purchasing managers is higher than the mean
salary for female purchasing managers.

Men 81 69 81 76 76 74 69 76 79 65
Women 78 60 67 61 62 73 71 58 68 48
Where m1 mean annual salary for male purchasing
managers and m2 mean annual salary for female
purchasing managers
What potential type error could we have made with
this conclusion?
Truncate (round down) this value.
Type I
Now find the area to the right of t 3.11 in the
t-curve with df 15.
Compute the test statistic and P-value
To find the P-value, first find the appropriate
df.
13
The Two-Sample t Confidence Interval for the
Difference Between Two Population or Treatment
Means

The general formula for a confidence interval for
m1 m2 when
The two samples are independently selected random
samples from the populations of interest
The sample sizes are large (generally 30 or
larger) or the population distributions are (at
least approximately) normal.
is
The t critical value is based on
df should be truncated to an integer.

For a comparison of two treatments, use the
following assumptions 1) Individuals or objects
are randomly assigned to treatments (or vice
versa) 2) The sample sizes are large (generally
30 or larger) or the treatment response
distributions are approximately normal.
where
and
14
In a study on food intake after sleep
deprivation, men were randomly assigned to
one of two treatment groups. The experimental
group were required to sleep only 4 hours on each
of two nights, while the control group were
required to sleep 8 hours on each of two nights.
The amount of food intake (Kcal) on the day
following the two nights of sleep was measured.
Compute a 95 confidence interval for the true
difference in the mean food intake for the two
sleeping conditions.
4-hour sleep 3585 4470 3068 5338 2221 4791 4435 3099
4-hour sleep 3187 3901 3868 3869 4878 3632 4518
8-hour sleep 4965 3918 1987 4993 5220 3653 3510 3338
8-hour sleep 4100 5792 4547 3319 3336 4304 4057
Find the mean and standard deviation for each
treatment.
15
Food Intake Study Continued . . .
4-hour sleep 3585 4470 3068 5338 2221 4791 4435 3099
4-hour sleep 3187 3901 3868 3869 4878 3632 4518
8-hour sleep 4965 3918 1987 4993 5220 3653 3510 3338
8-hour sleep 4100 5792 4547 3319 3336 4304 4057

Assumptions
Men were randomly assigned to two treatment groups

Verify the assumptions.
2) The assumption of normal response
distributions is plausible because both boxplots
are approximately symmetrical with no outliers.
16
Food Intake Study Continued . . .
4-hour sleep 3585 4470 3068 5338 2221 4791 4435 3099
4-hour sleep 3187 3901 3868 3869 4878 3632 4518
8-hour sleep 4965 3918 1987 4993 5220 3653 3510 3338
8-hour sleep 4100 5792 4547 3319 3336 4304 4057
Based upon this interval, is there a significant
difference in the mean food intake for the two
sleeping conditions?
No, since 0 is in the confidence interval, there
is not convincing evidence that the mean food
intake for the two sleep conditions are different.
Calculate the interval.
We are 95 confident that the true difference in
the mean food intake for the two sleeping
conditions is between -814.1 Kcal and 523.6 Kcal.
Interpret the interval in context.
17
Pooled t Test

Used when the variances of the two populations
are equal (s1 s2)
Combines information from both samples to create
a pooled estimate of the common variance which
is used in place of the two sample standard
deviations
Is not widely used due to its sensitivity to any
departure from the equal variance assumption

P-values computed using the pooled t procedure
can be far from the actual P-value if the
population variances are not equal.
When the population variances are equal, the
pooled t procedure is better at detecting
deviations from H0 than the two-sample t test.
18
Suppose that an investigator wants to determine
if regular aerobic exercise improves blood
pressure. A random sample of people who jog
regularly and a second random sample of people
who do not exercise regularly are selected
independently of one another. Can we conclude
that the difference in mean blood pressure is
attributed to jogging? What about other factors
like weight? One way to avoid these
difficulties would be to pair subjects by
weight then assign one of the pair to jogging
and the other to no exercise.
19
Summary of the Paired t test for Comparing Two
Population or Treatment Means

Null Hypothesis H0 md hypothesized value
Test Statistic
Where n is the number of sample differences and
xd and sd are the mean and standard deviation of
the sample differences. This test is based on df
n 1.
Alternative Hypothesis P-value
Ha md gt hypothesized value Area to the right of
calculated t
Ha md lt hypothesized value Area to the left of
calculated t
Ha md ? hypothesized value 2(area to the right
of t) if t
or 2(area to the left of t) if -t

The hypothesized value is usually 0 meaning
that there is no difference.
Where md is the mean of the differences in the
paired observations
20
Summary of the Paired t test for Comparing Two
Population or Treatment Means Continued . . .

Assumptions
The samples are paired.
The n sample differences can be viewed as a
random sample from a population of differences.
The number of sample differences is large
(generally at least 30) or the population
distribution of differences is (at least
approximately) normal.

21
Is this an example of paired samples?

An engineering association wants to see if there
is a difference in the mean annual salary for
electrical engineers and chemical engineers. A
random sample of electrical engineers is surveyed
about their annual income. Another random sample
of chemical engineers is surveyed about their
annual income.

No, there is no pairing of individuals, you have
two independent samples
22
Is this an example of paired samples?

A pharmaceutical company wants to test its new
weight-loss drug. Before giving the drug to
volunteers, company researchers weigh each
person. After a month of using the drug, each
persons weight is measured again.

Yes, you have two observations on each
individual, resulting in paired data.
23

Can playing chess improve your memory? In a
study, students who had not previously played
chess participated in a program in which they
took chess lessons and played chess daily for 9
months. Each student took a memory test before
starting the chess program and again at the end
of the 9-month period.

If we had subtracted Post-test minus Pre-test,
then the alternative hypothesis would be the mean
difference is greater than 0.
Student 1 2 3 4 5 6 7 8 9 10 11 12
Pre-test 510 610 640 675 600 550 610 625 450 720 575 675
Post-test 850 790 850 775 700 775 700 850 690 775 540 680
Difference -340 -180 -210 -100 -100 -225 -90 -225 -240 -55 35 -5
H0 md 0 Ha md lt 0 Where md is the mean memory
score difference between students with no chess
training and students who have completed chess
training
First, find the differences pre-test minus
post-test.
State the hypotheses.
24

Playing Chess Continued . . .

Student 1 2 3 4 5 6 7 8 9 10 11 12
Pre-test 510 610 640 675 600 550 610 625 450 720 575 675
Post-test 850 790 850 775 700 775 700 850 690 775 540 680
Difference -340 -180 -210 -100 -100 -225 -90 -225 -240 -55 35 -5
H0 md 0 Ha md lt 0 Assumptions 1) Although
the sample of students is not a random sample, the
Where md is the mean memory score difference
between students with no chess training and
students who have completed chess training
Verify assumptions
investigator believed that it was reasonable to
view the 12 sample differences as representative
of all such differences. 2) A boxplot of the
differences is approximately symmetrical with no
outliers so the assumption of normality
is plausible.
25

Playing Chess Continued . . .

Student 1 2 3 4 5 6 7 8 9 10 11 12
Pre-test 510 610 640 675 600 550 610 625 450 720 575 675
Post-test 850 790 850 775 700 775 700 850 690 775 540 680
Difference -340 -180 -210 -100 -100 -225 -90 -225 -240 -55 35 -5
H0 md 0 Ha md lt 0 Test Statistic
Where md is the mean memory score difference
between students with no chess training and
students who have completed chess training
State the conclusion in context.
Compute the test statistic and P-value.
P-value 0 df 11 a .05 Since the P-value lt
a, we reject H0. There is convincing evidence to
suggest that the mean memory score after chess
training is higher than the mean memory score
before training.
26
Paired t Confidence Interval for md

When
The samples are paired.
The n sample differences can be viewed as a
random sample from a population of differences.
The number of sample differences is large
(generally at least 30) or the population
distribution of differences is (at least
approximately) normal.
the paired t interval for md is
Where df n - 1

Playing Chess Revisited . . .

Student 1 2 3 4 5 6 7 8 9 10 11 12
Pre-test 510 610 640 675 600 550 610 625 450 720 575 675
Post-test 850 790 850 775 700 775 700 850 690 775 540 680
Difference -340 -180 -210 -100 -100 -225 -90 -225 -240 -55 35 -5
Compute a 90 confidence interval for the mean
difference in memory scores before chess training
and the memory scores after chess training.
We are 90 confident that the true mean
difference in memory scores before chess training
and the memory scores after chess training is
between -201.5 and -87.69.
28
Large-Sample Inferences Concerning the Difference
Between Two Population or Treatment Proportions
29
Some people seem to think that duct tape can fix
anything . . . even remove warts!

Investigators at Madigan Army Medical Center
tested using duct tape to remove warts versus the
more traditional freezing treatment.
Suppose that the duct tape treatment will
successfully remove 50 of warts and that the
traditional freezing treatment will successfully
remove 60 of warts.

30
pfreeze the true proportion of warts that are
successfully removed by freezing pfreeze .6
ptape the true proportion of warts that are
successfully removed by using duct tape ptape .5
Randomly take one of the sample proportions for
the freezing treatment and one of the sample
proportions for the duct tape treatment and find
the difference.
31
When performing a hypothesis test, we will use
the null hypothesis that p1 and p2 are equal. We
will not know the common value for p1 and p2.
2.
32
Summary of Large-Sample z Test for p1 p2 0

Null Hypothesis H0 p1 p2 0
Test Statistic
Alternative Hypothesis P-value
Ha p1 p2 gt 0 area to the right of
calculated z
Ha p1 p2 lt 0 area to the left of
calculated z
Ha p1 p2 ? 0 2(area to the right of z) if
z or
2(area to the left of z) if -z

33
Another Way to Write Hypothesis statements
Be sure to define both p1 p2!

H0 p1 - p2 0
Ha p1 - p2 gt 0
Ha p1 - p2 lt 0
Ha p1 - p2 ? 0

H0 p1 p2
Ha p1 gt p2
Ha p1 lt p2
Ha p1 ? p2
34
Summary of Large-Sample z Test for p1 p2 0
Continued . . .

Assumption
The samples are independently chosen random
samples or treatments were assigned at random to
individuals or objects

35
Investigators at Madigan Army Medical Center
tested using duct tape to remove warts. Patients
with warts were randomly assigned to either the
duct tape treatment or to the more traditional
freezing treatment. Those in the duct tape group
wore duct tape over the wart for 6 days, then
removed the tape, soaked the area in water, and
used an emery board to scrape the area. This
process was repeated for a maximum of 2 months or
until the wart was gone. The data
follows Do these data suggest that
freezing is less successful than duct tape in
removing warts?
Treatment n Number with wart successfully removed
Liquid nitrogen freezing 100 60
Duct tape 104 88
36
Duct Tape Continued . . . H0 p1 p2 0 Ha
p1 p2 lt 0 Assumptions 1) Subjects were
randomly assigned to the two treatments.
Treatment n Number with wart successfully removed
Liquid nitrogen freezing 100 60
Duct tape 104 88
Where p1 is the true proportion of warts that
would be successfully removed by freezing and p2
is the true proportion of warts that would be
successfully removed by duct tape
37
Duct Tape Continued . . . H0 p1 p2 0 Ha
p1 p2 lt 0
Treatment n Number with wart successfully removed
Liquid nitrogen freezing 100 60
Duct tape 104 88
P-value 0 a .01
Since the P-value lt a, we reject H0. There is
convincing evidence to suggest the proportion of
warts successfully removed is lower for freezing
than for the duct tape treatment.
38
A Large-Sample Confidence Interval for p1 p2

When
The samples are independently chosen random
samples or treatments were assigned at random to
individuals or objects
a large-sample confidence interval for p1 p2 is

The article Freedom of What? (Associated Press,
February 1, 2005) described a study in which high
school students and high school teachers were
asked whether they agreed with the following
statement Students should be allowed to report
controversial issues in their student newspapers
without the approval of school authorities. It
was reported that 58 of students surveyed and
39 of teachers surveyed agreed with the
statement. The two samples 10,000 high school
students and 8000 high school teachers were
selected from schools across the country.
Compute a 90 confidence interval for the
difference in proportion of students who
agreed with the statement and the
proportion of teachers who agreed
with the statement.

Newspaper Problem Continued . . .

Based on this confidence interval, does there
appear to be a significant difference in
proportion of students who agreed with the
statement and the proportion of teachers who
agreed with the statement? Explain.
1) Assume that it is reasonable to regard these
two samples as being independently selected and
representative of the populations of interest.
We are 90 confident that the difference in
proportion of students who agreed with the
statement and the proportion of teachers who
agreed with the statement is between .178 and
.202.

Write a Comment

User Comments (0)