Suppose we have a population of adult men with a mean height of 71 inches and standard deviation of 2.5 inches. We also have a population of adult women with a mean height of 65 inches and standard deviation of 2.3 inches. Assume heights are - PowerPoint PPT Presentation

About This Presentation
Title:

Suppose we have a population of adult men with a mean height of 71 inches and standard deviation of 2.5 inches. We also have a population of adult women with a mean height of 65 inches and standard deviation of 2.3 inches. Assume heights are

Description:

... Students should be allowed to report controversial issues in their student newspapers without ... The sampling distribution ... Are women still paid less ... – PowerPoint PPT presentation

Number of Views:86
Avg rating:3.0/5.0
Slides: 41
Provided by: Plano254
Category:

less

Transcript and Presenter's Notes

Title: Suppose we have a population of adult men with a mean height of 71 inches and standard deviation of 2.5 inches. We also have a population of adult women with a mean height of 65 inches and standard deviation of 2.3 inches. Assume heights are


1
Suppose we have a population of adult men
with a mean height of 71 inches and
standard deviation of 2.5 inches. We also have a
population of adult women with a mean height of
65 inches and standard deviation of 2.3 inches.
Assume heights are normally distributed.
Suppose we take a random sample of 30 men and a
random sample of 25 women from their respective
populations and calculate the difference in their
heights (mans height womans height). If we
did this many times, what would the distribution
of differences be like?
On the next slide we will investigate this
distribution.
2
Randomly take one of the sample means for the
males and one of the sample means for the females
and find the difference in mean heights.
sM 2.5
sF 2.3
3
  • Heights Continued . . .
  • Describe the sampling distribution of the
    difference in mean heights between men and women.
  • What is the probability that the difference in
    mean heights of a random sample of 30 men and a
    random sample of 25 women is less than 5 inches?

The sampling distribution is normally distributed
with
4
1. 2. and
The variance of the differences is the sum of the
variances.
5
When two random samples are independently
selected and n1 and n2 are both large or the
population distributions are (at least
approximately) normal, the distribution of
We must know s1 and s2 in order to use this
procedure.
If s1 and s2 is unknown we must use t
distributions.
is described (at least approximately) by the
standard normal (z) distribution.
6
Two-Sample t Test for Comparing Two Populations
  • Null Hypothesis H0 m1 m2 hypothesized value
  • Test Statistic
  • The appropriate df for the two-sample t test is
  • The computed number of df should be truncated to
    an integer.

A conservative estimate of the P-value can be
found by using the t-curve with the number of
degrees of freedom equal to the smaller of (n1
1) or (n2 1).
The hypothesized value is often 0, but there are
times when we are interested in testing for a
difference that is not 0.
where
and
7
Two-Sample t Test for Comparing Two Populations
Continued . . .
  • Null Hypothesis H0 m1 m2 hypothesized value

Alternative Hypothesis P-value
Ha m1 m2 gt hypothesized value Area under the appropriate t curve to the right of the computed t


Area under the appropriate t curve to the left of
the computed t
Ha m1 m2 lt hypothesized value
2(area to right of computed t) if t or 2(area
to left of computed t) if -t
Ha m1 m2 ? hypothesized value
8
Another Way to Write Hypothesis Statements
H0 m1 m2
  • H0 m1 - m2 0
  • Ha m1 - m2 lt 0
  • Ha m1 - m2 gt 0
  • Ha m1 - m2 ? 0

When the hypothesized value is 0, we can rewrite
these hypothesis statements
Be sure to define BOTH m1 and m2!
Ha m1 lt m2
Ha m1 gt m2
Ha m1 ? m2
9
Two-Sample t Test for Comparing Two Populations
Continued . . .
  • Assumptions
  • The two samples are independently selected random
    samples from the populations of interest
  • The sample sizes are large (generally 30 or
    larger) or the population distributions are (at
    least approximately) normal.
  • When comparing two treatment groups, use the
    following assumptions
  • Individuals or objects are randomly assigned to
    treatments (or vice versa)
  • The sample sizes are large (generally 30 or
    larger) or the treatment response distributions
    are approximately normal.

10
  • Are women still paid less than men for comparable
    work? A study was carried out in which
    salary data was collected from a random
    sample of men and from a random sample of women
    who worked as purchasing managers and who were
    subscribers to Purchasing magazine. Annual
    salaries (in thousands of dollars) appear below
    (the actual sample sizes were much larger). Use
    a .05 to determine if there is convincing
    evidence that the mean annual salary for male
    purchasing managers is greater than the mean
    annual salary for female purchasing managers.
  • H0 m1 m2 0
  • Ha m1 m2 gt 0

If we had defined m1 as the mean salary for
female purchasing managers and m2 as the mean
salary for male purchasing managers, then the
correct alternative hypothesis would be the
difference in the means is less than 0.
Men 81 69 81 76 76 74 69 76 79 65
Women 78 60 67 61 62 73 71 58 68 48
Where m1 mean annual salary for male purchasing
managers and m2 mean annual salary for female
purchasing managers
State the hypotheses
11
  • Salary War Continued . . .
  • H0 m1 m2 0
  • Ha m1 m2 gt 0
  • Assumptions
  • Given two independently selected random samples
    of male and female purchasing managers.

Men 81 69 81 76 76 74 69 76 79 65
Women 78 60 67 61 62 73 71 58 68 48
Where m1 mean annual salary for male purchasing
managers and m2 mean annual salary for female
purchasing managers
Even though these are samples from subscribers of
Purchasing magazine, the authors of the study
believed it was reasonable to view the samples
as representative of the populations of interest.






2) Since the sample sizes are small, we must
determine if it is plausible that the sampling
distributions for each of the two populations are
approximately normal. Since the boxplots are
reasonably symmetrical with no outliers, it is
plausible that the sampling distributions are
approximately normal.
Verify the assumptions
12
  • Salary War Continued . . .
  • H0 m1 m2 0
  • Ha m1 m2 gt 0
  • Test Statistic
  • P-value .004 a .05
  • Since the P-value lt a, we reject H0. There is
    convincing evidence that the mean salary for male
    purchasing managers is higher than the mean
    salary for female purchasing managers.

Men 81 69 81 76 76 74 69 76 79 65
Women 78 60 67 61 62 73 71 58 68 48
Where m1 mean annual salary for male purchasing
managers and m2 mean annual salary for female
purchasing managers
What potential type error could we have made with
this conclusion?
Truncate (round down) this value.
Type I
Now find the area to the right of t 3.11 in the
t-curve with df 15.
Compute the test statistic and P-value
To find the P-value, first find the appropriate
df.
13
The Two-Sample t Confidence Interval for the
Difference Between Two Population or Treatment
Means
  • The general formula for a confidence interval for
    m1 m2 when
  • The two samples are independently selected random
    samples from the populations of interest
  • The sample sizes are large (generally 30 or
    larger) or the population distributions are (at
    least approximately) normal.
  • is
  • The t critical value is based on
  • df should be truncated to an integer.

For a comparison of two treatments, use the
following assumptions 1) Individuals or objects
are randomly assigned to treatments (or vice
versa) 2) The sample sizes are large (generally
30 or larger) or the treatment response
distributions are approximately normal.
where
and
14
In a study on food intake after sleep
deprivation, men were randomly assigned to
one of two treatment groups. The experimental
group were required to sleep only 4 hours on each
of two nights, while the control group were
required to sleep 8 hours on each of two nights.
The amount of food intake (Kcal) on the day
following the two nights of sleep was measured.
Compute a 95 confidence interval for the true
difference in the mean food intake for the two
sleeping conditions.
4-hour sleep 3585 4470 3068 5338 2221 4791 4435 3099
4-hour sleep 3187 3901 3868 3869 4878 3632 4518
8-hour sleep 4965 3918 1987 4993 5220 3653 3510 3338
8-hour sleep 4100 5792 4547 3319 3336 4304 4057
Find the mean and standard deviation for each
treatment.
15
Food Intake Study Continued . . .
4-hour sleep 3585 4470 3068 5338 2221 4791 4435 3099
4-hour sleep 3187 3901 3868 3869 4878 3632 4518
8-hour sleep 4965 3918 1987 4993 5220 3653 3510 3338
8-hour sleep 4100 5792 4547 3319 3336 4304 4057
  • Assumptions
  • Men were randomly assigned to two treatment groups

Verify the assumptions.
2) The assumption of normal response
distributions is plausible because both boxplots
are approximately symmetrical with no outliers.
16
Food Intake Study Continued . . .
4-hour sleep 3585 4470 3068 5338 2221 4791 4435 3099
4-hour sleep 3187 3901 3868 3869 4878 3632 4518
8-hour sleep 4965 3918 1987 4993 5220 3653 3510 3338
8-hour sleep 4100 5792 4547 3319 3336 4304 4057
Based upon this interval, is there a significant
difference in the mean food intake for the two
sleeping conditions?
No, since 0 is in the confidence interval, there
is not convincing evidence that the mean food
intake for the two sleep conditions are different.
Calculate the interval.
We are 95 confident that the true difference in
the mean food intake for the two sleeping
conditions is between -814.1 Kcal and 523.6 Kcal.
Interpret the interval in context.
17
Pooled t Test
  • Used when the variances of the two populations
    are equal (s1 s2)
  • Combines information from both samples to create
    a pooled estimate of the common variance which
    is used in place of the two sample standard
    deviations
  • Is not widely used due to its sensitivity to any
    departure from the equal variance assumption

P-values computed using the pooled t procedure
can be far from the actual P-value if the
population variances are not equal.
When the population variances are equal, the
pooled t procedure is better at detecting
deviations from H0 than the two-sample t test.
18
Suppose that an investigator wants to determine
if regular aerobic exercise improves blood
pressure. A random sample of people who jog
regularly and a second random sample of people
who do not exercise regularly are selected
independently of one another. Can we conclude
that the difference in mean blood pressure is
attributed to jogging? What about other factors
like weight? One way to avoid these
difficulties would be to pair subjects by
weight then assign one of the pair to jogging
and the other to no exercise.
19
Summary of the Paired t test for Comparing Two
Population or Treatment Means
  • Null Hypothesis H0 md hypothesized value
  • Test Statistic
  • Where n is the number of sample differences and
    xd and sd are the mean and standard deviation of
    the sample differences. This test is based on df
    n 1.
  • Alternative Hypothesis P-value
  • Ha md gt hypothesized value Area to the right of
    calculated t
  • Ha md lt hypothesized value Area to the left of
    calculated t
  • Ha md ? hypothesized value 2(area to the right
    of t) if t
  • or 2(area to the left of t) if -t

The hypothesized value is usually 0 meaning
that there is no difference.
Where md is the mean of the differences in the
paired observations
20
Summary of the Paired t test for Comparing Two
Population or Treatment Means Continued . . .
  • Assumptions
  • The samples are paired.
  • The n sample differences can be viewed as a
    random sample from a population of differences.
  • The number of sample differences is large
    (generally at least 30) or the population
    distribution of differences is (at least
    approximately) normal.

21
Is this an example of paired samples?
  • An engineering association wants to see if there
    is a difference in the mean annual salary for
    electrical engineers and chemical engineers. A
    random sample of electrical engineers is surveyed
    about their annual income. Another random sample
    of chemical engineers is surveyed about their
    annual income.

No, there is no pairing of individuals, you have
two independent samples
22
Is this an example of paired samples?
  • A pharmaceutical company wants to test its new
    weight-loss drug. Before giving the drug to
    volunteers, company researchers weigh each
    person. After a month of using the drug, each
    persons weight is measured again.

Yes, you have two observations on each
individual, resulting in paired data.
23
  • Can playing chess improve your memory? In a
    study, students who had not previously played
    chess participated in a program in which they
    took chess lessons and played chess daily for 9
    months. Each student took a memory test before
    starting the chess program and again at the end
    of the 9-month period.

If we had subtracted Post-test minus Pre-test,
then the alternative hypothesis would be the mean
difference is greater than 0.
Student 1 2 3 4 5 6 7 8 9 10 11 12
Pre-test 510 610 640 675 600 550 610 625 450 720 575 675
Post-test 850 790 850 775 700 775 700 850 690 775 540 680
Difference -340 -180 -210 -100 -100 -225 -90 -225 -240 -55 35 -5
H0 md 0 Ha md lt 0 Where md is the mean memory
score difference between students with no chess
training and students who have completed chess
training
First, find the differences pre-test minus
post-test.
State the hypotheses.
24
  • Playing Chess Continued . . .

Student 1 2 3 4 5 6 7 8 9 10 11 12
Pre-test 510 610 640 675 600 550 610 625 450 720 575 675
Post-test 850 790 850 775 700 775 700 850 690 775 540 680
Difference -340 -180 -210 -100 -100 -225 -90 -225 -240 -55 35 -5
H0 md 0 Ha md lt 0 Assumptions 1) Although
the sample of students is not a random sample, the
Where md is the mean memory score difference
between students with no chess training and
students who have completed chess training
Verify assumptions
investigator believed that it was reasonable to
view the 12 sample differences as representative
of all such differences. 2) A boxplot of the
differences is approximately symmetrical with no
outliers so the assumption of normality
is plausible.
25
  • Playing Chess Continued . . .

Student 1 2 3 4 5 6 7 8 9 10 11 12
Pre-test 510 610 640 675 600 550 610 625 450 720 575 675
Post-test 850 790 850 775 700 775 700 850 690 775 540 680
Difference -340 -180 -210 -100 -100 -225 -90 -225 -240 -55 35 -5
H0 md 0 Ha md lt 0 Test Statistic
Where md is the mean memory score difference
between students with no chess training and
students who have completed chess training
State the conclusion in context.
Compute the test statistic and P-value.
P-value 0 df 11 a .05 Since the P-value lt
a, we reject H0. There is convincing evidence to
suggest that the mean memory score after chess
training is higher than the mean memory score
before training.
26
Paired t Confidence Interval for md
  • When
  • The samples are paired.
  • The n sample differences can be viewed as a
    random sample from a population of differences.
  • The number of sample differences is large
    (generally at least 30) or the population
    distribution of differences is (at least
    approximately) normal.
  • the paired t interval for md is
  • Where df n - 1

27
  • Playing Chess Revisited . . .

Student 1 2 3 4 5 6 7 8 9 10 11 12
Pre-test 510 610 640 675 600 550 610 625 450 720 575 675
Post-test 850 790 850 775 700 775 700 850 690 775 540 680
Difference -340 -180 -210 -100 -100 -225 -90 -225 -240 -55 35 -5
Compute a 90 confidence interval for the mean
difference in memory scores before chess training
and the memory scores after chess training.
We are 90 confident that the true mean
difference in memory scores before chess training
and the memory scores after chess training is
between -201.5 and -87.69.
28
Large-Sample Inferences Concerning the Difference
Between Two Population or Treatment Proportions
29
Some people seem to think that duct tape can fix
anything . . . even remove warts!
  • Investigators at Madigan Army Medical Center
    tested using duct tape to remove warts versus the
    more traditional freezing treatment.
  • Suppose that the duct tape treatment will
    successfully remove 50 of warts and that the
    traditional freezing treatment will successfully
    remove 60 of warts.

30
pfreeze the true proportion of warts that are
successfully removed by freezing pfreeze .6
ptape the true proportion of warts that are
successfully removed by using duct tape ptape .5
Randomly take one of the sample proportions for
the freezing treatment and one of the sample
proportions for the duct tape treatment and find
the difference.
31
When performing a hypothesis test, we will use
the null hypothesis that p1 and p2 are equal. We
will not know the common value for p1 and p2.
2.
32
Summary of Large-Sample z Test for p1 p2 0
  • Null Hypothesis H0 p1 p2 0
  • Test Statistic
  • Alternative Hypothesis P-value
  • Ha p1 p2 gt 0 area to the right of
    calculated z
  • Ha p1 p2 lt 0 area to the left of
    calculated z
  • Ha p1 p2 ? 0 2(area to the right of z) if
    z or
  • 2(area to the left of z) if -z

33
Another Way to Write Hypothesis statements
Be sure to define both p1 p2!
  • H0 p1 - p2 0
  • Ha p1 - p2 gt 0
  • Ha p1 - p2 lt 0
  • Ha p1 - p2 ? 0

H0 p1 p2
Ha p1 gt p2
Ha p1 lt p2
Ha p1 ? p2
34
Summary of Large-Sample z Test for p1 p2 0
Continued . . .
  • Assumption
  • The samples are independently chosen random
    samples or treatments were assigned at random to
    individuals or objects

35
Investigators at Madigan Army Medical Center
tested using duct tape to remove warts. Patients
with warts were randomly assigned to either the
duct tape treatment or to the more traditional
freezing treatment. Those in the duct tape group
wore duct tape over the wart for 6 days, then
removed the tape, soaked the area in water, and
used an emery board to scrape the area. This
process was repeated for a maximum of 2 months or
until the wart was gone. The data
follows Do these data suggest that
freezing is less successful than duct tape in
removing warts?
Treatment n Number with wart successfully removed
Liquid nitrogen freezing 100 60
Duct tape 104 88
36
Duct Tape Continued . . . H0 p1 p2 0 Ha
p1 p2 lt 0 Assumptions 1) Subjects were
randomly assigned to the two treatments.
Treatment n Number with wart successfully removed
Liquid nitrogen freezing 100 60
Duct tape 104 88
Where p1 is the true proportion of warts that
would be successfully removed by freezing and p2
is the true proportion of warts that would be
successfully removed by duct tape
37
Duct Tape Continued . . . H0 p1 p2 0 Ha
p1 p2 lt 0
Treatment n Number with wart successfully removed
Liquid nitrogen freezing 100 60
Duct tape 104 88
P-value 0 a .01
Since the P-value lt a, we reject H0. There is
convincing evidence to suggest the proportion of
warts successfully removed is lower for freezing
than for the duct tape treatment.
38
A Large-Sample Confidence Interval for p1 p2
  • When
  • The samples are independently chosen random
    samples or treatments were assigned at random to
    individuals or objects
  • a large-sample confidence interval for p1 p2 is

39
  • The article Freedom of What? (Associated Press,
    February 1, 2005) described a study in which high
    school students and high school teachers were
    asked whether they agreed with the following
    statement Students should be allowed to report
    controversial issues in their student newspapers
    without the approval of school authorities. It
    was reported that 58 of students surveyed and
    39 of teachers surveyed agreed with the
    statement. The two samples 10,000 high school
    students and 8000 high school teachers were
    selected from schools across the country.
  • Compute a 90 confidence interval for the
    difference in proportion of students who
    agreed with the statement and the
    proportion of teachers who agreed
    with the statement.

40
  • Newspaper Problem Continued . . .

Based on this confidence interval, does there
appear to be a significant difference in
proportion of students who agreed with the
statement and the proportion of teachers who
agreed with the statement? Explain.
1) Assume that it is reasonable to regard these
two samples as being independently selected and
representative of the populations of interest.
We are 90 confident that the difference in
proportion of students who agreed with the
statement and the proportion of teachers who
agreed with the statement is between .178 and
.202.
Write a Comment
User Comments (0)
About PowerShow.com