Title: Testing statistical hypotheses about when is not known: the one sample ttest
1Testing statistical hypotheses about ?? when ??
is not known the one sample t-test
- Minium, Clarke Coladarci, Chapter 13
2Testing statistical hypotheses about ?? when ??
is not known
- We have made use of the z distribution (the
standard normal distribution) to do hypothesis
testing and to compute confidence intervals for
example - For hypothesis testing we can calculate
- and determine if Z exceeds our z-score cutoff
(e.g., 1.64) - For confidence intervals we can calculate
- Confidence interval Mean 1.96( )
- These two calculations require knowing ? so that
we can calculate - What if we didnt know ? and had to estimate it
from our sample? - What should we use to estimate ???
- Would we be able to substitute this estimate
(estimated ) into our formulas and have
everything work out the same way?
3Sampling Distributions -- biased and unbiased
estimates
- We might think that ?? can be estimated by
- but it turns out that this is not quite the
case. - S is said to be a biased estimator of ?
- We know that if you repeatedly choose samples of
size n and compute the mean ( ) for each
sample, we will create the sampling distribution
of means. - The mean of the sampling distribution of ( )
will equal that of the original distribution (?). - In this sense the sample mean ( ) is an
unbiased estimate of ? because on average
??
4Sampling Distributions -- biased and unbiased
estimates
- If we repeatedly choose samples of size n and
compute the variance (S2) for each sample, - we will create a distribution of variances.
- The mean of the distribution of variances will
not equal the variances of original distribution
(?2). - In this sense the sample variance (S2) is a
biased estimate of ?2 because on average S2 ? ?2 .
5Sampling Distributions -- biased and unbiased
estimates
- But, if we repeatedly choose samples of size n
and compute the the variance as - we will create a distribution of variances and
the mean of the distribution of s2 (computed with
n-1 in the denominator) will equal of the
variance of original distribution (?2). - We refer to n-1 as the number of degrees of
freedom (df) i.e., df n-1 - In this sense the sample variance computed with
n-1 in the denominator (s2) is an unbiased
estimate of ?2 because on average s2 ?2.
6Sampling Distributions -- biased and unbiased
estimates
- Therefore, going back to our first question,
- What should we use to estimate ?
- The answer is we should first compute an unbiased
estimate of ?2 as - And then we should compute an estimate of as
- where
- NOTE Be able to
- explain the difference between s2 and S2
- explain the difference between s and
- explain the difference between and
Web Demo
7The t-distribution
- What we know already
- The distribution of sample means has the
following parameters - Thus any particular mean can can be converted to
a z score in the following way - producing the standard normal distribution or,
in other words, a sampling distribution of
z-scores.
8(No Transcript)
9The t-distribution
- Gosset raised the following question
- How would things change if we divided by
rather than itself? - Would the resulting sampling distribution be
normal? - The answer is no in general t-scores are not
normally distributed. - But the distribution of t-scores is symmetric
- and has a mean of zero
- The exact form of the t-distribution depends on n
- For large n the t-distribution approaches the
standard normal distribution (the z-distribution)
but for small n the t-distribution becomes
leptokurtic.
10The t-distribution
The good news is that the t-distribution has a
definite form we know this because of smart
people like Gossett
(You dont have to remember this)
11The t-distribution
12The t-distribution
We use the t-distribution in same way we use the
z-distribution we find t? in the same way we
find z?
t?
13(No Transcript)
14The one sample t-test
- Consider our baby/supercharged vitamin example
again - Lets say you know that on average the first
steps are taken at 14 months - but you dont know ?
15The one sample t-test
- Assume youve chosen a sample of 16 babies and
given them vitamins that you expect will speed
their development and hence lead them to walk at
an earlier age. - H0 ?V ?NV
- H1 ?V lt ?NV
- Set ? .05 and perform a one tailed t-test.
- The comparison distribution is the t-distribution
- with df (n - 1) 15, and t? -1.753
- Find the mean ( ) and standard deviation (s)
of the sample
16The one sample t-test
- The sample mean is 12 and and standard
deviation s 3 - Calculate s/sqrt(16) 3/4 .75
- Calculate (12-14)/.75 -2.67
- t -267, which is less than t? -1.753
- Therefore, reject H0.
- Conclude that the vitamins had a statistically
significant effect.
17The one sample t-test
18Confidence intervals when ? is estimated by s
- Confidence intervals about a mean
- If sample mean is 12 and, the standard
deviation s 3 and n 16, then - Calculate s/sqrt(16) 3/4 .75
- There are 16-1 15 dfs
- t? 2.132 (for a 95 CI)
- CI 12 2.132(.75) 10.401 to 13.599
19Additional Points
- Assumption of Population Normality
- Sample t-ratios follow the t-distribution exactly
only if the samples have been randomly selected
from a population of observations that itself has
the normal shape - Levels of significance versus p values
- We can use the t-table to set our cutoff values
but it does not provide the exact p-value for all
possible t-values for all dfs. - However, computer programmes can do this.
- e.g., the tdist function which is analogous to
the normdist function - TDIST(x, degrees_freedom, tails)
- X is the numeric value at which to evaluate the
distribution. - Degrees_freedom is an integer indicating the
number of degrees of freedom. - Tails specifies the number of distribution
tails to return. If tails 1, TDIST returns the
one-tailed distribution. If tails 2, TDIST
returns the two-tailed distribution.