Title: Simple Random Sampling
1Simple Random Sampling
Basic ideas 1. "Random" refers to the method
of selecting a sample rather than to the
particular sample selected. It refers the
process rather the outcome of the process. 2.
The random selection process determines the
selection probability. 3. The inverse of
selection probability plays the key role in
linking sample data to the population
quantities.
2- Two types of random sampling
- 1. Sampling with replacement - duplicated
selection allowed - 2. Sampling without replacement - duplicated
selection not allowed - In practice we use the without replacement
sampling. What are the reasons? -
3Two definitions of simple random sampling 1.
Each element has an equal chance of being
selected. 2. Each possible sample has an equal
chance of being selected.
4- The first definition appears to be valid with
the with-replacement sampling. Is it valid with
the without-replacement s sampling?
5Consider a case of taking a without-replacement
sample of n2 from a population N3 (a, b, c).
The probability of selecting any element in the
first draw is 1/3. What is the probability of
selecting any element in the second draw? You
would think that it is 1/2 because one is already
selected. You have to realize that 1/2 is a
conditional probability since the second draw of
any particular element is possible only if that
element was not selected in the first draw. The
probability of not selecting any particular
element in the first draw is 1-(1/3) 2/3. When
you multiply (1/2) to (2/3), you get 1/3,
suggesting that the probability of selecting any
element at each draw is the same.
6- Is the selection probability the same with
both of the two sampling procedures?
7Consider the case of selection 2 of 3. With the
without-replacement sampling, there are 3
possible samples 3. These are (a, b), (a,
c), and (b, c). Note that each element is
included 2 times in 3 possible samples. The
probability that any element is included in the
sample is 2/3. In general, it is n/N, the
sampling ratio. With the with-replacement
sampling, there are 9 possible samples. These
are (a, a), (a, b), (b, a), (b, b), (a, c), (c,
a), (b, c), (c, b), and (c, c). Note that each
element is included 6 times in 9 possible
samples. The probability that any element
included in the sample is 6/92/3. Again, the
sampling ratio is n/N.
8- Without-replacement sampling is equivalent
to with-replacement sampling when we take a
sample from a large population. Why?
9- Total number of possible samples
- Without-replacement sample - combination,
- With-replacement sample permutation,
- T Nn
10Example consider the illustrative example on
page 55 N6 ( not immunized 4, 5, 3, 3, 7, and
8) Mean5
Population total30
Variance
(22/6) 3.667
n3, sampling rate (selection probability)3/61/
2 Inverse of 1/2 2 - the population total
(x') can be estimated by doubling the sample
total. With the without-replacement sample
T20 With the with-replacement sample T216
11Sampling Distribution With-replacement
sample Of the 216 possible samples, 120 do not
contain duplications, 6 have three duplications,
and 90 have two duplications. Samples with no
duplications - each of the 20 possible samples
listed in Table 3.1 will appear 6 times because
there is 6 ways of ordering 3 elements in the
sample 120 Samples with three duplications -
each element of the population can be selected
three times (three duplications). The estimated
population totals from 6 such samples are (x'
24, 30, 18, 18, 42, 48) 6
12Samples with two duplications - each of the
following 6 samples can occur 15 times 90 1.
1, 1, _ - 5 ways to fill the blank spot - the
resulting estimates of population total are
(x' 26, 22,
22, 30, 32) and each of these can occur 3 times,
since the blank spot can appear three different
positions (5 x 3 15) 2. 2, 2, _ - (x' 28,
26, 26, 34, 36) - 3 times 3. 3, 3, _ - (x'
20, 22, 18, 26, 28) - 3 times 4. 4, 4, _ -
(x' 20, 22, 18, 26, 28) - 3 times 5. 5, 5, _
- (x' 36, 38, 34, 34, 44) - 3 times 6. 6, 6,
_ - (x' 40, 42, 38, 38, 46) - 3 times
13Sampling distribution - distribution of 216
estimates of populations total (x') is
Estimate (x') Frequency (f)
Probability (f/T) 18
8 .037037
20 12
.055556 22
18 .083333
24 13
.060185 26
21 .097222
28 27
.125000 30
28
.129630 32
21 .097222 34
15
.069444 36
18 .083333 38
15
.069444 40
9 .041667
42 4
.018519 44
3 .013889
46 3
.013889 48
1 .004630
Total 216
1.000000
14- Mean (expected value)30
Variance44 - The variance can also be obtained by the formula
at the bottom of page 56, ignoring the finite
population correction factor.
15Sampling Distribution Without-replacement
sample The total number of possible samples
20. The sampling distribution is shown in Table
3.2 on page 56 The mean (expected value) 30
the variance26.4. Note that the without
replacement sampling gives a smaller sampling
variance than the with replacement sampling.
16Estimation of variance from sample variance The
variance formula in Box 3.2 is expressed in terms
of population variance. In order to estimate
sampling variance from the sample, the formula is
modified to include sample variance as shown in
Box 3.1.
17Estimation of ratio (ratio of two random
variables) The sample ratio is a biased
estimate of the population ratio, but the bias is
usually very small.
18 Consider the following example The
population consists of three elements a, b, and
c. Element ( prof. staff) ( with MPH)
(a) 3 1 (b) 3 2 (c) 6 3 Total
12 6 If we take a sample of n2, then
we have three possible samples Possible
sample r (ratio) (a, b) (3, 3) (1,
2) 3/60.5 (a, c) (3, 6) (1,
3) 4/90.44 (b, c) (3, 6) (2,
3) 5/90.67 Total 1.61
(It is not same as 0.5.) Therefore the sample
ratio is a biased estimate of the population
ratio, but the bias is very small.