Sampling Design

1 / 34
About This Presentation
Title:

Sampling Design

Description:

Chapter 2 Sampling Design Homework 5.1-5, 7, 9-11, 13 How do we gather data? Surveys Opinion polls Interviews Studies Observational Retrospective (past) Prospective ... – PowerPoint PPT presentation

Number of Views:8
Avg rating:3.0/5.0

less

Transcript and Presenter's Notes

Title: Sampling Design


1
Chapter 2
  • Sampling Design

2
How do we gather data?
  • Surveys
  • Opinion polls
  • Interviews
  • Studies
  • Observational
  • Retrospective (past)
  • Prospective (future)
  • Experiments

3
Population
  • the entire group of individuals that we want
    information about

4
Census
  • a complete count of the population

5
Why would we not use a census all the time?
  • Not accurate
  • Very expensive
  • Perhaps impossible
  • If using destructive sampling, you would destroy
    population
  • Breaking strength of soda bottles
  • Lifetime of flashlight batteries
  • Safety ratings for cars

Look at the U.S. census it has a huge amount of
error in it plus it takes a long to compile the
data making the data obsolete by the time we get
it!
Suppose you wanted to know the average weight of
the white-tail deer population in Texas would
it be feasible to do a census?
Since taking a census of any population takes
time, censuses are VERY costly to do!
6
Sample
  • A part of the population that we actually examine
    in order to gather information
  • Use sample to generalize to population

7
Sampling design
  • refers to the method used to choose the sample
    from the population

8
Sampling frame
  • a list of every individual in the population

9
Simple Random Sample (SRS)
Suppose we were to take an SRS of 50 BGHS
students put each students name in a hat.
Then randomly select 50 names from the hat. Each
student has the same chance to be selected!
  • consist of n individuals from the population
    chosen in such a way that
  • every individual has an equal chance of being
    selected
  • every set of n individuals has an equal chance of
    being selected

Not only does each student have the same chance
to be selected but every possible group of 50
students has the same chance to be selected!
10
Stratified random sample
Homogeneous groups are groups that are alike
based upon some characteristic of the group
members.
Suppose we were to take a stratified random
sample of 50 BGHS students. Since students are
already divided by grade level, grade level can
be our strata. Then randomly select 25 seniors
and randomly select 25 juniors.
  • population is divided into homogeneous groups
    called strata
  • SRSs are pulled from each strata

11
Systematic random sample
Suppose we want to do a systematic random sample
of BGHS students - number a list of
students (Ex. If there were approximately 500
students if we want a sample of 50, 500/50
10) Select a number between 1 and 10 at random.
That student will be the first student chosen,
then choose every 10th student from there.
  • select sample by following a systematic approach
  • randomly select where to begin

12
Cluster Sample
Suppose we want to do a cluster sample of BGHS
students. One way to do this would be to
randomly select 10 classrooms during 2nd period.
Sample all students in those rooms!
  • based upon location
  • randomly pick a location sample all there

13
Multistage sample
To use a multistage approach to sampling BGHS
students, we could first divide 2nd period
classes by level (AP/pre-AP, Accelerated,
Regular, etc.) and randomly select 4 second
period classes from each group. Then we could
randomly select 5 students from each of those
classes. The selection process is done in stages!
  • select successively smaller groups within the
    population in stages
  • SRS used at each stage

14
SRS
  • Advantages
  • Unbiased
  • Easy
  • Disadvantages
  • Large variance
  • May not be representative
  • Must have sampling frame (list of population)

15
Stratified
  • Advantages
  • More precise unbiased estimator than SRS
  • Less variability
  • Cost reduced if strata already exists
  • Disadvantages
  • Difficult to do if you must divide stratum
  • Formulas for SD confidence intervals are more
    complicated
  • Need sampling frame

16
Systematic Random Sample
  • Advantages
  • Unbiased
  • Ensure that the sample is distributed across
    population
  • More efficient, cheaper, etc.
  • Disadvantages
  • Large variance
  • Can be confounded by trend or cycle
  • Formulas are complicated

17
Cluster Samples
  • Advantages
  • Unbiased
  • Cost is reduced
  • Sampling frame may not be available (not needed)
  • Disadvantages
  • Clusters may not be representative of population
  • Formulas are complicated

18
Identify the sampling design
  • 1)The Educational Testing Service (ETS) needed a
    sample of colleges. ETS first divided all
    colleges into groups of similar types (small
    public, small private, etc.) Then they randomly
    selected 3 colleges from each group.

Stratified random sample
19
Identify the sampling design
  • 2) A county commissioner wants to survey people
    in her district to determine their opinions on a
    particular law up for adoption. She decides to
    randomly select blocks in her district and then
    survey all who live on those blocks.

Cluster sampling
20
Identify the sampling design
  • 3) A local restaurant manager wants to survey
    customers about the service they receive. Each
    night the manager randomly chooses a number
    between 1 10. He then gives a survey to that
    customer, and to every 10th customer after them,
    to fill it out before they leave.

Systematic random sampling
21
Random digit table
Numbers can be read across.
Numbers can be read vertically.
The following is part of the random digit table
found on Table B of your textbook Row 1 4 5
1 8 5 0 3 3 7 1 2 4 2 5 5
8 0 4 5 7 0 3 8 9 9 3 4 3
5 0 6 3
Numbers can be read diagonally.
  • each entry is equally likely to be any of the 10
    digits
  • digits are independent of each other

22
Suppose your population consisted of these 20
people 1) Aidan 6) Fred 11) Kathy 16) Paul 2)
Bob 7) Gloria 12) Lori 17) Shawnie 3) Chico 8)
Hannah 13) Matthew 18) Tracy 4) Doug 9)
Israel 14) Nan 19) Uncle Sam 5) Edward 10) Jung
15) Opus 20) Vernon Use the following
random digits to select a sample of five from
these people.
We will need to use double digit random numbers,
ignoring any number greater than 20. Start with
Row 1 and read across.
1) Aidan
18) Tracy
13) Matthew
15) Opus
5) Edward
Ignore.
Ignore.
Ignore.
Ignore.
Stop when five people are selected. So my sample
would consist of Aidan, Edward, Matthew, Opus,
and Tracy
Row 1 4 5 1 8 0 5 1 3 7 1 2
0 1 5 5 8 0 1 5 7 0 3 8 9
9 3 4 3 5 0 6 3
23
Bias
  • ERROR
  • favors certain outcomes

Anything that causes the data to be wrong! It
might be attributed to the researchers, the
respondent, or to the sampling method!
24
Sources of Bias
  • things that can cause bias in your sample
  • cannot do anything with bad data

25
Voluntary response
  • People chose to respond
  • Usually only people with very strong opinions
    respond

An example would be the surveys in magazines that
ask readers to mail in the survey. Other
examples are call-in shows, American Idol,
etc. Remember, the respondent selects themselves
to participate in the survey!
Remember the way to determine voluntary
response is Self-selection!!
26
Convenience sampling
The data obtained by a convenience sample will be
biased however this method is often used for
surveys results reported in newspapers and
magazines!
  • Ask people who are easy to ask
  • Produces bias results

An example would be stopping friendly-looking
people in the mall to survey. Another example is
the surveys left on tables at restaurants - a
convenient method!
27
Undercoverage
  • some groups of population are left out of the
    sampling process

Suppose you take a sample by randomly selecting
names from the phone book some groups will not
have the opportunity of being selected!
28
Nonresponse
Because of huge telemarketing efforts in the past
few years, telephone surveys have a MAJOR problem
with nonresponse!
  • occurs when an individual chosen for the sample
    cant be contacted or refuses to cooperate
  • telephone surveys 70 nonresponse

People are chosen by the researchers, BUT refuse
to participate. NOT self-selected! This is
often confused with voluntary response!
One way to help with the problem of nonresponse
is to make follow contact with the people who are
not home when you first contact them.
29
Response bias
Suppose we wanted to survey high school students
on drug abuse and we used a uniformed police
officer to interview each student in our sample
would we get honest answers?
  • occurs when the behavior of respondent or
    interviewer causes bias in the sample
  • wrong answers

Response bias occurs when for some reason
(interviewers or respondents fault) you get
incorrect answers.
30
Wording of the Questions
The level of vocabulary should be appropriate for
the population you are surveying
Questions must be worded as neutral as possible
to avoid influencing the response.
  • wording can influence the answers that are given
  • connotation of words
  • use of big words or technical words

31
Source of Bias?
1) Before the presidential election of 1936, FDR
against Republican ALF Landon, the magazine
Literary Digest predicting Landon winning the
election in a 3-to-2 victory. A survey of 10
million people. George Gallup surveyed only
50,000 people and predicted that Roosevelt would
win. The Digests survey came from magazine
subscribers, car owners, telephone directories,
etc.
Undercoverage since the Digests survey comes
from car owners, etc., the people selected were
mostly from high-income families and thus mostly
Republican! (other answers are possible)
32
2) Suppose that you want to estimate the total
amount of money spent by students on textbooks
each semester at WKU. You collect register
receipts from students as they leave the
bookstore during lunch one day.
Convenience sampling easy way to collect
data or Undercoverage students who buy books
from on-line bookstores are not included.
33
3) To find the average value of a home in Bowling
Green, one averages the price of homes that are
listed for sale with a realtor.
Undercoverage leaves out homes that are not for
sale or homes that are listed with different
realtors. (other answers are possible)
34
Homework
  • 5.1-5, 7, 9-11, 13
Write a Comment
User Comments (0)