Sampling - PowerPoint PPT Presentation


PPT – Sampling PowerPoint presentation | free to download - id: 529d3e-ZWY0Z


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation



Sampling Polls predicting 1992 U.S. presidential election outcomes Polls predicting 1996 U.S. presidential election outcomes How many interviews it took to estimate ... – PowerPoint PPT presentation

Number of Views:73
Avg rating:3.0/5.0
Slides: 69
Provided by: soc60


Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Sampling

Polls predicting 1992 U.S. presidential election
Polls predicting 1996 U.S. presidential election
How many interviews it took to estimate the
behaviors of 90 million voters?
  • Less than 2,000

The History of Sampling
  • In 1920, Literary Digest mailed postcards to
    people in 6 states, asking whom they were
    planning to vote for in the presidential
  • The Digest correctly predicted that Harding would
    be elected.
  • In the elections that followed, the Literary
    Digest expanded the size of its poll and made
    correct predictions in 1924, 1928, 1932.

The History of Sampling
  • In 1936, Literary Digest conducted its most
    ambitious poll 10 million ballots were sent to
    people listed in the telephone directories and on
    lists of automobile owners.
  • Over 2 million responded, given the Republican
    contender Alf London, a 57 to 43 percent
    landslide over the incumbent, president
  • Election results Roosevelt won 61 of the votes.

The History of Sampling
  • Problem 22 return rate.
  • A part of the answer to these questions lay in
    the sampling frame used by the Digest telephone
    subscribers and automobile owners.
  • Such a design selected a disproportionately
    wealthy sample.
  • The sample effectively excluded poor people, and
    the poor people predominantly voted for
    Roosevelts New Deal recovery program during the
    depression period.

The History of Sampling
  • In the same year (1936), George Gallup correctly
    predicted that Roosevelt would beat London.
  • Gallups success in 1936 hinged on his use of
    quota sampling, which is based on a knowledge of
    the characteristics of the population being
    sampled. People are selected to match the
    population characteristics.
  • Using quota sampling, Gallup successfully
    predicting the presidential winner in 1940 and

The History of Sampling
  • In 1948, Gallup mistakenly picked Thomas Dewey
    over incumbent president Harry Truman.
  • Factors accounted for 1948s failure
  • 1). Most of the pollsters stopped polling in
    early Oct despite a steady trend toward Truman
    during the campaign.
  • 2). Undecided voters went disproportionately
    for Truman.
  • 3). Unrepresentativeness of the sample (resulting
    from quota sampling).

The History of Sampling
  • Quota sampling technique requires that the
    researcher know something about the total
  • For national political polls, such information
    came primarily from census data.
  • By 1948, however, WWII had produced a massive
    movement from country to city, radically changing
    the character of the U.S. population, and Gallup
    relied on 1940 census data (City dwellers tended
    to vote Democratic hence the over-representation
    of rural voters also underestimated the number of
    Democratic votes).

Population and Sample element
  • Element An element is that unit about which
    information is collected and that provides the
    basis of analysis.
  • People, families, corporations
  • usually the same as unit of analysis
  • Population The entire group of individuals that
    we want information about is called the
    population. A population is the theoretically
    specified aggregation of study elements.
  • A sample is a part of the population that we
    actually examine in order to gather information.

Defining the target population????
  • It is vitally important to carefully define the
    target population so the proper source from which
    the data are to be collected can be identified.
  • Question "To whom do we want to talk?" What or
    who will be observed?---answer the questions
    about the tangible characteristics of the
    population (1) definition of the element (2) time
    referent for the study.
  • EXgt ???????? Or female between age 12-50?.
  • EXgt????????
  • EXgt ?????????????????????

Defining the study population????
  • Study Population A study population is that
    aggregation of elements from which the sample is
    actually selected.
  • Lists of elements are usually somewhat incomplete
  • ?????????????????
  • ??????? ?????????

Sampling units????
  • A sampling unit is that element or set of
    elements considered for selection in some stage
    of sampling.
  • In a simple single-stage sample, the sampling
    units are the same as the elements and are
    probably the units of analysis.
  • EXgt passengers on a passengers list ? sampling
    unit elements
  • In a multi-stage sample
  • EXgt the airlines could first select flights as
    the sampling unit, then select certain passengers
    on the previously select flights.
  • PSU (primary sampling units) flights
  • Secondary sampling units passengers

Observation unit
  • An observation unit, or unit of data collection,
    is an element or aggregation of elements from
    which information is collected.
  • EX) Researcher may interview heads of households
    (the observation units) to collect information
    about all members of the households (the units of

Sampling Design
Sample designs
  • Nonprobability samples
  • Voluntary Response Sample
  • Convenience
  • Judgment
  • Quota
  • Snowball
  • Probability samples
  • Simple random
  • Systematic
  • Stratified
  • Proportionate
  • Disproportionate
  • Cluster
  • Multistage

There are no appropriate statistical techniques
for measuring random sampling error from a
non-probability sample. Thus projecting the data
beyond the sample is statistical inappropriate.
Nonprobability Sampling
  • Social research is often conducted in situations
    where you can't select the kinds of probability
    samples used in large-scale social surveys.
  • Lack of population list Suppose you wanted to
    study homelessness There is no list of all
    homeless individuals, nor are you likely to
    create such a list.

Voluntary Response Sample
  • A voluntary response sample consists of people
    who choose themselves by responding to a general
    appeal. Voluntary response samples are biased
    because people with strong opinions, especially
    negative opinions, are most likely to respond.
  • radio station call in to reflect public opinions.

Convenience Sampling
  • ???? (haphazard or accidental sampling), relying
    on available subjects
  • EXgt man-on-the-street interviews, talk to friend
    about their political sentiment
  • EXgt professor uses students as sample
  • EXgt every tenth student entering the university
  • EXgt Survey over sea Chinese for international

Convenience Sampling
  • Advantages Very low cost, extensively used, No
    need for list of population.
  • It is justified only if the researcher wants to
    study the characteristics of people passing the
    sampling point at specified times or if less
    risky sampling methods are not feasible.

Convenience Sampling
  • Problems
  • (1) no way of knowing if those included are
  • (2) Variability and bias of estimates cannot be
    measured or controlled.
  • (3) Projecting the results beyond the specific
    sample is inappropriate.
  • Should be use only for exploratory design to
    generate ideas and insights.
  • you should alert readers to the risks associated
    with this method.

Judgment Samples (Purposive Samples)????
  • hand-picked sample elements, believed to be
    representative of the population of interest
  • EXgt a fashion manufacturer regularly selects a
    sample of key accounts that it believes are
    capable of providing the information to predict
    what will sell in the fall.
  • EXgt Dow Jones industrial average select 30
    blue-chip stocks out of 1,800 stocks. Highly
    correlated with other NYSE indicators on the
    daily percentages of price changes
  • EXgtRepresentative communities in U.S.
    presidential election.
  • EXgt CPI????????

Snowball sample????
  • Locate an initial set of respondents. These
    individual are then used as informants to
    identify others with the desired characteristics.
  • Appropriate when the members of a special
    population are difficult to locate.

Snowball sample????
  • EXgt survey users of an unusual product a study
    among deaf for product that would allow deaf
    people to communicate over telephone.
  • EXgt ??????(????),homeless, gangsters, migrant
    workers, undocumented immigrants.
  • EXgt network study,????(HIV)
  • Bias a person who is known to someone has a
    higher probability of being similar to the first

Quota samples????
  • by selecting sample elements in such a way that
    the proportion of the sample elements possessing
    a certain characteristics is approximately the
    same as the proportion with the characteristics
    in the population.
  • Establishing a characteristics matrix What
    proportion of the target population is male and
    female? what proportions of each gender fall
    various age categories, educational level, ethnic
  • Once such a matrix has been created and a
    relative proportion assigned to each cell in the
    matrix, you collect data from people having all
    the characteristics of a given cell.
  • All the persons in a given cell are then assigned
    a weight appropriate to their portion of the
    total population.

Quota samples????
  • Problems
  • The sample could be far off with respect to other
    important characteristics.
  • The quota frame must be accurate, and it is often
    difficult to get up-to-date information for this

Quota samples????
  • Biases may exist in the selection of sample
    elements within a given cell. The interviewer has
    a quota to achieve. The actual choice of elements
    left to the discretion of the individual field
    worker. Interviewers are prone to follow certain

Quota samples????
  • those who are similar to the interviewers are
    more likely to be interviewed,
  • toward the accessible (first floor, airline
    terminals, business district, college campus),
  • toward household with children, exclude working
  • against workers in manufacturing (service and
  • against extreme of income (EXgt "mansions" were
    skipped because the interviewer did not feel
    comfortable knocking on doors that were answered
    by servants. ),
  • against the less educated, against low-status

Probability sample
  • A probability sample is a sample chosen by
    chance. We must know what samples are possible
    and what chance or probability, each possible
    sample has.

Probability sampling offers two advantages
  • First, probability samples, although never
    perfectly representative, are typically more
    representative than other types of samples
    because the biases previously discussed are
  • Second, and more important, probability theory
    permits us to estimate the accuracy or
    representativeness of the sample.

Types of Sampling Designs
  • Simple Random Sampling
  • Systematic Sampling
  • Stratified Sampling
  • Cluster Sampling

Simple random sample
  • A simple random sample (SRS) of size n consists
    of n individuals from the population chosen in
    such a way that every set of n individuals has an
    equal chance to be the sample actually selected.

Simple Random Sampling??????
  • Simple random sampling is the basic sampling
    method assumed in the statistical computations of
    social research.
  • Establish a sampling frame
  • Assigns a single number to each element in the
    list, not skipping any number in the process.
  • generates series of random numbers to select the
  • Simple random sampling is seldom used in practice

Systematic Sampling????
  • A systematic sample with a random start--a
    procedure in which an initial starting point is
    selected by a random process, and then every kth
    number on the list is selected.
  • Sampling interval the number of population
    elements between the units selected for the
  • Sampling interval population size / sample
  • Sampling ratio sample size / population size
  • Systematic sampling is virtually identical to
    simple random sample. If the list of elements is
    indeed randomized before sampling, one might
    argue that a systematic sample drawn from that
    list is in fact a simple random sample.
  • Systematic sampling is much easier to conduct.

Problem of periodicity
  • The arrangement of elements in the list can make
    systematic sampling unwise.
  • EXgt collecting retail sales information every
    seventh day (Monday)
  • EXgt apartment number

Stratified Random Sampling????
  • The parent population is divided into mutually
    exclusive and exhaustive subsets.
  • A simple random sample of elements is chosen
    independently from each group or subset.
  • To organize the population into homogeneous
    subsets and to select the appropriate number of
    elements from each.?????????(strata),?????????????

Stratified Random Sampling????
  • Sampling error can be reduced by
  • (1) increase sampling size
  • (2) a homogeneous population produces samples
    with smaller sampling errors than does a
    heterogeneous population.
  • The logic of stratified sampling rather than
    selecting your sample from the total population
    at large, you ensure that appropriate numbers of
    elements are drawn from homogeneous subsets of
    that population.

Stratified Random Sampling????
  • EXgt urban and rural groups differ widely on
    attitudes toward energy conservation, members
    within each group hold very similar attitudes.
  • EXgt divide the university by college class
    (freshmen, sophomores, juniors, seniors)
  • In selecting stratification variables, you should
    be concerned primarily with those that are
    presumably related to variables that you want to
    represent accurately. Such as sex, education,
    geographic location,etc.
  • EXgt estimate income stratified by educational

Example 3.17
  • ?????????(ASCAP) ?????????????????????ASCAP???????
  • ????60,000????????????????????
  • Radio stations are stratified by type of
    community (metropolitan, rural), geographic
    location, and the size of the license fee paid to
    ASCAP, which reflect the size of the audience.

Stratified Random Sampling????
  • ??????????????????(homogeneous within
    strata),??????????(sampling error is smaller)?
  • The investigator should divide the population
    into strata so that the elements within any given
    stratum are as similar in value as possible and
    the values between any two strata are as
    disparate as possible.
  • In the limit, if the investigator is successful
    in partitioning the population so that the
    elements in each stratum are exactly equal, there
    will be no error associated with the estimate of
    the population parameters.

Increased precision of stratified samples
  • EXgt N1,000
  • Mean 5 (.2) 10 (.3) 20 (.5) 14, variance
  • Suppose that a researcher was able to
    partitioning the total population so that all the
    elements with a value of 5 in one stratum, those
    with value of 10 were in the second, and those
    with the value of 20 were in the third.
  • Take a proportionate stratified sample of n10.
  • Or select a sample of n3, and calculate the
    weighted average.

Proportional stratified sample
  • Proportional stratified sample the number of
    sampling units drawn from each stratum is in
    proportion to the relative population size of
    that stratum.
  • (1) Sort the population into discrete groups (2)
    On the basis of relative proportion of the
    population represented by a given group, select
    several elements from tat group constituting the
    same proportion of your desired sample size.
  • (1) Group elements and then put groups together
    in a continuous list (an ordered list, if no
    periodicity, is sometime better than randomized
    list--implicit stratification in systematic
    sampling). (2) Select a systematic from the
    entire list.

Disproportionate stratified sampling
  • Balancing the two criteria of strata size and
    strata variability. Strata exhibiting more
    variability are sampled more than proportionately
    to their relative size those strata that are
    very homogeneous are sampled less than

Multistage cluster sampling????
  • Used when it is either impossible or impractical
    to compile an exhaustive list of the elements
    composing the target population.
  • ??????(cluster),??(cluster)??????????????
  • EX) ????????????,??????????????????????,??????????
  • EX) census blocks---sampled blocked? sample
    household? sample individual
  • EXgt sampling high school students in Taiwan
    requires the entire student list. Cluster
    sampling no initial listing is required.

Multistage cluster sampling????
  • ????
  • ???,???????
  • ????????????????
  • ????????????????
  • ????
  • ???,???????
  • ????????????
  • ?????????????????????????

Multistage cluster sampling????
  • Price of the efficiency? less accurate sample A
    simple random sample drawn from a population list
    is subject to a single sampling error, but a
    two-stage cluster sample is subject to two
    sampling errors. (exgt selecting a sample of
    disproportionately wealthy city blocks, plus a
    sample of disproportionately wealthy households
    within those blocks.)

Comparisons of sampling techniques
Comparisons of sampling techniques
Comparisons of sampling techniques
Comparisons of sampling techniques
Sampling Bias
  • A sample is biased if it is obtained by a method
    that favors the selection of elementary units
    having particular characteristics.

Sampling Error or Error of Estimation
Error in survey research
Systematic (nonsampling) error
Random sampling error
Respondent error
Administrative error
Data processing error
Response bias
Nonresponse error
Sample selection error
Deliberate falsification Unconscious
Self-selection bias
Interviewer cheating
Interviewer error
Acquiescence bias
Extremity bias
Interviewer bias
Auspices bias
Social desirability bias
Contamination by others
Random Sampling Error
  • A statistical fluctuation that occurs because of
    chance variation in the elements selected for a
  • Can be estimated.
  • Can be reduce through increasing sample size.

Systematic Error???? nonsampling errors
  • ???????imperfect aspect of the research design
  • ????????mistake in the execution of the research
  • A sample bias exists when the results of a sample
    show a consistent tendency to deviate in one
    direction from the true value of the population
  • Two general categories
  • ) Respondent error Nonresponse error Response
  • ) Administrative error

Non-response error
  • The statistical difference between a survey that
    includes only those who responded and a survey
    that also includes those who failed to respond.
  • Non-respondenta person who is not contacted or
    who refuses to cooperate
  • 1. not-at-homemarried women
  • 2. refusal a person who is unwilling to

Non-response error
  • To identify the extent of nonresponse error,
    business researcher often select a sample of
    nonrespondents who are then recontacted.
    ??????????????(call back or follow-up),???????????
  • Comparing the demographics of the sample with the
    demographics of the target population is one mean
    of inspecting for possible bias. ????????
  • EX) 500???????????????????
  • EXgt sample from the educational or personnel

Self-selection bias
  • (EX) who are more likely to respond to customer
    satisfaction survey on the dining table?
  • (EX) PC software--expert views on degree of "user
    friendly", might be more critical.
  • Self-selection biases the survey because it
    allows extreme positions to be over-represented
    while those who are indifferent are

Deliberate falsification
  • Appear to be intelligentEXgt price of a good,
    reluctant to say "can't remember".
  • Conceal personal informationEXgtincome, political
  • To avoid embarrassmentEXgtsexual behaviors,
  • Become boredto get rid of the interviewer
  • Reluctant to give negative feelingEXgt in
    employee survey to safeguard their job
  • To please interviewer.
  • Average man" hypothesisto conform to their
    perception of the average person. EXgt number of
    hour worked.

Unconscious Misrepresentation
  • in the absence of strong preference, respondents
    will choose answers to justify their behavior
  • (EX) which PC is better? In-flights survey
    concerning aircraft preference
  • Misunderstand the question
  • EXgt ????????
  • Never thought about the question
  • EXgtbuying intention, quitting intention
  • Forgot the exact details
  • EXgtwhen was last time you? How many times did

Acquiescence bias????
  • A tendency to agree with all questions or to
    indicate a positive connotation. yea (no)
  • EXgt Japanese do not wish to contradict others
  • particularly prominent in ideas previously
    unfamiliar to the respondents

Extremity bias (or avoid extreme position)
  • Consistently low or high scores were given to
    every question.
  • EX) student evaluation of the class.

Interviewer bias
  • Bias due to the influence of the interviewer
    (mere presence)
  • Provide the right answer to please interviewer
  • Appear intelligent and wealthy to save face.
  • Interviewers age, sex, tone of voice, facial
    expressions, or other noverbal characteristics.
  • Will interviewers gender make a difference when
    asking the following questions?
  • EX)???????????,???????????
  • EX)???????????????????
  • Interviewer shorten or rephrase question

Auspices bias??????
  • bias in the responses of subjects caused by the
    respondents being influenced by the organization
    conducting the study.
  • EX) ????????????
  • EX) ?????????????????????

Social desirability bias
  • bias in the responses of subjects caused by
    respondent's desire, either consciously or
    unconsciously, to gain prestige or to appear in a
    different social role.
  • inflated income
  • have you ever been fired from a job?
  • Do you have roaches in your home?
  • how many times you brash your teeth per day
  • Likelihood for social desirability bias
  • face-to-face gt telephone gt mail

Contamination by others
  • EXgt complete a question on the satisfaction with
    family (marital) relationship (Under the presence
    of a spouse).

Administrative error
  • Data processing error
  • Sample selection errorunlisted telephone
    respondent, stopping respondents during daytime
    hours in shopping center exclude working women,
    wrong household member answer the phoneetc.
  • Interviewer errorcheck the wrong response, can't
    write fast enough to record answers, selective
    perception (take liberty in interpreting
    questions, specific words may unconsciously be
  • Interviewer cheating (deliberate subversion)
  • fills in the answers to certain questions, skip
    questions, in order to finish the question as
    soon as possible.
  • remedymini-re-interviews a percentage of
    respondent will be call upon to verify the data.

What can be done to reduce error
  • Questionnaire designto reduce response bias
  • Samplingto control random sampling error
  • Interviewer training
  • Use rule-of-thumb estimates for systematic error
  • based on the result of other studies (areas),
    create benchmark figures or standards of
  • EXgt½ of those who say they will definitely buy
    within the next three months actually do make a
    purchase. For durable1/3. "will probably buy"
    durable no actual buy