Statistical Issues in Census Taking - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

Statistical Issues in Census Taking

Description:

Much controversy attended the use of such models in the court debate. ... 'The 1990 Post-Enumeration Survey: Operations and Results,' J of American ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 33
Provided by: barbara177
Category:

less

Transcript and Presenter's Notes

Title: Statistical Issues in Census Taking


1
Statistical Issues in Census Taking  
  • Introduction
  •   Census taking had its beginning in ancient
    times in Babylonia, China, Egypt, Palestine, and
    Rome.
  • The word "census" comes from Latin word
    "censere" that is to mean "to tax," or "to
    value."

2
  •   We need current and accurate information to
    run our democracy. Our founding fathers
    recognized this idea when they made the census
    as integral part of the American Constitution.
  • Today, census taking is an important
    statistical function of government in all
    countries. This function has steadily grown
    more complex as the needs for data have
    increased.

3
  • US Bureau of the Census and other statistical
    agencies contributed significantly to
    development of statistical theory and methods,
    especially in probability sampling and
    statistical analysis.
  •   Sampling was introduced into the 1950 census
    in collecting socioeconomic data and in
    evaluating quality of census data.
  •   Sampling may play a bigger role in future
    census taking.
  •  

4
  • Methods of census in history of statistics
  • 1. Enumeration Method(US Constitution, Article
    1, section 2)
  • "Representatives and direct Taxes shall be
    apportioned among the several States which may be
    included within this Union, according to their
    respective Numbers, which shall be determined by
    adding to the whole Number of free Persons,
    including those bound to Service for a Term of
    Years, and excluding Indians not taxed,
    three-fifths of all other Persons. The actual
    Enumeration shall be made within three Years
    after the first Meeting of the Congress of the
    United States, and within every subsequent Term
    of ten Years, in such Manner as they shall by Law
    direct.
  • The Constitution was adopted in 1787 and the
    first census was taken in 1790.

5
  • Estimation Method - Laplace proposed in 1786 and
    employed in 1802 as a method of estimating
    population (Stigler, History of Statistics, pp.
    163-164)
  • Take a census in a few carefully selected
    communities and determine the ratio of population
    to birth
  • Multiply this ratio to the number of births in
    France in the past year from the birth registers
    (which were considered to be quite accurate) to
    estimate the total population.
  • This is a forerunner of modern "ratio estimation"
    methods.

6
  • Uses of census data
  • 1.  Reapportionment of political representation
  • Only a simgle head count is needed. As a
    result of the 1990 census, 8 states gained and
    13 states lost at least one congressional seat.
  • 2. Statistical uses (information about the
    state)
  •   James Madison urged the first Congress to
    collect additional information in the first
    census about the people as a guide to future
    legislation. He proposed that white people be
    classified by gender and white males by age, and
    that a count is made of people employed in each
    occupation.

7
  • The first Census Act of 1790 specified
    collection of data on the name of the head of
    the family and the number of persons in each
    household of the following description free
    white males 16 years old and upward free white
    males under 16 years free white females all
    other free persons and slaves. Madison's
    suggestion relating to occupational
    information was deleted and did not appear
    again until 1820.
  •  

8
  • Public health agencies rely on census
    statistics for their planning and evaluation
    activities. Economic and social policy analysis
    utilizes census information. Researchers in
    demography and epidemiology also make use of
    census data. Businesses make extensive use of
    census data for formulating their marketing
    strategies and organizing their future plans.

9
  • Problems with the past Censuses
  • 1. Costs have escalated dramatically.
  •   The 1990 census spent 2.6 billion, which
    represents an increase of 65 (in constant 1990
    dollars) over the 1.67 billion in 1980. GAO
    projected 4.8 billion for the 2000 census.

10
  • 2. Accuracy not improved.
  •   Coverage of 1990 census got worse the net
    undercounting was estimated as
  • 2.1 (5.3 million persons) by the Post
    Enumeration Survey (PES), and 1.8 (4.7
    million persons) by demographic analysis.

11
  • The 4.4 difference in the 1990 undercount
    between Blacks and non-Blacks was the highest
    since the Bureau began estimating coverage
    in1940.
  •   The 1990 census contained 14.1 million gross
    errors including 9.7 million persons missed
    and 4.4 million persons double-counted.

12
  • 3. Undercounting adjustment not made
  •   Several unsuccessful lawsuits were filed to
    compel the Bureau to adjust the 1980 census.
    During the 1980s, the Bureau carried out a
    research program to improve its undercount
    estimation and adjustment procedures and
    designed a PES for 1990 in preparation to
    adjust the 1990 census.
  •  

13
  • Public images of statistics
  • 1. Lies, damned lies, and statistics (Mark
    Twain)
  • 2. Statistics may be faked
  • In no event may sampling or other statistical
    procedures be used in determining the total
    population by states." (HR3589, sponsored by
    Congressman Thomas Petri, R- Wisconsin)
  •   "Sampling isn't counting votes could be
    statistically manufactured by bureaucrats"
    (Wall Street Journal, editorial, May 22, 1997)
  •  

14
  • Preparing for the future censuses
  •  1. Statistical estimation will be an important
    part of census methodology
  •   The Bureau tested this statistical
    methodology extensively during 1980s
    (Mississippi in 1986, Los Angeles in 1986,
    North Dakota in 1987, Missouri and Washington
    in 1988, and extended PES in 1990).
  •   The Bureau worked with external advisory
    groups (National Academy of Science panels,
    ASA Blue Ribbon panel)

15
  • 2. Fundamental reform in making
  • Scrapping the long form in 2010, replaced by
    a monthly household survey designed to provide
    annually updated data (American Community
    Survey).
  • Modifying the census to respond to social
    changes in living arrangements of Americans and
    reexamining the de jure method of enumeration
    and changing some of its residency rules.
  •  
  •  
  •  

16
 
Application of Sampling Methodology in Census
Post Enumeration Survey Design - 1990  1.
Sample design - single-stage, stratified cluster
sampling   Sampling units were block clusters
(a block or a collection of blocks) 5,290
block clusters were taken, which contained
about 380,000 people in 166,000 households.  
17
  • Results of PES
  • Occupied housing units 143,818
  • Interviews with household 134,808
    93.7
  • Member 
  • Proxy interviews
    6,745 4.7
  • Non-interviews
    2,265 1.6

18
  • Two samples are designated
  •  
  • P-sample - the persons found by PES in the
    selected clusters
  • E-sample - the persons found by the census
    in the selected clusters
  •  

19
  • 2. Post-stratification
  •   Based on geography, race, place, housing
    tenure, age and sex - resulted in 1,392
    post- strata
  • 3. Estimation within each post-stratum
  •   Estimate the number of erroneous census
    enumeration
  •  
  • Estimate the total population by the
    dual- system estimator by matching the P-sample
    and E-sample (see Chapter 14, Section 6, page
    443)
  •  

20
  • Data preparation
  •  1. Determine the matching status in P-sample
  •  
  • Matched with persons in E-sample computer
    match and manual match
  •   Non-match (not enumerated in census)
  • Unresolved cases
  • Non-Hispanic white 1.6
  • Black 2.5
  • Hispanic 2.5
  • Asian 2.0
  •  

21
  • 2. Determine erroneous enumeration (follow-up of
    unmatched cases)
  •   Types of error duplicates, fictitious
    enumeration, children born after Census Day,
    people died before Census Day, people counted
    in the wrong location, insufficient
    information.
  •   Non-error (missed by PES)
  • Unresolved cases
  • Non-Hispanic white 0.7
  • Black 2.1
  • Hispanic 1.8
  • Asian 1.3

22
  • Estimation procedure
  •   Allocate the P and E sample data to
    post-strata
  •   Imputation of unresolved cases by using a
    logistic regression model and weight
    adjustment.
  •  

23
  • a. Unresolved cases in P-sample
  •   Calculate predicted probability of resolution
  • The sample weight of resolved cases are
    inflated
  • by multiplying

24
  • b. Unresolved cases in E-sample
  •  
  • Similar adjustment is made as above
  • Estimation (dual-system estimator)
  • - weighted E-sample total
  • - weighted total of erroneous
    enumeration
  • - weighted P-sample total
  • - weighted total of P- sample
    matches
  •  
  •    

25
  • Estimated total population -
  •    Net undercounting -
  •   Adjustment factor -

26
  • Adjustment and estimation procedure
  •  1. Smoothing of adjustment factors in 1,392
    post-strata by Bayesian regression approach
  •   The "pre-smoothed" variance based on a
    Poisson distribution was used as a basis for
    smoothing.
  •   Much controversy attended the use of such
    models in the court debate.
  • True uncertainty was difficult to assess
  • Analytical solution vs. alternative design
    (the smallest post-stratum had only 8 persons).
  •  

27
  • 2. Synthetic estimation
  • Census count in all areas were adjusted within
    post-strata

28
  • Criticisms of methods
  • 1. The smoothing model unjustified
  • 2. The post-strata were possibly too
    heterogeneous, e.g., 12 age-sex groups 0-9,
    10-19, 20-29, 30-44, 45-64, 65
  •   Later reduced to 7 age-sex groups 0-17 (both
    sexes), 18-29 (male, female), 30-49 (male,
    female), 50 (male, female) - the revision
    produced 357 post-strata.
  •  

29
  • 3. Possible biases in estimation (bias in
    address file used on Census Day, matching error,
    coding error, whole person imputation, error in
    computer editing).
  •  
  • Some modification has been introduced new
    manual matching, modification in matching rules,
    and correction in computer editing routine.
  •  

30
References  General historical  1.
Desrosieres, A. (translation by Naish, C.).
(1998), The Politics of Large Numbers A History
of Statistical Reasoning, Cambridge, Mass.
Harvard University Press.  2. Stigler, S.M.
(1986), The History of Statistics The
Measurement of Uncertainty before 1900,
Cambridge, Mass. Harvard University Press.  
31
Non-technical  1. Edmonston, B., and Schultze,
C. (1994), Modernizing the U.S. Census,
Washington, DC National Academy Press.  2.
Stefey, D.L., and Bradburn, N.M. (1994),
Counting People in the Information Age,
Washington, DC National Academy Press.  3.
Duncan, J.W., and Shelton, W.C. (1992), "U.S.
Government Contribution to Probability Sampling
and Statistical Analysis," Statistical Science
7320- 338.  
32
Technical  1. Mulry, M.H., and Spencer, B.D.
(1991), "Total Error in PES Estimates of
Population," J of American Statistical
Association 86(416) 839-863.  2. Hogan, H.
(1993), "The 1990 Post-Enumeration Survey
Operations and Results," J of American
Statistical Association 88 (423) 1047-1060.  3.
Breiman, L. (1994), "The 1991 Census
Adjustment Undercount or Bad Data?" Statistical
Science 9(4) 458-537.
Write a Comment
User Comments (0)
About PowerShow.com