IPUMS-International: A Restricted Access Web-Site Providing Anonymized, Integrated Census Microdata for Social Science and Policy Research * * * Robert McCaa, Steven Ruggles, Matt Sobek (University of Minnesota) and Albert Esteve (Centre d - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

IPUMS-International: A Restricted Access Web-Site Providing Anonymized, Integrated Census Microdata for Social Science and Policy Research * * * Robert McCaa, Steven Ruggles, Matt Sobek (University of Minnesota) and Albert Esteve (Centre d

Description:

'Swap' (recode) place of enumeration for a small fraction of households ... Photos from Colombia integration project, February-March, 2000: ... – PowerPoint PPT presentation

Number of Views:54
Avg rating:3.0/5.0

less

Transcript and Presenter's Notes

Title: IPUMS-International: A Restricted Access Web-Site Providing Anonymized, Integrated Census Microdata for Social Science and Policy Research * * * Robert McCaa, Steven Ruggles, Matt Sobek (University of Minnesota) and Albert Esteve (Centre d


1
IPUMS-International A Restricted Access
Web-Site Providing Anonymized, Integrated
Census Microdata for Social Science and Policy
Research Robert McCaa, Steven Ruggles, Matt
Sobek (University of Minnesota) and Albert
Esteve (Centre dEstudis Demografics)
2
Overview access, privacy and confidentiality for
integrated census microdata of 40 countries
  1. Goals and accomplishments
  2. Confidentiality and privacy protections legal,
    administrative, technical
  3. Data cleaning, constructing, and integration
  4. Access custom-tailored extracts (not whole
    datasets) users and uses
  5. Summary new directions, 6 strengths and an
    aspiration

3
1. Goals and accomplishments
4
Four Goals
  • 1. Inventory the worlds census microdata
  • 2. Preserve endangered microdataa. contract
    preservation with repositoriesb. deposit copies
    in at least two archives National Statistical
    Organization and ... WHO
  • 3. Integrate datasets of authorized countries
    using UNSD and other standards
  • 4. Disseminate extracts of database to approved
    researchers without charge (copy to each NSI)

5
Minnesota Population CenterUniversity of
MinnesotaPrincipal investigatorshistorians
Steven Ruggles Robert McCaawww.ipums.org/interna
tional
1998 First agreement signed 1999 Funding
authorized 2002 First data release, 7 countries
China, Colombia, France, Kenya, Mexico, USA,
Vietnam 2003 Regional projects Latin America,
Europe
Accomplishments
6
Preserve data
documentation
UN Demographic Center for Latin America (CELADE,
Santiago, Chile)3000 microdata tapes preserved
7
Integration projects 40 Partners 1 (Table 1
August 16, 2003 )
World Region Official Statistical Authority
Africa Ghana, Kenya, Madagascar
Americas Argentina, Brazil, Chile, Colombia, Costa Rica, Dominican Republic, Ecuador, El Salvador, Guatemala, Honduras, Mexico, Nicaragua, Panama, Paraguay, Peru, Venezuela, USA
Asia China, Tajikistan, Turkmenistan, Vietnam
Europe Austria, Belarus, Bulgaria, Czech Republic, France, Germany, Greece, Hungary, Netherlands, Portugal, Romania, Russia, Slovenia, Spain, the United Kingdom
Middle East Israel, Palestinian Authority
8
Data Access first release May 2002
7 countries, 23 samples60 million person
records USA 1960, 1970, 1980, 1990,
2000 China 1982 Colombia 1964, 1973,
1985, 1993 France 1962, 1968, 1975,
1982, 1990 Kenya 1989,
1999 Mexico 1960, 1970, 1990,
2000 Vietnam 1989, 1999
9
2. Confidentiality and privacy protections
10
Confidentiality and privacy protections changing
perceptions
  • Growing recognition that anonymized census
    microdata samples do not violate national
    legislation on statistical confidentiality and
    privacy
  • International Monetary Funds General Data
    Dissemination System 52 countries with uniform
    standards
  • All enforce strict standards of statistical
    confidentiality
  • Prohibit disclosure of information which may
    identify individuals or entities
  • In 2000, 37of 52 countries disseminate
    anonymized census microdata samples

11
Confidentiality protections, IPUMSI legal,
administrative, and technical
  • Dissemination agreement between University of
    Minnesota and each National Statistical Institute
  • Uniform 10 point protocol ownership, use,
    authorization, restrictions, confidentiality,
    security, publication, violations, sharing, and
    arbitration
  • Conditional use license between the University of
    Minnesota and each researcher
  • Permission to use restricted access microdata, 3
    criteria research need, research competence,
    and agree to conditions of use
  • Technical data protection measures
  • Specific to each country /

12
Confidentiality protections, IPUMSI technical
  • Technical data protection measures
  • Adopt sample size according to national norms
  • Suppress detailed geography
  • Top and bottom code continuous variables
  • Suppress dates (birth, migration, marriage,
    etc.)
  • Swap (recode) place of enumeration for a small
    fraction of households
  • Randomly order households within administrative
    units
  • No semi-automatic procedures (e.g., ยต-Argus)

13
Only serious researchers need apply (Table 2)
Institutional affiliations (Europe Canada)
Cardiff University Demographic Studies Center - University A. of Barcelona Department of Statistics, University of Florence INED Paris Institut d etudes politiques de Paris Institut francais de recherche en Afrique (IFRA) Ministry of Economic Development and Trade Novosibirsk State Technical University University College London Department of Demography, University of Montreal Queen's University Simon Fraser University Statistics Canada -Library and information centre University of Toronto
Institutional affiliations (Africa, Asia, Latin America)
African Population and Health Research Center Centro de Investigacion y Docencia Economicas. Hong Kong University of Science and Technology National University of Singapore The University of Nairobi The World Bank Universidad Externado de Colombia Universidad Pedagogica Experimental Libertador World Agro-Forestry Centre World Health Organization
Official Over-sight Boards cited by approved users
Commission Nationale Information et Liberte Comite National d'Ethique Institutional Review Board--research on human subjects) IRD scientific commission (Conseil Scientifique) ISA and its research committees RC28 and RC33 National Committees for Research Ethics--Norway USA Federal Code title 13/title 26 /title 5 Vice-decanat a la recherche, l'ethique
Funding Agencies
Canadian Foundation for Innovation Council for the Development of Social Science Research in Africa Economic and Social Research Council, UK National Science Foundation National Institutes of Health Norwegian University Development Aid Funding Rockefeller Foundation Wellcome Trust
14
3. Data enhancements integration
15
Data Enhancements
  • Data quality and enhancements added value
  • Clean data to eliminate duplicate records
  • Conduct internal consistency checks
  • Impute missing, inconsistent values
  • Constructed variables to facilitate analysis
  • Pointer variables for Mothers, Fathers, Spouses
  • Family and household variables

16
Integration (not standardization)
  • Adopt uniform coding schemes, nomenclatures and
    classifications
  • United Nations Statistics Division (Priniciples
    Recs)
  • UNESCO (ISCED)
  • International Labor Office (ISCO-88)
  • Composite coding scheme 2 simple, but seemingly
    contradictory rules (Table 3, next slide)
  • Retain original detail
  • Harmonize each digit
  • /

17
Composite coding scheme Employment
Status
18
Integration Work Plan
  • Assemble microdata and documentation (MPC, NSI)
  • Develop samples to minimize confidentiality risk
    and maximize robustness (MPC or NSI partner)
  • Design national integration plan (NSI,
    consultants)census-by-censusconcept-by-conceptc
    ode-by-code
  • Write integrated documentation (MPC, partners)
  • Program integration (MPC)

19
StandardUN/Eurostat Principles Recs...
Census documentation compiled for Colombian
microdata
Photos from Colombia integration project,
February-March, 20004 experts from DANE (census
office)7 academics (3 universities)
20
4. Access
21
Data Access web-based extraction system
  • Password protected to make and retrieve
    extracts
  • Researcher selects
  • countries,
  • censuses,
  • Cases/sub-populations,
  • variables, and
  • Sample densities
  • Extract engine queues request, generates extract
  • Researcher retrieves extract via web
  • NO CDs, original codes, or complete datasets

22
5. Regional initiatives summary
23
IPUMS-Latin America, 2003-2007 16 countries,
500m. people
  • Scope Latin American census microdata,
    1960-present
  • Work Plan
  • ?2001-2 Sign licensing agreements with official
    agencies
  • ? 2002-3 Obtain funding from U.S. NIH
  • 2004 Develop/translate microdata metadata
  • Country expert teams design national
    integrations
  • 2005 MPC/expert teams design regional
    integration
  • 2006 MPC integrates microdata and metadata
  • 2007 MPC disseminates to bona fide researchers
    who show need and agree to conditions of use.

24
ICM-Europe 14 national teams
25
Summary, 6 strengths and an aspiration
  • Uniform legal authorization
  • Access restricted to scientists with need
  • Experienced integration teams
  • Proven web-based distribution system
  • High user satisfaction
  • Sustainability MPC, ICPSR, WHO
  • Aspiration 90 countries, 90 worlds population
    by 2010

26
additional information athttp//www.ipums.org/i
nternational Contact Robert
McCaarmccaa_at_umn.edu
Write a Comment
User Comments (0)
About PowerShow.com