National Center for Health Statistics Record Linkage Program - PowerPoint PPT Presentation

About This Presentation
Title:

National Center for Health Statistics Record Linkage Program

Description:

National Center for Health Statistics Record Linkage Program Christine S. Cox, Chief, Special Projects Branch (SPB) Office of Analysis & Epidemiology (OAE) – PowerPoint PPT presentation

Number of Views:253
Avg rating:3.0/5.0
Slides: 97
Provided by: kdl4
Learn more at: https://www.cdc.gov
Category:

less

Transcript and Presenter's Notes

Title: National Center for Health Statistics Record Linkage Program


1
National Center for Health Statistics Record
Linkage Program
  • Christine S. Cox,
  • Chief, Special Projects Branch (SPB)
  • Office of Analysis Epidemiology (OAE)
  • NCHS Data Users Conference
  • August 12, 2008
  • U.S. DEPARTMENT OF HEALTH AND HUMAN SERVICES
  • Centers for Disease Control and Prevention
  • National Center for Health Statistics


2
Overview
  • NCHS Record Linkage Program
  • Analytic Issues Tools
  • Comparative Analysis of Public vs Restricted
    Linked Mortality Files
  • Accessing the Restricted-use Linked Data

3
NCHS Record Linkage Program
  • Links survey data with data collected from
    administrative records
  • Designed to maximize the scientific value of the
    NCHS population-based surveys
  • Examine factors that influence chronic disease,
    disability, health care utilization, morbidity,
    and mortality

4
Why Do Linkage?
  • Augments available information for major
    diseases, risk factors, and health service
    utilization
  • Links exposures to outcomes
  • Provides longitudinal component to survey data
  • Reduces cost burden
  • Re-contacting survey respondents for follow-up
    information can be expensive
  • Increases accuracy and detail of data collected

5
How Records are Linked
6
Research Potential of NCHS Linked Data
  • Aging
  • Risk factors for poor health outcomes (hip
    fractures, stroke, etc.)
  • Disability
  • Effects of chronic illness and obesity on
    disability and mortality
  • Disparities
  • Mortality patterns by race/ethnicity or
    socioeconomic status
  • Health services
  • Functional impairment and health care costs
  • Methodologic Studies
  • Validation of self-reports vs. administrative
    records
  • Genetics
  • Genetic variants and health outcomes

7
Record Linkage Activities
  • Mortality
  • National Death Index
  • Social Security Retirement and Disability
  • Data from the Retirement, Survivors, Disability
    Insurance (RSDI) and Supplemental Security Income
    (SSI) programs
  • Medicare enrollment and payments
  • Enrollment and claims data

8
NCHS Linked Mortality Data Files
  • Children included

9
Number of Deaths by Survey
  • NHIS and LSOA II have mortality follow-up through
    12/31/2002.
  • NHEFS, NHANES II and III have mortality follow-up
    through 12/31/2000.

10
Public-use Linked Mortality Files
  • In 2007, released public-use files with a limited
    amount of perturbed data and reduced number of
    mortality variables
  • NHIS 1986-2000
  • NHANES III
  • LSOA II
  • Study comparing analyses from public-use and
    restricted-use linked mortality files
    demonstrated similar results
  • Lochner et al. Am. J. Epidemiol. 2008 168 336-344

11
Mortality Data Elements
  • Vital status
  • Date of death or follow-up time
  • Underlying cause of death
  • Multiple cause of death
  • Age at death
  • Age last presumed alive
  • only available on restricted-use files

12
Research Potential of Linked Mortality Data
  • Excess Deaths Associated with Underweight,
    Overweight, and ObesityKM Flegal, BI Graubard,
    DF Williamson, MH Gail JAMA, 20052931861-1867.
  • Living and Dying in the USA Behavioral, Health,
    and Social Differentials of Adult Mortality RG
    Rogers, CB Nam, RA Hummer 2000.
  • Suicide among male veterans a prospective
    population-based study MS Kaplan, N Huguet, BH
    McFarland, JT Newsom J Epidemiol Community
    Health, 2007 61619-624.

13
NCHS Linked Medicare Data Files
14
Medicare Linkage
  • Medicare enrollment and claims data for the years
    1991-2000
  • Denominator file
  • MEDPAR Inpatient hospitalization
  • MEDPAR Skilled nursing facility (SNF)
  • Hospital outpatient
  • Home Health Agency (HHA)
  • Hospice
  • Carrier (physician/supplier Part B file)
  • Durable Medical Equipment (DMERC)
  • Next data release (1999-2007)
  • All of the above files
  • Chronic Conditions Warehouse
  • Medicare Part D (Prescription Drugs)

15
Summary Medicare Data File
  • Summary Medicare Enrollment and Claims Files
    (SMEC) for 1991-2000
  • Enrollment information from the Denominator file
    plus summary variables of claims and payments
  • Variables modeled after MCBS cost and use files
  • Total reimbursements per year
  • Total number of claims by Medicare record type
  • Summary of charges by Medicare record type
  • Termination status reason for termination
  • Monthly HMO enrollment
  • Medicare status code (i.e. Part A, B or both)

16
Research Potential of Linked Medicare Data
  • Examine risk factors for health conditions
  • Examine reliability of survey data
  • Compare survey reported Medicare enrollment to
    Medicare claims records
  • Examine survey report of disability with program
    participation eligibility criteria
  • Examine disparities in Medicare service
    utilization

17
NCHS Linked SSA Data Files
18
Social Security Linkage
  • Old Age, Survivor, Disability Income
  • Master Beneficiary Record (MBR), 1962 - 2003
  • Program eligibility, benefit amount, payment
    status, dual entitlement
  • Payment History Update System (PHUS), 1984-2003
  • Benefit payment amounts, including withholding
    information for Medicare Part B premiums
  • Supplemental Security Income
  • Supplemental Security Record (SSR), 1974 - 2003
  • Program eligibility, benefit information, and
    payment status

19
Research Potential of Linked Social Security Data
  • Examine reliability of survey information for SSA
    program participation and benefits
  • Compare the health characteristics of early
    retirees (age 62) to those who postpone benefits
  • Policy analysis using validated survey data
  • Predicting the number of people who will become
    disabled based upon survey reported health
    conditions
  • Determining whether current disability
    entitlement funding levels will be adequate as
    the population ages

20
Future Linkage Activities
  • Linkage of 1999-2004 Medicaid enrollment and
    claims data linked to 1999-2004 NHIS and NHANES
  • NCHS series report comparing the mortality
    experience of the 1986-2000 National Health
    Interview Survey Participants with the U.S.
    population

21
Overview
  • NCHS Record Linkage Program
  • Analytic Issues Tools
  • Comparative Analysis of Public vs Restricted
    Linked Mortality Files
  • Accessing the Restricted-use Linked Data

22
National Center for Health Statistics Record
Linkage ProgramAnalytic Issues and Tools
  • Kimberly A. Lochner, SPB, OAE
  • NCHS Data Users Conference
  • August 12, 2008
  • U.S. DEPARTMENT OF HEALTH AND HUMAN SERVICES
  • Centers for Disease Control and Prevention
  • National Center for Health Statistics

23
Analytic Issues Overview
  • Linkage eligibility
  • Linkage match status
  • Combining survey years for the linked mortality
    files
  • Changes in surveys or administrative data over
    time
  • Issues with administrative data

24
Mortality Analytic Issues
  • Eligibility status
  • Sample weights
  • Combining survey years for the linked mortality
    files
  • Variance estimation
  • Changes over time
  • ICD-9 and ICD-10 codes
  • Most of these issues apply only to the NHIS
    Linked Mortality Files

25
(No Transcript)
26
Eligibility Status
  • What determines eligibility for mortality
    follow-up?
  • Age
  • Non adult survey respondents are INELIGIBLE
  • Future linkages will include children
  • Sufficient data for matching
  • Lack of identifying data makes you INELIGIBLE
  • Drop INELIGIBLE survey respondents
  • Variable indicating eligibility status on files

27
Mortality IneligibilityLack of Matching Data
(adults only)
28
Eligibility Status
  • Ineligibility a problem for NHIS
  • Created new sample weights to account for
    ineligibility due to insufficient identifying
    data
  • Original NHIS sample weights (WTFA)
  • New NHIS sample weights (WGT_NEW)
  • Only for core/person files
  • Recommend using WGT_NEW

29
Combining Survey Years
  • NHIS linked mortality files cover two design
    periods (1986-1994 and 1995-2000)
  • Follow guidelines on pooling NHIS years
  • http//www.cdc.gov/nchs/nhis/methods.htm
  • Created new stratum and psu variables for NHIS
    Linked Mortality files to allow combining across
    NHIS design years

30
Changes in Data Over Time
  • ICD-9 (deaths 1979 1998) and ICD-10 (deaths
    1999 to present) cover linked mortality files
  • Use both sets of codes to obtain full counts of
    cause-specific deaths
  • Individual codes (ICD_9REV, ICD_10REV)
  • Recodes
  • UCOD_282, (ICD-9)
  • UCOD_72, (ICD-9)
  • UCOD_34, (ICD-9)
  • UCOD_358, (ICD-10)
  • UCOD_113 - recodes deaths before 1998 using
    ICD-10 guidelines
  • Refer to vital statistics report on ICD
    comparability

31
Medicare Analytic Issues
  • Eligibility status
  • Eligible but not matched
  • Death
  • Linked but no Medicare data
  • Managed care enrollment
  • Non covered services
  • Gaps in coverage
  • Issues with Medicare data files
  • See the NCHS-CMS linkage web page under
    Analytic/Programming Support

32
Medicare Ineligible Population and Linkage Rates
(65 years)
33
Ineligibles and Non-Matches
  • Must be excluded from your sample
  • Identify using the variable (CMS_MATCH) on the
    Feasibility Study Data files

34
Identifying Deaths
  • Survey participants interviewed before the
    availability of linked Medicare files could have
    died before 1991
  • E.g. NHEFS, NHANES II or NHANES III respondents
    interviewed in Phase I (1988-91)
  • Persons may die during study period and cease to
    have Medicare records
  • Enrolled in Medicare in 1991 but died before 2000

35
Identifying Deaths
  • Survey respondents who died before 1991 (e.g.
    from NHANES) can be identified by merging
    mortality information from the Linked Mortality
    files
  • Needed to create analytic sample
  • Persons who died during 1991-2000 should no
    longer have Medicare records after date of death
  • Look for a CMS date of death (DOD) on each of the
    Denominator or SMEC files (1991 to 2000)

36
Linked but no Medicare data
  • No denominator file because
  • Loss of entitlement during 1991-2000
  • Deceased prior to 1991
  • CMS record keeping inconsistencies
  • No claims data
  • Not utilizing Medicare in 1991-2000
  • No reimbursable claims
  • CMS record keeping inconsistencies

37
No Denominator Record
  • Lack of denominator record can affect your
    analytic sample why?
  • Cant determine managed care enrollment
  • In general, managed care enrollees are excluded
    from sample (more on this to come)

38
Managed Care Enrollment
  • Medicare does not receive claims for
    beneficiaries enrolled in managed care plans
    (HMO)
  • Do not have complete information on payments or
    services received
  • Could miss health events that are being counted
    based upon submitted claims
  • Complex issue. Refer to ResDAC
  • http//www.resdac.umn.edu/

39
How managed care enrollees affect your research
depends upon your question
  • Studies on reimbursements/charges
  • Option may be to exclude those with any managed
    care enrollment because you dont have complete
    information on payments or services received
  • Studies on health outcomes/events
  • Option may be to exclude those with any managed
    care enrollment because you could miss events
  • Option may be to censor observations at time of
    first HMO enrollment
  • Other methods for addressing HMO enrollment
    possible depending upon research question

40
Services not covered in Medicare 1991-2000 files
  • Out-patient prescription drugs
  • Routine physical and dental exams
  • Dentures
  • Eye glasses
  • Out-of-pocket expenses for Medicare beneficiaries
    (e.g. deductibles, coinsurance)

41
SSA Analytic Issues
  • Eligibility status
  • Eligible but not matched
  • Linked but no benefit history data
  • Records are extracted from files designed for
    program administration - not for research

42
SSA Ineligible Population and Linkage Rates
43
Ineligibles and Non-Matches
  • Must be excluded from your sample
  • Identify using the variable (SSA_MATCH) on the
    Feasibility Study Data files

44
Linked but no SSA Data
  • Linkage is to SSA NUMIDENT file
  • Linked to NUMIDENT file but may not be eligible
    for Social Security benefits
  • Not age eligible for retirement
  • Defer retirement benefits because working
    full-time
  • Not eligible for Social Security

45
Issues with Administrative Data
  • Administrative data updates
  • Payment history updates
  • Previously denied claims may be overridden
  • Changes to type of benefit status
  • Individuals receiving disability (DI) switch to
    retirement (R) benefits at age 65 in RSDI program
  • Complicated data
  • File layouts are complex, e.g. each MBR record
    has 2 parts
  • Calculation of benefits not straightforward, e.g.
    SSI benefits come from both federal and state
    programs

46
Final Tips
  • Read relevant documentation !!!
  • Survey file layouts detailed notes
  • Linkage methodology reports
  • Sample SAS STATA input statements for
    public-use linked mortality files
  • Analytic guidelines
  • Consult basic program information
  • CMS http//www.cms.gov
  • ResDAC http//www.resdac.umn.edu (Medicare)
  • SSA http//www.ssa.gov and
  • http//www.ssa.gov/regulations/index.htm

47
Final Tips
  • Determine NCHS public-use files needed
  • Determine RDC linked files needed
  • Determine feasibility of research question based
    upon successfully linked respondents
  • Public-use Feasibility Study Data files available
    indicating whether respondent was linked to
    Medicare or SSA data and whether there is a
    record on the various Medicare and/or SSA files
  • Match status (SSA_MATCH CMS_MATCH)

48
Overview
  • NCHS Record Linkage Program
  • Analytic Issues Tools
  • Comparative Analysis of Public vs Restricted
    Linked Mortality Files
  • Accessing the Restricted-use Linked Data

49
National Center for Health Statistics Record
Linkage Program Comparative Analysis of the
Public-use and Restricted-use Linked Mortality
Files Kimberly A. Lochner, SPB, OAE NCHS Data
Users Conference August 12, 2008 U.S.
DEPARTMENT OF HEALTH AND HUMAN SERVICESCenters
for Disease Control and PreventionNational
Center for Health Statistics
50
Objectives
  • Present an overview of the newly available
    public-use linked mortality files
  • National Health Interview Survey (NHIS) 1986 to
    2000
  • Third National Health a Nutrition Examination
    Survey (NHANES III)
  • The Second Longitudinal Study of Aging (LSOA II)
  • Demonstrate the analytic comparability between
    the public-use and restricted-use versions of the
    linked mortality files

51
Background
  • Mortality follow-up studies are a major focus of
    NCHS record linkage activities
  • NCHS linked mortality files created in 2004 made
    available through NCHS Research Data Center (RDC)
  • Protects confidentiality of survey participants
  • May minimize access to highly utilized data
    sources

52
Background
  • NCHS plan for public-use linked mortality files
    included
  • Releasing a reduced number of key mortality
    variables
  • Perturbing date or cause of death for select
    records
  • Determining that survey participants could not be
    reidentified
  • Comparing the analytic utility of the public-use
    file to the restricted-use file

53
Public-use Linked Mortality Files
  • NHIS (1986 2000)
  • Each NHIS year is nationally representative
    survey of the civilian non-institutionalized U.S.
    population
  • Questionnaire content
  • Basic socio-demographic characteristics
  • Health conditions and utilization
  • Health status, health care services, and behavior
  • Mortality follow-up through December 2002

54
Public-use Linked Mortality Files
  • NHANES III (1988 1994)
  • Includes survey and examination information
    designed to assess the health and nutritional
    status of U.S. adults and children.
  • Study content
  • Basic socio-demographic characteristics
  • Medical and dental examinations
  • Laboratory tests
  • Environmental exposures
  • Mortality follow-up through December 2000

55
Public-use Linked Mortality Files
  • LSOA II
  • Prospective survey of persons 70 years of age and
    over at the time of their baseline interview
    (1994 NHIS)
  • Follow-up interviews in 1997-98 and 1999-00
  • Questionnaire content
  • Basic socio-demographic characteristics
  • Health conditions, functional health status and
    disability
  • Health care utilization
  • Mortality follow-up through December 2002

56
Data Elements NHIS Linked Mortality Files
  • MCOD flags only for diabetes, hypertension, and
    hip fracture
  • Available on the public-use NHIS survey data
    files

57
Data Elements NHANES III Linked Mortality Files
  • MCOD flags only for diabetes, hypertension, and
    hip fracture
  • Available on the public-use NHANESIII survey
    data files

58
Data Elements LSOA II Linked Mortality Files
  • MCOD flags only for diabetes, hypertension, and
    hip fracture
  • Available on the public-use LSOA II survey data
    files

59
Comparative Analyses
60
Statistical Methods
  • Compared mean follow-up times and distributions
    for select causes of death
  • Compared the mortality risk for a standard set of
    socio-demographic covariates for all-cause as
    well as cause-specific mortality
  • Cox proportional hazard models
  • SUDAAN to take into account complex survey design

61
Analytic Samples
  • Eligible for mortality follow-up
  • At least 25 years of age at the time of the
    survey interview
  • Non-Hispanic white, non-Hispanic black, or
    Hispanic
  • Non missing values for cause of death or other
    covariates

62
Covariates
  • Socio-demographic characteristics reported
  • at time of interview and taken from public-use
  • survey data files
  • Age
  • Sex
  • Race and ethnicity
  • Educational attainment
  • Marital status (except NHANES III)
  • Region of the country (except NHANES III)

63
Outcomes
  • All-cause and cause-specific mortality
  • Cause-specific deaths based on underlying cause
    of death from the ICD-10 113 grouped recode
  • Duration of follow-up calculated from time of
    interview until death or censored at end of the
    follow-up period
  • Restricted-use files use complete information on
    interview and death month, day, and year
  • Public-use files use less detailed information on
    timing of death, some of which is perturbed
  • NHIS/LSOA II use interview year and death year
    only
  • NHANES III use person-time follow-up provided on
    the file

64
NHIS Results
  • Sample (n 897,232)
  • Deaths (n 114,264)
  • 11.8 weighted
  • Follow-up (mean)
  • Restricted-use 8.6 years
  • Public-use 8.7 years

65
NHIS Linked Mortality Files Cause-specific
Deaths
66
NHIS Linked Mortality Files Relative Hazards
for All-Cause Mortality
  • Note Models also adjusted for marital status and
    region of the country.

67
NHIS Linked Mortality Files Relative Hazards
for Homicide Mortality
  • Note Models are restricted to Non Hispanic
    Whites and Blacks (n 802,307).
  • Models also adjusted for marital status
    and region of the country

68
NHANES III Results
  • Sample (n 16,048)
  • Deaths (n 3,209)
  • 12.1 weighted
  • Follow-up (mean)
  • Restricted-use 104.1 months
  • Public-use 103.8 months

69
NHANES III Linked Mortality Files
Cause-specific Deaths
70
NHANES III Linked Mortality File Relative
Hazards for All-Cause Mortality
71
NHANES III Linked Mortality File Relative
Hazards for Cerebrovascular Mortality
  • Note Models restricted to Non Hispanic Whites
    and Blacks (n 11,985).


72
LSOA II Results
  • Sample (n 8,867)
  • Deaths (n 3,671)
  • 41.4 weighted
  • Follow-up (mean)
  • Restricted-use 4.4 years
  • Public-use 4.4 years

73
LSOA II Linked Mortality Files Cause-specific
Deaths
74
LSOA II Linked Mortality File Relative Hazards
for All-Cause Mortality
  • Note Models also adjusted for marital status and
    region of the country.

75
LSOA II Linked Mortality File Relative Hazards
for Cancer Mortality
  • Note Models restricted to Non Hispanic Whites (n
    7,586).
  • Models also adjusted for region of the
    country.

76
Conclusions
  • Public-use linked mortality files yield similar
    results as the restricted-use data
  • Public-use and restricted-use files yield similar
    hazard ratios and confidence intervals,
    particularly for common causes of death
  • Results for less common causes of death remain
    consistent, although there tends to be less
    agreement in the estimates

77
Conclusions
  • Caution is urged for analyses of very rare causes
    of death or small population subgroups
  • Users of the public-use linked mortality files
    may request to verify their results through the
    NCHS Research Data Center

78
Public-use Linked Mortality Files Can Be
Downloaded
  • http//www.cdc.gov/nchs/data_access/data_linkage_a
    ctivities.htm

79
Acknowledgements
  • American Journal of Epidemiology 2008
    168(3)336-344
  • SPB data linkage team
  • Stephanie Bartee
  • Jim Brittain
  • Cordell Golden
  • Donna Miller
  • Gloria Wheatcroft

80
Overview
  • NCHS Record Linkage Program
  • Analytic Issues Tools
  • Comparative Analysis of Public vs Restricted
    Linked Mortality Files
  • Accessing the Restricted-use Linked Data

81
NCHS Record Linkage Activities Accessing
Restricted Linked data at the NCHS Research Data
CenterChristine CoxNCHS Data Users
ConferenceAugust 12, 2008 U.S. DEPARTMENT
OF HEALTH AND HUMAN SERVICES Centers for
Disease Control and Prevention National Center
for Health Statistics
82
Why cant you just give me the data?
  • NCHS does not own the linked administrative
    data
  • NCHS data confidentiality rules prohibit the
    release of potentially identifiable data
    special considerations concerning the protection
    of linked data
  • The RDC is the only option for access to
    restricted-use data files

83
Research Data Center
  • The RDC is a organizational unit located at NCHS
    headquarters in Hyattsville, MD
  • Provides access to restricted use data files

84
Restricted Data Files Include
  • Linked administrative data
  • Medicare
  • SSA
  • Restricted-use linked mortality files
  • Detailed geographic data or contextual data
  • Census tract State/county level data
  • EPA air pollution data

85
What to Expect?
  • To gain access to NCHS restricted data user must
  • Submit a research proposal
  • Sign an affidavit of confidentiality
  • Promise not to use any method to attempt to
    identify respondents

86
What to Expect?
  • How long for a proposal to be reviewed?
  • Usually within 2 weeks, if proposing to use
    public use survey data with the linked data
  • Up to 1-2 months, if proposing to use non-public
    survey data with the linked data

87
Access Methods
  • Once approved, three methods to access restricted
    data
  • on-site - use local computing resources in the
    NCHS RDC, Hyattsville, MD
  • remote submit programs electronically to be
    executed in the RDC with output returned by email
  • Census RDC- access NCHS data using any one of the
    nine Census RDCs.
  • For all methods of access, restricted data files
    remain in RDC and output is inspected for
    disclosure violations

88
On-Site Access Method
  • On-site Facilities
  • Four user workstations-expandable as needed
  • Pentium IV computers
  • Windows XP
  • SAS, STATA, SUDAAN, LIMDEP, SPSS, Watcom Fortran
    77, HLM
  • No removable media
  • Secure printer
  • Open only during normal working hours
  • RDC staff constructs necessary data files,
    including merged user data

89
Remote Access Method
  • RDC staff constructs necessary data files,
    including merged user data
  • SAS programs only, including SAS callable SUDAAN
    (certain procedures and functions not allowed)
  • Both submitted programs and output undergo a
    programmed disclosure limitation review
  • Ability to submit analytical computer programs
    via email from anywhere in the world with access
    available 24hrs/day

90
Census RDC Access Method
  • 9 Census RDCs
  • Los Angeles, Berkeley, Boston, Durham,
  • Ann Arbor, Ithaca, NYC, Chicago, DC
  • Separate Census research proposal is not needed
  • May have to follow additional security
    requirements at Census Bureau facilities

91
User Fees Linked Data Access
92
Proposal Requirements
  • Proposal is evaluated by review committee
  • Review criteria
  • Scientific and technical feasibility
  • Availability of RDC resources
  • Disclosure risk for restricted information
  • The extent to which project is in accordance with
    the mission of NCHS
  • Special note NCHS does not try to determine if
    proposals are duplicative

93
Proposal Requirements Helpful Tips
  • Be clear about research and data requirements
    (helps to determine feasibility of project)
  • Clearly identify the sample to be included in the
    analytic file
  • Provide data dictionaries for both
  • Public-use data
  • Restricted-use data
  • Provide examples of expected output

94
Visit the RDC at http//www.cdc.gov/nchs/rd/rdc
.htm or email rdca_at_cdc.gov
95
Where to get Help?
  • RDC website contains
  • Proposal Checklist
  • Sample Proposal
  • List of available restricted data files
  • Detail on Census RDC locations and contact
    information
  • FAQs regarding proposal review process, on-site
    procedures, area information and contact
    information
  • Email rdca_at_cdc.gov

96
Questions?
Write a Comment
User Comments (0)
About PowerShow.com