Title: Longitudinal Analysis of Health Care Data in Large Populations: Data Configurations and Methods
1(No Transcript)
2Longitudinal Analysis of Health Care Data in
Large Populations Data Configurations and
Methods Daniel Gilden JEN Associates,
Inc. Cambridge, Massachusetts USA
3Data Applications What is the process we are
supporting?
- The life cycle of applied research
- Research
- Policy development/strategic planning
- Active program management
- Evaluation of impact of policies/therapies
- Research...
4What Do We Measure?
- People
- personal and family demographics
- geographic and personal environment
- economic status
- diseases
- disabilities
- utilization of medical services/therapies
- healthcare provider affiliations
5Where is the Data?
- Birth and death records
- Public health disease registries
- Personal survey data
- Economic information by local area
- Administrative data
- cost reports
- aggregate purchasing information
- therapy level payment records
- enrollment data
6How do we Understand What is in the Data?
- Turning data into information
- the poorer the data source the more statistical
manipulation will be required - sample extrapolation
- insufficiently specified models
- the denser the source the easier the analytic
problem but - costly data preparation
- complex analytic file production
7Matching the Data to the Research or the Research
to the Data?
- Can the data support the unit of analysis?
- program
- population
- person
- Are the exposure and outcome measures available?
8Everybody Has Time But Can We Handle It?
- Interacting time with the unit of analysis
multiplies both the data processing challenge and
the analytic opportunities - Cross-sectional
- standardized time bucket for simple comparative
analyses - Longitudinal
- following trajectories over time, why?
- to understand the past and predict the future
9Standardized Snapshot Measure Using Claims, Cost
Reports or Pharmacy Purchasing Records
10Expenditure Trend from Treatment Claims
50 Increase in Monthly Expenditures Over 3 Years
11Population Trajectories in Costs and Utilization
Rates
Non-Schizophrenia Dx Number of Users and Costs
Per User Driving Cost Increase
12Person-Level Report of Evolution in Therapy
Re-fills
13Data System Configurations
- Each example implies a different data
infrastructure, from the least to most
complexbut all the profiles are reasonable
starting points for research - Match the question to the datado not let your
eyes become bigger than your resources - The important point is to start and not wait for
better systems or more detailed data... systems
grow organically
14Steps to Developing a Research Data Infrastructure
- Data Inventory
- what is currently available
- Access model
- how many users and how deep the yield
- Analytic method selection
- analysis type determines resources
- Hardware and software follow the methods
- Design for economy - planned profligacy
- The human element - wheres the talent?
15Data Structures
- Vertical Data Archives
- fixed length
- a single record per observation
- research area related data fields
- Horizontal Analytic Records
- aggregated to the unit of analysis
- summary variables
- time oriented arrays
16System Design Goals
- Rational data structures minimize hardware and
software requirements and reduce analysis time - Reusable data structures and methods
- Select a data analysis strategy and stay with it,
re-use data, re-use methods, never reinvent the
wheel - Design a data update and expansion strategy in
advance to minimize disruptions and data damage
17Sample Configuration Typical US Source Data
- Large US State, 2.5 million Medicaid
Beneficiaries - Three Years, 800 million treatment records
- Monthly Enrollment Denominator
- Integrated and linked Pharmacy, Physician,
Inpatient, Post-acute Care, Long Term and Chronic
Care - Payments, diagnoses, therapies
- Linked to regional economic profiles
- 2.5 million person level summary records
18Successful System Configuration
- 1 Tera-Byte Disk capacity
- 1 Giga-Byte RAM
- 1 Off the shelf Dell PC, Windows XP
- SAS license for large database steps and
statistical analysis - Oracle license for storage of output tables
- Supports three researchers - not necessarily
skilled programmers - Access model is deep but narrow
19Results Performance
- Interaction between vertical and horizontal data
structures, four examples revisited - Snapshot of drugs by Psycho-active (PA) category
2 minutes - 3 years of PA Rx payments 3 minutes
- PA prescribing in schizophrenia population
identified from physician records 12 minutes - Time relative Rx refills for populations with new
schizophrenia 15 minutes