Title: RECORD LINKAGE 201: VISION FOR DATA INTEGRATION TO ACTION AND IMPLEMENTATION
1RECORD LINKAGE 201VISION FOR DATA INTEGRATION
TO ACTION AND IMPLEMENTATION
- Russell S. Kirby, Ph.D., M.S., F.A.C.E.
- Department of Maternal and Child Health
- School of Public Health
- University of Alabama at Birmingham
2Objectives
- Place record linkage in a broad framework for
planning, analysis, and public health action - Focus on key issues in planning, implementation,
evaluation, and utilization of record linkage
projects with administrative public health
databases - Avoid falling asleep listening to a boring
presentation right after lunch
3(No Transcript)
4What is Record Linkage?
- If we assume there is a single record as well as
a file of records and all records relate to some
entities persons, businesses, addresses, etc . .
. Record linkage is the operation that, using
the identifying information contained in the
single record, seeks another record in the file
referring to the same entity. - Ivan Felligi, Statistics Canada
5A Long History
- Based on this definition, record linkage has been
around for a long time! - In public health, modern methods date only back
to the 1960s, and its broad use is truly a
phenomenon of the 1990s into the present decade.
6Population Health Informatics
- Record linkage should not be undertaken as an end
unto itself. - Rather, projects should be done within a broad
informatics context, with scientifically sound
strategies. Data quality issues should be a
paramount concern at all steps in the record
linkage process. - Ideally, record linkage should be done within the
context of a theoretical framework and a research
study design.
7CONCEPTUAL FRAMEWORK FOR POPULATION HEALTH
Genetic Endowment
Physical Environment
Social Environment
- Individual
- Response
- Behavior
- Biology
Health Care
Disease
Prosperity
Well-Being
Traditional Medical Model of Health Care
Source modified from Evans RG, Barer ML, Marmor
TR, Eds, Why are some people healthy and others
not? New York Aldine de Gruyter, 1994
8Maternal and Child Health
- Most of our databases represent administrative
data - Most of these data focus on aspects of disease
processes or systems of care (traditional medical
model) - While some of our databases are population-based,
some are program-based (and by no means are all
public health programs population-based)
9COMPONENTS OF AN IDEALSTATEWIDE PERINATAL
DATABASE1. Linkages relating to the index
pregnancy
Maternity/ Newborn/ Postpartum Hospital Data
Death Certificates (linked to age 14)
Certificates of Live Birth and Fetal Death
Perinatal Risk Assessment (8, 20, 36 Wks)
NICU Discharge Data
Cancer Registry Cases (under age 15)
a.k.a. prenatal care data
MSAFP Data
Fetal/Infant Mortality Review
Child Fatality Review
Clinical Genetics Database
Newborn Screening Database
Birth Certificate Linkage
Blood Lead Screening Registry
Risk Assessment Linkage
Hospital/NICU Data Linkage
Birth to Three IDEA Part H
Infant Hearing Screening Registry
Death Certificate Linkage
Clinical Genetics Data Linkage
Screening Data Linkage
Immunization Database
MSAFP Data Linkage
R. S. Kirby, Version 5/30/02
BDS/High Risk Linkage
10(No Transcript)
11COMPONENTS OF AN IDEALSTATEWIDE PERINATAL
DATABASE2. Linkages across pregnancies
Birth Certificate of Mother
Birth Certificate of Index Child
a. Sibship studies involving risk factors from a
previous pregnancy, or prospective outcomes
conditional on the index pregnancy. This
can also apply to pedigrees, and to educational
records across family members. b.
Intergenerational effects of pregnancy
outcomes. c. Linkages within maternal sibships
across generations. d. These approaches apply
equally to hospital discharge data.
R. S. Kirby, Version 4/2/07
12COMPONENTS OF AN IDEALSTATEWIDE PERINATAL
DATABASE3. Linkages between mother and pregnancy
Certificates of Live Birth and Fetal Death
Death Certificates
Hospital Discharge Survey
Routine linkage to identify maternal and
reproductive deaths among women of child-bearing
age (10 - 49), conducted among deaths
occurring at 42 or 90 days, or within one year of
termination of the index pregnancy. If
spontaneous abortion and/or induced termination
data are collected with personal identifiers,
these events should also be routinely linked with
death certificates, as should hospital discharge
(in-patient or emergency room) records for deaths
occuring to women with ICD-9-CM or CPT-4 codes
relating to reproductive health.
R. S. Kirby, Version 8/19/96
13COMPONENTS OF AN IDEALSTATEWIDE PERINATAL
DATABASE4. Routine automated geocoding of
addresses to latitude-longitude coordinates
- All vital statistics records should be geocoded
by place of residence. - All health facilities should be geocoded by
location. - For mortality and injury studies, data sufficient
to identify the location where the death or
injury occurred should be recorded in the
documentation, and this location should also be
geocoded. - Routine geocoding is an automated,
computer-assisted process the time required to
do it diminishes with the implementation of a
prospective system in which address files are
continually corrected and updated.
R. S. Kirby, Version 8/19/96
14COMPONENTS OF AN IDEALSTATEWIDE PERINATAL
DATABASE5. Linkages for child health, growth
and development
Link with Birth Defects Surveillance
CSHCN Database
R. S. Kirby, Version 7/7/03
15NICU Discharge Data
Hospital Discharge Data
Medicaid
Reports of Communicable Diseases
WIC
Newborn Metabolic Screening
DDS and CSHCN
Certificates Of Live Birth
Newborn Hearing Screening
Blood Lead Screening
Birth Defects Surveillance Data
Immunization Registry
Early Intervention (Birth to 3)
Child Abuse And Neglect/ Child Protective Services
16Elvis Presley on Love
- You dont know what youve got,
- until you LOSE it . . .
17Kirby on Data in Databases
- You dont know what youve got,
- until you USE it . . .
18RECORD LINKAGEWho, What, Why, When, Where, How?
- Which question is primary?
19 RECORD LINKAGE Why?
- What is the purpose of the study?
- Does a record linkage make sense?
- would a simple numerator/denominator analysis
suffice? - can the linkage be conducted in a manner that
supports the use of the resultant database for
other projects? - is a record linkage technically feasible?
- is a record linkage necessary?
20 RECORD LINKAGE How?
- Manual versus automated linkage
- The theoretical basis for record linkage
- deterministic methods
- probabilistic methods
- The need for identifiers
- Record linkage with names and dates
- Software buy specialized, use statistical
software package, develop your own? - Statistical evaluation of linkage results is
imperative, regardless of the method
21 RECORD LINKAGE Who?
- What personnel should do the linkage?
- dedicated linkage specialists?
- statisticians/programmers/analysts?
- Should linkage staff be subjected to personality
profiles? - What cases/events qualify for the linkage?
22 RECORD LINKAGE What?
- What databases should be linked?
- What are the functional relationships between the
records in each of the candidate datasets? Are
they sufficient to answer the research question? - How does the linkage support the
programmatic/research needs for which the linkage
was proposed? - Is there a plan for data warehousing or
systematic data integration?
23 RECORD LINKAGE Where?
- Where should the linkage be done?
- statistical agency?
- epidemiological agency?
- university research center?
- contract to vendor?
- Dont forget the importance of spatial
identifiers - consider geocoding as another aspect of record
linkage
24 RECORD LINKAGE When?
- How often should linkages be done?
- The periodicity of routine linkages is predicated
on the programmatic need for timeliness and
reporting, e.g. - infant deaths link immediately
- hospital discharges and birth certificates
quarterly or annually may be appropriate - linkages to support impassive case-finding
registries periodicity defined by registry needs
25With all this in mind . . .
- Lets review some perspectives from the experts
on how to do record linkage with public health
databases.
26TOP TEN LISTTEN BEST WAYS TO DO BAD PUBLIC
HEALTH RECORD LINKAGE
With apologies to David Letterman, and thanks for
editorial assistance to Elizabeth Kirby and for
their insights to the following Internet
contributors Kim Hauser, University of South
Florida Phil Klein, Wisconsin Department of
Workforce Development Richard Miller, Wisconsin
Bureau of Health Information Mark Fulcomer,
Pennsylvania Kate Kvale, Wisconsin Division of
Public Health Patrick Remington, Department of
Population Health, University of Wisconsin Russel
Rickard, Colorado Department of Health and
Environment Melissa Adams, University of Alabama
at Birmingham Phil Cross, NY State Congenital
Malformations Registry
R.S. Kirby, December 2002
27Just have someone else do the linkage for you,
then use the dont ask, dont tell method
perfected by the military. That way, what don't
know doesn't hurt you!
- -- Anonymous correspondent, summer of 2002
28Top Ten List Ten Best Ways to Do Bad Public
Health Record Linkage
Number 10
All for one and one for all
Always trust the Social Security Number in the
database as the correct Social Security Number
for that individual. If there are duplicate
Social Security Numbers for obviously different
individuals (based on age, gender or other
conflicting information between persons),
randomly select just one. Use the latest state
lottery results to obtain the random numbers.
29Top Ten List Ten Best Ways to Do Bad Public
Health Record Linkage
Number 9
The Shell Game
Change the linkage identifier every time you
recreate the data set. This keeps your data
users guessing, plus they can't refer to specific
records based on the linkage identifier. This
ensures confidentiality! If the vital statistics
agency refuses to allow birth certificate numbers
to be used, generate your own unique identifier
based on the records physical location in the
input file. Overwrite this field each time the
dataset is accessed. Compiling the final
analysis file should be a snap!
30Top Ten List Ten Best Ways to Do Bad Public
Health Record Linkage
Number 8
A rose is a rose is a rose is . . .
It doesn't matter if you get twins matched
correctly across files, since they are identical
anyway. If subjects share a genotype, this
should entitle you to share a link.
31Maam, you can have any color car you want, so
long as its black
32Top Ten List Ten Best Ways to Do Bad Public
Health Record Linkage
Number 7
What you get is what you see
If a variable is listed in a data dictionary it
is safe to assume to you can use it for linking.
After all, it has always been collected, and in
exactly the same manner, for the time period and
geographical area related to your study. This
rule of thumb holds especially for
race/ethnicity, educational attainment, and all
disease, procedure, and billing fields.
33Top Ten List Ten Best Ways to Do Bad Public
Health Record Linkage
Number 6
If it runs, dont fix it
Always strive to develop computer algorithms that
overmatch. High match percentages are impressive
and will also save staff time. Corollary There
should never be a need to physically examine any
of the source documents used in the linkage
process.
34Top Ten List Ten Best Ways to Do Bad Public
Health Record Linkage
Number 5
What, me worry?
Checking for duplicate records just slows down
the process this is a step that can be
eliminated. Instead, simply verify that the
output dataset contains the same number of
records as the largest input file. Then, proceed
to conduct the analyses.
35Top Ten List Ten Best Ways to Do Bad Public
Health Record Linkage
Number 4
The quality goes in, before the name
Don't bother to check for name changes. It
doesn't happen often enough to change your
statistics. This is especially true for women,
children who are adopted or in foster care, or
the rare family that speaks Spanish or other
languages, or comes from a culture where surnames
are listed first.
36Top Ten List Ten Best Ways to Do Bad Public
Health Record Linkage
Number 3
Only you, and you alone . . .
There is only one valid and reliable record
linkage strategy your own. Never test or
evaluate it, and by all means never subject the
computer algorithm to scrutiny by others!
37Top Ten List Ten Best Ways to Do Bad Public
Health Record Linkage
Number 2
Black and white, or shades of gray
Deterministic linkages must be correct after
all, they are based on EXACT matches. Why settle
for a complicated probabilistic matching
procedure, when you can be certain?
38Top Ten List Ten Best Ways to Do Bad Public
Health Record Linkage
Number 1
Bread and Roses
Spend months of staff time discussing whether to
do record linkages. Be sure to include the
department attorneys, the HIPAA privacy
consultant, and the division directors for each
program dataset to be included. Assume that the
project will take weeks to a month at most, and
that once completed, next year it can be run as
an overnight computer job.
39KEY ISSUES
- Why link?
- To link, or not to link? . . . or
- I link, therefore I am?
- Defining the nature of the problem
- What is the purpose?
- What do the records in each dataset represent?
- What will we do with the results?
40Why Link? (select the best answer)
- We cannot answer the research or policy question
without linking the databases. - We have to under the terms of our grant or
cooperative agreement. - Integrating record linkage into the routine data
management process of our program enables us to
assess the programs effectiveness and efficiency
on a continual basis.
41Why Not Link? (select the best answer)
- Lack of funding.
- Staff dont have training.
- Necessary hardware/software/data storage
unavailable. - Bureaucratic inertia.
- Turf battles between programs.
- Question doesnt warrant linkage.
- Some of the above?
- All of the above?
42(No Transcript)
43First Steps
- Before conducting a record linkage, carefully
examine the broad informatics, program and
research context. - Above all else, consider the purpose of the
linkage project in relation to the planned
approach and other potential uses of the
resulting linked dataset. - Hint if you only ask the people on your team
about other potential uses, the uses identified
will mostly be within the same frame of reference
for your own approach. - Lets carefully explore the question of linking
birth certificates and Medicaid pregnancy claims
data.
44What do the records represent?
- Medicaid claims database
- Pregnant women/mothers
- Women who are not pregnant or may be pregnant
(including the elderly) - Infants and children
- Men (what a concept!)
- Birth certificates
- Live births
- Fetal deaths
45What do the records represent (continued)?
- Some questions to consider
- What records are included in the claims database?
Are there systematic exclusions (e.g. global
bills for Medicaid managed care recipients)?
Does the database include only paid claims? - Are there records in the Medicaid database that
may not represent prenatal services? - Are there potentially multiple records per
patient in the claims database? - What is the purpose of the linkage?
46What do the records represent (continued)?
- Some questions to consider
- Is the focus of the study on mothers, infants, or
mother-infant dyads? - How do the concepts of residence and
occurrence affect the likelihood that an event
will be included in either database? - What is the relationship between Medicaid
eligibility and utilization? - What a priori expectations are there concerning
which records will and will not match?
47Some possible purposes of the linkage
- Link all Medicaid-eligible pregnant women with
their birth outcomes? - Link all Medicaid-paid deliveries with their
birth certificates? - Link all Medicaid-eligible pregnant women with
their infants (or all Medicaid-eligible infants
with their mothers)? - Create a proxy measure for socio-economic status
for vital statistics analyses? - Create Medicaid pregnancy episodes of care
records? - Other purposes?
48Issues with residence and occurrence in the
context of linking Medicaid and vital statistics
records
- Vital statistics datasets include all resident
and occurrence events in the state thanks to
the VSCP-NAPHSIS interstate exchange agreement.
This includes live births, fetal deaths, deaths
but does not extend to non-vital statistics
records. - Medicaid program data are generally
state-specific, and state residence is part of
the eligibility requirement. A woman who gives
birth in your state, but is a resident of another
state, may have been a Medicaid participant
there, but youll never know. Some states have
special programs governing reimbursement for
Medicaid services provided by physicians/health
care facilities in other states. - If records fail to match, might it be due to
differences in reporting requirements and
eligibility?
49What is the relationship between eligibility
and utilization?
- Remember that eligibility data are just that
unfortunately some persons who meet eligibility
requirements never apply or get signed up, while
others do, but never access the services for
which they are eligible. - Consider linking your eligibility data with
service utilization data not only to find out
which clients actually used the program, but also
for the insights you might gain from the
utilization data themselves.
50What records will and wont match?
- Some pregnancies involving Medicaid-eligible
women or Medicaid pregnancy claims do not result
in live births. - Fetal deaths
- Spontaneous or induced abortions
- Some Medicaid-eligible women may not have been
residents of the study area at the time of the
vital event. - Over-reliance on unique identifiers (SSNs,
service IDs) can lead to both mis-matched and
unmatched records. - Whats in a name, anyway?
51Record Linkage Methods
- Generally there are two classes of linkage
methodologies - Deterministic linkage methods
- Probabilistic linkage methods
52Linking data deterministically
53Which variables are common to both datasets??
Note This section is based on the use of SAS.
54A Word of Caution
- On the previous slide, mention was made of using
SAS. - If you plan to do record linkage using Microsoft
Access without complex Visual Basic code, DONT!
The same applies to other relational database
software. - Linkages based solely on straightforward JOINs
will allow significant error to remain in your
matched results.
55An Even Stronger Word of Caution
- If you plan to conduct a deterministic match
using a single identifying variable, or requiring
a match on that variable together with others,
DONT! - A good example of this is the Social Security
Number. - On the other hand, once you have linked records,
assigning a common identifier to both datasets
will facilitate future data processing.
56And now, back to our regular program . . .
57Mothers information
- Birth certificate Newborn Screen
- Birth_mom_legal Screen_mom_legal_last
- Birth_mom_mid Screen_mom_mid
- Birth_mom_first Screen_mom_first
- Birth_mother_dob Screen_mother_dob
58Infants information
- Birth Certificate Newborn Screen
- Birth_child_last Screen_child_last
- Birth_child_mid Screen_child_mid
- Birth_child_first Screen_child_first
- _Birth_gender _Screen_gender
- Birth_child_dob Screen_date
59Other information
- There could also be related fields that dont
specifically identify the individual, but are
useful for record linkage - Birth Certificate Newborn Screen
- Birth_zip_code Screen_zip_code
- Birth_hosp Screen_hosp
60Missing data
- Look for missing data in linkage variables.
- What do you do when you find it?
61Duplicate records
- Look for records that share the same values for
your vector of matching variables. - What do you do when you find records that share
these values?
62Ranking of linkage variables
- Which variables are the best variables?
- How much missing data in each variable?
- What do you know about the variables?
- How do you decide?
63The art of creating a linkage algorithm
- Use the most discriminating combination of
variables first - Loosen criteria as go along
64The art of creating a linkage algorithm
Most strict criteria
Linkage step 1
Linkage step 2
Linkage step 3
Least strict criteria
Linkage step
65Create id in data set
- Allows you to easily merge back with original
data - Easy as
- data new
- set old
- id_n_
- run
66Sort by chosen linkage variables
- What happens when you dont use by variables??,
for example - DATA LINKED
- MERGE BCERT MED
- RUN
- Be sure you unduplicate the output file (ie
NODUPKEY option in PROC SORT)
67Merge by chosen linkage variables
- Create data set with only linked records
- Keep track of the link level level of linkage
where records matched - Dont discard records that fail to match at each
step - Consider allowing full replacement prior to
running each new iteration
68Re-merge to get unlinked datasets
- Unlinked data sets contain only variables from
that data set - Unlinked records sent to next level of linkage
algorithm
69Last step
- Combine all linked data sets
- Investigate unlinked records
- Look for systematic errors responsible for
non-linking - Look for biases
- Evaluate quality of links in linked records
70Probabilistic Record Linkage
- Uses probabilities to determine whether a pair of
records refer to the same individual - Calculates weights to quantify the likelihood
that a pair of records are a true match - Computationally intensive each record in each
dataset is compared with every other record in
the other dataset - Probabilistic weights may be either non-specific
or value specific
71General (Non-Specific) Weights
- Agreement on a specific variable
- Example
- - Agreement on date of birth receives a higher
weight then match on sex - - Disagreement on sex receives a higher
penalty than disagreement on date of birth
72Value Specific Weights
- Agreement on a specific value of the variable
being compared - Example Comparing initials using value specific
weights - - Agreement on initial Z receives higher weight
than match on initial S - - Disagreement on initial S receives higher
penalty than disagreement on Z -
73Benefit of Weights
- Weights objectively reflect our confidence in a
match - Individual choice in cutting off low weights
74Probabilistic Linkage Methods
- Some SAS programmers write their own
probabilistic code - Software packages
- - Very expensive
- - Difficult to use
- - Some applications are available as freeware or
shareware
75Choosing Probabilistic Software
- Links same as LinkPro but freeware
- Link the King
- Link Plus (CDC - http//www.cdc.gov/cancer/npcr/to
ols/registryplus/lp.htm) - FEBRL also freeware, opensource has steep
learning curve
76Linkage Evaluation
- A significant advantage of probabilistic methods
is that evaluation of the linkage results is an
explicit step in the methodology. - The analyst must determine what level of
tolerance will be applied for acceptance of a
matched pair of records.
77Document, Document, Document
- Even if you plan to remain in your current job
for the next 30 years, the importance of careful
documentation in programs, output, data
dictionaries, and reports cannot be stressed
strongly enough. - Retain statistical program logs, keep track of
the provenance of input datasets, and document
all decisions made concerning methods and their
application.
78Data Warehouses
- Be wary of warehouses, lest you fall into the
trap of believing they are all things to all
people. - More specifically
- When linkages within the warehouse are made
solely on the basis of unique identifiers,
caveat emptor. - Always ask the question of how the linkages for
the warehouse were done, and more importantly,
for what purpose.
79Data Warehouses (continued)
- The term data warehouse means different things
to different people. - For some, its a perfect one-to-many/many-to-one
linkage repository - For others, its a library of databases
containing records of unknown or untested
relationship to one another - For still others, it is a Swiss cheese data
cube in which some regions are fully populated
and linked across data sources, while others
contain data measured at differing levels of
aggregation, while others contain unlinked
records, while still others are empty
80And finally, one more time . . .
81Evaluate before you analyze
- Dont assume the linkage has been done correctly,
whether you did it yourself or it was done by
someone else. - Each time the linkage is done the results must be
evaluated, whether you use deterministic or
probabilistic linkage algorithms. - Compare values on non-linkage variables as well
as those used to conduct the linkage, across all
observations in the dataset. - Create pairwise linkage scores and throw out
linkages between records that dont meet your
minimum criteria. - If you publish reports or submit manuscripts, it
is imperative that information on how the linkage
was done and how the results were evaluated prior
to analysis be included in your methods.
82(No Transcript)
83But weve always done it this way . . . (or,
close enough for government work)
- Why do the linkages once a year?
- Consider building linkages into the routine
processing of records as they are filed or
reported. - Even if linkages are done annually, consider
creating a database in which links across
subjects can cross reporting years. This can
result in a self-correcting feedback loop that
enables additional unmatched records to be linked
later on the basis of more current information.
84 THE TEN COMMANDMENTS OF RECORD LINKAGE
With apologies to Mel Brooks, and thanks for
editorial assistance to Elizabeth Kirby and for
their insights to the following Internet
contributors Jane Lazar, Boston
University Craig Mason, University of
Maine Russel Rickard, Colorado Department of
Health and Environment Greg Alexander, University
of Alabama at Birmingham
R.S. Kirby, November 2003
85The Ten Commandments of Record Linkage
Number 10
Thou shalt not taketh the name of thine software
in vain.
86The Ten Commandments of Record Linkage
Number 9
Thou shalt not covet thy neighbors database, yet
neither should thou hoardeth thine database.
87The Ten Commandments of Record Linkage
Number 8
Know thy purpose (in doing record linkage).
88The Ten Commandments of Record Linkage
Number 7
Thou shalt not merge without by variable(s).
89The Ten Commandments of Record Linkage
Number 6
Thou shalt checketh thine statistical software
log before thou proceedeth to thy next step or
process.
90The Ten Commandments of Record Linkage
Number 5
Thou shalt protect the privacy of those whose
information is recorded in thy databases, even as
thou useth their personal identifiers to conduct
thine linkage analyses.
91The Ten Commandments of Record Linkage
Number 4
Thou shalt not bear false witness against the
inconsistent values of variable common to two
datasets, nor because thou faileth to evaluate
thine linkage results.
92The Ten Commandments of Record Linkage
Number 3
Know thy data.
93The Ten Commandments of Record Linkage
Number 2
Thou shalt not underestimate the complexity, time
commitment, and staffing required to conduct
thine record linkage projects, nor shalt thou
overestimate the time needed to conduct thine
analyses.
94The Ten Commandments of Record Linkage
Number 1
Thou shalt show humility to others, even to those
who doubted that the tasks thou hast accomplished
could be done.
95The life which is unexamined is not worth living
96The database which is unexamined is not worth
analyzing
97Contact Information
- Russell S. Kirby, PhD, MS, FACE
- Department of Maternal and Child Health School of
Public Health, University of Alabama at
Birmingham - Email rkirby_at_uab.edu
- Telephone 205-934-2985
98(No Transcript)
99What is reality?
100CONTROLLING THE URGE TO MERGEDIAGNOSIS AND
TREATMENT OF A NEW CLINICAL PSYCHOSIS AFFECTING
PUBLIC HEALTH WORKERS AND RESEARCHERS
- Russell S. Kirby, Ph.D., M.S., F.A.C.E.
- Originally described Dec. 1996,
- revised at UAB Nov. 2002
101(No Transcript)
102Impulse-Control Disorders Not Elsewhere
Classified (269)
- 312.34 Intermittent Explosive Disorder (269)
- 312.32 Kleptomania (269)
- 312.33 Pyromania (270)
- 312.31 Pathological Gambling (271)
- 312.39 Trichotillomania (272)
- 312.35 Urge to Merge (272)
- 312.30 Impulse-Control Disorder NOS (272)
103Impulse-Control Disorders Not Elsewhere
Classified (269)
- 312.35 Urge to Merge
- A. Recurrent failure to resist impulses to link
public health and/or clinical medical records
that result in ill-conceived, often unscientific
linkage strategies and linked files which may be
inappropriate for the research purposes for which
they were created. - B. The urge to merge manifested by researchers
and analysts is often stimulated by external
forces (administrators) but is grossly out of
proportion to any precipitating bureaucratic
stressors. - C. The urge to merge is not better accounted for
by Conduct Disorder, Manic Episode, Substance
Dependence, or Antisocial Personality Disorder.
104Some clinical features of the urge to merge
psychosis
- subject observed constantly mumbling about the
need for a unique identifier - subject suffers from multiple tools disorders
(see DSM-4R for diagnostic criteria), e.g. - if Access doesnt work, subject tries SAS
- if direct importation doesnt work, subject
converts files to spreadsheets first, then into
statistical file formats
105Some clinical features of the urge to merge
psychosis (continued)
- subject given to making grandiose statements,
e.g. - if you cant drill down, then roll up
- linked files are data rich and information poor
- electronic data rules paper is for illiterates
- subject often forgets why research projects are
being done, as the linkage task becomes both
primary and primal
106If this is you . . .
- There is hope.
- Join the national community of LA (Linkers
Anonymous) and practice its iterative
twelve-step algorithm. - Talk to your colleagues and co-workers in time,
they may come to understand, or at least become
more tolerant. - Remember, you dont have to go-it-alone!