DIMACS Working Group on Privacy Confidentiality of Health Data - PowerPoint PPT Presentation


PPT – DIMACS Working Group on Privacy Confidentiality of Health Data PowerPoint presentation | free to download - id: dd4d-YzE5Z


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation

DIMACS Working Group on Privacy Confidentiality of Health Data


The initial three digits of a zip code for all such geographic units containing ... All five-digit patient zip codes truncated to first 3 digits and further merged ... – PowerPoint PPT presentation

Number of Views:62
Avg rating:3.0/5.0
Slides: 52
Provided by: quint2


Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: DIMACS Working Group on Privacy Confidentiality of Health Data

  • DIMACS Working Group on Privacy / Confidentiality
    of Health Data
  • Rutgers University Center
  • Piscataway, New Jersey
  • December 10-12, 2003

Health Care Databases under HIPAA Statistical
Approaches to De-identification of Protected
Health Information
  • Judith E. Beach, Ph.D., Esq.
  • Associate General Counsel, Regulatory Affairs
  • Chief Privacy Officer
  • Chair, Council on Data Protection and Council on
    Research Ethics

  • 1.Evolution of De-identification Standards
    HIPAA Privacy Regulation
  • 2.De-identification Standards for Health
    Information in Research
  • a. Safe Harbor
  • b. Statistician Method
  • )HIPAA Provisions
  • )Quintiles Experience and Methodology
  • c. Limited Data Set
  • 3.Preemption of State laws on De-identification
    Standards for Health Information
  • 4.Health Information Privacy - Cases and

Evolution of De-Identification Standards in
HIPAA Privacy Regulation
Federal Policy De-Identification of Health
  • Governments intent - to provide a balance of
    stringent standards flexible enough not to be a
    disincentive to use or disclose de-identified
    health information, wherever possible.
  • De-Identified health data is one of the best
    mechanisms for avoiding wrongful disclosure of
    Protected Health Information (PHI).
  • See Draft (05/27/03) DHHS Policy and Procedure
    Manual De-Identification Policy d11 (effective
    date 6/1/03) - applies to DHHS agencies HIPAA
    covered health care components and Internal
    Business Associates

Federal Policy Use of De-identified Health Data
Rather than PHI for Research
  • We HHS expressed the hope that covered
    entities, their business associates and others
    would make greater use of de-identified health
    information . . . when it is sufficient for the
    research purpose and that such practice would
    reduce the burden and the confidentiality
    concerns that result from the use of individually
    identifiable health information for some of these
    purposes. HHS, in final privacy rule, 65 Fed.
    Reg. at 82543 (Dec. 28, 2000), citing proposed
    privacy rule of Nov. 3, 1999

HIPAAs Jurisdiction
  • Individually Identifiable Health Information
  • A subset of health information, including
    demographic information, that identifies the
    individual or with respect to which there is a
    reasonable basis to believe the information can
    be used to identify the individual
  • Protected health information (PHI)
  • Means individually identifiable health
    information (IIHI Health Information
    Identifier) that is transmitted or maintained
    electronically, or transmitted or maintained in
    any other form or medium
  • An investigator who submits health claims would
    be a HIPAA covered entity (CE)
  • CE Health Information Identifier PHI
  • CE Identifier - Health Information NOT PHI
  • Health Information Identifier - CE NOT PHI

De-identification Standards for Health
Information in Research
De-identified Health Information
  • Definition health information that does not
    identify an individual and with respect to which
    there is no reasonable basis to believe that the
    information can be used to identify an
    individual. 45 CFR 164.514(a)
  • The Privacy Rule permits de-identification of PHI
    so that such information may be used and
    disclosed freely, without being subject to the
    Privacy Rules requirements.
  • Once de-identified, the data is out of the
    Privacy Rule.

HIPAA De-identification Standards
  • Two methods for the de-identification of health
  • Safe Harbor -- remove 18 specified identifiers
    - intended to provide a simple, definitive method
    for de-identifying health information with
    protection from litigation
  • Statistician Method -- retain some of the 18
    safe harbors specified identifiers and
    demonstrate the standard is met if person with
    appropriate knowledge of and experience with
    generally accepted statistical and scientific
    principles and methods, e.g., a Biostatistician,
    makes and documents that the risk of
    re-identification is very small.
  • 45 CFR 160.514

Limited Data Set
  • Final rule added another method requiring
    removal of facial identifiers -- Limited Data
  • Under confidentiality agreements - for research,
    public health, and health care operations
  • Regarded as PHI - NOT de-identified
  • therefore, still subject to Privacy Rule
    requirements such as minimum necessary rule.

Safe Harbor Method
Safe Harbor
  • Covered entities must remove all of a list of 18
    enumerated identifiers and have no actual
    knowledge that the information remaining could be
    used alone or in combination to identify a
    subject of the information.
  • The identifiers to be removed include
  • direct identifiers such as name, address, SSN
  • indirect identifiers such as birth date,
    admission and discharge dates, and five-digit zip
  • 45 CFR 160.514(b)(2)

Safe Harbor
  • The safe harbor does allow for the disclosure of
  • All geographic subdivisions no smaller than a
    State, as well as the initial three digits of a
    zip code
  • IF the geographic unit formed by combining all
    zip codes with the same initial three digits
    contains more than 20,000 people
  • AGE, if less than 90, gender, ethnicity and other
    demographic information not listed.

Safe Harbors 18 Identifiers
  • Names
  • All geographic subdivisions smaller than a State,
    including street address, city, county, precinct,
    zip code, and their equivalent geocodes
  • Except for the initial three digits of a zip code
    if according to the currently available data from
    the Bureau of the Census
  • The geographic unit formed by combining all zip
    codes with the same three initial digits contains
    more than 20,000 people and
  • The initial three digits of a zip code for all
    such geographic units containing 20,000 or fewer
    people are changed to 000
  • All elements of dates (except year) or dates
    directly relating to an individual, including
  • birth date, admission date, discharge date, date
    of death
  • and all ages over 89 and all elements of dates
    (including year) indicative of such age, except
    that such ages and elements may be aggregated
    into a single category of age 90 or older
  • Telephone numbers
  • Fax numbers
  • Electronic mail addresses
  • Social security numbers
  • Medical record numbers
  • Health plan beneficiary numbers
  • Account numbers
  • Certificate/license numbers
  • Vehicle identifiers and serial numbers, including
    license plate numbers
  • Device identifiers and serial numbers
  • Web Universal Resource Locators (URLs)
  • Internet Protocol (IP) address numbers
  • Biometric identifiers, including finger and voice
  • Full face photographic images and any comparable
    images and
  • Any other unique identifying number,
    characteristic, or code.

Sources of Authority
  • In Privacy Rule Preamble, HHS recognizes two
    sources of authority as to what constitutes such
    principles and methods for de-identification
    adequate for posting a de-identified database on
    the Internet 65 Fed. Reg. at 82,709-82,710 (Dec.
    28, 2000)
  • Paper 22 Statistical Policy Working Paper
    22Report on Statistical Disclosure Limitation
  • The Checklist The Checklist on Disclosure
    Potential of Proposed Data Releases -intended
    primarily for use in the development of
    public-use data products.

Safe Harbor
  • BUT many researchers and other groups have
    complained that the Safe Harbor renders the
    de-identified data as virtually useless for
    research so that the result will be MORE research
    using PHI.
  • No dates of service, no patient initials, no date
    of birth
  • Can have deltas such as number of patient
    visits over time
  • However, the safe harbor was NOT designed for
    research, but to provide an approved method of
    de-identification for any purpose by any covered
    entity, regardless of sophistication.
  • For instance, such de-identified data would be
    deemed to be safely posted on the Internet.

Statistician Method
Statistician Method
  • For this method, the covered entity
  • must remove all direct identifiers
  • reduce the number of variables on which a match
    might be made
  • should limit the distribution of records through
    a data use agreement or restricted access
  • 65 Fed. Reg. at 82,709-710 (Dec. 28, 2000)

Opinion of Statistician
  • Statistician must
  • determine that there is a very small risk of
  • after applying generally accepted statistical
    and scientific principles and methods for
    rendering information not individually
  • documents the methods and results of the analysis
    that justify such determination.
  • 45 CFR 160.514(b)(1)

Statistician Method
  • This method has been generally ignored by covered
  • Who prefer a safe harbor approach with safe
    being the operative word.
  • Consider the Statistician alternative as too

Statistician Method Quintiles Experience
  • An expert statistician calculated the statistical
    likelihood of re-identification IF all 18 safe
    harbor identifiers were removed, that is, the
    de-identification probability.
  • Then, the statistician calculated the likelihood
    of re-identification if certain dates of service
    of medical or pharmacy claims were retained
  • And rather than age or year of birth, which is
    allowed in the safe harbor, the month and year of
    birth was included.

Statisticians Opinion
  • This calculated number, the de-identification
    probability served as a benchmark of a very
    small risk of re-identification against which
    the statistician method would be compared.

Analysis Comparison of Both Methods
  • To ensure the statistical likelihood of
    re-identification was comparable to that of the
    calculated safe harbor benchmark, the following
    data fields were made stricter than as permitted
    by the safe harbor
  • For all patients older than 85 years of age
    (rather than 90), the year of their birth
    modified to make them all 85 years old.
  • All five-digit patient zip codes truncated to
    first 3 digits and further merged so that no
    resulting 3 digit code has a total population of
    less than 200,000.

Factors Considered by Statistician
  • In the analysis, the statistician pointed out the
  • The de-identified data received is conveyed under
    a confidentiality agreement, which specifically
    prohibits re-identification or further disclosure
    of the data except in statistically aggregated
  • The database is maintained on a physically and
    technically secure, password-protected server.

Statisticians Opinion
  • Applying generally accepted statistical and
    scientific principles and methods for rendering
    information not individually identifiable, . .
    . I conclude that the risk is very small that the
    information . . . could be used, alone or in
    combination with other reasonably available
    information, by an anticipated recipient to
    identify an individual who is a subject of the
    information. . . . In practice the actual
    reidentification probabilities are much, much
    lower . . . arguably de minimis.

Statistician Method
  • It is clear that most persons who have reviewed
    the Privacy Rule have failed to appreciate the
    significance of the statistician opinion to
    de-identification, and, instead, have focused
    almost exclusively on the "safe harbor."
  • In particular, many have failed to understand the
    importance of the "restricted access" as it
    relates to the statistician opinion approach to

Ensuring HIPAA Compliance
All data handled is de-identified using a unique
patient identifier that is irreversibly
Patient identifiable electronic healthcare claims
(standard health claims data fields)
Data Encryption Process
Data Warehouse
De-identified data
zip 3 digit DOB modified
Upon completion of the de-identification process
a unique patient identifier is created, which is
irreversibly encrypted.
Core Data Elements
July 98 - to date
Jan 98 - to date
Note Payor Type not available on all records
Physician Demographics
  • Specialty
  • Region
  • Number of years in practice
  • Prescribing volume
  • Type of practice
  • Number of HMO / PPO / IPA affiliations
  • patient volume by insurance type
  • Physician race
  • Physician age

Patient Characteristics
  • Location of contact
  • Height and weight
  • Age
  • Gender
  • Race
  • Blood pressure
  • Cholesterol levels (total, HDL, LDL,
  • Insurance type
  • Physician reimbursement method (fee-for-service
    vs. capitation)
  • Smoker or non-smoker

Disease Entities
  • Visits (with and without drugs)
  • Visits per physician per year
  • Total patients seeking treatment
  • Newly diagnosed patients
  • Visit type (first vs. subsequent)
  • Referrals and referring specialty
  • Severity of condition
  • Tests ordered or completed during visit
  • Existing medical conditions not treated
  • Number of times seen and days since last visit
  • Number of patient drug requests for condition

Treatment Regimens
  • Dosage form, strength and signa
  • Formulary impact
  • Quantity prescribed and number of refills (mean
    and frequency)
  • Weighted diagnosis value
  • Dispensing instructions
  • Occurrences per physician per year
  • Therapy type
  • New
  • First-line versus adjunct therapy
  • Drug replacement and reason
  • Continued

Treatment Regimens
  • Desired action
  • Concomitant drugs (to treat same diagnosis)
  • Concurrent drugs (regardless of diagnosis)
  • Drug issuance
  • Sample days of therapy (mean and frequency)
  • Prescribed days of therapy (mean and frequency)
  • Daily average consumption (DACON)
  • Non-drug therapy

Limited Data Set (LDS)
HHS Solution Limited Data Set
  • For research, public health, or health care
    operations purposes
  • Authorization not required
  • A limited data use agreement must be in place
    between the covered entity and the recipient of
    limited data set (LDS) 45 CFR 164.514(e)
  • Data Use Agreements would only be needed for
    those public health, research, or health care
    operation uses and disclosures that are not
    otherwise permitted by federal or state laws.
    See Draft (05/27/03) DHHS Policy and Procedure
    Manual De-Identification Policy d11

  • Regarded as PHI, that is, not de-identified data
    and, therefore subject to requirements for
    protection of PHI such as
  • Prohibits re-identification or any attempt to
    contact individuals by recipient
  • BUT re-identification code permitted for covered
  • Subject to minimum necessary standards
  • BUT no accounting of disclosures or IRB approval

Limited Data Set Specifications
  • May be useful for records-based research such as
    epidemiological and other population research
  • But may NOT be useful for patient recruitment
  • Because re-identification of individuals or
    attempt to contact individuals is prohibited by a
    third party even if by Researcher (without IRB or
    internal privacy board approval) unless the
    contact is made by the Covered Entity or the
    Covered Entitys Workforce.

LDS Remove 16 Identifiers
  • Name
  • Postal address information (other than city,
    state, zip code)
  • Telephone number
  • Fax number
  • E-mail address
  • Social Security Number
  • Medical record / prescription numbers
  • Health plan beneficiary numbers
  • Account numbers
  • Certificate / license numbers
  • Vehicle identity / serial numbers
  • Device numbers
  • Web URL
  • IP address
  • Biometric identifiers (e.g., fingerprints,
    retinal scans)
  • Full face similar photographic images

45 CFR 164.514(e)(2)
LDS Retain Indirect Identifiers
  • Five-digit zip code
  • Dates of service (e.g., admission / discharge)
  • Dates of birth and death
  • Geographic subdivision (e.g., state, county,
    city, precinct), but not street address

Statistical Method for Dummies
  • Limited Data Set . . .
  • the Statistician Method made easy.

Preemption of State Laws on De-identification
Standards for Health Information
Preemption of De-identification Standards - A View
  • HIPAA Statute and privacy regulation
  • Preemption of state law only if
  • The provision of state law relates to the privacy
    of individually identifiable health information
  • HIPAA Statute 1178 AND 45 CFR 160.202 -

Preemption of State Law HIPAA Statute
  • Health information considered identifiable and,
    therefore, subject to all requirements of rule
    ONLY if reasonable basis to believe that the
    information can be used to identify the
  • Exception to preemption - when states can assert
    contrary and more stringent definition of
    individually identifiable health information
  • But exception analysis does not apply to
    de-identified data

Preemption Deidentification Standards
  • Thus, states would be preempted from enforcing a
    standard for deidentification that exceeds the
    reasonable basis definition of individually
    identifiable health information as established in
    HIPAA statute.
  • Note in response to Quintiles written request,
    HHS responded by revising preemption section of
    the Rule to refer to individually identifiable
    health information rather than merely health

Privacy Cases ControversiesDe-identified
Health Databases
U.S. Controversy
  • Quintiles Transnational Corp. v. WebMD
  • No demonstrable violation of HIPAA or other
    privacy law by transmission and aggregation of
    deidentified health data
  • Inhibits additional state regulation of national
    electronic data system
  • Order of Judge Terrence Boyle.
  • Re de-identified data the Dormant Commerce
    Clause prevents the individual states from
    regulating the interstate transmission of data.
  • No. 501-CV-180-BO(3), U.S. EDNC Western

UK Controversy
  • Regina v. Department of Health, Ex Parte Source
    Informatics Ltd. Judge Latham, 4 All ER 185, May
    29, 1999 Case No. CO\4490\97, Queens Bench
  • Judge Latham dismissed applicants' application
    for a Declaration that a policy document issued
    in March 1996 by the Department of Health The
    Protection and Use of Health Information.

UK Source Informatics Overturned on Appeal
  • Court of Appeals Simon Brown, Aldous and
    Schiemann LJJ 21 December 1999
  • Where a patient's identity was protected, it
    would not be a breach of confidence for general
    practitioners and pharmacists to disclose to a
    third party, without the patient's consent, the
    information contained in the patient's
    prescription form for marketing research

UK Health and Social Care Bill Clause 65
  • Department of Health included language in the
    Health and Social Care Bill that would have
    essentially reinstated the lower courts opinion
    (Judge Lathams)
  • After heavy lobbying in the House of Lords
    against Clause 65, the language was defeated.

The key is . . .
Safeguarding protected health information by
encouraging use of federal standards for
de-identification of health data for clinical
About PowerShow.com