Dealing with confidential research information anonymisation techniques and other measures to enable using and sharing research data - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

Dealing with confidential research information anonymisation techniques and other measures to enable using and sharing research data

Description:

name, address, postcode, telephone number, voice, picture etc. ... km2 area, postcode district, ward, road, ... (ESDS) guidelines, UK Data Archive. Clark, ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 13
Provided by: Vee43
Category:

less

Transcript and Presenter's Notes

Title: Dealing with confidential research information anonymisation techniques and other measures to enable using and sharing research data


1
Dealing with confidential research information
anonymisation techniques and other measures to
enable using and sharing research data
  • Data Management and Sharing workshop
  • Leeds and Essex, 11 March 2008

2
Using and sharing confidential research data
  • Requires a combination of
  • discussing consent and confidentiality with
    participants / respondents (dialogue)
  • anomymisation of data
  • user access restrictions
  • e.g. researchers only use license with
    confidentiality agreement
  • What is required depends on
  • nature of research
  • planned data uses
  • is study specific

3
Identity disclosure
  • A persons identity can be disclosed through
  • direct identifiers
  • name, address, postcode, telephone number, voice,
    picture etc.
  • usually NOT essential research information
    (administrative)
  • indirect identifiers possible disclosure in
    combination with other information
  • occupation, geography, unique or exceptional
    values / characteristics,

4
Why anonymise data?
  • Ethical reasons
  • protect identity (sensitive, illegal,
    confidential info)
  • disguise research location
  • Commercial reasons
  • Legal reason protect personal data (DPA)

5
Essential points
  • Never disclose personal data (unless specific
    consent)
  • Reasonable / appropriate level of anonymity
  • Maintain maximum meaningful info
  • Where possible replace rather than remove
  • Identifying info may provide context, so do not
    over-anonymise
  • Re-users of data have the same legal and ethical
    obligation to NOT disclose personal and
    confidential information

6
Anonymising quantitative data
  • Remove direct identifiers or replace with code
  • e.g names, address, institution
  • Reduce the variable precision through aggregation
  • e.g postcode sector vs full postcode, birth year
    vs date, occupational categories
  • Generalise meaning of text
  • e.g. occupational expertise
  • Restrict upper / lower ranges to hide outliers
  • e.g. income or age

7
Relational data
  • Extra care is need - combinations of related
    datasets or a dataset in combination with
    publicly available info can disclose information
  • E.g. businesses studied are mapped in publication

8
Geo-referenced data
  • Point data may reveal position of individuals,
    organisations, businesses, etc.
  • Remove point coordinates loss of all
    geographical info
  • Reduce precision - replace point coordinates with
    line or polygon of larger area
  • km2 area, postcode district, ward, road,
  • Reduce precision - replace point coordinate with
    meaningful variable typifying the geographical
    position
  • catchment area, poverty index, population density

9
Anonymising qualitative data
  • Removing or aggregating identifiers in text can
    distort data, make them unusable and unreliable /
    misleading
  • Avoid blanking out information
  • Use pseudonyms or codes
  • Consistency
  • Plan replacements at start (where possible)
  • e.g. anonymise during transcription, or highlight
    sensitive info for later anonymising
  • Exc. longitudinal studies anonymise when data
    collection complete
  • bracket replacements for clarity
  • XML mark-up for anonymisation can be used (TEI
    tag)
  • e.g. ltseg type"anonymised"gtMarylt/seggt

10
Tips
  • Always consider anonymisation together with
    consent agreements and user access restrictions
  • Regulating / restricting user access may offer a
    better solution than anonymising
  • Remove, mask, change identifiers
  • Maintain maximum information
  • Create log of all anonymisations
  • Be consistent in anonymisation techniques used
    use throughout study, publications, etc.
  • Keep copy of original data
  • Plan at start of research, not at the end

11
Sources
  • Economic and Social Data Services (ESDS)
    guidelines, UK Data Archive
  • Clark, A. 2006. Anonymising research data. NCRM
    Working Paper Series 7/06. ESRC National Centre
    for Research Methods.
  • http//www.ncrm.ac.uk/research/outputs/publicati
    ons/WorkingPapers/2006/0706_anonymising_research_d
    ata.pdf
  • Timescapes meetings discussions

12
Exercises / scenarios
  • Anonymising qualitative data
  • Foot mouth study Cumbria 2001-2003 (5407)
  • Conflicts and violence in prison (4596)
  • Anonymising quantitative data Labour Force
    Survey
  • Confidential relational and geo-referenced data
    British Household Panel Survey
Write a Comment
User Comments (0)
About PowerShow.com