Data protection : issues about anonymisation and dissemination for researchers The French experience - PowerPoint PPT Presentation

1 / 40
About This Presentation
Title:

Data protection : issues about anonymisation and dissemination for researchers The French experience

Description:

CESSDA Wokshop , Ath ens 11-12 ocotbre 2006 ... The French experience over the last 20 years ... Moves from nominative data to direct or indirect identification ... – PowerPoint PPT presentation

Number of Views:80
Avg rating:3.0/5.0
Slides: 41
Provided by: roxanesi
Category:

less

Transcript and Presenter's Notes

Title: Data protection : issues about anonymisation and dissemination for researchers The French experience


1
Data protection issues about anonymisation and
dissemination for researchers The French
experience
  • Roxane Silberman
  • CCDSHS/Réseau Quetelet
  • CESSDA Wokshop , Athèens 11-12 ocotbre 2006

2
Introduction
  • Contribute to the discussion
  • The French experience over the last 20 years
  • Impact of institutional arrangements
  • Impact of pressure and space for negotiations
  • Changes in the contexts
  • Old and new questions
  • Role of the Data Archives in a new world

3
A specific context
  • Importance of the National Institute for
    Statistics (Insee) a lot of surveys
  • A specificity of Insee a scientific dimension
    (a department of research)
  • Funding academic surveys is not a tradition in
    France
  • Socio-political surveys
  • Some specialized research institutes INED,
    INSERM
  • High pressure upon Insee to get access and
    questions of equal treatment between researchers
  • Less experience in sharing data

4
Different experiences of anonymisation
  • Very different experiences and ways to deal with
    anonymization in the Réseau Quetelet
  • Centre Maurice Halbwachs (ex-Lasmas) access to
    public statistics
  • INED making surveys (in relation with Insee but
    as a research institute) and disseminating its
    own surveys
  • CDSP disseminating socio-political surveys

5
Main issues
  • The French legal framework its evolution and
    the impact of the European directive
  • Changes and differences in practices
  • Current questions and current negotiations

6
I.The French legal framework
7
The French implementation
  • Four sources
  • Statistical law
  • Privacy protection law
  • Archives law
  • Law about informatio
  • Changes over the 30 last years
  • Some conflicts between these four sources

8
A. Statistical laws
  • Importance of the implementation of the
    statistical law
  • Two regulations for surveys, for administrative
    purpose
  • The 1951 statistical law
  • The CERFA procedure for administrative data

9
The 1951 law
  • The 1951 law defines the rules in collecting
    statistical data for the state (obligation,
    coordination, statistical secret)
  • Personal data and business data
  • Promulgation by the Ministry of Economy, under
    the control of the CNIS (National council for
    statistical information) which includes the
    social partners (and researchers), and with
    authorization of the CNIL for individual data
  • The 1986 addition allow Insee to ask for
    administrative databases

10
Statistical secret
  • Formal interpretation
  • - statistical secret no dissemination,
  • - liberal interpretation no dissemination of
    non anonymised data
  • Exception for business data (assumption
    business data cannot be anonymised)
    dissemination through a Committee (Comité du
    secret) including business representatives

11
Recent changes
  • 2004 updating of the 1951 law
  • Enlargement of the role of the Comité du Secret
    to give access to business administrative data
  • One researcher in the Comité du secret

12
The CERFA procedure
  • For administrative purposes
  • Visas
  • Right of access for citizens through CADA
  • Under the control of the administration
  • The 1986 addition to the 1951 law gave right to
    the Insee to mobilize administrative data

13
Other changes in the law
  • 2004
  • 1978 law and the European directive
  • Compatibility of research aim (history and
    statistics also)

14
B. The 1978 law data protection
  • Protection of individual data
  • One of the first laws in Europe (the first wave
    before the implementation of the 1995 European
    directive)
  • Linked to the SAFARI episode in the context of
    informatique revolution 1974
  • Had an impact on the statistical law the
    Ministry must have the advice of the CNIL for
    Insee surveys and the 1986 addition in order to
    allow Insee to get administrative data
  • Private and public regimes differ declaration
    or authorization cf the SAFARI episode

15
Researchers
  • Nothing specific for researchers
  • No reactions from researchers in 1978
  • Additional chapter in 1994 for epidemiologist
    (with a specific ethic research committee)
  • Difficult relations between researchers and the
    CNIL
  • Complex relations between Insee and CNIL

16
The impact of the 1995 European directive
  • France had a rather restrictive law, no needs to
    add much in order to apply the directive (France
    was late in updating the law)
  • But the European directive was seized as an
    opportunity by the statisticians and the
    researchers to implement the compatibility of
    statistical and research aims with initial aim of
    the data collection
  • Joint lobbying of the statisticians and the
    researchers
  • The 2004 updating of the 1978 law introduces the
    compatibility of the initial aim with the
    historical, statistical and scientific aims
    (impact for storing on the long term and impact
    on dissemination and reuse)

17
But
  • Moves from nominative data to direct or indirect
    identification
  • Some extension in the definition of the
    sensitive variables impact for collecting
    data but also for access
  • Same regime for private and public data
    declaration except for sensitive data more
    pressure on researchers
  • Still no statistician in the CNIL

18
C. The law about the Archives
  • Establishes huge delays for access to archives
    but also obligation to store data (including
    individual data)
  • A tendency to open archives

19
D. Law about information
  • In line with European directive
  • Say nothing about researchers
  • But may impact dissemination of data (see Insee
    and open access to surveys on the website)

20
First conclusions
  • Different laws with potential conflicts on right
    of access
  • CADA, CNIL, Archives
  • Complexity interrelation between the CNIL (data
    protection) and the INSEE (statistical law) who
    is deciding what in terms of access for
    researchers ?

21
II. Practices
22
Differences in practices and evolutions
  • Different periods
  • Different contexts research / public statistics
  • Qualitative and quantitative

23
A first period the end of the 70ies and the
beginning of the 80ies
  • A general feature France was late in setting up
    a Data Archives (86)
  • Socio-political surveys CIDSP (Grenoble)
  • Insee and other statistical bodies only
    individual access through personal relations but
    anonymisation is not a main issue very
    different arguments but mainly no real discussion
    on the topic, and individual exceptions
  • Research institutes as Ined no culture of
    sharing data
  • Sharing data is the issue, not anonymisation

24
Second period 86 - 98
  • Some progress in sharing data
  • First collective agreement with Insee and other
    statistical departments
  • Restrictions to the CNRS (not the universities)
  • Anonymisation is not a main issue a liberal
    interpretation of the statistical law, right to
    disseminate anonymised data, mostly direct
    identification
  • Commercial issues are more important
  • First discussion in INED about sharing data

25
Third period 1999 -2002
  • Changes in the level of anonymisation a
    decision from CNIL and Insee for the 1999 census
    no more dissemination at the level of little
    geographical units, no dissemination of sensitive
    variables (less details on nationality and
    country of birth)
  • Huge impact for geographers (but also for urban
    managers and municipalities)
  • Difficulties for specific geographical
    agregations
  • Difficulties for longitudinal analysis
  • Difficulties for contextual analysis

26
Negotiation with Insee and other statistical
departments
  • Other problems
  • - access for universities
  • - access to other statistical departments
  • - costs
  • General negotiation
  • A committee in charge of a national policy for
    social sciences research, Insee and other
    statistical departments a place to discuss and
    negotiate (CCDSHS)

27
A rather liberal situation
  • A lot of individual micro-data available for all
    researchers (also available for other countries)
  • Access to business surveys through the Comité du
    secret (see composition) few refusals
  • Unequal access to administrative data

28
but growing concerns with anonymisation
  • Mainly not much progress on the Census
    dissemination a complex system with different
    products that have been offered to replace less
    anonymised sample, tabulation but no
    modelisation, time consuming
  • NB In the same time, urban managers were more
    successful)
  • Geographical levels anonymisation became the
    general rule for all surveys
  • New restrictions 1) sensitive variables
    nationality,country of birth, spoken languages
  • 2) indirect identification not
    only geographical precision but professions,
    income .

29
in all sectors
  • Indirect identification becomes an issue also for
    research institutes as Ined
  • New pressure on individual researchers who were
    unaware of these issues notifying surveys to
    the CNIL, asking for authorizations, paying
    attention to confidentiality

30
Lobbying to change the 1978 law
  • Difficult relations with the CNIL (no
    statisticians)
  • Discussion with the CNIL
  • Common lobbying research and Insee
  • The 2004 updating of the 1978 law open new space
    to negotiate

31
but not the only problem
  • The statistical law
  • The argument of the statistical secret comes back
    in a different way (responsibility, possible
    sanctions, rate of response )
  • Changes in the statistical law ? Status for
    researchers ?

32
in a new context
  • New and powerful statistical tools that demand
    all data
  • More administrative data that can also be merged
    with surveys
  • More panels difficult to anonymised

33
Difference in practices for anonymisation
  • Differences between surveys
  • Differences between institutions (Insee,
    statistical departments, governmental agencies,
    research institutes, individual researchers)
    different knowledge about the laws, different
    interpretation, different practices about
    indirect identification
  • Discussion about indirect identification with the
    CNIL

34
but also some access
  • Access through CNIL and Comité du secret
  • Individual contracts under the responsibility of
    the statistical department or the governmental
    agency for administrative data
  • Access through CADA (even for newspapers) and
    National Archives
  • Impact of law about information
  • A specific treatment the research unit in Insee

35
III. Current discussions and new negotiation
  • Very different practices and situations
  • High pressure from researchers
  • Space in the law

36
Two directions
  • Researches files
  • Safe centers

37
Research files
  • General ideal an intermediate level between
    anonymised data and safe centers
  • Two levels public files (now on the Insee web
    site ) and research file
  • A negotiation Insee/CNIL/ Ministry of Research
  • A general authorization from the CNIL that will
    allow to discuss with Insee more detailed files
  • The Data Archives will be in charge for
    dissemination and will have the responsibility
  • Rely on organization and confidence

38
Safe centers
  • For more sensitive data
  • Also to merge datasets
  • Business data will go in the safe centers
  • Census not very clear at the moment
  • Different options and questions are currently
    under discussion
  • Safe centers or remote access
  • Role of the researchers
  • Role of the Data Archives

39
Conclusions 1
  • A different world different data panels,
    administrative data, merged datasets, powerful
    statistical tools, more concern about
    confidentiality
  • Space for negotiation in the law and between the
    laws
  • Discussion and collective pressure is effective
  • But needs for organization different levels of
    dissemination

40
Conclusions 2
  • Role of the Data Archives in this new world?
  • Importance of information and documentation in a
    distributed system
  • More discussion about indirect identification
  • More discussion about sensitive variables
  • More discussion about choice between
    anonymisation or different levels and systems of
    dissemination
  • Data Archives as an actor in the negotiations
Write a Comment
User Comments (0)
About PowerShow.com