Security Methods for Statistical Databases by Karen Goodwin - PowerPoint PPT Presentation

About This Presentation
Title:

Security Methods for Statistical Databases by Karen Goodwin

Description:

Security Methods for Statistical Databases by Karen Goodwin – PowerPoint PPT presentation

Number of Views:116
Avg rating:3.0/5.0
Slides: 25
Provided by: kgo96
Category:

less

Transcript and Presenter's Notes

Title: Security Methods for Statistical Databases by Karen Goodwin


1
Security Methods for Statistical Databasesby
Karen Goodwin
2
Introduction
  • Statistical Databases containing medical
    information are often used for research
  • Some of the data is protected by laws to help
    protect the privacy of the patient
  • Proper security precautions must be implemented
    to comply with laws and respect the sensitivity
    of the data

3
Accuracy vs. Confidentiality
  • Accuracy
  • Researchers want to extract accurate and
    meaningful data
  • Confidentiality
  • Patients, laws and database administrators want
    to maintain the privacy of patients and the
    confidentiality of their information

4
Laws
  • Health Insurance Portability and Accountability
    Act HIPAA (Privacy Rule)
  • Covered organizations must comply by April 14,
    2003
  • Designed to improve efficiency of healthcare
    system by using electronic exchange of data and
    maintaining security
  • Covered entities (health plans, healthcare
    clearinghouses, healthcare providers) may not use
    or disclose protected information except as
    permitted or required
  • Privacy Rule establishes a minimum necessary
    standard for the purpose of making covered
    entities evaluate their current regulations and
    security precautions

5
HIPAA Compliance
  • Companies offer 3rd Party Certification of
    covered entities
  • Such companies will check your company and
    associating companies for compliance with HIPAA
  • Can help with rapid implementation and compliance
    to HIPAA regulations

6
Types of Statistical Databases
  • Static a static database is made once and never
    changes
  • Example U.S. Census
  • Dynamic changes continuously to reflect
    real-time data
  • Example most online research databases

7
Security Methods
  • Access Restriction
  • Query Set Restriction
  • Microaggregation
  • Data Perturbation
  • Output Perturbation
  • Auditing
  • Random Sampling

8
Access Restriction
  • Databases normally have different access levels
    for different types of users
  • User ID and passwords are the most common methods
    for restricting access
  • In a medical database
  • Doctors/Healthcare Representative full access
    to information
  • Researchers only access to partial information
    (e.g. aggregate information)

9
Query Set Restriction
  • A query-set size control can limit the number of
    records that must be in the result set
  • Allows the query results to be displayed only if
    the size of the query set satisfies the condition
  • Setting a minimum query-set size can help protect
    against the disclosure of individual data

10
Query Set Restriction
  • Let K represents the minimum number or records to
    be present for the query set
  • Let R represents the size of the query set
  • The query set can only be displayed if
  • K ? R

11
Query Set Restriction
12
Microaggregation
  • Raw (individual) data is grouped into small
    aggregates before publication
  • The average value of the group replaces each
    value of the individual
  • Data with the most similarities are grouped
    together to maintain data accuracy
  • Helps to prevent disclosure of individual data

13
Microaggregation
  • National Agricultural Statistics Service (NASS)
    publishes data about farms
  • To protect against data disclosure, data is only
    released at the county level
  • Farms in each county are averaged together to
    maintain as much purity, yet still protect
    against disclosure

14
Microaggregation
15
Microaggregation
16
Data Perturbation
  • Perturbed data is raw data with noise added
  • Pro With perturbed databases, if unauthorized
    data is accessed, the true value is not disclosed
  • Con Data perturbation runs the risk of
    presenting biased data

17
Data Perturbation
18
Output Perturbation
  • Instead of the raw data being transformed as in
    Data Perturbation, only the output or query
    results are perturbed
  • The bias problem is less severe than with data
    perturbation

19
Output Perturbation
Query
Results
Results
Query
20
Auditing
  • Auditing is the process of keeping track of all
    queries made by each user
  • Usually done with up-to-date logs
  • Each time a user issues a query, the log is
    checked to see if the user is querying the
    database maliciously

21
Random Sampling
  • Only a sample of the records meeting the
    requirements of the query are shown
  • Must maintain consistency by giving exact same
    results to the same query
  • Weakness - Logical equivalent queries can result
    in a different query set

22
Comparison Methods
The following criteria are used to determine the
most effective methods of statistical database
security
  • Security possibility of exact disclosure,
    partial disclosure, robustness
  • Richness of Information amount of
    non-confidential information eliminated, bias,
    precision, consistency
  • Costs initial implementation cost, processing
    overhead per query, user education

23
A Comparison of Methods
Method Security Richness of Information Costs
Query-set Restriction Low Low1 Low
Microaggregation Moderate Moderate Moderate
Data Perturbation High High-Moderate Low
Output Perturbation Moderate Moderate-low Low
Auditing Moderate-Low Moderate High
Sampling Moderate Moderate-Low Moderate
1 Quality is low because a lot of information can
be eliminated if the query does not meet the
requirements
24
Sources
  • This presentation is posted on http//www.cs.jmu.e
    du/users/aboutams
  • Adam, Nabil R. Wortmann, John C.
    Security-Control Methods for Statistical
    Databases A Comparative Study ACM Computing
    Surveys, Vol. 21, No. 4, December 1989
    (http//delivery.acm.org/10.1145/80000/76895/p515-
    adam.pdf?key176895key21947043301collportaldl
    ACMCFID4702747CFTOKEN83773110)
  • Official HIPAA (http//cms.hhs.gov/hipaa/)
    incur
  • Bernstein, Stephen W. Impact of HIPAA on
    BioTech/Pharma Research Rules of the Road
    (http//www.privacyassociation.org/docs/3-02bernst
    ein.pdf)
  • Service Bureau 3rd Party Testing
    (http//hipaatesting.com/service_bureau.html)
Write a Comment
User Comments (0)
About PowerShow.com