Agenda - PowerPoint PPT Presentation

About This Presentation
Title:

Agenda

Description:

Agenda What is (Web) data mining? And what does it have to do with privacy? a simple view Examples of data mining and – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 16
Provided by: gebh150
Learn more at: http://schul-web.org
Category:

less

Transcript and Presenter's Notes

Title: Agenda


1
Agenda
  • What is (Web) data mining? And what does it have
    to do with privacy? a simple view
  • Examples of data mining and "privacy-preserving
    data mining"
  • Association-rule mining ( privacy-preserving AR
    mining)
  • Collaborative filtering ( privacy-preserving
    collaborative filtering)
  • A second look at ...privacy
  • A second look at ...Web / data mining
  • The goal More than modelling and hiding
    Towards a comprehensive view of Web mining and
    privacy. Threats, opportunities and solution
    approaches.
  • An outlook Data mining for privacy

2
Privacy Problems Example 1
  • Technical background of the problem
  • The dataset allows for Web mining (e.g., which
    search queries lead to which site choices),
  • it violates k-anonymity (e.g. "Lilburn" ? a
    likely k inhabitants of Lilburn)

3
Where do people live who will buy the Koran soon?
Privacy Problems Example 2
  • Technical background of the problem
  • A mashup of different data sources
  • Amazon wishlists
  • Yahoo! People (addresses)
  • Google Maps
  • each with insufficient k-anonymity, allows for
    attribute matching and thereby inferences

4
Predicting political affiliation from Facebook
profile and link data (1) Most Conservative
Traits
Privacy Problems Example 3
Trait Name Trait Value Weight Conservative
Group george w bush is my homeboy 45.88831329
Group college republicans 40.51122488
Group texas conservatives 32.23171423
Group bears for bush 30.86484689
Group kerry is a fairy 28.50250433
Group aggie republicans 27.64720818
Group keep facebook clean 23.653477
Group i voted for bush 23.43173116
Group protect marriage one man one woman 21.60830487
Lindamood et al. 09 Heatherly et al. 09
5
Predicting political affiliation from Facebook
profile and link data (2) Most Liberal Traits
per Trait Name
Trait Name Trait Value Weight Liberal
activities amnesty international 4.659100601
Employer hot topic 2.753844959
favorite tv shows queer as folk 9.762900035
grad school computer science 1.698146579
hometown mumbai 3.566007713
Relationship Status in an open relationship 1.617950632
religious views agnostic 3.15756412
looking for whatever i can get 1.703651985
Lindamood et al. 09 Heatherly et al. 09
6
"Privacy-preserving Web mining" example find
patterns, unlink personal data
  • Volvo S40 website targets people in 20s
  • Are visitors in their 20s or 40s?
  • Which demographic groups like/dislike the
    website?
  • An example of the "Randomization Approach" to
    PPDM
  • R. Agrawal and R. Srikant, "Privacy Preserving
    Data Mining", SIGMOD 2000.

7
Randomization Approach Overview
50 40K ...
30 70K ...
...
Randomizer
Randomizer
65 20K ...
25 60K ...
...
Reconstruct distribution of Age
Reconstruct distribution of Salary
...
Data Mining Algorithms
Model
8
Seems to work well!
9
What is collaborative filtering?
  • "People like what
  • people like them
  • like"
  • regardless of support and confidence

10
User-based Collaborative Filtering
  • Idea People who agreed in the past are likely to
    agree again
  • To predict a users opinion for an item, use the
    opinion of similar users
  • Similarity between users is decided by looking at
    their overlap in opinions for other items
  • Next step build a model of user types ? "global
    model" rather than "local patterns" as mining
    result

11
1. Privacy as confidentiality"the right to be
let alone" and to hide data
Data
Is this all there is to privacy?
12
2. Privacy as controlinformational
self-determination
Data

Dont do THIS !
  • e.g. data privacy "the right of the individual
    to decide what information about himself should
    be communicated to others and under what
    circumstances" (Westin, 1970)
  • behind much of data-protection legislation (see
    Eleni Kostas talk)

13
Discussion item What is this an example
of?Tracing anonymous edits in Wikipedia
http//wikiscanner.virgil.gr/
14
Method Attribute matching
15
Results (an example)
Write a Comment
User Comments (0)
About PowerShow.com