Anonymized social networks

About This Presentation

Title:

Anonymized social networks

Description:

Wherefore Art Thou R3579X? Anonymized Social Networks, Hidden Patterns, and Structural Stenography * * * * * A social network occurs anywhere there is social ... – PowerPoint PPT presentation

Number of Views:109

Avg rating:3.0/5.0

Slides: 26

Provided by: Kan116

Learn more at: http://www.mathcs.emory.edu

Category:

more less

Transcript and Presenter's Notes

Title: Anonymized social networks

1
Anonymized social networks

Wherefore Art Thou R3579X? Anonymized Social
Networks, Hidden Patterns, and Structural
Stenography

2
What is a social network?

A social network occurs anywhere there is social
interaction between people.
Examples include Email, instant messaging,
Facebook, blogging trackbacks, coauthor networks

3
Coauthor Network
4
Uses of mining social networks

The structure of social networks can be
interesting

How are friendships usually structured? Are there
hubs, such as Heather, who connect separate
networks? How many degrees of Kevin Bacon? We
can investigate these questions if we have the
data to mine.
5
Email

For our examples, we will use a network of emails
sent between users.
How do we protect users privacy while still
releasing the data for research?

John
Mary
Vertex
Vertex
Directed edge
6
Anonymization Techniques

Remove any identifiable information, such as name
and other attributes.
Randomly rename the vertices

R3579X
R73313
7
Anonymization Techniques

Convert directed edges to undirected edges. This
increases the complexity and makes it harder to
attack.

R3579X
R73313
Undirected edge
8
Compromising privacy

Lets say you want to know if two vertices are
connected onthe graph.
All the identifying info has beenremoved, so how
do we do it?

9
Active Attacks!

An active attack involves the adversary creating
vertices in the graph before the graph is
released
The adversary will create edges between the
vertices in a fashion that it can then recognize
later on in when the graph is released

10
Walk-Based Attack

We create k new vertices around 2(log n) where n
is the total number of vertices
We create new do d1 edges between these new
vertices and the other ones in the graph
Then, we randomly create edges between these new
nodes with independent probability of 1/2

11
Algorithm

Given the graph, how do we find the subgraph that
we created?
Create a search tree, pruning the tree based on
the properties of our subgraph, such as the
number of degrees of our new vertices

12
Are Mary and John connected?
John
Mike
Mary
Zoe
Tom
13
Are Mary and John connected?
John
Mike
k1
k5
k2
Mary
k4
k3
Zoe
Tom
14
Are Mary and John connected?
John
Mike
k1
k5
k2
Mary
k4
k3
Zoe
Tom
15
Are Mary and John connected?
John
Mike
k1
k5
k2
Mary
k4
k3
Zoe
Tom
16
Graph is released
ZXCV
ASDF
WER
DFG
UYT
QWER
ASD
HGF
BNM
JKL
17
We identify our subgraph
ZXCV
ASDF
k1
k5
k2
QWER
k4
k3
BNM
JKL
18
Yes, theyre connected
John
ASDF
k1
k5
k2
Mary
k4
k3
BNM
JKL
19
Analysis

The paper proves that the search tree does not
grow too large and that the algorithm displays
good performance
Also, it proves that the subgraph is unique so
that we dont identify the wrong subgraph

20
Experimental attack

They simulate an attack on LiveJournal friendship
links. They create the accounts on the website,
make the connections, and then crawl the site and
anonymize the data
The network has 4.4 million nodes and 77 million
edges

21
Results
22
Cut-Based Attack

Only needs sqrt(log(n)) new nodes to attack the
graph
However, its much more computationally intensive
and less practical in the real world, although it
takes less nodes

23
Cut-based Results
24
Passive Attack

Its a lot like an active attack, except you
dont create new nodes, instead you collaborate
with your friends and find yourselves in the
graph
However, because you did not specifically target
certain people, you may not be able to identify
other people when you find yourself