Title: Vanish:%20Increasing%20Data%20Privacy%20with%20Self-Destructing%20Data
1Vanish Increasing Data Privacy
withSelf-Destructing Data
- Roxana Geambasu, Yoshi Kohno, Amit A. Levy
- and Hank A. Levy
- University of Washington
- Slides by Gal Motika
2Outline
- Motivating Problem
- Goals
- Distributed Hash Tables (DHTs)
- How Vanish Works
- Availability Performance Analyze
- Security Analyze
3Motivation Data Lives Forever
- How can Ann delete her sensitive email?
- She doesnt know where all the copies are.
- Services may retain data for long after user
tries to delete.
Sensitive email
Ann
Carla
This is sensitive stuff. This is sensitive
stuff. This is sensitive stuff. This is
sensitive stuff. This is sensitive stuff.
This is sensitive stuff.
This is sensitive stuff. This is sensitive
stuff. This is sensitive stuff. This is
sensitive stuff. This is sensitive stuff.
This is sensitive stuff.
Sensitive Senstive Sensitive
Sensitive Senstive Sensitive
Sensitive Senstive Sensitive
Sensitive Senstive Sensitive
4Motivation Data Lives Forever
Ann
Carla
This is sensitive stuff. This is sensitive
stuff. This is sensitive stuff. This is
sensitive stuff. This is sensitive stuff.
This is sensitive stuff.
Sensitive Senstive Sensitive
Sensitive Senstive Sensitive
Sensitive Senstive Sensitive
Sensitive Senstive Sensitive
Attacker
Some time later
Retroactive attack on archived data
This is sensitive stuff. This is sensitive
stuff. This is sensitive stuff. This is
sensitive stuff. This is sensitive stuff.
This is sensitive stuff.
5Self-Destructing Data Model
Sensitive email
This is sensitive stuff. This is sensitive
stuff. This is sensitive stuff. This is
sensitive stuff. This is sensitive stuff.
This is sensitive stuff.
self-destructing data (timeout)
- VDO Vanish Data Object email, Facebook
message, text message. - Until timeout, VDO is readable.
- After timeout, all copies become permanently
unreadable. - Even for attackers who obtain an archived copy
user keys.
5
6Assumptions
- The VDO will be used to encapsulate data that is
only of value to the user for a limited time. - Every message has known timeout.
- Users are connected to the Internet when
interacting with VDOs. - Early destruction is preferred than information
exposure.
7Goals
- A VDO must expire automatically and without any
explicit action. - The VDO should be accessible until timeout.
- Leverage existing infrastructures.
- The system must not require the use of dedicated
secure hardware. - The system should not introduce new privacy risks
to the users.
8Distributed Hash Tables (DHTs)
- A distributed, peer-to-peer (P2P) storage network
consisting of multiple participating nodes. - (index, value) pair data.
- Lookup, get, and store operations.
9Key DHT-related Insights
- Huge scale millions of nodes.
- Geographic distribution Nodes are distributed
over 190 countries. - Decentralization individually-owned, no single
point of trust. - Constant evolution DHTs evolve naturally and
dynamically over time as new nodes constantly
join and old nodes leave.
10Data Encapsulation
Ann
Carla
VDO C, L
Encapsulate (data, timeout)
Vanish Data Object VDO C, L
Vanish
kN
k3
Random indexes
k1
k1
Secret Sharing (M of N)
k2
k2
k2
k3
k3
.
.
.
k1
kN
kN
C EK(data)
11Data Decapsulation
Ann
Carla
VDO C, L
Encapsulate (data, timeout)
Decapsulate (VDO C, L)
Vanish Data Object VDO C, L
data
Vanish
Vanish
kN
kN
k3
k3
Random indexes
Random indexes
Secret Sharing (M of N)
Secret Sharing (M of N)
X
k2
k2
.
.
.
k1
k1
C EK(data)
data DK(C)
11
12Data Timeout
- The DHT loses key pieces over time
- Natural churn nodes crash or leave the DHT
- Built-in timeout DHT nodes purge data
periodically - Key loss makes all data copies permanently
unreadable
Vanish
kN
k3
Random indexes
k1
Secret Sharing (M of N)
X
X
k3
.
.
.
k1
X
kN
data DK(C)
12
12
13The Vuze DHT
- 160-bit ID based on the IP and port. The ID
determines the index ranges that it will store. - To store an (index,value), a client looks up 20
nodes with IDs closest to the specified index. - Entries in the nodes cache are republished every
30 minutes to the other 19 closest nodes. - Nodes remove from their caches all values whose
store timestamp is more than 8 hours old.
14Availability Evaluation
- Pushed 1,000 VDOs shares to pseudorandom indices
in the Vuze DHT and then polled them back. - Repeated this experiment 100 times over a 3-day
period. - 8-hour Vuze standard timeout.
15Availability Evaluation Cont.
- N50 and threshold of 90 is recommended for high
availability.
16Performance Evaluation
- Encryption/Decryption time is negligible.
- The DHT component accounts for over 99 of the
execution time. - The Encapsulation/Decapsulation times were
measured.
17Security Analyses
- The attacker can have access to the sender
computer, the email provider or to the DHT. - The key shares are unlikely to remain in the DHT
much after the timeout. - After timeout, many of the hosting nodes would
have long disappeared or changed their ID. - Even for legal authorities it will be difficult
to reconstruct the lost data. - The relevant attacks can be done before the
timeout.
18Strategy (1) - Decapsulate VDO Prior to
Expiration
- An attacker might try to obtain a copy of the VDO
and revoke its privacy prior to its expiration. - Example an email provider that proactively
decapsulates all VDO emails in real-time. - Defense encapsulate VDOs in traditional
encryption schemes, like PGP or GPG.
19Strategy (2) Sniff Users Internet Connection
- An attacker sniffs the data users push into or
retrieve from the DHT. - Example an ISP or employer.
- Defense
- Encrypt DHT communications between nodes.
- Compose with Tor to tunnel ones interactions
with a DHT through remote machines. - The man-in-the-middle attack is not solved.
20Strategy (3) Integrate into DHT
- The attacker integrate itself into the DHT in
order to create copies of all data that it is
asked to store. - The attacker intercept internal DHT lookup
procedures and then issue get requests of his own
for learned indices. - Standard DHT attacks (Sybil ,Eclipse) are handled
by Vuze DHT (the ID is based on the IP), or
changing the Vuze client.
21Experimental Methodology
- The experiment can not be done on real DHT
because the attacker should acquire as much as
possible nodes. - 1,000, 2,000, 4,500, and 8,000 node DHTs were
tested - Churn (node death and birth) is modeled by a
Poisson distribution with median lifetime of 2
hours.
22Store Sniffing Attack
- The adversary saves all of the index-to-value
mappings it receives from peers. - Via store messages.
- Via replication (every 30 minutes to the 20
closest nodes.
- The attacker compromised
5 of 1000-node DHT.
23Store Sniffing Attack - Attacker Sizes
- None of the 1,000 tested VDOs was compromised.
- For N150, 2 hours churn
24Lookup Sniffing Attack
- Lookup requests pass through multiple nodes.
- The attacker can fetch the value of the searched
index. - Defense lookup for a different index but with
the same node ID. - For 1M nodes, 160 index bits, the first 20 bits
are the ID of the node. - On lookup, randomize the last 80 bits, so it will
be impossible for the attacker to get the key.
25Conclusions
- Vanish causes sensitive information, such as
emails, files, or text messages, to irreversibly
self-destruct. - Without any action on the users part.
- Without any centralized or trusted system.
- Vanish is robust against adversarial attacks.
- Limitations In Vuze, the fixed data timeout
present challenge for a self-destructing data
system.
26Questions?