Title: Experience with an Object Reputation System for PeertoPeer File Sharing
1Experience with an Object Reputation System for
Peer-to-Peer File Sharing
- Kevin Walsh Emin Gun Sirer
- Cornell University
Gayatri Swamynathan CURRENT Group Meeting
4/28/06
2The Problem
- Pollution in P2P networks
- Wasted resources
- Mislabeled content
- Mal-content (viruses and Trojans)
3Typical Approaches for Object Reputation
- Determine peer trustworthiness
- Firsthand experience limited
- Assume a shared file as an endorsement of the
file contents - Not reliable
4The Need
-
- A trustworthiness metric that is robust,
predictive, and invariant under changing network
conditions and peer resource constraints
Credence
A decentralized object reputation and ranking
system for large P2P networks
5The Proposed Solution
- Clients explicitly label files as authentic or
polluted - Reputation scores for peers are computed based on
a statistical measure of the reliability of the
peers past voting habits - Flow-based trust computation
6Paper Outline
- Overview of the approach
- All about votes
- Evaluation
7Overview of the approach
- Goal To distinguish between authentic content
and polluted content - Users VOTE on objects based on their judgment
- Users COLLECT votes to evaluate authenticity of
the object they are querying - Users EVALUATE votes from peers to determine
credibility of peers from their perspective - Users have realistic incentives to participate
honestly in the reputation system
8Its All About Voting
9Votes in Credence
- Three requirements
- All clients must agree on a common syntax and
semantics - Simple in encoding and decoding votes, yet
expressive. - Semantics must be faithful to the users
intentions when voting
10Votes in Credence
- A Vote is a signed tuple K
- H - File content hash
- S Statement about the file
- T Timestamp
- K Certificate
- Statement
11Statements - Examples
- Statements are useful for thwarting relabeling
attacks - Eg If metadata is changed, negative votes when
evaluated in new context might apply positively
to the modified metadata
12Collecting and Storing Votes
- Vote-gather query
- Reactive, pull-based
- Highest weight votes sent if multiple votes
stored by a given peer - Vote-database
- other votes received
13Weighing Votes
- Simple tabulation would be prone to maliciousness
- Credence client computes a trust metric to each
vote, and then a weighted averaging to compute an
estimate of the objects reputation
14Computing Correlations
- Compare shared voting history for each pair of
peers - Correlation coefficient ? obtained by comparing
conflicting votes and agreeing votes - ? (p-ab) / va (1-a) b(1-b)
- a,b fraction of votes from A,B with positive
intention - p fraction of pairs that agree with both votes
having positive intention
15Flow-based Peer Reputation
- Used in absence of overlapping set of votes
- Leverage strong correlations with other peers
transitively - Gossip-based exchange of locally computed
correlation coefficients - Reliability?
- Request all inputs from neighbor to recompute
- Signatures
16Evaluation Overview
- First P2P reputation system
- 10,000 downloads since March 2005
- 2 crawlers collected snapshots of the network
structure - Data compiled from about 1,200 Credence clients
39,000 votes and 84,000 files
17Experiences Graph Structure
18Credence Reputation Graph
19Classification of Credence users
20Local and Transitive Correlations
21Local and Transitive Correlations
22Assimilation of new clients
23Experiences Files in Credence
24Decoys and Artifacts
25File Voting Popularity
26Voting is independent of sharing
and can even contradict sharing!
27Experiences Performance
28Resistance to Decoys and Collusion
29Ranking Performance
- Credence is able to estimate file authenticity
based on votes in the clients local vote
database, and modify search result order - Legacy ranking based on file popularity is no
better than a random ordering in about 10 of
cases
30Responses to Attack
- Naïve attacker
- Random attacker
- Rational attacker
- Whitewashing attack
31Credence Overhead
- Inbound traffic A highly active client with 250
votes cast and informed of 10,000 other votes
approx. receives 100 bytes per second of
additional background traffic in Credence. - Outbound traffic depends on popularity of
clients votes, clients reputation and Gnutella
connectivity - Processing overhead
32Thank you