FRIENDS: File Retrieval In a dEcentralized Network Distribution System - PowerPoint PPT Presentation

About This Presentation
Title:

FRIENDS: File Retrieval In a dEcentralized Network Distribution System

Description:

Illegal file sharing has drawn negative publicity. P2P systems ... Kazaa, Grokster, Morpheus. Hierarchy of nodes like DNS. Bandwidth is not well distributed ... – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0
Slides: 17
Provided by: cuis
Learn more at: https://cseweb.ucsd.edu
Category:

less

Transcript and Presenter's Notes

Title: FRIENDS: File Retrieval In a dEcentralized Network Distribution System


1
FRIENDS File Retrieval In a dEcentralized
Network Distribution System
  • Steven Huang, Kevin Li
  • Computer Science and Engineering
  • University of California, San Diego

2
FRIENDS
  • Motivation
  • Related works
  • Initial approaches
  • Implementation
  • Future work

3
Peer-to-peer File Sharing
  • Illegal file sharing has drawn negative publicity
  • P2P systems have nice features
  • Low barrier of entry
  • Aggregates computation and storage on large scale
  • Robust, secure

4
Early P2P File Distribution Systems
  • Napster (http//www.napster.com)
  • Uses central database
  • reliability issues
  • Gnutella (http//www.gnutella.com)
  • Uses broadcast
  • Not scalable
  • Kazaa, Grokster, Morpheus
  • Hierarchy of nodes like DNS
  • Bandwidth is not well distributed
  • Reliability issues

5
More P2P File Distribution Systems
  • Freenet Clarke, 2000
  • Focus on anonymity, files get lost
  • BitTorrent (http//www.bittorrent.com)
  • Focus on distributing file to many peers
  • Not fully decentralized, requires tracker

6
Related Works
  • Recent popularity has resulted in lots of
    research in P2P
  • Focus is on scalability, balance, flexibility
  • Overlay Networks
  • CAN RATNASAMY, 2001
  • Chord STOICA, 2001
  • Pastry ROWSTRON, 2001
  • Tapestry HILDRUM, 2002
  • Object Integrity unchecked

7
Related Works
  • MD5
  • Replication
  • Reputation Management
  • Distributed EigenTrust Kamvar, 2003
  • Distributed Hash Tables

8
Research Question
  • How can a fully decentralized P2P file
    distribution network support the verification of
    object integrity in a hostile environment?

9
Initial Approach
  • Global table of file names with hash values
  • Not decentralized, not scalable
  • Use time of object entry as indicator of file
    validity
  • Malicious nodes can tamper with time
  • Bootstrap to physically close node using IP
    address
  • Avoid multiple malicious nodes connecting to each
    other
  • Reduces overall bandwidth
  • Scalability issues, limited view of network
  • Use a single MD5 value per file
  • Waste a lot of bandwidth downloading 99 of an
    invalid file

10
FRIENDS System Implementation
  • Built using MACEDON
  • MACEDON hides low level details allowing us to
    focus on higher level research questions and
    reducing code
  • Built protocol on top of Chord
  • Pros Scalable, de-centralized, supports
    distributed index
  • Cons Overlay network lacks bandwidth
    optimizations
  • Focus is on proof of concept rather than
    performance

11
FRIENDS System Details
  • Verification needs to be done client size since
    anyone can be an adversary
  • Each unique object in the system has a set of MD5
    hash value associated with it
  • Files are broken up into 512kB sized chunks, with
    an associated MD5 value
  • There is also an MD5 of all the MD5s which serves
    to reduce bootstrapping costs
  • After downloading a portion of a file, run the
    object through the given hash function to verify
    the data is correct
  • Reputation system limits downloads from malicious
    nodes

12
Distributed Hash Table
  • Use a Distributed Hash Table to store the global
    table of object to hash value.
  • Upon entering the system, the node will calculate
    the hash values of any new objects it wishes to
    add to the system
  • Hashing greenday basketcase sends file
    information to node for greenday and node for
    basketcase

13
Can I trust you?
  • Malicious users can target the system in a number
    of ways
  • Incorrect routing checks can be made to ensure
    progress
  • Hosting invalid files MD5 hashes ensure that
    the file being retrieved is the file desired
  • Corrupted Hash Table solved with replication
    across multiple nodes
  • Malicious nodes can still affect a cooperative
    node by wasting its time and bandwidth

14
Blacklisting
  • Keep a rating of nodes that have hosted
    downloaded objects.
  • Each successfully hashed object increases that
    nodes rating (1)
  • Failed hash decreases it (-5)
  • Filter potential sources against personal
    reputation records
  • Friends trust friends
  • Increase rating of nodes trusted by trusted nodes
  • Exponential back-off

15
Future Work
  • Replication
  • Square-root replication Lv, 2002 Optimal
    number of replications 1/? (Svqi)2 where qi is
    the request frequency of object I and ? is the
    average number of replicas per site
  • Bandwidth comparison tests using ModelNET
  • Improve file name indexing
  • Exploit locality to optimize bandwidth

16
Questions?
Write a Comment
User Comments (0)
About PowerShow.com