BeeHive: Exploiting Power Law Query Distributions for O(1) Lookup Performance in P2P Overlays - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

BeeHive: Exploiting Power Law Query Distributions for O(1) Lookup Performance in P2P Overlays

Description:

domain name service (DNS) and web access. O(logN/loglogN) ... hysteresis to limit thrashing. mutable objects. version number. proactive propagation to all nodes ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 25
Provided by: ram56
Category:

less

Transcript and Presenter's Notes

Title: BeeHive: Exploiting Power Law Query Distributions for O(1) Lookup Performance in P2P Overlays


1
BeeHive Exploiting Power Law Query Distributions
for O(1) Lookup Performance in P2P Overlays
  • Venugopalan Ramasubramanian (Rama)
  • and
  • Emin Gün Sirer

2
introduction
  • distributed peer-peer overlay networks
  • decentralized
  • self-organized
  • distributed hash tables (DHTs)
  • store lookup interface
  • unstructured DHTs
  • Freenet, Gnutella, Kazaa
  • bad lookup performance accuracy and latency

3
structured overlays
lookup performance lookup performance
CAN O(dN1/d)
Chord, Kademlia, Pastry, Tapestry, Viceroy O(logN)
de Bruijn graphs (Koorde) O(logN/loglogN)
Kelips O(1)
  • latency sensitive applications
  • domain name service (DNS) and web access

median mean
overlay rtt 81.9 ms 202 ms
DNS lookup 112 ms 256 ms
4
overview of beehive
  • general replication framework
  • proactive replication based on popularity of
    objects
  • exploit structure of DHTs
  • goals
  • performance O(1) amortized lookup time
  • scalability minimize number of replicas and
    reduce storage, bandwidth, and network load
  • adaptivity promptly respond to changes in
    popularity flash crowds
  • mutable objects quickly disseminate object
    updates to all replicas

5
prefix matching DHTs
2012
object 0121
6
key intuition
0021
0112
0122
2012
by replicating popular objects more and unpopular
objects less, the overall lookup performance can
be tuned to any desired constant efficiently.
7
popularity based replication
  • levels of replication 0, 1, 2,
  • nodes share i matching prefixes with the object
  • replicated on N/bi nodes
  • i hop lookup latency
  • lower level ? greater replication
  • default level ?logN?
  • replicated only at the home node

8
analytical model
  • optimization problem
  • minimize total number of replicas, s.t.,
  • average lookup performance ? C
  • zipf-like power law query distributions
  • popularity of ith popular request ? 1/i?
  • dns requests alpha 0.91 MIT trace
  • web requests

trace Dec UPisa FuNet UCB Quest NLANR
alpha 0.83 0.84 0.84 0.83 0.88 0.90
9
optimization problem
  • minimize (number of replicas)
  • x0 x1/b x2/b2 xK-1/bK-1
  • such that (average lookup time is C hops)
  • (x01-? x11-? x21-? xK-11-?) ? K C
  • and
  • x0 ? x1 ? x2 ? ? xK-1 ? 1
  • b base K logb(N)
  • xj fraction of objects replicated at level j or
    lower

10
optimal solution
K is determined by setting (typically 2 or
3) xK-1 ? 1 ? dK-1 (K C) / (1 d
dK-1) ? 1
optimal replicas per node (1 1/b) / (1 d
dK-1)?/(1- ?) 1/bK
11
example
  • b 32
  • C 1
  • ? 0.9
  • N 10,000
  • M 1,000,000
  • x0 0.001102 1102 objects
  • x1 0.0519 51900 objects
  • x2 1
  • total storage 3700 objects per node
  • total storage for Kelips M/?N 10,000 objects
    per node

12
performance vs overhead trade off
13
analytical model
  • configurable target lookup performance
  • continuous range
  • even better with proximity routing
  • minimizing number of replicas provides storage as
    well as bandwidth efficiency
  • k is a upper bound on lookup performance of
    successful query
  • assumptions
  • homogeneous object sizes
  • infrequent updates

14
beehive replication protocol
  • aggregation phase
  • popularity of objects, zipf parameter
  • local measurement and limited aggregation
  • analysis phase
  • apply analytical model
  • locally change replication level
  • replication phase
  • push new replicas to nodes one hop away
  • remove old replicas no longer required

15
beehive replication protocol
home node
0 1 2
E
L 3
0 1
0 1
0 1
E
B
I
L 2
0
0
0
0
0
0
0
0
0
L 1
A
B
C
D
E
F
G
H
I
16
beehive replication protocol
  • periodic packets to nodes in routing table
  • asynchronous and independent
  • exploit structure of underlying DHT
  • replication packet sent by node A to each node B
    in level i of routing table
  • node B pushes new replicas to A and tells A which
    replicas to remove
  • fluctuations in estimated popularity
  • aging to prevent sudden changes
  • hysteresis to limit thrashing

17
mutable objects
  • version number
  • proactive propagation to all nodes
  • home node sends to nodes in level i of routing
    table
  • level i nodes send to level i1 node
  • lazy propagation
  • replication phase
  • handles missed out updates due to join and leave

18
implementation
  • pastry (FreePastry 1.3)
  • replication also on leaf-set nodes
  • combine pastry heart-beat and routing table
    maintenance with beehive aggregation and
    replication packets
  • prototype DNS name server and resolver
  • UDP based and serves A queries
  • fall back to legacy DNS
  • home node detects and propagates updates

19
evaluation DNS application
  • DNS survey
  • queried 594059 unique domain names
  • TTL distribution 95 lt 1 day
  • rate of change of entries 0.13 per day
  • MIT DNS trace 4 11 december 2000
  • 4 million queries for 300,000 distinct names
  • zipf parameter 0.91
  • setup
  • simulation mode on single node
  • 1024 nodes, 40960 distinct objects
  • 7 queries per sec from MIT trace
  • 0.8 per day rate of change

20
evaluation lookup performance
3
2.5
2
latency (hops)
1.5
1
0.5
Pastry
PC-Pastry
Beehive
0
0
8
16
24
32
40
time (hours)
21
evaluation overhead
Storage
Object Transfers
6
x 10
2
PC-Pastry
average number of replicas per node average number of replicas per node
Pastry 40
Beehive 380
PC-Pastry 420
Kelips 1280
Beehive
1.5
object transfers ()
1
0.5
0
0
8
16
24
32
40
time (hours)
22
evaluation flash crowds
Latency
3
2.5
2
latency (hops)
1.5
1
Pastry
0.5
PC-Pastry
Beehive
0
32
40
48
56
64
72
80
time (hours)
popularity reversal complete inversion in
popularity-rank of all the objects.
23
evaluation zipf parameter change
24
conclusions
  • general replication framework to achieve O(1)
    lookup performance efficiently
  • structured overlays with uniform fanout
  • properties
  • high performance
  • adaptivity
  • improved availability and resilience to failures
  • well-suited for a cooperative domain name service
    application
Write a Comment
User Comments (0)
About PowerShow.com