A Network Gossip Algorithm using Lifted Markov Chain - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

A Network Gossip Algorithm using Lifted Markov Chain

Description:

Joint work with Kyomin Jung and Devavrat Shah. Laboratory of Information and Decision Systems ... Find a fast mixing & slim lifted MC. Pseudo lifting ... – PowerPoint PPT presentation

Number of Views:120
Avg rating:5.0/5.0
Slides: 19
Provided by: jinwo6
Category:

less

Transcript and Presenter's Notes

Title: A Network Gossip Algorithm using Lifted Markov Chain


1
A Network Gossip Algorithm using Lifted Markov
Chain
  • Jinwoo Shin
  • Joint work with Kyomin Jung and Devavrat Shah
  • Laboratory of Information and Decision Systems
  • Massachusetts Institute of Technology
  • ICASSP, Apr. 22th 2009

2
Motivation
  • Network in the 20th century- Telephone
    networks- Internet
  • Next generation network?- Peer-to-peer
    networks- Social networks- Wireless sensor
    networks- Mobile (ad-hoc) networks

3
Motivation
  • Characteristics of these networks?
  • - Lack of (reliable) infrastructure
  • ? Algorithm should be distributed
    (i.e. local message-passing)
  • - Dynamic topology
  • ? Algorithm should be robust against
    network changes
  • - Resource constraints
  • ? Algorithm should be simple and
    efficient-computable
  • Naturally, lead to Gossip algorithms.

4
Distributed Averaging
  • Setup- A network graph G(V,E) with Vn-
    Initial values xi at node i- Goal Design a
    Gossip algorithm to find the average
    at all nodes
  • Applications- distributed linear estimation,
    distributed load balancing, distributed
    consensus, synchronization, etc.

5
Gossip Algorithm for Averaging
  • Linear distributed iterations- a popular
    approach since Tsitsiklis (1984)- setting it
    needs a Markov Chain P s.t. ? P is graph
    conformant i.e. P(i.j)gt0 only if (i,j)?E. ? P
    has the uniform stationary distribution i.e.
    11P.- updating mechanism ? xi(t) is a value
    of a node i at time t and initially xi(0) xi ?
    a node j transfers P(i,j)xj(t) to its neighbor i
    at time t ? xi(t1) ?jP(i,j)xj(t) - for every
    node i, xi(t) converges to xave

6
Mixing time of MC
? E number of communications happens per each
iteration.
  • The mixing time of MC is the time to approach
    stationarity from a worst initial state.
  • The running time for the convergence is the
    mixing time H of P
  • The total number of operations is HE.
  • The key issue is finding a graph conformant MC P
    that has the smallest mixing time H and the
    uniform stationary distribution.
  • Boyd et al. (2004) showed that the fastest mixing
    symmetric one can be found using SDP.
  • Question How about the non-symmetric one?

7
Example
1-1/n
E2
E1
1-1/n
1/2
1/n
1/2
1/n
E2 is a lifting of E1
? HO(n2) ? symmetric
? HO(nlog n) ? non-symmetric
? Observed by Diaconis et al. (1997)
8
Lifting a Markov chain
  • Definition is a lifted Markov chain of P
    if there exists a many-to-one function
    s.t.- are sets of states of
    respectively. -
    , where
    and is a stationary
    distribution of P
  • Informally, is a lifted Markov chain of P
    if- a node of is a copy of a node of P-
    two copies have a positive transition prob. only
    if their original nodes have-
    the lifted MC can simulate the original
    stationary distribution

9
Previous results
  • Diaconis, Holmes and Neal (1997)- found a fast
    mixing lifted MC for the ring-like graph (E2).
  • Chen, Lovasz and Pak (1999)1) for a general P,
    constructed a non-symmetric lifted MC that has
    the mixing time 1/F(P), where F(P) is the
    conductance of P. ? via lifting, the mixing
    time can be reduced up to its square
    root since 1/F(P)H1/F2(P)
  • 2) showed that this is the fastest mixing
    lifted MC and symmetric lifted MC cannot reduce
    mixing time.

10
Back to averaging
  • Recall the performance guarantees- the running
    time H- the total number of operations HE
  • Averaging using the lifted MC- Implementable via
    running multiple threads per each node.- The
    lifted MC reduces the running time.- However,
    the total number of operations may not be reduced
    due to the increased size E of the lifted MC.
  • Revised goal- Find a fast mixing slim lifted MC

11
Pseudo lifting
  • We introduce the new notion of lifting, which
    will produce better performance than the old one
    for averaging.
  • Basic motivation similar to the original
    lifting, but allow a small fractional loss in
    preserving the original stationary distribution
  • Definition same as the original lifting except
    for the last condition- previous condition
    - new condition there exists a
    copy j of i s.t.

12
Our result L1
  • CLP(1999) lifting
  • Mixing time 1/F(P)- Size n2/F(P)- D
    1/F(P)
  • For a general P, we construct a pseudo lifted
    MC- its mixing time D,where D is a diameter
    of the underlying graph of P - its size nD
  • The mixing time D is the optimal one achievable
    for a given underlying graph topology.
  • Example Barbell-graph

? DO(1), but 1/F(P)O(n), which is a lower
bound of the mixing time of the original lifted
MC -gt The pseudo lifting is better than the
original one.
13
Main idea for L1
  • The essence of CLP- consider the complete graph
    which mixes in one step, and embed its topology
    to the original graph.- need to solve the
    uniform multi-commodity flow problem to maintain
    the properties of lifting.
  • Our main ideas
  • 1) we use a sparse expander graph instead of
    the complete graph, and it leads to the reduction
    of size.
  • 2) by allowing a loss in preserving the
    original stationary distribution, we relax the
    uniform multi-commodity flow problem. It reduces
    the mixing time.

14
Our result L2
  • Doubling dimension- Definition The doubling
    dimension ? of a metric space is the least k s.t.
    any ball of radius R can be covered by 2k balls
    of radius R/2.- Example dim-k grid has ?k.
  • For a general P, we construct a pseudo lifted
    MC- its mixing time D - its size
    Dn1-1/(?1), where ? is the doubling dimension
    of the underlying graph of P.
  • Graphs which need lifting usually have small
    doubling dimension.

15
Back to averaging
  • Implementable using L2 construction - For every
    node i, exists a copy j of i such that xj(t)
    converges to (1-e)xave.- Possible to estimate
    xave by xj(t)/(1-e)
  • The performance- the running time D - the
    total number of operations D2n1-1/(?1)

16
Example 2-d Grid
  • Consider a random walk on the 2-dimensional grid
    graph with n nodes.
  • Before lifting - Mixing time O(n)- Size n
  • After lifting using L2 (L1,CLP)- Mixing time
    n0.5 (n0.5,n0.5)- Size n1.17 (n1.5,n2.5 )
  • Benefit of lifting for averaging - Running time
    n ? n0.5- Total number of operations n2 ?
    n1.67

17
Summary - Comparison of results
? D 1/F(P)
18
Conclusion
  • Markov chain is one of fundamental tools for
    designing network algorithms. - Mixing time
    decides the running time.
  • Lifting reduces the mixing time, but increase the
    size.- We construct lifting with the best
    possible mixing time and small size.
  • Thank you!
Write a Comment
User Comments (0)
About PowerShow.com