A Network Gossip Algorithm using Lifted Markov Chain - PowerPoint PPT Presentation

1 / 18

About This Presentation

Title:

A Network Gossip Algorithm using Lifted Markov Chain

Description:

Joint work with Kyomin Jung and Devavrat Shah. Laboratory of Information and Decision Systems ... Find a fast mixing & slim lifted MC. Pseudo lifting ... – PowerPoint PPT presentation

Number of Views:120

Avg rating:5.0/5.0

Slides: 19

Provided by: jinwo6

Category:

more less

Transcript and Presenter's Notes

Title: A Network Gossip Algorithm using Lifted Markov Chain

1
A Network Gossip Algorithm using Lifted Markov
Chain

Jinwoo Shin
Joint work with Kyomin Jung and Devavrat Shah
Laboratory of Information and Decision Systems
Massachusetts Institute of Technology
ICASSP, Apr. 22th 2009

2
Motivation

Network in the 20th century- Telephone
networks- Internet
Next generation network?- Peer-to-peer
networks- Social networks- Wireless sensor
networks- Mobile (ad-hoc) networks

3
Motivation

Characteristics of these networks?
- Lack of (reliable) infrastructure
? Algorithm should be distributed
(i.e. local message-passing)
- Dynamic topology
? Algorithm should be robust against
network changes
- Resource constraints
? Algorithm should be simple and
efficient-computable
Naturally, lead to Gossip algorithms.

4
Distributed Averaging

Setup- A network graph G(V,E) with Vn-
Initial values xi at node i- Goal Design a
Gossip algorithm to find the average
at all nodes
Applications- distributed linear estimation,
distributed load balancing, distributed
consensus, synchronization, etc.

5
Gossip Algorithm for Averaging

Linear distributed iterations- a popular
approach since Tsitsiklis (1984)- setting it
needs a Markov Chain P s.t. ? P is graph
conformant i.e. P(i.j)gt0 only if (i,j)?E. ? P
has the uniform stationary distribution i.e.
11P.- updating mechanism ? xi(t) is a value
of a node i at time t and initially xi(0) xi ?
a node j transfers P(i,j)xj(t) to its neighbor i
at time t ? xi(t1) ?jP(i,j)xj(t) - for every
node i, xi(t) converges to xave

6
Mixing time of MC
? E number of communications happens per each
iteration.

The mixing time of MC is the time to approach
stationarity from a worst initial state.
The running time for the convergence is the
mixing time H of P
The total number of operations is HE.
The key issue is finding a graph conformant MC P
that has the smallest mixing time H and the
uniform stationary distribution.
Boyd et al. (2004) showed that the fastest mixing
symmetric one can be found using SDP.
Question How about the non-symmetric one?

7
Example
1-1/n
E2
E1
1-1/n
1/2
1/n
1/2
1/n
E2 is a lifting of E1
? HO(n2) ? symmetric
? HO(nlog n) ? non-symmetric
? Observed by Diaconis et al. (1997)
8
Lifting a Markov chain

Definition is a lifted Markov chain of P
if there exists a many-to-one function
s.t.- are sets of states of
respectively. -
, where
and is a stationary
distribution of P
Informally, is a lifted Markov chain of P
if- a node of is a copy of a node of P-
two copies have a positive transition prob. only
if their original nodes have-
the lifted MC can simulate the original
stationary distribution

9
Previous results

Diaconis, Holmes and Neal (1997)- found a fast
mixing lifted MC for the ring-like graph (E2).
Chen, Lovasz and Pak (1999)1) for a general P,
constructed a non-symmetric lifted MC that has
the mixing time 1/F(P), where F(P) is the
conductance of P. ? via lifting, the mixing
time can be reduced up to its square
root since 1/F(P)H1/F2(P)
2) showed that this is the fastest mixing
lifted MC and symmetric lifted MC cannot reduce
mixing time.

10
Back to averaging

Recall the performance guarantees- the running
time H- the total number of operations HE
Averaging using the lifted MC- Implementable via
running multiple threads per each node.- The
lifted MC reduces the running time.- However,
the total number of operations may not be reduced
due to the increased size E of the lifted MC.
Revised goal- Find a fast mixing slim lifted MC

11
Pseudo lifting

We introduce the new notion of lifting, which
will produce better performance than the old one
for averaging.
Basic motivation similar to the original
lifting, but allow a small fractional loss in
preserving the original stationary distribution
Definition same as the original lifting except
for the last condition- previous condition
- new condition there exists a
copy j of i s.t.

12
Our result L1

CLP(1999) lifting
Mixing time 1/F(P)- Size n2/F(P)- D
1/F(P)

For a general P, we construct a pseudo lifted
MC- its mixing time D,where D is a diameter
of the underlying graph of P - its size nD
The mixing time D is the optimal one achievable
for a given underlying graph topology.
Example Barbell-graph

? DO(1), but 1/F(P)O(n), which is a lower
bound of the mixing time of the original lifted
MC -gt The pseudo lifting is better than the
original one.
13
Main idea for L1

The essence of CLP- consider the complete graph
which mixes in one step, and embed its topology
to the original graph.- need to solve the
uniform multi-commodity flow problem to maintain
the properties of lifting.
Our main ideas
1) we use a sparse expander graph instead of
the complete graph, and it leads to the reduction
of size.
2) by allowing a loss in preserving the
original stationary distribution, we relax the
uniform multi-commodity flow problem. It reduces
the mixing time.

14
Our result L2

Doubling dimension- Definition The doubling
dimension ? of a metric space is the least k s.t.
any ball of radius R can be covered by 2k balls
of radius R/2.- Example dim-k grid has ?k.
For a general P, we construct a pseudo lifted
MC- its mixing time D - its size
Dn1-1/(?1), where ? is the doubling dimension
of the underlying graph of P.
Graphs which need lifting usually have small
doubling dimension.

15
Back to averaging

Implementable using L2 construction - For every
node i, exists a copy j of i such that xj(t)
converges to (1-e)xave.- Possible to estimate
xave by xj(t)/(1-e)
The performance- the running time D - the
total number of operations D2n1-1/(?1)

16
Example 2-d Grid

Consider a random walk on the 2-dimensional grid
graph with n nodes.
Before lifting - Mixing time O(n)- Size n
After lifting using L2 (L1,CLP)- Mixing time
n0.5 (n0.5,n0.5)- Size n1.17 (n1.5,n2.5 )
Benefit of lifting for averaging - Running time
n ? n0.5- Total number of operations n2 ?
n1.67

17
Summary - Comparison of results
? D 1/F(P)
18
Conclusion

Markov chain is one of fundamental tools for
designing network algorithms. - Mixing time
decides the running time.
Lifting reduces the mixing time, but increase the
size.- We construct lifting with the best
possible mixing time and small size.
Thank you!

Write a Comment

User Comments (0)