RTG: A Recursive Realistic Graph Generator using Random Typing - PowerPoint PPT Presentation

1 / 63
About This Presentation
Title:

RTG: A Recursive Realistic Graph Generator using Random Typing

Description:

RTG: A Recursive Realistic Graph Generator using Random Typing Leman Akoglu and Christos Faloutsos Carnegie Mellon University Spread degrees so that nodes with the ... – PowerPoint PPT presentation

Number of Views:197
Avg rating:3.0/5.0
Slides: 64
Provided by: La9
Category:

less

Transcript and Presenter's Notes

Title: RTG: A Recursive Realistic Graph Generator using Random Typing


1
RTG A Recursive Realistic Graph Generator using
Random Typing
  • Leman Akoglu and Christos Faloutsos
  • Carnegie Mellon University

2
Outline
  • Motivation
  • Problem Definition
  • Related Work
  • A Little History
  • Proposed Model
  • Experimental Results
  • Conclusion

3
Motivation - 1
  • Complex graphs --WWW, computer,
  • biological, social networks, etc.
  • exhibit many common properties
  • - power laws
  • - small and shrinking diameter
  • - community structure
  • -
  • How can we produce
  • synthetic but realistic graphs?

http//www.aharef.info/static/htmlgraph/
4
Motivation - 2
  • Why do we need synthetic graphs?
  • Simulation
  • Sampling/Extrapolation
  • Summarization/Compression
  • Motivation to understand pattern generating
    processes

5
Problem Definition
  • Discover a graph generator that is
  • G1. simple the more intuitive the better!
  • G2. realistic outputs graphs that obey all
    laws
  • G3. parsimonious requires few parameters
  • G4. flexible able to produce the cross-product
    of un/weighted, un/directed, uni/bipartite
    graphs
  • G5. fast generation should take linear time with
    the size of the output graph

6
Outline
  • Motivation
  • Problem Definition
  • Related Work
  • A Little History
  • Proposed Model
  • Experimental Results
  • Conclusion

7
Related Work
  • Graph Properties
  • ? What we want to match
  • 2. Graph Generators
  • ? What has been proposed earlier

8
Related Work 1 Graph Properties
9
Related Work 2 Graph Generators
  • Erdos-Rényi (ER) model Erdos, Rényi 60
  • Small-world model Watts, Strogatz 98
  • Preferential Attachment Barabási, Albert 99
  • Winners dont take all Pennock et al. 02
  • Forest Fire model Leskovec, Faloutsos 05
  • Butterfly model McGlohon et al. 08

10
Related Work 2 Graph Generators
  • Model some static graph property
  • Neglect dynamic properties
  • Cannot produce weighted graphs.
  • Erdos-Rényi (ER) model Erdos, Rényi 60
  • Small-world model Watts, Strogatz 98
  • Preferential Attachment Barabási, Albert 99
  • Winners dont take all Pennock et al. 02
  • Forest Fire model Leskovec, Faloutsos 05
  • Butterfly model McGlohon et al. 08

11
Related Work 2 Graph Generators
  • Random dot-product graphs
  • Kraetzl, Nickel 05 Young, Scheinerman
    07
  • Utility-based models Fabrikant et al. 02
  • Even-Bar et al. 07 Laoutaris, 08
  • Kronecker graphs
  • Leskovec et al. 07 Akoglu et al. 08

12
Related Work 2 Graph Generators
  • Produces only undirected graphs
  • Cannot produce weighted graphs.
  • Requires quadratic time
  • Random dot-product graphs
  • Kraetzl, Nickel 05 Young, Scheinerman
    07
  • Utility-based models Fabrikant et al. 02
  • Even-Bar et al. 07 Laoutaris, 08
  • Kronecker graphs
  • Leskovec et al. 07 Akoglu et al. 08

13
Related Work 2 Graph Generators
  • Produces only undirected graphs
  • Cannot produce weighted graphs.
  • Requires quadratic time
  • Random dot-product graphs
  • Kraetzl, Nickel 05 Young, Scheinerman
    07
  • Utility-based models Fabrikant et al. 02
  • Even-Bar et al. 07 Laoutaris, 08
  • Kronecker graphs
  • Leskovec et al. 07 Akoglu et al. 08
  • Hard to analyze

14
Related Work 2 Graph Generators
  • Produces only undirected graphs
  • Cannot produce weighted graphs.
  • Requires quadratic time
  • Random dot-product graphs
  • Kraetzl, Nickel 05 Young, Scheinerman
    07
  • Utility-based models Fabrikant et al. 02
  • Even-Bar et al. 07 Laoutaris, 08
  • Kronecker graphs
  • Leskovec et al. 07 Akoglu, 08
  • Hard to analyze
  • Multinomial/Lognormal distrib.
  • Fixed number of nodes

15
Outline
  • Motivation
  • Problem Definition
  • Related Work
  • A Little History
  • Proposed Model
  • Experimental Results
  • Conclusion

16
A Little History - 1
  • Zipf, 1932
  • In many natural languages, the rank r and the
    frequency fr of words follow a power law
  • fr ? 1/r

17
A Little History - 2
  • Mandelbrot, 1953
  • Humans optimize avg. information per unit
    transmission cost.

18
A Little History - 2
  • Miller, 1957
  • A monkey types
  • randomly on a
  • keyboard
  • ? Distribution of words follow a power-law.

19
A Little History - 2
  • Conrad and Mitzenmacher, 2004
  • Same relation still holds when keys have
    unequal probabilities.

20
Outline
  • Motivation
  • Problem Definition
  • Related Work
  • A Little History
  • Proposed Model
  • Experimental Results
  • Conclusion

21
Preliminary Model 1RTG-IE RTG with Independent
Equiprobable keys
22
Lemma 1. W is super-linear on N (power
law)Lemma 2. W is super-linear on E (power
law)Lemma 3. In(out)-weight Wn of node n is
super-linear on in(out)-degree dn (power law)
Preliminary Model 1RTG-IE RTG with Independent
Equiprobable keys
Please find the proofs in the paper.
23
Graph Properties
24
Lemma 1. W is super-linear on N (power
law)Lemma 2. W is super-linear on E (power
law)Lemma 3. In(out)-weight Wn of node n is
super-linear on in(out)-degree dn (power law)
Preliminary Model 1RTG-IE RTG with Independent
Equiprobable keys
L05. Densification PL
L11. Weight PL
L10. Snapshot PL
Please find the proofs in the paper.
25
Advantages of the Preliminary Model 1
  • G1 - Intuitive
  • G1 - Easy to implement
  • G2 - Realistic provably follows several rules
  • G3 - Handful of parameters k, q, W
  • G5 - Fast generating random sequence of
    char.s

26
Problems of the Preliminary Model 1
  • 1- Multinomial degree distributions

27
Problems of the Preliminary Model 1
  • 2- No homophily, no community structure
  • ? Node i connects to any node j with prob. didj
    independently, rather than connecting to
    similar nodes.

28
Preliminary Model 2RTG-IU RTG with Independent
Un-equiprobable keys
Solution to Problem 1 Conrad and Mitzenmacher,
2004
29
Proposed ModelRTG Random Typing Graphs
  • Solution to Problem 2
  • 2D keyboard
  • Generate source-
  • destination labels
  • in one shot.
  • Pick one of the nine
  • keys randomly.

30
Proposed ModelRTG Random Typing Graphs
  • Solution to Problem 2
  • 2D keyboard
  • Repeat recursively.
  • Terminate each label
  • when the space key
  • is typed on each
  • dimension (dark blue).

31
Proposed ModelRTG Random Typing Graphs
Solution to Problem 2 2D keyboard How do we
choose the keys? Independent model
does not yield community structure!
papb
paq
pbpa
pbpb
pbq
qq
qpa
qpb
32
Proposed ModelRTG Random Typing Graphs
  • Solution to Problem 2
  • 2D keyboard
  • Boost probability
  • of diagonal keys and
  • decrease probability
  • of off-diagonal ones
  • (0ltßlt1 imbalance factor)

33
Proposed ModelRTG Random Typing Graphs
  • Solution to Problem 2
  • 2D keyboard
  • Boost probability
  • of diagonal keys and
  • decrease probability
  • of off-diagonal ones
  • (0ltßlt1 imbalance factor)
  • Favoring of diagonal keys
  • creates homophily.

34
Proposed Model
  • Parameters
  • k Number of keys
  • q Probability of hitting
  • the space key S
  • W Number of multi-
  • edges in output
  • graph G
  • ß imbalance factor

35
Proposed Model
Up to this point, we discussed directed, weighted
and unipartite graphs. Generalizations -
Undirected graphs Ignore edge directions
edge generation is symmetric. - Unweighted
graphs Ignore duplicate edges. - Bipartite
graphs Different key sets on source and
destination labels are different.
36
Outline
  • Motivation
  • Problem Definition
  • Related Work
  • A Little History
  • Proposed Model
  • Experimental Results
  • Conclusion

37
Experimental Results
  • How does RTG model real graphs?
  • Blognet a social network of blogs based on
    citations
  • ? undirected, unweighted and unipartite
  • ? N 27, 726 E 126, 227 over 80 time
    ticks.
  • Com2Cand the U.S. electoral campaign donations
    network from organizations to candidates
  • ? directed, weighted ( amounts) and
    bipartite
  • ? N 23, 191 E 877, 721 W 4, 383,
    105, 580 over 29 time ticks.

38
Experimental Results
  • Blognet RTG

count
count
degree
degree
L01. Power-law degree distribution Faloutsos et
al. 99, Kleinberg et al. 99, Chakrabarti et al.
04, Newman 04
39
Experimental Results
  • Blognet RTG

count
count
triangles
triangles
L02. Triangle Power Law (TPL) Tsourakakis 08
40
Experimental Results 1
  • Blognet RTG

?rank
?rank
rank
rank
L03. Eigenvalue Power Law (EPL) Siganos et al.
03
41
Graph Properties
42
Experimental Results 1
  • Blognet RTG

edges
edges
nodes
nodes
L05. Densification Power Law (DPL) Leskovec et
al. 05
43
Experimental Results
  • Blognet RTG

L06. Small and shrinking diameter Albert and
Barabási 99, Leskovec et al. 05
44
Experimental Results
  • Blognet RTG

size
size
time
time
L07. Constant size 2nd and 3rd connected
components McGlohon et al. 08
45
Experimental Results 1
  • Blognet RTG

?1
?1
edges
edges
L08. Principal Eigenvalue Power Law (?1PL)
Akoglu et al. 08
46
Experimental Results 1
  • Blognet RTG

entropy
entropy
resolution
resolution
L09. Bursty/self-similar edge/weight additions
Gomez and Santonja 98, Gribble et al. 98,
Crovella and Bestavros 99, McGlohon et al. 08
47
Graph Properties
48
Experimental Results 2
  • Com2Cand RTG

diameter
diameter
time
time
size
size
time
time
49
Experimental Results 2
  • Com2Cand RTG

?1
?1
edges
edges
?rank
?rank
rank
rank
50
Experimental Results 2
  • Com2Cand RTG

count
count
in-degree
in-degree
entropy
entropy
resolution
resolution
51
Experimental Results 2
  • Com2Cand RTG

in-weight
in-weight ( amount)
in-degree (checks)
in-degree
L10. Snapshot Power Law (SPL) McGlohon et al.
08
52
Experimental Results 2
  • Com2Cand RTG

Total weight
Total weight
edges
edges
L11. Weight Power Law (WPL) McGlohon et al. 08
53
Graph Properties
54
Experimental Results
  • On modularity Girvan and Newman 02

Modularity decreases with increasing ß
more community structure
No significant modularity --RTG-IE
55
Graph Properties
56
Experimental Results
  • On complexity

Computation time grows linearly with increasing W
2M multi-edges in 7 sec.s
time (ms)
multi-edges
57
Outline
  • Motivation
  • Problem Definition
  • Related Work
  • A Little History
  • Proposed Model
  • Experimental Results
  • Conclusion

58
Conclusion 1
  • Our model is
  • G1. simple and intuitive --few lines of code
  • G2. realistic --graphs that obey all eleven
    properties in real graphs
  • G3. parsimonious --only a handful of parameters
  • G4. flexible --can generate weighted/unweighted,
    directed/undirected, unipartite/bipartite graphs
    and any combination of those
  • G5. fast --linear on the size of the output graph

59
Conclusion 2
  • We showed that RTG mimics real graphs well.

60
Contact
Leman Akoglu www.cs.cmu.edu/lakoglu lakoglu_at_cs
.cmu.edu Christos Faloutsos www.cs.cmu.edu/chris
tos christos_at_cs.cmu.edu
61
A Little History - 3
  • The infinite monkey theorem
  • A monkey typing randomly
  • on a keyboard for an infinite
  • amount of time will almost
  • surely type a given text,
  • such as the complete works of
  • William Shakespeare.

62
Proposed Model
  • Burstiness and Self-similarity
  • If each step is a time tick, weight additions
    are uniform!
  • Start with a uniform interval
  • Recursively subdivide weight
  • additions to each half,
  • quarter, and so on,
  • according to the bias b gt 0.5
  • b -fraction of the additions
  • happen in one half and
  • the remaining in the other.

63
Related Work Graph Properties
Unweighted Weighted
Static L01. Power-law degree distribution Faloutsos et al. 99, Kleinberg et al. 99, Chakrabarti et al. 04, Newman 04 L02. Triangle Power Law (TPL) Tsourakakis 08 L03. Eigenvalue Power Law (EPL) Siganos et al. 03 L04. Community structure Flake et al. 02, Girvan and Newman 02 L10. Snapshot Power Law (SPL) McGlohon et al. 08
Dynamic L05. Densification Power Law (DPL) Leskovec et al. 05 L06. Small and shrinking diameter Albert and Barabási 99, Leskovec et al. 05 L07. Constant size 2nd and 3rd connected components McGlohon et al. 08 L08. Principal Eigenvalue Power Law (?1PL) Akoglu et al. 08 L09. Bursty/self-similar edge/weight additions Gomez and Santonja 98, Gribble et al. 98, Crovella and Bestavros 99, McGlohon et al. 08 L11. Weight Power Law (WPL) McGlohon et al. 08
Write a Comment
User Comments (0)
About PowerShow.com