A Framework For Community Identification in Dynamic Social Networks - PowerPoint PPT Presentation

About This Presentation
Title:

A Framework For Community Identification in Dynamic Social Networks

Description:

... C stays in its original community and just visits. Cost = b1 b2 ... Eighteen women in 1933 in Natchez, Tennessee. Tracks their attendance at 14 social events ... – PowerPoint PPT presentation

Number of Views:59
Avg rating:3.0/5.0
Slides: 33
Provided by: victo69
Learn more at: https://www.cs.kent.edu
Category:

less

Transcript and Presenter's Notes

Title: A Framework For Community Identification in Dynamic Social Networks


1
A Framework For Community Identification in
Dynamic Social Networks
  • Chayant Tantipathananandh
  • Tanya Berger-WolfDavid KempePresented by
    Victor Lee

2
Outline of Presentation
  • The Challenge Dynamic Social Networks
  • Framework and Problem Formulation
  • Individual and Group Colorings
  • Group Coloring Heuristics
  • Experimental Results
  • Future Directions

3
The Problem
  • Many well-known approaches to identify
    communities in social networks
  • Graph Partitioning
  • Clustering
  • Various measures of closeness or density
  • But, these approaches generally assume static
    networks
  • Most social networks are dynamic

4
Dynamic Social Networks
  • Social Networks change over time
  • Membership changes
  • Interaction changes
  • Most community identification techniques
  • Use a single snapshot
  • Or use time-averaged measurements
  • Lose important information

5
Importance of Dynamic Information
T1 T2 T3 T4 T5 T6
A
B
A
B
A
B
C
A
B
A
B
A
B
time
A
B
C
A
B
C
A
B
A
B
C
A
B
C
A
B
C
Network 1 Network 2
  • Networks 1 and 2 same average characteristics,bu
    t
  • Network 1 shows an oscillation
  • Network 2 suggests that C joins the community

6
Proposal
  • New framework for modeling social networks over
    time
  • Algorithms and Heuristics to identify dynamic
    communities
  • Experiments to verify the concept and the
    computational performance

7
Problem Formation
  • Given
  • A set of individuals
  • A sequence of snapshot observations
  • Find
  • A best-fit set of time-varying communities C(t)
  • Best-fit time-varying community membership for
    each individual
  • Approach
  • Combinatorial optimization
  • Graph coloring

8
Model Individuals and Groups
  • Set of individuals X i1, i2, in
  • Sequence of observations ltP1, P2, PTgt
  • Discrete time
  • Record interaction between individuals
  • The set of individuals interacting at time t
    define a group.
  • If A interacts with B, and B interacts with
    C,than A,B,C ? a group

A
C
B
9
Group vs Community
  • Snapshot Graph
  • Individual is a vertex
  • Interaction is an edge
  • Group is a connected subgraph
  • Assumption interaction is sufficiently limited
    so that the graph is not connected (we have
    disjoint groups)
  • Group ? Community
  • Groups capture observed interaction at a point in
    time
  • Communities extend over time

10
Graphing the Observations
  • Each time slice is one observation
  • Edges within a time slice show observed
    interaction at time t
  • Add edges joining all observations of the same
    individual
  • No edges between groups from one time to another

? individual ? group
11
Refine the Problem
  • A community appears as a sequence of groups, of
    at most one group per time slice.
  • Tasks
  • Assign each group to a community(color the group
    vertices)
  • Assign each individual to a community, for each
    time step (color individual vertices)
  • More Assumptions
  • Individuals belong to one community at a time
  • Individuals dont change community frequently
  • Individuals frequently appear in their community

12
Cost Model
  • Quantify a good community identification
  • Assign costs to undesirable behavior
  • I-cost ? when an individual changes color.
  • G-costs
  • b1 when an individual is absent from its
    community.
  • b2 when an individual is present in a different
    community.
  • C-cost g for each color that I uses
  • Find a coloring with minimum cost

13
Coloring Choices and Costs
At time T3, C temporarily changes its interaction.
T1 T2 T3 T4
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
A
B
C
D
time
A
B
C
D
A
B
C
D
Coloring 1 Coloring 2
  • Coloring 1 C changes community and then changes
    back.
  • Cost 2a ( g if this color hasnt been used
    before)
  • Coloring 2 C stays in its original community and
    just visits.
  • Cost b1 b2
  • Optimal coloring depends on comparison (b1
    b2) lt (2a g) or (2a)

14
Finding Optimal Colorings
  • Finding the optimal solution is NP-hard
  • Partition the problem
  • Find an optimal set of communities
  • Find optimal assignment of individuals to
    communities
  • If Phase 1 (Group Coloring) is completed first
  • Phase 2 is reduced from O(2N) to O(2G),N of
    individuals, G of groups
  • The cost incurred by one individuals coloring
    is independent of the colors chosen by others.

15
Independence of Individual Color Choice
  • Proof
  • Cost of an individuals behavior A (I-cost)
    B (G-cost) C (C-cost)
  • Costs are assessed individually
  • I-cost a ( of color changes)
  • G-cost b1 ( absences from its group) b2
    ( visits to other groups)
  • C-cost g ( of colors that an individual
    uses)
  • So, we can solve for each individual one at a
    time.
  • Moreover, we can assess cost incrementally,from
    time t to time t1

16
Individual Coloring Algorithm
  • C set of all colors observed to be used by an
    individual i
  • F(t) S ? C 1 S t all possible subsets
    of colors up to time t
  • G(t,x) G-cost to use color x at time t
  • I(t,x,y) I-cost to use color x at time t-1 and
    color y at time t
  • C(x,R) C-cost to use color x when color set R
    has been used
  • Min. cost at time t, using color x, with color
    set S used
  • At time1 G(I, x, x) G(1,x) At timet G
    (t, S, x) G(t, x) min G(t-1, R, y)
    I(t, x, y) C(x, R) over all R and y,
    where R ? F(t-1), y ? R R U x S,

i-cost changing colorg-cost wrong
groupc-cost new color
17
Optimal Individual Coloring
  • Given a group coloring, the minimum cost of
    coloring the individual I is min G(T, S, x) S
    ? F(T), x ? S
  • Time complexity is O( nTC2 2C )
  • Space requirement is O( C 2C )
  • If the number of groups C is not large, the
    complexity is tractable.

18
Optimal Group Coloring
  • Determine the best mapping of groups at time t to
    groups at time t1
  • Groups that are mapped across time are part of
    the same community and have the same color
  • A coloring is good if most individuals can retain
    their color from step to step.

19
Bipartite Matching Heuristic
  • Matching Graph
  • For each pair of groups g, g at times t, tt1,
    add a weighted edge from vg,t to vg,t
  • Weight g n g (similarity of g to g)
  • Find the maximum weight bipartite matching
  • Evaluation
  • Weights i-cost more than g-cost
  • Performs well if membership is fairly stable
  • No long range perspective
  • More efficient heuristics?

i-cost changing colorg-cost wrong
groupc-cost new color
20
Greedy Heuristics for Group Coloring
  • Approach Maximize pairwise similarity between
    groups, for all pairs of groups over all
    timesteps
  • Jaccards index Jac(g, g') g n g' g U
    g'
  • Weighted for temporal proximity JacD(g, g')
    Jac(g, g') t - t'

overlap between g and g', scaled to size of g and
g'
21
Greedy Heuristics for Group Coloring
  • Greedy Heuristic 1 (time is not a factor)
  • Construct a square similarity matrix of size
    groups
  • Using agglomerative clustering
  • Greedy Heuristic 2 (look backwards in time) For
    t1 to T do
  • Match most similar pairs g, g' for any time t' lt
    t
  • If similarity0 or all colors have been used, add
    a new color
  • Greedy Heuristic 3 (look back the shortest
    interval)
  • Like Heuristic 2, but use t', t' is the closest
    value to t such that ? similarity(g, g') gt 0

22
Experiment 1 Verify the Framework
  • Does the framework capture the intuitive concept
    of dynamic community?
  • Procedure
  • Construct small, synthetic datasets
  • Use exhaustive search to get a truly optimal
    coloring

23
Experiment 1A Assembly Line
  • At each time step, 1 member leaves and 1 enters a
    group, resulting in a complete membership change
    in 3 steps.
  • Results change as costs change. (A) favors
    stable membership. (B) allows for more fluid
    membership.

24
Experiment 1B Dutiful Children
  • 2, 3, and 4 are Children. 0 and 1 are Parents
    that visit a different child each timestep.
  • Results Framework succeeds at detecting the
    individual children as well as the visitation
    pattern.

25
Experiment 2 Quality of Heuristic Results
  • Do the heuristics obtain colorings similar to
    those of an exhaustive search?
  • Procedure
  • Re-test the synthetic datasets using the various
    heuristics

Results At least one Heuristic method obtains
the same coloring and total cost as Exhaustive
Search
26
Experiment 3 Real World Datasets
  • Do the framework and heuristics together obtain
    expected results using real-world datasets?

27
Experiment 3A Southern Women
  • Eighteen women in 1933 in Natchez, Tennessee
  • Tracks their attendance at 14 social events

28
Experiment 3A Prior Results
  • Twenty one analyses (1941 to 2001) all show
    similar results
  • Two clear communities
  • The membership of individuals 8, 9, and 16 is
    less certain.

29
Experiment 3A Results
  • Detects 4 communities, which are subsets of the
    traditional 2 communities
  • Individuals 6 and 10 change membership over time
  • By adjusting cost factors, the results of most of
    the 21 prior analyses can be duplicated

30
Experiment 3B Grevys Zebra
  • 28-member zebra herd observed 44 times over 3
    months in 2002
  • The graph to the left shows the aggregate
    interaction.
  • Temporal information is lost.

31
Experiment 3B Results
  • Inferred communities agree with manual results
    obtained by biologists.
  • 4 stable communities
  • Some short-lived communities and some visiting

32
Conclusions
  • We present a framework for identifying
    communities in dynamic social networks
  • The framework produces meaningful results
    compared to traditional methods
  • Heuristic methods produce near-optimal solutions
  • Future Directions
  • Develop an approximation algorithm which
    guarantees the quality of the result
  • Investigate scalability over network size and
    time
  • Relax assumptions about interaction and dynamics
Write a Comment
User Comments (0)
About PowerShow.com