Title: Introduction to Social Network Analysis theory and its application to CSCL
1Introduction to Social Network Analysis theory
and its application to CSCL
- Christophe Reffay (France)
Alejandra Martínez-Monés (Spain)
2Outline
- Introduction historical examples
- Milgram, Moreno, Granovetter, Burt, Durkheim,
- Social capital (e.g guanxi), Leaders
Facilitators - Graphs Sociograms definition properties
- What is it interesting for?
- Building sociograms in practice
3The world is so little
Fine! But how little exactly?
4Example 1 Milgram, 1967
- Target person
- Name
- Trader
- Lives in Boston
Nebraska
Boston
- Starting senders
- 100 live in Boston
- 96 live in Nebraska
- 100 traders in Nebraska
From 296 documents, only 217 have been sent,
5Only 64 documents reach the target
6So What is the world size?
- Geographically directed intermediaries 6,1
- Professionally directed intermediaries 4,6
- Globally
- For USA 5 intermediaries
- (Kochen,1989) shows stability of this result
gt The 6 degrees of separation
7The New Small-World Experiment (bigger, faster,
and less expensive)
- (Dodds, Muhamad, and Watts, 2002)
- Very similar to Milgrams Experiment, but
web-based - Initial results
- 60,000 senders
- 19 targets
- 171 countries
- 380 chains complete (worse attrition than
Milgram) - Median chain length ranges from 5 (for same
country) to 7 (for different countries)
8Conversely how many person do we know?
- Granovetter (1976) defines this relation
Person that we met, for whom the contact has
been established - Ithiel de Sola Pool (1978)Exp. 100 days gt 3500
persons every 20 years - Freeman Thomson (1989)(Phone book 305/112147)
gt 5520
9Strong vs weak ties (Granovetter 1973)
- Strong ties
- (longevity, emotional intensity, intimacy,
reciprocity) - gtTransitivity gt Cliques (sub-groups highly
connected) - Strong ties connexions inside the clique
- Weak ties bridges between cliques
10Holism
- An individual acts according to the rules of
the group he belongs to
Weak Rules are interiorised by socialisation
Ronald Stuart Burt
Strong Structure determines action
Émile Durkheim
11The Strength of weak ties (Granovetter)
- Social Capital (guanxi ??)
- Have a lot of contacts
- To be able to go beyond the clique by activating
bridges - Easier to find a good house/job/friend/collaborat
or - Nan Lin (1982) shows efficiency of weak ties in
a Milgram-like experiment (Men/Women x
Black/White)
bridges
12The Morenos experiments (1943)
- Pupils relation in the classroom
- Pupils of various age range
- Gender study
- If you could choose freely, which are the (2)
kids you would like to have as direct
neighbour?
- Main results
- At ltagegt gt pupils tend to lt?gt
- 6-8 years old gt mix
- 8-13 years old gt separate
- 13-15 years old gt mix
- 15-17 years old gt separate
13Outline
- Introduction historical examples
- Graphs Sociograms definition properties
- Graphs types, nodes, edges, Relationships,
- Density, Distance, Path, Radius, Diameter
- Centrality, Betweenness, Cohesion
- What is it interesting for?
- Building sociograms in practice
14Graphs Sociograms
See http//en.wikipedia.org/wiki/Social_network
15What SNA deals with?
- Types of the graph
- 1-mode, 2-mode
- Directed/Undirected
- Weighted (valued) graphs
- Measures
- Density, Diameter, Distances, In/Out degree,
- Structural Cohesion, Betweenness Centrality
- Sub graphs
- Cliques
- Clusters
16Graph types examples
- A set of nodes linked by edges/ties
- Examples (one-mode) nodes are homogeneous
- Networks (Cities, roads)
- Communities of pairs (Persons, relationships)
- Examples (two-mode) 2 distinct node types
- Collaborative Activities (Actors-Objects,
Actions) - Affiliation network (Actors-Structure,
Affiliation) - Directed graphs ties are oriented
- Valued graphs ties have weights
17Sociogram Individuals and relationships
- Nodes are generally individuals (workers,
actors, learners, teachers, tutors) - Edges are relationships between
individuals(communication, resource sharing,
service exchanges, friendship, family links, )
- Example
- One-mode
- Directed
- Unvalued
Generalisation Nodes can be groups or
corporations
18One-mode or Two-mode networks
Two-mode
One-mode
- All nodes are of the same type
19Directed vs Undirected graphs
Edges are oriented
Edges are not oriented
- Directed relationship meaning has sent some
message to - Undirected relationship meaning has exchanged
some message with
20Weighted (valued) vs Unvalued graphs
Edges have values
gt Transform the meaning of relationship!
Edges have no value
21Conclusion 8 possible network types
Two-mode(Two node types)
One-mode(One node type)
One-Mode Directed Valued One-Mode Undirected Valued
One-Mode Directed Unvalued One-Mode Undirected Unvalued
Two-Mode Directed Valued Two-Mode Undirected Valued
Two-Mode Directed Unvalued Two-Mode Undirected Unvalued
22Network types transformation allowed
Two-Mode
One-Mode
Directed
Undirected
Valued
Unvalued
23Two-Mode
One-Mode
Strategy Decide what shared resource represent
for relationships between (blue) nodes.
Do blue nodes share any orange resource? gt
Unvalued
1
2
2
1
1
1
How many orange resource do blue nodes share ? gt
Valued
NB As made in some CSCL research, we can also
transform the two-mode network to a network of
orange nodes (resource network)
24Directed
Undirected
Strategy Decide if you have/not edges in both
directions.
Are nodes connected (one tie is enough)?
Are nodes connected with reciprocal edges?
25Valued
Unvalued
Strategy Only ties with valuegtThreshold are
considered
Threshold5
Threshold8
26Useful definitions measures on graphs
- Density of the graph,
- Degree, In-degree, Out-degree
- Path, Geodesic distance, Diameter
- Centrality indices (for nodes)
- Degree centrality
- Betweenness centrality,
- Closeness centrality
- Ego-net
- Cliques,
27Density (of edges) for an undirected graph
Eff.0 Poss.10 d0 Eff.2 Poss.10 d0.2 Eff.4 Poss.10 d0.4 Eff.8 Poss.10 d0.8 Eff.10 Poss.10 d1
28Density (of edges) for a directed graph
Reciprocal edges count twice (twice more
possible edges)
Eff.0 Poss.20 d0 Eff.4 Poss.20 d0.2 Eff.8 Poss.20 d0.4 Eff.21 Poss.25 d0.8 Eff.10 Poss.25 d1
29Degree in an undirected graph
- For a node, Degree number of edges
Degree is one of the Centrality measuresgt also
called Degree Centrality
NB These indices have a normalized version where
the of ties is divided by the maximum number of
ties possible
30In- Out- degree in an directed graph
- In-degree number of edges coming in to the node
Out-degree number of edges coming out of the
node
31Path sequence of edges connecting 2 nodes
Example in a directed graph
C
G
A
B
E
F
I
J
H
D
- From A-gtE 2 possible paths
- (A B C E)
- or
- (A B D E)
32Path example in an undirected graph
C
G
A
B
E
F
I
J
H
D
- From A-gtE 2 possible paths
- (D E)
- or
- (D B C E)
33Diameter of the graph
- Diameter longest distance in the graph
maximal distance between any pair of nodes
What is the diameter of this graph?
34Betweenness centrality
- Number of shortest paths passing through the node
Directed graph
Undirected graph
35Another example for betweenness
Highest score
Lowest score
- Source http//en.wikipedia.org/wiki/Centrality
36Closeness centrality
- Scoring the closeness of one node to all others
Directed graph
Undirected graph
37Eigenvector centrality
- Scoring the importance of a node in the network
- (take into account the importance of connected
nodes)
Undirected graph
Nb Eigenvector centrality cannot be calculated
for directed graphs
38The structure as a constraint
Net A
2
6
Density DA9/280,321
1
4
5
8
3
7
Net B
2
6
Density DB9/280,321
1
4
5
8
3
7
Do nodes 4 and 5 have the same role in nets A
and B?
39Measures for the network B
Betweenness
Degree
Eigenvector
Closeness
2LocalEigenvector
Harmonic Closeness
40Measures for the network A
Betweenness
Degree
Eigenvector
Closeness
2LocalEigenvector
Harmonic Closeness
41Ego-net The network of ego (undirected)
- Ego the selected node
- Alters (neighbours) distance(Ego,Alter) 1
- Ties between ego and alters
- Ties between alters
Ego-net (x34)
Ego-net (x38)
Whole network
42Ego-net (considering edge direction)
Layout / Ego network (new)
Alter-gtEgo (x38)
Ego-gtAlter (x38)
Alterlt-gtEgo (x38)
43Detection/building of sub-regions (NetDraw)
- Components simply connected sub-groups
- K-cores or K-cliques members of core k are
connected to (at least k-1 other members) - Choosing the number of sub-regions
- Hiclus of geodesic distance (Hierarchical
Cluster) - Factions
- Girvan-Newman Clustering (min max clusters)
- Polished?
44Components
45This results in breaking the component
46K-cores
47Cliques or K-cliques
- Clique maximum subset where all nodes are
connected - K-clique Clique with K members
How many cliques?
gt 6 cliques
Which are ?
- One 5-clique
- One 4-clique
- One 3-clique
- Three 2-cliques
48Hierarchical Clusters
6 clusters
8 clusters
2 clusters
4 clusters
49Why are they called Hierarchical clusters
Source (french) http//asi.insa-rouen.fr/enseigne
ment/siteUV/dm/Cours/clustering.pdf
50Outline
- Introduction historical examples
- Graphs Sociograms definition properties
- What is it interesting for?
- Building sociograms in practice
- Conception, transformation, tools format
51General view
52Data representation for graphs
- Graph a set of nodes connected by edges.
- Defining separately
- The set of nodes and
- The set of edges
- Giving the adjacency matrix
- Square matrix defining connections (values)
between all pairs of nodes
Nodes Gl1, Gl2, Gl4, Gn1, Gn2 Edges (Gn2,Gl4),
(Gl4,Gn1)
Gl1 Gl2 Gl4 Gn1 Gn2
Gl1 0 0 0 0 0
Gl2 0 0 0 0 0
Gl4 0 0 0 1 1
Gn1 0 0 1 0 0
Gn2 0 0 1 0 0
53Format/languages and tools
- Graph languages
- Text based
- Ucinet, Dot, TGF
- XML based
- GraphML, RDF
- GXL, GML, XGMML
- Table based
- DB, CSV, XLS
- Image formats
- Static image
- ps, .gif, .raw, .ppm, .bmp, .jpg, .png,
- Vector graphics/graph
- .svg, .emf, .gexf, .tif
- Interactive scripts
- javascripts
Input
SNA Tools
Output(Visualisation)
Output(transformation)
Output(other)
Other Information Index computation, Sub-graph
detection
54Tools formats interoperability