Dr. C. Lee Giles - PowerPoint PPT Presentation

Loading...

PPT – Dr. C. Lee Giles PowerPoint presentation | free to download - id: 59d967-OWUwZ



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Dr. C. Lee Giles

Description:

Title: Introductory Lecture Author: CIS Last modified by: PSU Created Date: 9/24/2012 1:17:10 PM Document presentation format: On-screen Show (4:3) Company – PowerPoint PPT presentation

Number of Views:460
Avg rating:3.0/5.0
Slides: 129
Provided by: CIS136
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Dr. C. Lee Giles


1
IST 511 Information Management Information and
Technology Networks and Social Networks
  • Dr. C. Lee Giles
  • David Reese Professor, College of Information
    Sciences and Technology
  • Professor of Computer Science and Engineering
  • Professor of Supply Chain and Information Systems
  • The Pennsylvania State University, University
    Park, PA, USA
  • giles_at_ist.psu.edu
  • http//clgiles.ist.psu.edu

Special thanks to P. Tsaparas, P. Baldi, P.
Frasconi, P. Smyth, Michael Kearns, James Moody,
Anna Nagurney
2
Last Time
  • What is AI
  • Definitions
  • Theories/hypotheses
  • Why do we care
  • Impact on information science
  • Techniques used in information science
  • Wide variety of subfields

3
Today
  • What are networks
  • Definitions
  • Theories
  • Social networks
  • Why do we care
  • Impact on information science

4
Tomorrow
  • Topics used in IST
  • Machine learning
  • Text Information retrieval
  • Linked information and search
  • Encryption
  • Probabilistic reasoning
  • Digital libraries
  • Others?

5
Theories in Information Sciences
  • Enumerate some of these theories in this course.
  • Issues
  • Unified theory?
  • Domain of applicability
  • Conflicts
  • Theories here are mostly algorithmic
  • Quality of theories
  • Occams razor
  • Subsumption of other theories
  • Theories of networks

6
Networked Life
  • Physical, social, biological, etc
  • Hybrids
  • Static vs dynamic
  • Local vs global
  • Measurable and reproducible

7
  • Points power stations
  • Operated by companies
  • Connections embody business relationships
  • Food for thought
  • 2003 Northeast blackout

North American Power Grid
8
  • Purely biological network
  • Links are physical
  • Interaction is electrical

The Human Brain
9
The Premise of Networked Life
  • It makes sense to study these diverse networks
    together.
  • The Commonalities
  • Formation (distributed, bottom-up, organic,)
  • Structure (individuals, groups, overall
    connectivity, robustness)
  • Decentralization (control, administration,
    protection,)
  • Strategic Behavior (economic, free riding,
    Tragedies of the Common)
  • An Emerging Science
  • Examining apparent similarities between many
    human and technological systems organizations
  • Importance of network effects in such systems
  • How things are connected matters greatly
  • Details of interaction matter greatly
  • The metaphor of viral spread
  • Dynamics of economic and strategic interaction
  • Qualitative and quantitative can be very subtle
  • A revolution of measurement, theory, and breadth
    of vision

10
Whos Doing All This?
  • Computer Information Scientists
  • Understand and design complex, distributed
    networks
  • View competitive decentralized systems as
    economies
  • Social Scientists, Behavioral Psychologists,
    Economists
  • Understand human behavior in simple settings
  • Revised views of economic rationality in humans
  • Theories and measurement of social networks
  • Physicists and Mathematicians
  • Interest and methods in complex systems
  • Theories of macroscopic behavior (phase
    transitions)
  • All parties are interacting and collaborating

11
Examples
  • Theories
  • Apps in all areas

12
The Networked Nature of Society
  • Networks as a collection of pairwise relations
  • Examples of (un)familiar and important networks
  • social networks
  • content networks
  • technological networks
  • biological networks
  • economic networks
  • The distinction between structure and dynamics

A network-centric overview of modern society.
13
  • Points are still machines but are associated
    with people
  • Links are still physical but may depend on
    preferences
  • Interaction content exchange
  • Food for thought free riding

Gnutella Peers
14
  • A purely technological network?
  • Points are physical machines
  • Links are physical wires
  • Interaction is electronic
  • What more is there to say?

Internet, Router Level
15
  • Points sovereign nations
  • Links exchange volume
  • A purely virtual network

Foreign Exchange
16
Contagion, Tipping and Networks
  • Epidemic as metaphor
  • The three laws of Gladwell
  • Law of the Few (connectors in a network)
  • Stickiness (power of the message)
  • Power of Context
  • The importance of psychology
  • Perceptions of others
  • Interdependence and tipping
  • Paul Revere, Sesame Street, Broken Windows, the
    Appeal of Smoking, and Suicide Epidemics

17
Graph Network Theory
  • Networks of vertices and edges
  • Graph properties
  • cliques, independent sets, connected components,
    cuts, spanning trees,
  • social interpretations and significance
  • Special graphs
  • bipartite, planar, weighted, directed, regular,
  • Computational issues at a high level

18
What is a network?
  • Network a collection of entities that are
    interconnected with links.
  • people that are friends
  • computers that are interconnected
  • web pages that point to each other
  • proteins that interact

19
Graphs
  • In mathematics, networks are called graphs, the
    entities are nodes, and the links are edges
  • Graph theory starts in the 18th century, with
    Leonhard Euler
  • The problem of Königsberg bridges
  • Since then graphs have been studied extensively.

Academic genealogy
20
Networks in the past
  • Graphs have been used in the past to model
    existing networks (e.g., networks of highways,
    social networks)
  • usually these networks were small
  • network can be studied visual inspection can
    reveal a lot of information

21
Networks now
  • More and larger networks appear
  • Products of technological advancement
  • e.g., Internet, Web
  • Result of our ability to collect more, better,
    and more complex data
  • e.g., gene regulatory networks
  • Networks of thousands, millions, or billions of
    nodes
  • impossible to visualize

22
The internet map
23
Understanding large graphs
  • What are the statistics of real life networks?
  • Can we explain how the networks were generated?

24
Measuring network properties
  • Around 1999
  • Watts and Strogatz, Dynamics and small-world
    phenomenon
  • Faloutsos, On power-law relationships of the
    Internet Topology
  • Kleinberg et al., The Web as a graph
  • Barabasi and Albert, The emergence of scaling in
    real networks

25
Real network properties
  • Most nodes have only a small number of neighbors
    (degree), but there are some nodes with very high
    degree (power-law degree distribution)
  • scale-free networks
  • If a node x is connected to y and z, then y and z
    are likely to be connected
  • high clustering coefficient
  • Most nodes are just a few edges away on average.
  • small world networks
  • Networks from very diverse areas (from internet
    to biological networks) have similar properties
  • Is it possible that there is a unifying
    underlying generative process?

26
Generating random graphs
  • Classic graph theory model (Erdös-Renyi)
  • each edge is generated independently with
    probability p
  • Very well studied model but
  • most vertices have about the same degree
  • the probability of two nodes being linked is
    independent of whether they share a neighbor
  • the average paths are short

27
Modeling real networks
  • Real life networks are not random
  • Can we define a model that generates graphs with
    statistical properties similar to those in real
    life?
  • a flurry of models for random graphs

28
Processes on networks
  • Why is it important to understand the structure
    of networks?
  • Epidemiology Viruses propagate much faster in
    scale-free networks
  • Vaccination of random nodes does not work, but
    targeted vaccination is very effective
  • Random sampling can be dangerous!

29
The basic random graph model
  • The measurements on real networks are usually
    compared against those on random networks
  • The basic Gn,p (Erdös-Renyi) random graph model
  • n the number of vertices
  • 0 p 1
  • for each pair (i,j), generate the edge (i,j)
    independently with probability p

30
Degree distributions
frequency
fk fraction of nodes with degree k p(k)
probability of a randomly selected node to
have degree k
fk
degree
k
  • Problem find the probability distribution that
    best fits the observed data

31
Power-law distributions
  • The degree distributions of most real-life
    networks follow a power law
  • Right-skewed/Heavy-tail distribution
  • there is a non-negligible fraction of nodes that
    has very high degree (hubs)
  • scale-free no characteristic scale, average is
    not informative
  • In stark contrast with the random graph model!
  • Poisson degree distribution, znp
  • highly concentrated around the mean
  • the probability of very high degree nodes is
    exponentially small

p(k) Ck-a
32
Power-law signature
  • Power-law distribution gives a line in the
    log-log plot
  • a power-law exponent (typically 2 a 3)

log p(k) -a logk logC
a
log frequency
frequency
log degree
degree
33
Examples of degree distribution for power laws
Taken from Newman 2003
34
A random graph example
35
Exponential distribution
  • Observed in some technological or collaboration
    networks
  • Identified by a line in the log-linear plot

p(k) le-lk
log p(k) - lk log l
log frequency
?
degree
36
Average/Expected degree
  • For random graphs z np
  • For power-law distributed degree
  • if a 2, it is a constant
  • if a lt 2, it diverges

37
Maximum degree
  • For random graphs, the maximum degree is highly
    concentrated around the average degree z
  • For power law graphs

38
Collective Statistics (M. Newman 2003)
39
Clustering coefficient
  • In graph theory, a clustering coefficient is a
    measure of degree to which nodes in a graph tend
    to cluster together.
  • Evidence suggests that in most real-world
    networks, and in particular social networks,
    nodes tend to create tightly knit groups
    characterized by a relatively high density of
    ties (Holland and Leinhardt, 19711 Watts and
    Strogatz, 19982).
  • In real-world networks, this likelihood tends to
    be greater than the average probability of a tie
    randomly established between two nodes (Holland
    and Leinhardt, 1971 Watts and Strogatz, 1998).

40
Clustering (Transitivity) coefficient
  • Measures the density of triangles (local
    clusters) in the graph
  • Two different ways to measure it, C1 C2
  • The ratio of the means

41
Exampleundirected graph
1
4
3
2
5
Triangles one each centered at nodes, 1, 2,
3 Triples none centered for nodes 4, 5 node 1
213 node 2 123 node 3 134, 135, 234, 235,
132, 231
42
Clustering (Transitivity) coefficient
  • Clustering coefficient for node i
  • The mean of the ratios

43
Example
  • The two clustering coefficients give different
    measures
  • C(2) increases with nodes with low degree

1
4
3
2
5
44
Collective Statistics (M. Newman 2003)
45
Clustering coefficient for random graphs
  • The probability of two of your neighbors also
    being neighbors is p, independent of local
    structure
  • clustering coefficient C p
  • when z is fixed C z/n O(1/n)

46
The C(k) distribution
  • The C(k) distribution is supposed to capture the
    hierarchical nature of the network
  • when constant no hierarchy
  • when power-law hierarchy

C(k) average clustering coefficient of nodes
with degree k
C(k)
degree
k
47
Millgrams small world experiment
  • Letters were handed out to people in Nebraska to
    be sent to a target in Boston
  • People were instructed to pass on the letters to
    someone they knew on first-name basis
  • The letters that reached the destination followed
    paths of length around 6
  • Six degrees of separation (play of John Guare)
  • Also
  • The Kevin Bacon game
  • The Erdös number
  • Small world project http//smallworld.columbia.ed
    u/index.html

48
Measuring the small world phenomenon
  • dij shortest path between i and j
  • Diameter
  • Characteristic path length
  • Harmonic mean
  • Also, distribution of all shortest paths

49
Collective Statistics (M. Newman 2003)
50
Is the path length enough?
  • Random graphs have diameter
  • dlogn/loglogn when z?(log n)
  • Short paths should be combined with other
    properties
  • ease of navigation
  • high clustering coefficient

51
Degree correlations
  • Do high degree nodes tend to link to high degree
    nodes?
  • Pastor Satoras et al.
  • plot the mean degree of the neighbors as a
    function of the degree

52
Degree correlations
  • Newman
  • compute the correlation coefficient of the
    degrees of the two endpoints of an edge
  • assortative/disassortative

53
Collective Statistics (M. Newman 2003)
54
Connected components
  • For undirected graphs, the size and distribution
    of the connected components
  • is there a giant component?
  • For directed graphs, the size and distribution of
    strongly and weakly connected components

55
Network Resilience
  • Study how the graph properties change when
    performing random or targeted node deletions

56
Social Networks
  • A social network is a social structure of people,
    related (directly or indirectly) to each other
    through a common relation or interest
  • Social network analysis (SNA) is the study of
    social networks to understand their structure and
    behavior

(Source Freeman, 2000)
57
Social Network Theory
  • Metrics of social importance in a network
  • degree, closeness, between-ness, clustering
  • Local and long-distance connections
  • SNT universals
  • small diameter
  • clustering
  • heavy-tailed distributions
  • Models of network formation
  • random graph models
  • preferential attachment
  • affiliation networks
  • Examples from society, technology and fantasy

58
The Web as a Network
  • Empirical web structure and components
  • Web and blog communities
  • Web search
  • hubs and authorities
  • the PageRank algorithm
  • The Main Streets and dark alleys of the web

The algorithmic and social implications of
network structure.
59
Towards RationalityEmergence of Global from
Local
  • Beyond the dynamics of transmission
  • Context, motivation and influence
  • The madness/wisdom of crowds
  • thresholds and cascades
  • mathematical models of tipping
  • the market for lemons
  • private preferences and global segregation

60
Interdependent Security and Networks
  • Security investment and Tragedies of the Commons
  • Catastrophic events you can only die once
  • Fire detectors, airline security, Arthur
    Anderson,

Blending network, behavior and dynamics.
61
Network Economics
  • Buying and selling on a network
  • Modeling constraints on trading partners
  • Local imbalances of supply and demand
  • Preferential attachment, price variation, and the
    distribution of wealth

The effects of network structure on economic
outcomes.
62
Modern Financial Markets
  • Stock market networks
  • correlation of returns
  • Market microstructure
  • limit and market orders
  • order books and electronic crossing networks
  • network, connectivity and data issues
  • Quantitative trading
  • VWAP trading, market making
  • limit order power laws
  • Herd behavior in trading
  • Economic theory and financial markets
  • Behavioral economics and finance
  • Impacts of the Internet on financial markets

A study of the network that runs the world.
63
Definition of Social Networks
  • A social network is a set of actors that may
    have relationships with one another. Networks can
    have few or many actors (nodes), and one or more
    kinds of relations (edges) between pairs of
    actors. (Hannemann, 2001)

64
History (based on Freeman, 2000)
  • 17th century Spinoza developed first model
  • 1937 J.L. Moreno introduced sociometry he also
    invented the sociogram
  • 1948 A. Bavelas founded the group networks
    laboratory at MIT he also specified centrality

65
Social Networking
  • Large number of sites available throughout the
    world

66
History (based on Freeman, 2000)
  • 1949 A. Rapaport developed a probability based
    model of information flow
  • 50s and 60s Distinct research by individual
    researchers
  • 70s Field of social network analysis emerged.
  • New features in graph theory more general
    structural models
  • Better computer power analysis of complex
    relational data sets

67
Foundations Theory
Structural Analysis from method and metaphor to
theory and substance.
H. White The presently existing, largely
categorical descriptions of social structure have
no solid theoretical grounding furthermore,
network concepts may provide the only way to
construct a theory of social structure. (p.25)
Integration of large-scale social systems
Form Vs. Content
68
Introduction
  • Social network analysis is
  • a set of relational methods for systematically
    understanding and identifying connections among
    actors. SNA
  • is motivated by a structural intuition based on
    ties linking social actors
  • is grounded in systematic empirical data
  • draws heavily on graphic imagery
  • relies on the use of mathematical and/or
    computational models.
  • Social Network Analysis embodies a range of
    theories relating types of observable social
    spaces and their relation to individual and group
    behavior.

69
Introduction
What are social relations?
A social relation is anything that links two
actors. Examples include Kinship Co-membership
Friendship Talking with Love Hate Exchang
e Trust Coauthorship Fighting
70
Introduction
What properties relations are studied?
The substantive topics cross all areas of
sociology. But we can identify types of
questions that social network researchers
ask 1) Social network analysts often study
relations as systems. That is, what is of
interest is how the pattern of relations among
actors affects individual behavior or system
properties.
71
Introduction
High Schools as Networks
72
(No Transcript)
73
(No Transcript)
74
Introduction
Why do Networks Matter?
Local vision
75
Introduction
Why do Networks Matter?
Local vision
76
Representation of Social Networks
  • Matrices
  • Graphs

Ann
Sue
Nick
Rob
77
Graphs - Sociograms (based on Hanneman, 2001)
  • Labeled circles represent actors
  • Line segments represent ties
  • Graph may represent one or more types of
    relations
  • Each tie can be directed or show co-occurrence
  • Arrows represent directed ties

78
Graphs Sociograms (based on Hanneman, 2001)
  • Strength of ties
  • Nominal
  • Signed
  • Ordinal
  • Valued

79
Visualization Software Krackplot
80
Connections
  • Size  
  • Number of nodes
  • Density
  • Number of ties that are present vs the amount of
    ties that could be present
  • Out-degree
  • Sum of connections from an actor to others
  • In-degree
  • Sum of connections to an actor
  • Diameter
  • Maximum greatest least distance between any actor
    and another

81
Some Measures of Distance
  • Walk (path)
  • A sequence of actors and relations that begins
    and ends with actors
  • Geodesic distance (shortest path)
  • The number of actors in the shortest possible
    walk from one actor to another
  • Maximum flow
  • The amount of different actors in the
    neighborhood of a source that lead to pathways to
    a target

82
Some Measures of Power (based on Hanneman, 2001)
  • Degree (indegree, outdegree)
  • Sum of connections from or to an actor
  • Closeness centrality
  • Distance of one actor to all others in the
    network
  • Betweenness centrality
  • Number that represents how frequently an actor is
    between other actors geodesic paths

83
Cliques and Social Roles (based on Hanneman,
2001)
  • Cliques
  • Sub-set of actors
  • More closely tied to each other than to actors
    who are not part of the sub-set
  • Social roles
  • Defined by regularities in the patterns of
    relations among actors

84
SNA applications
  • Many new unexpected applications plus many of the
    old ones
  • Marketing
  • Advertising
  • Economic models and trends
  • Political issues
  • Organization
  • Services to social network actors
  • Travel guides
  • Jobs
  • Advice
  • Human capital analysis and predictions
  • Medical
  • Epidemiology
  • Defense (terrorist networks)

85
Examples of Applications (based on Freeman, 2000)
  • Visualizing networks
  • Studying differences of cultures and how they can
    be changed
  • Intra- and interorganizational studies
  • Spread of illness, especially HIV

86
Foundations Data
The unit of interest in a network are the
combined sets of actors and their relations. We
represent actors with points and relations with
lines. Actors are referred to variously
as Nodes, vertices, actors or
points Relations are referred to variously
as Edges, Arcs, Lines, Ties
Example
b
d
a
c
e
87
Foundations Data
  • Social Network data consists of two linked
    classes of data
  • Nodes Information on the individuals (actors,
    nodes, points, vertices)
  • Network nodes are most often people, but can be
    any other unit capable of being linked to another
    (schools, countries, organizations,
    personalities, etc.)
  • The information about nodes is what we usually
    collect in standard social science research
    demographics, attitudes, behaviors, etc.
  • Often includes dynamic information about when the
    node is active
  • b) Edges Information on the relations among
    individuals (lines, edges, arcs)
  • Records a connection between the nodes in the
    network
  • Can be valued, directed (arcs), binary or
    undirected (edges)
  • One-mode (direct ties between actors) or two-mode
    (actors share membership in an organization)
  • Includes the times when the relation is active
  • Graph theory notation G(V,E)

88
Foundations Data
In general, a relation can be (1) Binary or
Valued (2) Directed or Undirected
The social process of interest will often
determine what form your data take. Almost all
of the techniques and measures we describe can be
generalized across data format.
89
Foundations Data and social science
Global-Net
90
Foundations Data
We can examine networks across multiple levels
1) Ego-network - Have data on a respondent (ego)
and the people they are connected to (alters).
Example terrorist networks - May include
estimates of connections among alters
2) Partial network - Ego networks plus some
amount of tracing to reach contacts of contacts
- Something less than full account of
connections among all pairs of actors in the
relevant population - Example CDC Contact
tracing data
91
Foundations Data
We can examine networks across multiple levels
  • 3) Complete or Global data
  • - Data on all actors within a particular
    (relevant) boundary
  • - Never exactly complete (due to missing data),
    but boundaries are set
  • Example Coauthorship data among all writers in
    the social sciences, friendships among all
    students in a classroom

92
Foundations Graphs
Working with pictures. No standard way to draw a
sociogram which are equal?
93
Foundations Graphs
Network visualization helps build intuition, but
you have to keep the drawing algorithm in mind
Spring-embeder layouts
Tree-Based layouts
Most effective for very sparse, regular graphs.
Very useful when relations are strongly directed,
such as organization charts, internet connections,
Most effective with graphs that have a strong
community structure (clustering, etc). Provides
a very clear correspondence between social
distance and plotted distance
Two images of the same network
94
Foundations Graphs
Network visualization helps build intuition, but
you have to keep the drawing algorithm in mind
Spring-embeder layouts
Tree-Based layouts
Two images of the same network
95
Foundations Graphs
Network visualization helps build intuition, but
you have to keep the drawing algorithm in
mind. Hierarchy Tree models Use optimization
routines to add meaning to the Y-axis of the
plot. This makes it possible to easily see who
is most central because of who is on the top of
the figure. Usually includes some routine for
minimizing line-crossing. Spring Embedder
layouts Work on an analogy to a physical system
ties connecting a pair have springs that pull
them together. Unconnected nodes have springs
that push them apart. The resulting image
reflects the balance of these two features. This
usually creates a correspondence between physical
closeness and network distance.
96
Foundations Graphs
97
Foundations Graphs
Using colors to code attributes makes it simpler
to compare attributes to relations. Here we can
assess the effectiveness of two different
clustering routines on a school friendship
network.
98
Foundations Graphs
As networks increase in size, the effectiveness
of a point-and-line display diminishes - run out
of plotting dimensions. Insights from the
overlap that results in from a space-based
layout as information. Here you see the
clustering evident in movie co-staring for about
8000 actors.
99
Foundations Graphs
This figure contains over 29,000 social science
authors. The two dense regions reflect different
topics.
100
Foundations Graphs
As networks increase in size, the effectiveness
of a point-and-line display diminishes, because
you simply run out of plotting dimensions. Ive
found that you can still get some insight by
using the overlap that results in from a
space-based layout as information. This figure
contains over 29,000 social science authors. The
two dense regions reflect different topics.
101
Foundations Graphs and time
Adding time to social networks is also
complicated, run out of space to put time in most
network figures. One solution animate the
network - make a movie! Here we see streaming
interaction in a classroom, where the teacher
(yellow square) has trouble maintaining
order. The SoNIA software program (McFarland and
Bender-deMoll)
102
Foundations Methods
Graphs are cumbersome to work with analytically,
though there is a great deal of good work to be
done on using visualization to build network
intuition. Recommendation use layouts that
optimize on the feature you are most interested
in.
103
A graph is vertices and edges
  • A graph is vertices joined by edges
  • i.e. A set of vertices V and a set of edges E
  • A vertex is defined by its name or label
  • An edge is defined by the two vertices which it
    connects, plus optionally
  • An order of the vertices (direction)
  • A weight (usually a number)
  • Two vertices are adjacent if they are connected
    by an edge
  • A vertexs degree is the no. of its edges

104
Directed graph (digraph)
  • Each edge is an ordered pair of vertices, to
    indicate direction
  • Lines become arrows
  • The indegree of a vertex is the number of
    incoming edges
  • The outdegree of a vertex is the number of
    outgoing edges

E
210
M
450
190
60
B
200
130
L
P
105
Traversing a graph (1)
  • A path between two vertices exists if you can
    traverse along edges from one vertex to another
  • A path is an ordered list of vertices
  • length the number of edges in the path
  • cost the sum of the weights on each edge in the
    path
  • cycle a path that starts and finishes at the
    same vertex
  • An acyclic graph contains no cycles

106
Traversing a graph (2)
  • Undirected graphs are connected if there is a
    path between any pair of vertices
  • Digraphs are usually either densely or sparsely
    connected
  • Densely the ratio of number of edges to number
    of vertices is large
  • Sparsely the above ratio is small

E
M
B
L
P
107
Two graph representationsadjacency matrix and
adjacency list
  • Adjacency matrix
  • n vertices need a n x n matrix (where n V,
    i.e. the number of vertices in the graph) - can
    store as an array
  • Each position in the matrix is 1 if the two
    vertices are connected, or 0 if they are not
  • For weighted graphs, the position in the matrix
    is the weight
  • Adjacency list
  • For each vertex, store a linked list of adjacent
    vertices
  • For weighted graphs, include the weight in the
    elements of the list

108
Representing an unweighted, undirected graph
(example)
0E
1M
2B
3L
4P
109
Representing a weighted, undirected graph
(example)
0E
210
1M
450
190
60
2B
200
130
3L
4P
110
Representing an unweighted, directed graph
(example)
0E
1M
2B
3L
4P
111
Comparing the two representations
  • Space complexity
  • Adjacency matrix is O(V2)
  • Adjacency list is O(V E)
  • E is the number of edges in the graph
  • Static versus dynamic representation
  • An adjacency matrix is a static representation
    the graph is built in one go, and is difficult
    to alter once built
  • An adjacency list is a dynamic representation
    the graph is built incrementally, thus is more
    easily altered during run-time

112
Algorithms involving graphs
  • Graph traversal
  • Shortest path algorithms
  • In an unweighted graph shortest length between
    two vertices
  • In a weighted graph smallest cost between two
    vertices
  • Minimum Spanning Trees
  • Using a tree to connect all the vertices at
    lowest total cost

113
Graph traversal algorithms
  • When traversing a graph, we must be careful to
    avoid going round in circles!
  • We do this by marking the vertices which have
    already been visited
  • Breadth-first search uses a queue to keep track
    of which adjacent vertices might still be
    unprocessed
  • Depth-first search keeps trying to move forward
    in the graph, until reaching a vertex with no
    outgoing edges to unmarked vertices

114
Shortest path (unweighted)
  • The problem Find the shortest path from a vertex
    v to every other vertex in a graph
  • The unweighted path measures the number of edges,
    ignoring the edges weights (if any)

115
Shortest unweighted pathsimple algorithm
For a vertex v, dv is the distance between a
starting vertex and v
  • 1 Mark all vertices with dv infinity
  • 2 Select a starting vertex s, and set ds 0, and
    set shortest 0
  • 3 For all vertices v with dv shortest, scan
    their adjacency lists for vertices w where dw is
    infinity
  • For each such vertex w, set dw to shortest1
  • 4 Increment shortest and repeat step 3, until
    there are no vertices w

116
Foundations Build a socio-matrix
From pictures to matrices
Undirected, binary
Directed, binary
117
Foundations Methods
From matrices to lists
Arc List
Adjacency List
a b b a b c c b c d c e d c d e e c e d
118
Foundations Basic Measures
Basic Measures For greater detail,
see http//www.analytictech.com/networks/graphth
eory.htm
Volume
The first measure of interest is the simple
volume of relations in the system, known as
density, which is the average relational value
over all dyads. Under most circumstances, it is
calculated as
1???0
119
Foundations Basic Measures
Volume
At the individual level, volume is the number of
relations, sent or received, equal to the row and
column sums of the adjacency matrix.
Node In-Degree Out-Degree a
1 1 b 2 1 c
1 3 d 2 0 e
1 2 Mean 7/5 7/5
120
Foundations Data
Basic Measures
Reachability
Indirect connections are what make networks
systems. One actor can reach another if there is
a path in the graph connecting them.
a
b
d
a
c
e
f
121
Foundations Basic Matrix Operations
One of the key advantages to storing networks as
matrices is that we can use all of the tools from
linear algebra on the socio-matrix. Some of the
basics matrix manipulations that we use are as
follows
  • Definition
  • A matrix is any rectangular array of numbers. We
    refer to the matrix dimension as the number of
    rows and columns

(5 x 5)
(5x2)
(5x1)
122
Foundations Basic Matrix Operations
Matrix operations work on the elements of the
matrix in particular ways. To do so, the
matrices must be conformable. That means the
sizes allow the operation. For addition (),
subtraction (-), or elementwise multiplication
(), both matrices must have the same number of
rows and columns. For these operations, the
matrix value is the operation applied to the
corresponding cell values.
-1 0 -3 6 2 1
3 6 11 8 2 9
1 3 4 7 2 5
2 3 7 1 0 4
A-B
AB
A
B
2 9 28 7 0 20
3 9 12 21 6 15
AB
Multiplication by a scalar 3A
123
Matrix properties
  • Addition contributes to the actors relations
  • Multiplication sums over a trait.
  • Negative values can occur
  • (friend, dont care, enemy) (1,0,-1)
  • Interpret operations carefully

124
Foundations Basic Matrix Operations
The transpose ( or T) of a matrix reverses the
row and column dimensions. AtijAji So a M x
N matrix becomes an N x M matrix.
T
a b c d e f
a c e b d f

125
Foundations Basic Matrix Operations
The matrix multiplication (x) of two matrices
involves all elements of the matrix, and will
often result in a matrix of new dimensions. In
general, to be conformable, the inner dimension
of both matrices must match. So A3x2 x B2x3
C3 x 3 But A3x3 x B2x3 is not defined
(actually a tensor) Substantively, adding
names to the dimensions will help us keep track
of what the resulting multiplications mean So
multiplying (send x receive)x (send x receive)
(send x receive), giving us the two-step
distances (the senders recipient's receivers).
126
Foundations Basic Matrix Operations
The multiplication of two matrices Amxn and Bnxq
results in Cmxq
a b c d
e f g h
aebg afbh cedg cfdh

a b c d e f
agbj ahbk aibl cgdj chdk cidl egfg
ehfk eifl
g h i j k l

(3x2) (2x3)
(3x3)
127
Foundations Basic Matrix Operations
The powers (square, cube, etc) of a matrix are
just the matrix times itself that many
times. A2 AA or A3 AAA We often use
matrix multiplication to find types of people one
is tied to, since the 1 in the adjacency matrix
effectively captures just the people each row is
connected to.
128
Foundations Data
Basic Measures
Reachability
The distance from one actor to another is the
shortest path between them, known as the geodesic
distance. If there is at least one path
connecting every pair of actors in the graph, the
graph is connected and is called a component.
Two paths are independent if they only have the
two end-nodes in common. If a graph has two
independent paths between every pair, it is
biconnected, and called a bicomponent. Similarly
for three paths, four, etc.
129
Foundations Data
Calculate reachability through matrix
multiplication. (see p.162 of WF)
Total of directed walks for power n
Minimal distance from one node to another
130
Foundations Data
Mixing patterns
Matrices make it easy to look at mixing patterns
connections among types of nodes. Simply
multiply an indicator of category by the
adjacency matrix.
e
d
c
f
B 4 to selves B 2 to G G 2 to B G 6 to selves
b
a
131
Foundations Data
Matrix manipulations allow you to look at
direction of ties, and distinguish symmetric
from asymmetric ties.
To transform an asymmetric graph to a symmetric
graph, add it to its transpose.
X 0 1 0 0 0 1 0 0 0 0 0 1 0 1 1 0 0 0 0 0 0
0 1 1 0
XT 0 1 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 1 0 1 0 0 1
0 0
Max Sym MIN Sym 0 1 0 0 0 0 1 0 0
0 1 0 1 0 0 1 0 0 0 0 0 1 0 1 1 0 0 0
0 1 0 0 1 0 1 0 0 0 0 0 0 0 1 1 0 0 0
1 0 0
0 2 0 0 0 2 0 1 0 0 0 1 0 1 2 0 0 1 0 1 0 0 2 1 0
Interpretation?
132
Graphs / matrices
  • Analysis of social structure
  • Visualization tools
  • Other methods
  • Statistics
  • Power laws
  • Bayesian graphs

133
Global web Winners take all
  • Pr(page has k inlinks) ? k-? ??2.1
  • Popular few receive disproportionate share of
    links
  • ? traffic, ? prob SE indexing, ? SE ranking

134
Category-specific webWinners dont (quite)
take all
  • All US company homepages
  • hist w/ exp ? buckets (const on log scale)
  • Strong deviation from pure power law
  • Unimodal (?l.n.) body,power law tail
  • Less skewed many fare well against mode

Pennock, Giles, et.al PNAS 2002
135
Applications of Networked Life
  • Social structure in organizations
  • Economic and business behavior
  • Epidemiology
  • Information discovery
  • Design and robustness of networks

136
SNA disciplines
  • More diverse than expected!
  • Sociology
  • Political Science
  • Business
  • Economics
  • Sciences
  • Computer science
  • Information science
  • Others?

137
Citation Graph - social network
Bibliographic coupling
cites
Paper
cocitation
is cited by
Note that journal citations nearly always refer
to earlier work.
138
Graphical Analysis of Hyperlinks on the Web -
social network?
This page links to many other pages (hub)
2
1
4
Many pages link to this page (authority)
3
6
5
139
PageRank as social network analysis
  • PageRank (social rank) flowing from pages to
    the pages they cite.

.1
.09
140
SN on the web - services
  • A social network service uses software to build
    online social networks for communities of people
    who share interests and activities or who are
    interested in exploring the interests and
    activities of others. - wikipedia
  • Friending
  • Facebook
  • MySpace
  • LinkedIn
  • Second life
  • This will only increase!
  • Large complex, heterogeneous networks
  • Latours actor-network model
  • Different entities connect actors
  • Coauthorship network connected by papers

141
example networks of people and articles (e.g.,
citation and co-authorship networks)
this image is from the system ReferalWeb by
Henry Katz et al. at ATT Research http//foraker.r
esearch.att.com/refweb/version2/RefWeb.html
142
SNA and the Web 2.0
  • Wikis
  • Blogs
  • Folksonomies
  • Collaboratories
  • What next?

143
Computational SNA Models
  • New models are emerging
  • Very large network analysis is possible!
  • Deterministic - algebraic
  • Early models still useful
  • Statistical
  • Descriptive using many features
  • Diameter, betweeness,
  • Probabilistic graphs
  • Generative
  • Creates SNA based on agency, documents,
    geography, etc.
  • Community discovery and prediction

144
Graphical models
  • Modeling the document generation

Existing three generative models. Three
variables in the generation of documents are
considered (1) authors (2) words and (3)
topics (latent variable)
145
Theories used in SNA
  • Graph/network
  • Heterogeneous graphs
  • Hypergraphs
  • Probabilistic graphs
  • Economics/game theory
  • Optimization
  • Visualization/HCI
  • Actor/Network
  • Many more

146
Future of social networks?
  • Top End User Predictions for 2010 - Gartner
  • By 2012, Facebook will become the hub for social
    networks integration and Web socialization.
  • Internet marketing will be regulated by 2015,
    controlling more than 250 billion in Internet
    marketing spending worldwide.
  • By 2014, more than three billion of the worlds
    adult population will be able to transact
    electronically via mobile and Internet
    technology.
  • By 2015, context will be as influential to mobile
    consumer services and relationships as search
    engines are to the Web.
  • By 2013, mobile phones will overtake PCs as the
    most common Web access device worldwide.

147
Open questions
  • Scalability
  • Data acquisition and data rights
  • Search (socialnetworkrank?)
  • CollabSeer
  • Trust
  • Heterogeneous network analysis
  • Business models!

148
What next?
  • More personal search
  • Mobile search
  • Specialty search
  • Freshness search
  • ?

Search as a problem is only 5 solved Udi
Manber, 1st Yahoo, 2nd Amazon, now Google
149
Social networks vs social networking
  • Social networks are links of actors and their
    relationships usually represented as a graph or
    network
  • Social networking is the actual implementation of
    social networks in the digital world or media
  • A social network service focuses on building and
    reflecting of social networks or social relations
    among people, e.g., who share interests and/or
    activities. A social network service essentially
    consists of a representation of each user (often
    a profile), his/her social links, and a variety
    of additional services. Most social network
    services are web based and provide means for
    users to interact over the internet, such as
    e-mail and instant messaging.

150
Facebook vs Google
151
Web 2.0
  • A perceived second generation of web development
    and design, that aims to facilitate
    communication, secure information sharing,
    interoperability, and collaboration on the World
    Wide Web.
  • Web 2.0 concepts have led to the development and
    evolution of web-based communities, hosted
    services, and applications such as
    social-networking sites, video-sharing sites,
    wikis, blogs, and folksonomies.

152
Social Media
  • Information content created by people using
    highly accessible and scalable publishing
    technologies that is intended to facilitate
    communications, influence and interaction with
    peers and with public audiences, typically via
    the Internet and mobile communications networks.
  • The term most often refers to activities that
    integrate technology, telecommunications and
    social interaction, and the construction of
    words, pictures, videos and audio.
  • Businesses also refer to social media as
    user-generated content (UGC) or
    consumer-generated media (CGM).

153
Social Media on Web 2.0
  • Multimedia
  • Photo-sharing Flickr
  • Video-sharing YouTube
  • Audio-sharing imeem
  • Entertainment
  • Virtual Worlds Second Life
  • Online Gaming World of Warcraft
  • News/Opinion
  • Social news Digg, Reddit
  • Reviews Yelp, epinions
  • Communication
  • Microblogs Twitter, Pownce
  • Events Evite
  • Social Networking Services
  • Facebook, LinkedIn, MySpace

154
Top 10 Websites
  • Googe
  • Microsoft
  • Yahoo!
  • AOL
  • Wikimedia
  • eBay
  • CBS
  • Fox Interactive Media
  • Amazon Sites
  • Facebook
  • Source comScore as of August 2008

155
Top 10 Websites - 2009
156
Top 10 Websites - Google
Feb 2011
But not everyone agrees
157
But not everyone agrees
158
Top 10 Social Media Websites
Other opinions
159
Top Websites MPM
160
Social Network Service
  • A social network service focuses on building
    online communities of people who share interests
    and/or activities, or who are interested in
    exploring the interests and activities of others.
  • Most social network services are web based and
    provide a variety of ways for users to interact,
    such as e-mail and instant messaging services.

161
Once Popular Social Networking Sites by Location
  • North America
  • MySpace and Facebook, Nexopia (mostly in Canada)
  • South and Central America
  • Orkut, Facebook and Hi5
  • Europe
  • Bebo,Facebook, Hi5, MySpace, Tagged, Xing and
    Skyrock
  • Asia and Pacific
  • Friendster, Orkut, Xiaonei and Cyworld

162
Usage of Social Network
163
Types of Social Networking Services
  • Profile-based
  • Content-based
  • White-label
  • Multi-user Virtual environments
  • Mobile
  • Micro-blogging
  • Social Search

164
Profile-based
  • Primarily organized around members profile pages
    pages that mainly consist of information about
    an individual member, including the persons
    picture and details of interests, likes and
    dislikes.
  • Bebo, Facebook and MySpace are all good examples
    of profile-based services.

165
Content-based
  • In these services, the users profile remains an
    important way of organizing connections, but
    plays a secondary role to the posting of content.
  • Examples of content-based communities
  • YouTube.com for video sharing
  • Last.fm, in which the content is arranged by
    software that monitors and represents the music
    that users listen to.
  • Photo-sharing site like Flickr.com.

166
White-label
  • Offer some group-building functionality, which
    allows users to form mini-communities within
    sites.
  • Example sites
  • PeopleAggregator
  • Ning
  • Colexn
  • Seem to be losing popularity

167
Multi-user Virtual Environment
  • allow users to interact with each others avatars
  • users have profile cards, their functional
    profiles are the characters they customize or
    build and control. Friends lists are usually
    private and not publicly shared or displayed.
  • Example Sites
  • Second Life
  • World of Warcraft

168
Mobile Social Networking
  • Many social networking sites, for example MySpace
    and Twitter, offer mobile phone versions of their
    services, allowing members to interact with their
    friends via their phones.
  • mobile-led and mobile-only communities include
    profiles and media-sharing just as with web-based
    social networking services.
  • MYUBO, for example, allows users to share and
    view video over mobile networks
  • MobiSNA, mobile video social networking

169
Micro-blogging
  • Allows you to publish short (140 characters,
    including spaces) messages publicly or within
    contact groups.
  • These services are designed to work as mobile
    services, but are popularly used on the web as
    well.
  • Example Services
  • Twitter
  • Jaiku

170
Social Search
  • Social search engines are an important web
    development which utilise the popularity of
    social networking services.
  • There are various kinds of social search engine,
    but sites like Wink and Spokeo generate results
    by searching across the public profiles of
    multiple social networking sites, allowing the
    creation of web-based dossiers on individuals.
  • This type of people search cuts across the
    traditional boundaries of social networking site
    membership, although any data retrieved should
    already be in the public domain.

171
Things you can do in a Social Network
  • Communicating with existing networks, making and
    developing friendships/contacts
  • Represent themselves online, create and develop
    an online presence
  • Viewing content/finding information
  • Creating and customizing profiles
  • Authoring and uploading content
  • Adding and sharing content
  • Posting messages public private
  • Collaborating with other people

172
Future of social networks?
  • Tribes - Seth Godin
  • Internet mobbing
  • Will there be social networking wars?
  • Google
  • Facebook
  • LinkedIn
  • MySpace
  • Friendster
  • Build your own Ning
  • Borg

173
What weve covered
  • Networks
  • Physical, social, biological, etc
  • Hybrids
  • Static vs dynamic
  • Local vs global
  • Measurable and reproducible
  • Social networking

174
Questions
  • Role of networks in information science?
  • Is certain social networking bad for society?
  • Hurting our culture
  • Future of social networking?
About PowerShow.com