Title: Exponential Random Graph Models (ERGM)
1Exponential Random Graph Models(ERGM)
- Michael Beckman
- PAD777
- April 9, 2010
2Introduction
- The purpose of ERGM, in a nutshell, is to
describe parsimoniously the local selection
forces that shape the global structure of a
network. - ERGM may then be used to understand a particular
phenomenon or to simulate new random realizations
of networks that retain the essential properties
of the original. (Hunter et al 2008) - General characteristics of ERGM
- Single observation rather than successive waves
- Change statistics compare observed network to
random realizations - Still computes Markov or Markov-like statistics
- Can model both structural and attribute
parameters - Assumptions and constraints are important to
estimations - Improved SEs even where pseudolikelihood
produces acceptable estimates - Goodness of fit statistics are reliable
- Significant move towards true stochastic modeling
of networks
3Agenda
- Wasserman and Robins (2005) An Introduction to
Random Graphs, Dependence Graphs, and p - Snijders ( 2002) Markov chain monte carlo
estimation of ERGM - Robins et al (2007) Recent developments in
exponential random graph (p) models for social
networks - Hunter et al (2008) A Package to Fit, Simulate
and Diagnose Exponential-Family Models for
Networks - Morris et al (2008) Specification of
Exponential-Family Random Graph Models Terms and
Computational Aspects - Andrew (2009) Regional integration through
contracting networks
4Wasserman Robins - Intro
- Wasserman and Robins (2005) An Introduction to
Random Graphs, Dependence Graphs, and p - Historic development of p distribution for
Markov random graphs - Frank and Strauss 1986
- Strauss and Ikeda 1990 (estimation of
distribution parameters) - Wasserman and Pattison 1996 (extend parameter
assumptions) - Wasserman and Robins 2005 Family of models from
dependence graphs - Versus approximate autologistic regression
(pseudo-likelihood) - Standard network notation
- r1- single relation, dichotomous data
- Random variables, assumed interdependent
- Can use multivariate or valued relations
- Dependence graphs allows testing for independent
elements in matrix X
5Wasserman Robins - Intro
- Model parameters estimated from three new
arrays converse, composition, intersection of
measured relations
6Wasserman Robins - Intro
- Consider the observed network as a subset of all
possible configurations - Dependence graphs help distinguish among possible
distributions, by identifying ties that are
statistically independent - Dependence graph graph of nodes whose edges
signify pairs of random variables that are
assumed to be conditionally dependent
7Wasserman Robins - Intro
- Three classes of dependence graphs
- Bernoulli assumption of conditional
independence for each pair of ties - Empty graph, due to complete independence
- Conditional uniform distribution
- Dyadic dependence assumes all dyads are
statistically independent - Dependence graph has edge set for each dyad
- Basis for p1 model of Holland and Leinhardt
(1977,1981) - General dependence graph arbitrary edge set
with general probability distribution basis for
p
8Wasserman Robins - Intro
- Markov graphs and p
- Any two relational ties associated if they
involve same actor - Observed network considered a realization x of
random array X - Dependence graph D consists of any complete
subgraphs, or cliques - Hammersley-Clifford theorem characterizes Pr(Xx)
in the form of an exponential family of
distributions - Set of non-zero parameters depends on maximal
cliques
9Wasserman Robins - Intro
- Estimating parameters can overwhelm the model, so
constraints are needed - Impose dependence assumptions on parameters
- Homogeneity ie, isomorphic dyads (MAN)
- Higher-order configurations typically set to zero
(stars, triads etc) - Constrained social settings
- Exact differentiation of log likelihood is
mathematically challenging - Pseudolikelihood measures of fit problematic
- MCMC model degeneracy may be a problem
- MCMC is normally preferred, improved algorithms
are available and/or being developed
10Snijders MCMC Estimation
- Snijders ( 2002) Markov chain Monte Carlo
Estimation of ERGM - Random graph is a Markov graph if number of nodes
is fixed, and non-incident edges are independent
conditional upon rest of graph - Exponential family of probability functions (p)
- Where y is the adjacency matrix of a digraph and
the sufficient statistic u(y) is any vector of
statistics of the digraph - Pseudolikelihood not a function of complete
sufficient statistic u(Y) so not a suitable
estimator - Dahmstrom and Dahmstrom (1993) proposed MCMC
11Snijders MCMC Estimation
- Random graph is a Markov graph if number of nodes
is fixed, and non-incident edges are independent
conditional upon rest of graph - Gibbs Sampling all elements Yij are updated
randomly, one element per draw, with all other
elements left unchanged - Assumes convergence at t -gt ?
- Conditional distribution toggles between Yij 1
and Yij 0 - Can result in severe convergence problems
- Model may not simulate effects properly, or
- May result in an explosion of ties after
significant stasis - Bi-modal distribution results, consisting of
high-density and low-density states or regimes - Regime is defined as a subset of the outcome
space - Other regimes are possible (besides bi-modal)
12Snijders MCMC Estimation
- Reciprocity p model of edges and reciprocity
- Assumes dyadic independence
- Probabilities calculated for MAN
- Independence assumption precludes the explosion
effect - Twostar p model - of edges and out-twostars
- Rows in adjacency matrix are statistically
independent - If total number of Y are fixed, number of
out-twostars is a linear function of out-degree
variance - Combined reciprocity and twostar p model
density, reciprocity, out-twostar - Transforms digraph into its complement
- Changes Yij to (1 Yij)
- Density must be set to 0.5
- Simulates graphs equal to, less than or greater
than 0.5 density - Can result in the explosion effect
- In effect, results are determined by initial
state ( high or low density)
13Snijders MCMC Estimation
- Gibbs sampling algorithm
- For every two outcomes, there is a positive
probability to go from one outcome to the other
in finite steps, but - It is possible one regime is dominant, so that
sojourn time from one state to the other is
practically infinite, so - Initial state determines outcome with 0.5
probability coin toss - Three problems arise
- Bi-modal distribution is undesirable for single
network observation - Convergence with two regimes can be so slow that
generating a random draw is practically
impossible - Expected values of sufficient statistics are
extremely sensitive to parameter values, causing
instability of estimation - Other iteration procedures have been proposed and
tested
14Snijders MCMC Estimation
- Detailed balance technique
- Set of all adjacency matrices Yg
- Results in unique stationary distribution
- Small updating steps one element of Yij per
step, as with Gibbs sampling - Cell being updated is random, rather than
deterministic - Referred to as mixing, versus cycling
- Metropolis-Hastings algorithm - Changes Yij to (1
Yij), all other ties constant - Updates more frequently than Gibbs, so more
efficient - Dyadic or triplet updating steps update several
elements per step - Dyad or triplets chosen randomly
- Groupwise updating
- Slower to converge
15Snijders MCMC Estimation
- Large updating steps update Yij from 0 to 1 or
vice versa in blocks - Biggest step is converting graph to its
complement (inversion) - Satisfies the detailed balance equation
- May be appropriate for bimodal distributions
- Inversion may reduce variance in estimation
(conditioning) - Fixed density only digraphs with given number
of ties are drawn - Random undirected graphs applied to half matrix
of unique elements - ML estimation not easily applied to exponential
random graphs, due to problematic calculation for
complex models - Pseudolikelihood estimates can be good, but
standard errors are too low - Monte Carlo Markov Chain estimates
- Monte carlo simulation of Markov graph estimates
moments - Moments are used to estimate parameter effects
for a neighborhood
16Snijders MCMC Estimation
- MCMC Newton-Raphson Algorithm and Robbins-Monro
Algorithm similar - Robbins-Monro Algorithm three phases
- Estimate diagonal matrix using derivative of
initial parameter estimate - Iteratively determines provisional estimation
values, leads quickly to solution of moment
equation - Large steps can lead to instability
- Parameter value is kept constant, then large
number of steps used to check validity of
equation - Use of MC with Robbins-Monro yields, in theory,
convergence probability of 1 - Snijders recommends use of inversion steps for
models with triplet counts
17Robins et al Recent Developments
- Robins et al (2007) Recent developments in
exponential random graph (p) models for social
networks - Technically, MCMC estimation does not converge
due to degeneracy problem near degenerate - Problem is more acute as network size grows
larger - Inclusion of suitable constraints on parameters
allows for estimation - Parameters then provide information on structural
effects - Recall from Snijders problem of bimodal
distribution/model degeneration - Gradual increase in triangle parameter does not
lead to gradual increase in graph triangulation,
so inclusion of star/triangle parameters does not
overcome problem
18Robins et al Recent Developments
19Robins et al Recent Developments
- Inclusion of higher-order structures
- Alternating k-stars
- Alternating k-triangles
- Alternating independent two-paths
- Alternating k-stars, technically only structure
still a Markov random graph - Assumption allows stars up to (n-1)
- Recall in previous models, higher-order stars
normally set to 0 - In alternating k-star, higher-order stars are
allowed - Impact of higher-order stars is gradually
diminished - Essentially, there is weighting of structure from
simple to complex - Allows for interesting inference regarding
network structure - Positive parameter indicates hubs in node
structure - Negative parameter indicates smaller variance in
degree (decentralized)
20Robins et al Recent Developments
- Interpreting alternating k-star models
- Positive parameter tendency toward large number
of low degree nodes, and small number of
high-degree nodes - Node degree may become saturated
- Increase in popularity plateaus additional
ties do not add value - Indicative of a loose core-periphery structure
- Alternation between positive and negative values
helps prevent distribution graph from being
forced to empty or complete graphs ( a la
Snijders et al 06)
21Robins et al Recent Developments
- Alternating k-triangles introduces conditional
dependence - In short, two possible edges in a graph, Yrs and
Yuv, for distinct nodes r, s, u, v, are assumed
to be conditionally dependent if Ysu Yuv 1. - In other words, if the two possible edges in the
graph were actually observed, they would create a
4-cycle. - Defines social circuit dependence
- Chance of Ysu is conditionally dependent on
presence of Yuv - Snijders et al (2006) combine k-triangles with
Markov dependence - K-triangle is combination of individual triangles
that share one edge (base) - Shared adjacency with other nodes are triangle
sides - Conditionally dependent structure, IF either
Markov configuration (shared node), or Social
Circuit Configuration (4-cycle)
22Robins et al Recent Developments
23Robins et al Recent Developments
- Interpreting k-triangles
- Positive parameter provides evidence of
transitivity effects - Also can suggest core-periphery structure, but
due to triangulation rather than popularity
influence - More of a structural effect than an attribute
effect - IE, outcome of the triangulation process
- Alternating k-twopaths
- Lower order structure
- Combine with k-triangles
- Distinguish tendency to form ties at base versus
side of triangle - Side edges absent base edges indicates
precondition to transitivity - Presence of base edge indicates transitive
closure - Combination of parameters can indicate pressure
towards closure
24Robins et al Recent Developments
- Other possible parameters
25Robins et al Recent Developments
- Estimating parameters
- MCMC is preferred method, when available
- When model converges, simulation produces
distribution of graphs in which observed graph is
typical for all effects - Reliable standard errors
- Snijders et al (2006) conditioned on edges
- No density parameter
- Diminishes degeneracy problem with moderate
impact on other parameters - Robins et al find that, at least for smaller
networks, conditioning on edges may not be needed
26Robins et al Recent Developments
- Modeling with SIENA
- Output of estimates, standard error, t-stat for
estimate (how well model converges) - t-ratio close to zero good convergence of model
- Large ratios may indicate model has not
converged, or is degenerate - For non-degenerate models, absolute value of less
than 0.1 is converged - Other tests in SIENA
- Hysteresis analysis
- Simulate from estimates and compare with observed
graph - Modeling with statnet
- Newton-Raphson algorithm
- Fewer simulation runs, then weights graphs for
estimating - Incorporates advances from Metropolis-Hastings
27Robins et al Recent Developments
28Robins et al Recent Developments
Comparing pseduolikelihood to MCMC UCINET
datasets, SIENA modeling
29Hunter et al Package to Fit
- Hunter et al (2008) A Package to Fit, Simulate
and Diagnose Exponential-Family Models for
Networks - Implementing ERGM in R/statnet
- Specify ERGM
- Approximate/exact MLE
- Goodness of fit tests
- The purpose of ERGM, in a nutshell, is to
describe parsimoniously the local selection
forces that shape the global structure of a
network. - ERGM may then be used to understand a particular
phenomenon or to simulate new random realizations
of networks that retain the essential properties
of the original.
30Hunter et al Package to Fit
- Implementing ERGM in R/statnet variables
- Endogenous result of structure
- Exogenous attribute based (can serve as
predictors) - Attributes can be treated as functions of nodal
covariates - Statistics depend on attribute and relationship
information - Change statistics recall we are comparing
conditional distribution toggled between Yij 1
and Yij 0 (or some other Markov configuration) - Particular choice g() of statistics
- Particular network y
- Particular pair of nodes (i,j)
- Seed can be specified for reproducibility
31Hunter et al Package to Fit
- Dyadic independence models
- Dyadic independence term
- Term in an ERGM for which change statistics can
be calculated regardless of value of (i,j) or any
knowledge of y - Dyadic independence ERGM
- All terms in the model are dyadic independence
terms - This model is purely stochastic
- For undirected models, unconditional or marginal
probability is allowed - Important to distinguish between dyadic and
linear independence - Linear dependencies can arise with either form
above - Implications for model specification
- Statnet eliminates/allows for elimination of
statistics as needed
32Hunter et al Package to Fit
- Dyadic dependence models
- Dyads that do not share a node are conditionally
independent - Analogous to nearest neighbor
- Homogeneity condition may be added as a
constraint - All isomorphic networks have same probability
- Problems with model as previously discussed
- Correctives suggested
- combine terms (endogenous and exogenous)
- Specify triad-based curved exponential family
terms - Geometrically weighted degree (GWD)
- Geometrically weighted edgewise shared partner
(GWESP) - Geometrically weighted dyadwise shared partner
(GWDSP)
33Hunter et al Package to Fit
- Curved exponential family model
34Hunter et al Package to Fit
- Estimation and goodness of fit
- Parameters
- Edges
- Homophily term for grade
- Main effect for sex
- P. 23
35Morris et al Specification of ERGM
- Morris et al (2008) Specification of
Exponential-Family Random Graph Models Terms and
Computational Aspects - Where Hunter et al focused more on theory and
statistical formulas, Morris et al provide basic
instruction on implement ERGM in R/statnet - Commands for basic effects, nodal attributes,
relational attributes, structural configurations,
higher-order configurations, actor specific
effects, constraints - Tips to fine-tune algorithm and processing
- Appendix A Table of Model Terms provides quick
reference for what terms are appropriate to a
particular model - IE, directed/undirected, bipartite, dyadic
independence etc.
36Morris et al Specification of ERGM
- Constraints
- Model must include space of all possible networks
- Some networks are bipartite communication
between but never within groups of nodes - ERGM automatically implements these constraints
as needed
37Andrews Regional Integration
- Andrew (2009) Regional integration through
contracting networks - Research question Under what conditions do local
governments choose to contract for services, or
enter into regional agreements for the provision
of services? - Two hypotheses are advanced
- Bonding hypothesis in the presence of
uncertainty and complexity of interjurisdictional
activities, a highly dense network structure will
emerge over time - Bridging hypothesis for interjurisdictional
activities involving high asset specificity, a
sparse, core-periphery network is anticipated - Institutional collective action framework
transaction cost analysis, enforcement and
monitoring, free-rider problem
38Andrews Regional Integration
- Bonding local officials attracted to
interjurisdictional, voluntary cooperation
agreements - Flexible, non-binding, fosters norm of
reciprocity - Can be constrained by local politics and
coordination costs - Bridging in asset-specific dilemma, local
officials likely to choose strategic partner - May produce services in-house
- Induce competition to attenuate opportunism of
central actor - Expected to contract with partner who already has
ties with other jurisdictions
39Andrews Regional Integration
- Research Design
- Contractual ties among law enforcement community
in Orlando-Kissimmee - Five waves from 1986 to 2003
- 66 total actors
- List of goods services derived from
International City/County Management Association
surveys - Studying one metropolitan area controls for
geographic variation and allows for in-depth
analysis of regional integration
40Andrews Regional Integration
41Andrews Regional Integration
- Parameters
- Transitive triads
- Geodesic distance-2
- Covariate effects
- Importance of level of government, where
municipality is coded 1 and higher level
government is treated as benchmark - Importance of professionalism, indicated by
accreditation - Both coded as dummy variables, treated as control
variables - Homophily effect
- Rate parameters were all positive and significant
- T-ration less than 0.3, indicating no problems
with convergence (?)
42Andrews Regional Integration
P.392
43Andrews Regional Integration
P.392