Introduction%20to - PowerPoint PPT Presentation

About This Presentation
Title:

Introduction%20to

Description:

'To speak of social life is to speak of the association between people their ... Multiplex categorical edges. Ego-Net. Global-Net. Best Friend. Dyad. Primary. Group ... – PowerPoint PPT presentation

Number of Views:59
Avg rating:3.0/5.0
Slides: 105
Provided by: jwm
Learn more at: https://people.duke.edu
Category:

less

Transcript and Presenter's Notes

Title: Introduction%20to


1
Introduction to Social Network
Analysis Columbia University April
2007 James Moody Duke University
2
Introduction
  • Introduction
  • Social Network data
  • Basic data elements
  • Network data sources
  • Local (ego) Network Analysis
  • Introduction
  • Network Composition
  • Network Structure
  • Local Network Models
  • Complete Network Analysis
  • Exploratory Analysis
  • Network Connections
  • Network Macro Structure
  • Stochastic Network Analyses
  • Social Network Software Review
  • Work through examples

3
Introduction
We live in a connected world
To speak of social life is to speak of the
association between people their associating in
work and in play, in love and in war, to trade or
to worship, to help or to hinder. It is in the
social relations men establish that their
interests find expression and their desires
become realized. Peter M. Blau Exchange and
Power in Social Life, 1964
"If we ever get to the point of charting a whole
city or a whole nation, we would have a picture
of a vast solar system of intangible structures,
powerfully influencing conduct, as gravitation
does in space. Such an invisible structure
underlies society and has its influence in
determining the conduct of society as a
whole." J.L. Moreno, New York Times, April 13,
1933
These patterns of connection form a social space,
that can be seen in multiple contexts
4
Introduction
Source Linton Freeman See you in the funny
pages Connections, 23, 2000, 32-42.
5
Introduction
High Schools as Networks
6
(No Transcript)
7
(No Transcript)
8
Introduction
And yet, standard social science analysis methods
do not take this space into account. For the
last thirty years, empirical social research has
been dominated by the sample survey. But as
usually practiced, , the survey is a
sociological meat grinder, tearing the individual
from his social context and guaranteeing that
nobody in the study interacts with anyone else in
it. Allen Barton, 1968 (Quoted in Freeman
2004) Moreover, the complexity of the relational
world makes it impossible to identify social
connectivity using only our intuition. Social
Network Analysis (SNA) provides a set of tools to
empirically extend our theoretical intuition of
the patterns that compose social structure.
9
Introduction
Why do Networks Matter?
Local vision
10
Introduction
Why do Networks Matter?
Local vision
11
Introduction
  • Social network analysis is
  • a set of relational methods for systematically
    understanding and identifying connections among
    actors. SNA
  • is motivated by a structural intuition based on
    ties linking social actors
  • is grounded in systematic empirical data
  • draws heavily on graphic imagery
  • relies on the use of mathematical and/or
    computational models.
  • Social Network Analysis embodies a range of
    theories relating types of observable social
    spaces and their relation to individual and group
    behavior.

12
Introduction Key Questions
  • Social Network analysis lets us answer questions
    about social interdependence. These include
  • Networks as Variables approaches
  • Are kids with smoking peers more likely to smoke
    themselves?
  • Do unpopular kids get in more trouble than
    popular kids?
  • Are people with many weak ties more likely to
    find a job?
  • Do central actors control resources?
  • Networks as Structures approaches
  • What generates hierarchy in social relations?
  • What network patterns spread diseases most
    quickly?
  • How do role sets evolve out of consistent
    relational activity?
  • We dont want to draw this line too sharply
    emergent role positions can affect individual
    outcomes in a variable way, and variable
    approaches constrain relational activity.

13
1. Introduction and Background
  • Why networks matter
  • Intuitive information travels through contacts
    between actors, which can reflect a power
    distribution or influence attitudes and
    behaviors. Our understanding of social life
    improves if we account for this social space.
  • Less intuitive patterns of inter-actor contact
    can have effects on the spread of goods or
    power dynamics that could not be seen focusing
    only on individual behavior.

14
Social Network Data
The unit of interest in a network are the
combined sets of actors and their relations. We
represent actors with points and relations with
lines. Actors are referred to variously
as Nodes, vertices, actors or
points Relations are referred to variously
as Edges, Arcs, Lines, Ties
Example
b
d
a
c
e
15
Social Network Data Basic Data Elements
  • Social Network data consists of two linked
    classes of data
  • Nodes Information on the individuals (actors,
    nodes, points, vertices)
  • Network nodes are most often people, but can be
    any other unit capable of being linked to another
    (schools, countries, organizations,
    personalities, etc.)
  • The information about nodes is what we usually
    collect in standard social science research
    demographics, attitudes, behaviors, etc.
  • Often includes dynamic information about when the
    node is active
  • b) Edges Information on the relations among
    individuals (lines, edges, arcs)
  • Records a connection between the nodes in the
    network
  • Can be valued, directed (arcs), binary or
    undirected (edges)
  • One-mode (direct ties between actors) or two-mode
    (actors share membership in an organization)
  • Includes the times when the relation is active
  • Graph theory notation G(V,E)

16
Social Network Data Basic Data Elements
In general, a relation can be (1) Binary or
Valued (2) Directed or Undirected
The social process of interest will often
determine what form your data take. Almost all
of the techniques and measures we describe can be
generalized across data format.
17
Social Network Data Basic Data Elements
In general, a relation can be (1) Binary or
Valued (2) Directed or Undirected
b
d
a
c
e
Directed, Multiplex categorical edges
The social process of interest will often
determine what form your data take. Almost all
of the techniques and measures we describe can be
generalized across data format.
18
Social Network Data Basic Data Elements Levels
of analysis
Global-Net
19
Social Network Data Basic Data Elements Levels
of analysis
We can examine networks across multiple levels
1) Ego-network - Have data on a respondent (ego)
and the people they are connected to (alters).
Example 1985 GSS module - May include estimates
of connections among alters
2) Partial network - Ego networks plus some
amount of tracing to reach contacts of contacts
- Something less than full account of
connections among all pairs of actors in the
relevant population - Example CDC Contact
tracing data for STDs
20
Social Network Data Basic Data Elements Levels
of analysis
We can examine networks across multiple levels
  • 3) Complete or Global data
  • - Data on all actors within a particular
    (relevant) boundary
  • - Never exactly complete (due to missing data),
    but boundaries are set
  • Example Coauthorship data among all writers in
    the social sciences, friendships among all
    students in a classroom

21
Social Network Data Graph Layout
A good network drawing allows viewers to come
away from the image with an almost immediate
intuition about the underlying structure of the
network being displayed. However, because there
are multiple ways to display the same
information, and standards for doing so are few,
the information content of a network display can
be quite variable.
Consider the 4 graphs drawn at right. After
asking yourself what intuition you gain from each
graph, click on the screen.
Now trace the actual pattern of ties. You will
see that these 4 graphs are exactly the same.
22
Social Network Data Graph Layout
Network visualization helps build intuition, but
you have to keep the drawing algorithm in mind.
Here we show the same graphs with two different
techniques
Spring embedder layouts
Tree-Based layouts
(Fair - poor)
(good)
Most effective for very sparse, regular graphs.
Very useful when relations are strongly directed,
such as organization charts or internet
connections.
Most effective with graphs that have a strong
community structure (clustering, etc). Provides
a very clear correspondence between social
distance and plotted distance
Two images of the same network
23
Social Network Data Graph Layout
Another example
Spring embedder layouts
Tree-Based layouts
(poor)
(good)
Two layouts of the same network
24
Social Network Data
Basic Data Structures
In general, graphs are cumbersome to work with
analytically, though there is a great deal of
good work to be done on using visualization to
build network intuition. I recommend using
layouts that optimize on the feature you are most
interested in. The two I use most are a
hierarchical layout or a force-directed layout
are best. Well see some examples of best
practice after getting a little more familier
with data structure.
25
Social Network Data
Basic Data Structures
From pictures to matrices
Undirected, binary
Directed, binary
26
Social Network Data
Basic Data Structures
From matrices to lists
Arc List
Adjacency List
a b b a b c c b c d c e d c d e e c e d
27
Social Network Data Basic Data Elements Modes
Social network data are substantively divided by
the number of modes in the data. 1-mode data
represents edges based on direct contact between
actors in the network. All the nodes are of the
same type (people, organization, ideas, etc).
Examples Communication, friendship, giving
orders, sending email. There are no constraints
on connections between classes of nodes.
1-mode data are usually singly reported (each
person reports on their friends), but you can use
multiple-informant data, which is more common in
child development research (Cairns and Cairns).
28
Social Network Data Basic Data Elements Modes
Social network data are substantively divided by
the number of modes in the data. 2-mode data
represents nodes from two separate classes, where
all relations cross classes. Examples People
as members of groups People as authors on
papers Words used often by people Events in the
life history of people The two modes of the data
represent a duality you can project the data as
people connected to people through joint
membership in a group, or groups to each other
through common membership N-mode data
generalizes the constraint on ties between
classes to N groups
29
Social Network Data Basic Data Elements Modes
Breiger 1974 - Duality of Persons and Groups
Argument
Metaphor people intersect through their
associations, which defines (in part) their
individuality.
The Duality argument is that relations among
groups imply relations among individuals
30
Social Network Data Basic Data Elements Modes
Bipartite networks imply a constraint on the
mixing, such that ties only cross classes. Here
we see a tie connecting each woman with the party
she attended (Davis data)
31
Social Network Data Basic Data Elements Modes
Bipartite networks imply a constraint on the
mixing, such that ties only cross classes. Here
we see a tie connecting each woman with the party
she attended (Davis data)
32
Social Network Data Basic Data Elements Modes
By projecting the data, one can look at the
shared between people or the common memberships
in groups this is the person-to-person
projection of the 2-mode data.
33
Social Network Data Basic Data Elements Modes
By projecting the data, one can look at the
shared between people or the common memberships
in groups this is the group-to-group projection
of the 2-mode data.
34
Social Network Data Basic Data Elements Modes
Working with two-mode data
A person-to-group adjacency matrix is
rectangular, with persons down rows and groups
across columns
Each column is a group, each row a person, and
the cell 1 if the person in that row belongs to
that group. You can tell how many groups two
people both belong to by comparing the rows
Identify every place that both rows 1, sum
them, and you have the overlap.
1 2 3 4 5 A 0 0 0 0 1 B 1 0 0 0 0 C 1 1 0 0 0 D
0 1 1 1 1 E 0 0 1 0 0 F 0 0 1 1 0
A
35
Social Network Data Basic Data Elements Modes
Working with two-mode data
Compare persons A and F
Person A is in 1 group, Person F is in two
groups, and they are in no groups together.
Or persons D and F
Person D is in 4 groups, Person F is in two
groups, and they are in 2 groups together.
36
Social Network Data Basic Data Elements Modes
Working with two-mode data
Similarly for Groups
Group 1 has 2 members, group 2 has 2 members and
they overlap by 1 members (C).
37
Social Network Data Basic Data Elements Modes
Working with two-mode data
In general, you can get the overlap for any pair
of groups / persons by summing the multiplied
elements of the corresponding rows/columns of the
persons-to-groups adjacency matrix. That is
Groups-to-Groups
Persons-to-Persons
38
Social Network Data Basic Data Elements Modes
Working with two-mode data
One can get either projection easily with a
little matrix multiplication. First define AT as
the transpose of A (simply reverse the rows and
columns). If A is of size P x G, then AT will be
of size G x P.
39
Social Network Data Basic Data Elements Modes
1 2 3 4 5 A 0 0 0 0 1 B 1 0 0 0 0 C 1 1 0 0 0 D
0 1 1 1 1 E 0 0 1 0 0 F 0 0 1 1 0
A B C D E F 1 0 1 1 0 0 0 2 0 0 1 1 0 0 3 0 0 0
1 1 1 4 0 0 0 1 0 1 5 1 0 0 1 0 0
P A(AT) G AT(A)
A
AT
(5x6)
(6x5)
40
Social Network Data Basic Data Elements Modes
Theoretically, these two equations define what
Breiger means by duality With respect to the
membership network,, persons who are actors in
one picture (the P matrix) are with equal
legitimacy viewed as connections in the dual
picture (the G matrix), and conversely for
groups. (p.87)
The resulting network 1) Is always
symmetric 2) the diagonal tells you how many
groups (persons) a person (group) belongs to
(has)
In practice, most network software (UCINET,
PAJEK) will do all of these operations. It is
also simple to do the matrix multiplication in
programs like SAS, SPSS, or R.
41
Social Network Data Network Data Sources
Existing data sources
  • Existing Sources of Social Network Data
  • There are lots of network data archived. Check
    INSNA for a listing. The PAJEK data page
    includes a number of exemplars for large-scale
    networks.
  • 2-Mode Data
  • One can construct networks from many different
    data sources if you want to work with 2-mode
    data. Any list can be so transformed.
  • Director interlocks
  • Protest event participation
  • Authors on papers
  • Words in documents
  • 1-Mode Data
  • Local Network data
  • Fairly common, because it is easy to collect from
    sample surveys.
  • GSS, NHSL, Urban Inequality Surveys, etc.
  • Pay attention to the question asked
  • Key features are (a) number of people named and
    (b) whether alters are able to nominate each
    other.

42
Social Network Data Network Data Sources
Existing data sources
  • Existing Sources of Social Network Data
  • 1-Mode Data
  • Partial network data
  • Much less common, because cost goes up
    significantly once you start tracing to contacts.
  • Snowball data start with focal nodes and trace
    to contacts
  • CDC style data on sexual contact tracing
  • Limited snowball samples
  • Colorado Springs drug users data
  • Geneology data
  • Small-world network samples
  • Limited Boundary data select data within a
    limited bound
  • Cross-national trade data
  • Friendships within a classroom
  • Family support ties

43
Social Network Data Network Data Sources
Existing data sources
  • Existing Sources of Social Network Data
  • 1-Mode Data
  • Complete network data
  • Significantly less common and never perfect.
  • Start by defining a theoretically relevant
    boundary
  • Then identify all relations among nodes within
    that boundary
  • Co-sponsorship patterns among legislators
  • Friendships within strongly bounded settings
    (sororities, schools)
  • Examples
  • Add Health on adolescent friendships
  • Hallinan data on within-school friendships
  • McFarlands data on verbal interaction
  • Electronic data on citations or coauthorship (see
    Pajek data page)
  • See INSNA home page for many small-scale networks

44
Social Network Data Network Data Sources
Collecting network data
Boundary Specification Problem Network methods
describe positions in relevant social fields,
where flows of particular goods are of interest.
As such, boundaries are a fundamentally
theoretical question about what you think matters
in the setting of interest. See Marsden (19xx)
for a good review of the boundary specification
problem In general, there are usually relevant
social foci that bound the relevant social field.
We expect that social relations will be very
clumpy. Consider the example of friendship ties
within and between a high-school and a Jr. high
45
Social Network Data Network Data Sources
Collecting network data
  • Network data collection can be time consuming. It
    is better (I think) to have breadth over depth.
    Having detailed information on lt50 of the sample
    will make it very difficult to draw conclusions
    about the general network structure.
  • Question format
  • If you ask people to recall names (an open list
    format), fatigue will result in under-reporting
  • If you ask people to check off names from a full
    list, you can often get over-reporting
  • c) It is common to limit people to a small number
    if nominations (5). This will bias network
    measures, but is sometimes the best choice to
    avoid fatigue.
  • d) Concrete relational indicators are best (who
    did you talk to?) over attitudes that are harder
    to define (who do you like?)

46
Social Network Data Network Data Sources
Collecting network data
Boundary Specification Problem
While students were given the option to name
friends in the other school, they rarely do. As
such, the school likely serves as a strong
substantive boundary
47
Social Network Data Network Data Sources
Collecting network data
  • Local Network data
  • When using a survey, common to use an
    ego-network module.
  • First part Name Generator question to elicit
    a list of names
  • Second part Working through the list of names to
    get information about each person named
  • Third part asking about relations among each
    person named.

GSS Name Generator From time to time, most
people discuss important matters with other
people. Looking back over the last six months --
who are the people with whom you discussed
matters important to you? Just tell me their
first names or initials.
  • Why this question?
  • Only time for one question
  • Normative pressure and influence likely travels
    through strong ties
  • Similar to best friend or other strong tie
    generators
  • Note there are significant substantive problems
    with this name generator

48
Social Network Data Network Data Sources
Collecting network data
  • Electronic Small World name generator

49
Social Network Data Network Data Sources
Collecting network data
Local Network data The second part usually asks
a series of questions about each person GSS
Example Is (NAME) Asian, Black, Hispanic,
White or something else?
ESWP example
Will generate N x (number of attributes)
questions to the survey
50
Social Network Data Network Data Sources
Collecting network data
Local Network data The third part usually asks
about relations among the alters. Do this by
looping over all possible combinations. If you
are asking about a symmetric relation, then you
can limit your questions to the n(n-1)/2 cells of
one triangle of the adjacency matrix
GSS Please think about the relations between the
people you just mentioned. Some of them may be
total strangers in the sense that they wouldn't
recognize each other if they bumped into each
other on the street. Others may be especially
close, as close or closer to each other as they
are to you. First, think about NAME 1 and NAME 2.
A. Are NAME 1 and NAME 2 total strangers? B. ARe
they especially close? PROBE As close or closer
to eahc other as they are to you?
51
Social Network Data Network Data Sources
Collecting network data
Local Network data The third part usually asks
about relations among the alters. Do this by
looping over all possible combinations. If you
are asking about a symmetric relation, then you
can limit your questions to the n(n-1)/2 cells of
one triangle of the adjacency matrix
52
Social Network Data Network Data Sources
Collecting network data
  • Snowball Samples
  • Snowball samples work much the same as
    ego-network modules, and if time allows I
    recommend asking at least some of the basic
    ego-network questions, even if you plan to sample
    (some of) the people your respondent names.
  • Start with a name generator, then any demographic
    or relational questions.
  • Have a sample strategy
  • Random Walk designs (Klovdahl)
  • Strong tie designs
  • All names designs
  • Get contact information from the people named
  • Snowball samples are very effective at providing
    network context around focal nodes. New work on
    Respondent Driven Sampling (RDS) makes it
    possible to get good representation even with
    initially biased seed nodes.

http//www.respondentdrivensampling.org/reports/RD
Srefs.htm
53
Social Network Data Network Data Sources
Collecting network data
Snowball Samples
54
Social Network Data Network Data Sources
Collecting network data
  • Complete Network data
  • Data collection is concerned with all relations
    within a specified boundary.
  • Requires sampling every actor in the population
    of interest (all kids in the class, all nations
    in the alliance system, etc.)
  • The network survey itself can be much shorter,
    because you are getting information from each
    person (so ego does not report on alters).
  • Two general formats
  • Recall surveys (Name all of your best friends)
  • Check-list formats Give people a list of names,
    have them check off those with whom they have
    relations.

55
Social Network Data Network Data Sources
Collecting network data
  • Complete network surveys require a process that
    lets you link answers to respondents.
  • You cannot have anonymous surveys.
  • Recall
  • Need Id numbers a roster to link, or hand-code
    names to find matches
  • Checklists
  • Need a roster for people to check through

56
Social Network Data Network Data Sources
Collecting network data
  • Complete network surveys require a process that
    lets you link answers to respondents.
  • Typically you have a number of data tradeoffs
  • Limited number of responses.
  • Eases survey construction coding, lowers
    density degree, which affects nearly every
    other system-level measure.
  • Some evidence that people try to fill all of the
    slots.
  • Name check-off roster (names down a row or on
    screen, relations as check-boxes).
  • Easy in small settings or CADI, but encourages
    over-response.
  • The Amy Willis Problem.
  • Open recall list.
  • Very difficult cognitively, requires an extra
    name-matching step in analysis.
  • Think carefully about what you want to learn from
    your survey items.

57
Social Network Data Network Data Sources Missing
Data
Whatever method is used, data will always be
incomplete. What are the implications for
analysis?
Example 1. People can name friends out of
sample, but no way to match them (Add Health)
Out
Out
Out
Out
Out
Out
M
Ego
M
Ego
M
M
M
M
M
M
If the true network looks like this
you cannot distinguish it from this
58
Social Network Data Network Data Sources Missing
Data
Example 2 Node population 2-step
neighborhood of Actor X Relational population
Any connection among all nodes
F 1 2 3 4 5 1 2 3 4 5 6 7 8 1 2 3
Full (0)
Full
Full (0)
F 1.1 1.2 1.3 1.4 1.5 2.1 2.2 2.3 2.4 2.5 2.6 2.7
2.8 3.1 3.2 3.3
Full (0)
Full
Full
F
1-step
UK
Full
Full
F (0)
2-step
3-step
F (0)
Full (0)
Unknown
UK
59
Social Network Data Network Data Sources Missing
Data
Example 3 Node population 2-step neighborhood
of Actor X Relational population Trace, plus
All connections among 1-step contacts
F 1 2 3 4 5 1 2 3 4 5 6 7 8 1 2 3
Full (0)
Full
Full (0)
F 1.1 1.2 1.3 1.4 1.5 2.1 2.2 2.3 2.4 2.5 2.6 2.7
2.8 3.1 3.2 3.3
Full (0)
Full
Full
F
UK
Full
Unknown
F (0)
F (0)
Full (0)
Unknown
UK
60
Social Network Data Network Data Sources Missing
Data
Example 4. Node population 2-step neighborhood
of Actor X Relational population Only tracing
contacts
F 1 2 3 4 5 1 2 3 4 5 6 7 8 1 2 3
Full (0)
Full
Full (0)
F 1.1 1.2 1.3 1.4 1.5 2.1 2.2 2.3 2.4 2.5 2.6 2.7
2.8 3.1 3.2 3.3
Full (0)
Unknown
Full
F
1-step
UK
Full
Unknown
F (0)
2-step
3-step
F (0)
Full (0)
Unknown
UK
61
Social Network Data Network Data Sources Missing
Data
Example 5 Node population 2-step neighborhood
from 3 focal actors Relational population All
relations among actors
Focal
1-Step
2-Step
3-Step
Focal
Full
Full (0)
Full (0)
Full
Full (0)
Full
Full
Full
1-Step
UK
Full
Full
Full (0)
2-Step
Full (0)
3-Step
Full (0)
Unknown
UK
62
Social Network Data Network Data Sources Missing
Data
Example 6. Node population 1-step neighborhood
from 3 focal actors Relational population Only
relations from focal nodes
Focal
1-Step
2-Step
3-Step
Focal
Full
Full (0)
Full (0)
Full
Full (0)
Unknown
Unknown
Full
1-Step
UK
Unknown
Unknown
Full (0)
2-Step
Full (0)
3-Step
Full (0)
Unknown
UK
63
Social Network Data Network Data Sources Missing
Data
Summary Data collection design missing data
affect the information at hand to draw
conclusions about the system. Everything we do
from now on is built on some manipulation of the
observed adjacency matrix so we want to
understand what are valid and invalid conclusions
due to systematic distortions on the
data. Statistical modeling tools hold promise.
We can build models of networks that account for
missing data we are able to fix the
structural zeros in or models by treating them as
given. This then lets us infer to the world of
all graphs with that same missing data structure.
These models are very new, and not widely
available yet.
64
Local Network Analysis Introduction
  • Local network analysis uses data from a simple
    ego-network survey. These might include
    information on relations among egos contacts,
    but often not. Questions include

Population Mixing The extent to which one type
of person is tied to another type of person (race
by race, etc.) Local Network Composition Peer
behavior Cultural milieu Opportunities or
Resources in the network Social Support Local
Network Structure Network Size Density Holes
Constraint Concurrency Dyadic behavior Frequency
of contact Interaction content Specific exchange
behaviors Dyadic Similarity
65
Local Network Analysis Introduction
  • Advantages
  • Cost data are easy to collect and can be sampled
  • Methods are relatively simple extensions of
    common variable-based methods social scientists
    are already familiar with
  • Provides information on the local network
    context, which is often the primary substantive
    interest.
  • Can be used to describe general features of the
    global network context
  • Population mixing, concurrency, exchange
    frequency, etc.
  • Disadvantages
  • Treats each local network as independent, which
    is false.
  • The poor performance of number of partners for
    predicting STD spread is a clear example.
  • Impossible to account for how position in a
    larger context affects local network
    characteristics. popular with who
  • If structure matters, ego-networks are strongly
    constrained to limit the information you can get
    on overall structure

66
Local Network Analysis Introduction
Local
67
Local Network Analysis Introduction
Global
68
Local Network Analysis Network Composition
Perhaps the simplest network question is what
types of alters does ego interact with?
Network composition refers to the distribution of
types of people in your network.
  • Networks tend to be more homogeneous than the
    population. Using the GSS, Marsden reports
    heterogeneity in Age, Education, Race and Gender.
    He finds that
  • Age distribution is fairly wide, almost evenly
    distributed, though lower than the population at
    large
  • Homogenous by education (30 differ by less than
    a year, on average)
  • Very homogeneous with respect to race (96 are
    single race)
  • Heterogeneous with respect to gender

69
Local Network Analysis Network Composition
Claude Fischers book To Dwell Among Friends is
a classic study of urbanism that makes good use
of local network data.
Age heterogeneity varies by egos age and across
urban settings.
70
Local Network Analysis Network Composition
Claude Fischers book To Dwell Among Friends is
a classic study of urbanism that makes good use
of local network data.
Marital composition similarly varies across
respondents and settings
71
Local Network Analysis Network Composition
Calculating network composition using GSS style
data.
Generally you have a separate variable for each
alter characteristic, and you can construct items
by summing over the relevant variables. You
would, for example, have variables on age of each
alter such as Age_alt1 age_alt2 age_alt3
age_alt4 age_alt5 15 35 20 12 . You
get the mean age, then, with a statement such
as meanagemean(Age_alt1, age_alt2, age_alt3,
age_alt4, age_alt5) Be sure you know how the
program you use (SAS, SPSS) deals with missing
data.
72
Local Network Analysis Network Composition
Calculating local network information from global
network data
  • We often want to construct local-level measures
    from global level data. This involves a number
    of steps opens more opportunities than
    GSS-style data
  • 1) Define the local neighborhood
  • Distance (1-step, 2-steps, what?)
  • Direction of tie
  • Sent, Received, or both?
  • 2) Pull the relevant alters
  • 3) Match the alters to the variables of interest
  • Once you decide on a type of tie, you need to get
    the information of interest in a form similar to
    that in the example above.
  • A number of programs do this for you
    automatically (SPAN, R, etc.)

73
Local Network Analysis Network Composition
An example network All senior males from a small
(n350) public HS.
SPAN will do this for you
74
Local Network Analysis Network Composition
  • Common composition measures
  • Level measures
  • Mean of a given attribute (average income of
    alters)
  • Proportion with a particular attribute
    (proportion who smoke)
  • Counts (number of peers who have had sex)
  • Dispersion measures
  • Heterogeneity index (Racial heterogeneity)
  • Index of dissimilarity
  • Standard Deviation
  • Absolute value of the differences
  • Variable range of values
  • Composition measures for multiple variables
    simultaneously
  • Average correlation across all alters
  • Euclidean / Mahalanobis distance measures

75
Local Network Analysis Network Mixing
A common interest in network research is
identifying how likely persons of one category
are to interact with people of another
category. Examples Race mixing how likely are
people of one race to interact with people of
another? Sexual activity mixing Are people with
many partners likely to associate with each
other? Neighborhood / location mixing Are people
likely to name friends from the same
neighborhood. These questions can be answered by
cross classifying the category of the nominator
with the category of the nominated in a mixing
matrix.
76
Local Network Analysis Network Mixing
Race mixing in one of the Add Health schools
77
Local Network Analysis Network Mixing
White Black Hispan Asian
Mix/Other White 1099 128 53
0 231 Black 97 10218
1032 0 539 Hispanic 54
961 104 1 91 Asian
0 0 0 0 0 Mix/Other
191 560 66 0 106
78
Local Network Analysis Network Mixing
  • Working with mixing matrices
  • Group segregation index (Freeman 1972)
  • Associations between rows and columns (valued
    relations)
  • Assortative mixing
  • Correlations or Q
  • Log-linear models
  • Assessing chance levels depends on the data
    available. If you have full network data you can
    look at density between groups, without you can
    only focus on the sheer volume of ties (without
    information on the size of the target groups)

79
Local Network Analysis Network Structure
  • While network structure data are limited, there
    are a number of features that can be of interest,
    assuming you have data on the relations among
    egos contacts.
  • Basic arguments
  • structural amplification that some feature of
    the arrangement of ties amplifies any peer effect
    of network composition (see Haynies paper)
  • Network range effects that being connected to
    a diverse set of alters -- who are not connected
    to each other provides profitable returns.
    Granovetters Strength of Weak Ties, Burts
    Structural Holes
  • Familiar to students of social theory as the
    Tertius Gaudens argument from Simmel
  • In both cases, we use the pattern of ties
    surrounding ego to characterize the local
    structure. We start with volume measures, then
    move on to more complex pattern measures.

80
Local Network Analysis Network Structure volume
Network Size
X1985 2.9 X2004 2.1
From time to time, most people discuss important
matters with other people. Looking back over the
last six monthswho are the people with whom you
discussed matters important to you? Just tell me
their first names or initials. IF LESS THAN 5
NAMES MENTIONED, PROBE Anyone else?
81
Local Network Analysis Network Structure volume
Network Size by
Age Drops with age at an increasing rate.
Elderly have few close ties. Education Increase
s with education. College degree 1.8 times
larger Sex (Female) No gender differences on
network size. Race African Americans networks
are smaller (2.25) than White Networks (3.1).
82
Local Network Analysis Network Structure volume
What does Fischer have to say about the size of
local nets (by context)?
83
Local Network Analysis Network Structure volume
Density is the average value of the relation
among all pairs of ties. T /
((NN-1)/2) Density is usually calculated over
the alters in the network.
2
1
R
3
4
5
D 5 / ((54)/2) 5 / 10 0.5
84
Local Network Analysis Network Structure volume
What does Fischer have to say about the density
of local nets (by context)?
85
Local Network Analysis Network Structure volume
GSS Density
86
Local Network Analysis Network Structure volume
  • In general, dense networks should be more
    cohesive and we would expect that goods will
    flow through the network more efficiently
  • Social support peer influence, for example,
    should be stronger in dense networks
  • Density is a volume measure, however, and can
    mask significant structural differences

These two networks have the same density but very
different structures. Most network analysis
programs will calculate ego-network density
directly.
87
Local Network Analysis Network Structure Weak
Ties Structural Holes
The Strength of Weak Ties In a classic
article, Granovetter (1972) argues that for many
purposes (such as getting a job), the most useful
network contacts are through weak ties. This
is because weak ties connect you to a more
diverse set of alters, increasing the range of
your network. Your strong ties tend to be tied
to each other, making them redundant for the
purposes of bringing information. Essentially
this argument works on a spurious relation. The
key value of weak ties is not in the weak
affective bond, but in the structural location of
the ties. We can measure this directly, and Ron
Burt provides a series of measures for doing so.
88
Local Network Analysis Network Structure Weak
Ties Structural Holes
Maximum Efficiency
Decreasing Efficiency
Number of Non-Redundant Contacts
Increasing Efficiency
Minimum Efficiency
Number of Contacts
89
Local Network Analysis Network Structure Weak
Ties Structural Holes
Effective Size
Conceptually the effective size is the number of
people ego is connected to, minus the redundancy
in the network, that is, it reduces to the
non-redundant elements of the network. Effective
size Size - Redundancy
Where j indexes all of the people that ego i has
contact with, and q is every third person other
than i or j. The quantity (piqmjq) inside the
brackets is the level of redundancy between ego
and a particular alter, j.
90
Local Network Analysis Network Structure Weak
Ties Structural Holes
Effective Size
Piq is the proportion of actor is relations that
are spent with q.
2
3
Adjacency 1 2 3 4 5 1 0 1 1 1 1 2 1 0 0 0 1 3 1
0 0 0 0 4 1 0 0 0 1 5 1 1 0 1 0
1
5
4
91
Local Network Analysis Network Structure Weak
Ties Structural Holes
Effective Size
mjq is the marginal strength of contact js
relation with contact q. Which is js interaction
with q divided by js strongest interaction with
anyone. For a binary network, the strongest link
is always 1 and thus mjq reduces to 0 or 1
(whether j is connected to q or not) The sum of
the product piqmjq measures the portion of is
relation with j that is redundant to is relation
with other primary contacts.
92
Local Network Analysis Network Structure Weak
Ties Structural Holes
Effective Size
2
3
Working with 1 as ego, we get the following
redundancy levels
1
P 1 2 3 4 5 1 .00 .25 .25 .25 .25 2
.50 .00 .00 .00 .50 3 1.0 .00 .00 .00 .00 4 .50
.00 .00 .00 .50 5 .33 .33 .00 .33 .00
PM1jq 1 2 3 4 5 1 --- --- --- ---
--- 2 --- .00 .00 .00 .25 3 --- .00 .00 .00 .00 4
--- .00 .00 .00 .25 5 --- .25 .00 .25 .00
5
4
Redundancy 1 Effective size 4-1 3
93
Local Network Analysis Network Structure Weak
Ties Structural Holes
Effective Size
2
3
When you work it out, in a binary network,
redundancy reduces to the average degree, not
counting ties with ego of egos alters. Since
the average degree is simply another way to say
density, we can calculate redundancy as 2t/n
where t is the number of ties (not counting
ties to ego) and n is the number of people in the
network (not counting ego). Meaning that
effective size n - 2t/n
1
5
4
UCINET, STRUCTURE, SPAN and PAJEK all calculate
effective size
94
Local Network Analysis Network Structure Weak
Ties Structural Holes
Efficiency is simply effective size divided by
observed size. Taken from each egos point of
view, efficiency in this network would be
Effective Ego Size
Size Efficiency 1 4 3 .75 2
2 1 .50 3 1 1 1.00 4
2 1 .50 5 3 1.67 .55
2
3
1
5
4
95
Local Network Analysis Network Structure Weak
Ties Structural Holes
Constraint
Conceptually, constraint refers to how much room
you have to negotiate or exploit potential
structural holes in your network.
2
3
..opportunities are constrained to the extent
that (a) another of your contacts q, in whom you
have invested a large portion of your network
time and energy, has (b) invested heavily in a
relationship with contact j. (p.54)
1
5
4
96
Local Network Analysis Network Structure Weak
Ties Structural Holes
Constraint
Cij Direct investment (Pij) Indirect
investment (PiqPqj)
97
Local Network Analysis Network Structure Weak
Ties Structural Holes
2
3
Constraint
1
5
4
Given the p matrix, you can get indirect
constraint (piqpqj) by simply squaring the matrix
PP 1 2 3 4 5 1 ... .083
.000 .083 .250 2 .165 ... .125 .290 .125 3 .000
.250 ... .250 .250 4 .165 .290 .125 ... .125 5
.330 .083 .083 .083 ...
P 1 2 3 4 5 1 .00 .25 .25 .25 .25 2
.50 .00 .00 .00 .50 3 1.0 .00 .00 .00 .00 4 .50
.00 .00 .00 .50 5 .33 .33 .00 .33 .00
98
Local Network Analysis Network Structure Weak
Ties Structural Holes
Constraint
Total constraint between any two people then is
C (P P2)2
Where P is the normalized adjacency matrix, and
means to square the elements of the matrix.
99
Local Network Analysis Network Structure Weak
Ties Structural Holes
Hierarchy
Conceptually, hierarchy (for Burt) is really the
extent to which constraint is concentrated in a
single actor. It is calculated as
Note this measure says nothing about the
direction of ties its not about asymmetry
100
Local Network Analysis Network Structure Weak
Ties Structural Holes
Hierarchy
2
3
1
2 3 4 5 C C .11 .06 .11 .25
.53 .83 .46 .83 1.9
5
4
H.514
101
Local Network Analysis Network Structure Weak
Ties Structural Holes
Burt (2004) AJS 110349-399
102
Local Network Analysis Network Structure Weak
Ties Structural Holes
Burt (2004) AJS 110349-399
103
Local Network Analysis Network Structure Weak
Ties Structural Holes
Burt (2004) AJS 110349-399
104
Local Network Analysis Local Network Models
Modeling Issues
  • Local Network modeling issues
  • Case independence
  • In very clustered settings, the alters that each
    person names will overlap. This will lead to
    non-independence among the cases.
  • If you have enough cases or over time data, you
    can use random or fixed effect models
  • If you know the names of alters, you can link
    them to build in a direct network autocorrelation
    effect.
  • Small network effects
  • Be aware of the size of your networks.
    Substantively, having 50 white networks means
    something different in a net of size 2 vs a net
    of size 10. I often suggest interactions to
    check for these kinds of effects
  • Dealing with isolates
  • Isolated nodes have no network alters, so none of
    these measures apply. Depending on the context,
    you can either leave them out of the analysis, or
    use interaction terms to selectively apply the
    measures of interest.

105
Local Network Analysis Local Network Models
Modeling Issues
  • Selection
  • That some unobserved factor, z, creates both
    friendships and the outcome of interest.
  • Endogeneity
  • That the causal order of peer relations and
    outcomes is reversed. Peers do not cause Y, but
    Y causes friendship relations

106
Local Network Analysis Local Network Models
Modeling Issues
Selection
  • What do we know about how friendships form?
  • Opportunity / focal factors
  • - Being members of the same group
  • - In the same class
  • - On the same team
  • - Members of the same church
  • Structural Relationship factors
  • - Reciprocity
  • - Social Balance
  • Behavior Homophily
  • - Smoking
  • - Drinking

107
Local Network Analysis Local Network Models
Modeling Issues
Selection
How to correct this problem?
  • Essentially, this is an omitted variable problem,
    and the obvious solution is been to identify as
    many potentially relevant alternative variables
    as you can find.
  • Sensitivity measures (see Ken Franks work here)
  • Propensity score matching
  • Individual-level fixed effect models
  • Substantively you only look at change in Y as a
    function of change in X, holding constant
    (because dummied out) any individual level
    effect.
  • This works, but its drastic. Any endogenous
    effect of networks on the self are essentially
    removed

108
Local Network Analysis Local Network Models
Modeling Issues
Endogeneity
Estimated Y b0 b1(P) e where P some
peer function. But the actual model may really
be P b0 b1(f(Y)) e
109
Local Network Analysis Local Network Models
Modeling Issues
Endogeneity
Does it matter?
Algebraically the relation between y and p should
be direct translation of the coefficients
since
The statistical problem of endogeneity is that
when you estimate b1, it does not equal 1/b1,
because of our assumptions about x, and hence e.
There are other models that make different
assumptions, where this direction is irrelevant.
But they are uncommon and hard to work with in
the multivariate context.
(see Joel H. Levine, Exceptions are the Rule, for
a full discussion of this)
110
Local Network Analysis Local Network Models
Modeling Issues
Possible solutions
  • Theory Given what we know about how friendships
    form, is it reasonable to assume a bi-directional
    cause? That is, work through the meeting,
    socializing, etc. process and ask whether it
    makes sense that Y is a cause of P.
  • Models
  • Time Order. We are on somewhat firmer ground if
    P precedes Y in time.
  • - Simultaneous Equation Models. Model both the
    friendship pattern and the outcome of interest
    simultaneously. Difficult to identify
    instruments or to specify orders that do not
    logically make the model inestimable.

111
Local Network Analysis Local Network Models Peer
influence example
  • Haynie asks whether peers matter for delinquent
    behavior, focusing on
  • a) the distinction between selection and
    influence
  • b) the effect of friendship structure on peer
    influence
  • Two basic theories underlie her work
  • a) Hirchis Social Control Theory
  • Social bonds constrain otherwise criminal
    behavior
  • The theory itself is largely ambivalent toward
    direction of network effects
  • b) Sutherlands Differential Association
  • Behavior is the result of internalized
    definitions of the situation
  • The effect of peers is through communication of
    the appropriateness of particular behaviors
  • Haynie adds to these the idea that the structural
    context of the network can boost the effect of
    peers (a) so transmission is more effective in
    locally dense networks and (b) the effect of
    peers is stronger on central actors.

112
Local Network Analysis Local Network Models Peer
influence example
113
Local Network Analysis Local Network Models Peer
influence example
Write a Comment
User Comments (0)
About PowerShow.com