Using the Social Network Data From Add Health - PowerPoint PPT Presentation

1 / 57
About This Presentation
Title:

Using the Social Network Data From Add Health

Description:

Romantic relation characteristics. Real and Ideal. Relationship ... Mainly a graphics program, but is expanding the analytic capabilities. Free ... Romantic ... – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 58
Provided by: James9
Learn more at: https://people.duke.edu
Category:
Tags: add | data | health | network | social | using

less

Transcript and Presenter's Notes

Title: Using the Social Network Data From Add Health


1
Using the Social Network Data From Add Health
James Moody
Sunbelt Social Networks Conference February 13,
2001 New Orleans
2
  • Introduction What and Why
  • Background to Add Health
  • Levels of Network Data
  • Composition Pattern
  • Networks on both sides of the equation
  • Network Data structures
  • Adjacency Matrices
  • Adjacency Lists
  • Network Data in Add Health
  • In School Friendship Nominations
  • In Home Friendship Nominations
  • Constructing Networks
  • Total Networks
  • Local Networks
  • Peer Groups

3
History of the National Longitudinal Survey of
Adolescent Health better known as Add Health.
a program project designed by J. Richard Udry
and Peter S. Bearman, and funded by a grant
HD31921 from the National Institute of Child
Health and Human Development to the Carolina
Population Center, University of North Carolina
at Chapel Hill, with cooperative funding
participation by the following agencies The
National Cancer Institute The National Institute
of Alcohol Abuse and Alcoholism the National
Institute on Deafness and other Communication
Disorders the National Institute on Drug Abuse
the National Institute of General Medical
Sciences the National Institute of Mental
Health the Office of AIDS Research, NIH the
Office of Director, NIH The National Center for
Health Statistics, Centers for Disease Control
and Prevention, HHS Office of Minority Health,
Centers for Disease Control and Prevention, HHS,
Office of the Assistant Secretary for Planning
and Evaluation, HHS and the National Science
Foundation.
4
  • Initially proposed as an adolescent version of
    the National Health and Social Life Study
    (Laumann et al) known as the Teen Sex study.
  • Jesse Helms crew decided that asking teens about
    sexual behavior was inappropriate, and the study
    had the dubious distinction of being the only
    study ever explicitly outlawed.
  • Fortunately, the same legislation stipulated that
    NIH fund a national health survey, and from the
    ashes of Teen Sex, Add Health was born.
  • Funded at 24M for the first 4 years, Add Health
    was designed to provide a comprehensive image of
    the state of adolescent health and the behaviors
    that affect adolescent health.

5
The Add Health DesignAdolescents in Social
Context
In-School
Contextual
Community/Neighborhood
Contextual Data Base
School
Neighborhood
Community Characteristics
Health Service
Peers and Networks
School Context
Dyadic Relations
Family
  • Individual
  • Attributes
  • Attitudes
  • Behavior
  • Capacities

Health Status
Parent In-Home
In-Home Parent
Saturation
Saturated Sample
Parenting Family Data Relations Between Family
and Adolescent Ideal Sequence
In-School
In-School
Contextual Database
6
Substantive Domains covered in the Add Health
Design
Relations, Peers Nets
Community
Individual
Family
  • Detailed Household Roster
  • Family Structure
  • Parental Interview
  • Sibling relations
  • Parental behaviors
  • Multiple observations in the same family
  • Parents knowledge of adol
  • Activities
  • Friends
  • Adolescent Assessment of parents expectations and
    rule behavior
  • Twin Design
  • Population sample in schools provides complete
    network images
  • Constructed network data
  • Friendship nomination files
  • Romantic relation characteristics
  • Real and Ideal
  • Relationship timing and duration
  • Information from both sides of the relation in
    many cases
  • Peer assessment of peer activity not just
    respondent assessment
  • GIS links for spatial analysis
  • Contextual data at the Block Group, City, County
    and State level
  • Topics include
  • Population
  • Vital Statistics
  • Group Quarters
  • Households
  • Income
  • Poverty
  • Education
  • Labor Force
  • Housing
  • At Risk Children
  • Health Care
  • STD Levels
  • Crime
  • Religion
  • Elections
  • Social Welfare
  • Demographics
  • Detailed, multiple race /ethnic categories
  • Immigrant status
  • Socio-Economic Status
  • Health Status
  • Nutrition
  • STD Sexual Behavior
  • Exposure
  • Emotional
  • Physical
  • Insurance/ Access
  • Daily Activities
  • Exercise
  • TV/Hobbies
  • Academic Exposure
  • Subjects taught
  • Sexual knowledge
  • Future Expectations
  • Risk taking activity

7
Sampling Structure for Add Health
School Sampling Frame QED
Sampling Frame of Adolescents and Parents N
100,000 (100 to 4,000 per pair of schools)
Ethnic Samples
High Educ Black
Disabled Sample
Saturation Samples from 16 Schools
Genetic Samples
Cuban
8
(No Transcript)
9
The National Longitudinal Study of Adolescent
Health Demographic Sub-Sample Sizes
Core Sample 12,104
White 8,467
Black 2,384
Hispanic 1,456
Male 4,075
Female 4,392
Male 1,092
Female 1,292
Male 708
Female 748
Two Parents 3150
One Parent 760
Two Parents 3325
One Parent 842
Two Parents 496
One Parent 466
Two Parents 569
One Parent 593
Two Parents 477
One Parent 184
Two Parents 505
One Parent 189
7th 484 108 552 126
92 75 88
98 62 28
76 26 8th 522 115
526 137 93 93
93 102 82
18 78 33 9th
543 135 574 156
73 84 95
97 79 34 78
38 10th 536 141 540
151 80 80
99 99 94
37 92 23 11th 581
132 551 125 79
64 91 96
87 29 89
31 12th 445 108 524
117 67 50 87
84 58 30
80 30
10
Deductive Disclosure Risks
Start with 536 White, Male, 10th Graders in
Two parent Households
Who are Jewish 10
And Have No Siblings 1
Start with 484 White, Male, 7th Graders in
Two parent Households
Who Have Ever Been Held Back A Grade in
School 87
And Play Basketball 5
And Smoke 1
11
Deductive Disclosure Risks
Start with 87 Black, Female, 12th Graders in
Two parent Households
Who have Never been Held Back 77
And Smoke Regularly 5
And Have 2 siblings 1
And are Catholic 1
12
Deductive Disclosure Risks
Start with 98 Black, Female, 7th Graders in
One parent Households
Who Are Baptist 41
And have no Siblings 9
And Play Baskettball 1
And have one Sibling 13
And Smoke 1
And have gt one Sibling 19
And are Born in April 1
13
Levels of Network Data
Best Friends
ego
ego
Local Network
Peer Group
14
(No Transcript)
15
(No Transcript)
16
Measuring Network Context Patterns
  • Pattern measures capture some feature of the
    distribution of relations across nodes in the
    network. These include
  • Density of all possible ties actually made
  • Reciprocity likelihood that given a tie from i
    to j there will also be a tie from j to i.
  • Transitivity extent to which friends of friends
    are also friends
  • Hierarchy Is there a status order to
    nominations? How is it patterned?
  • Clustering Are there significant groups? How
    so?
  • Segregation Do attributes (such as race) and
    nominations correspond?
  • Distance How many steps separate the average
    pair of persons in the school? Is this larger or
    smaller than expected?
  • Block models What is the implied role structure
    underlying patterns of relations?
  • These features (usually) require having
    nomination data from each person in the network.

17
Measuring Network Context Composition
  • Composition measures capture characteristics of
    the population of people within a given network
    level. These include
  • Heterogeneity How dispersed are actors with
    respect to a given attribute?
  • Means What is the mean GPA of egos friends? How
    likely is it that most of egos friends will go
    to college?
  • Dispersion What is the age-range of people ego
    hangs out with?
  • These features can often be measured from the
    simple ego network.

18
Analysis with Social Network data
  • Networks as Dependent Variables
  • Interest is in explaining the observed patterns
    of relations.
  • Examples
  • Why are some schools segregated and others not?
  • What accounts for differences in hierarchy across
    schools?
  • What accounts for homophily in friendship choice?
  • Tools
  • Descriptive tools to capture properties
  • Standard analysis tools at the level of networks
    to explain the measures
  • p and other specialized network statistical and
    simulation models

19
Analysis with Social Network data
  • Networks as independent Variables
  • Interest is in explaining behavior with network
    context (Peer influence/ context models)
  • Examples
  • Is egos probability of smoking related to the
    smoking levels of those he/she hangs out with?
    (compositional context)
  • Is the transition to first intercourse affected
    by the peer context?
  • Are isolated students more likely to carry
    weapons to school than those in dense peer
    groups? (positional context)
  • Tools
  • Depends on dependent variable
  • Peer influence models
  • Dyad models
  • Contextual models, with network level as nested
    context (students within peer groups)

20
Network Data Structures
Adjacency Matrix
Graph
Arc List
Node List
21
Network Analysis Programs
  • 1) UCI-NET
  • General Network analysis program, runs in Windows
  • Good for computing measures of network topography
    for single nets
  • Input-Output of data is a little chunky, but
    workable.
  • Not optimal for large networks
  • Available from
  • Analytic Technologies
  • Borgatti_at_mediaone.net
  • 2) STRUCTURE
  • A General Purpose Network Analysis Program
    providing Sociometric Indices, Cliques,
    Structural and Role Equivalence, Density Tables,
    Contagion, Autonomy, Power and Equilibria In
    Multiple Network Systems.
  • DOS Interface w. somewhat awkward syntax
  • Great for role and structural equivalence models
  • Manual is a very nice, substantive, introduction
    to network methods
  • Available from a link at the INSNA web site
  • http//www.heinz.cmu.edu/project/INSNA/soft_inf.ht
    ml

22
Network Analysis Programs
  • 3) NEGOPY
  • Program designed to identify cohesive sub-groups
    in a network, based on the relative density of
    ties.
  • DOS based program, need to have data in arc-list
    format
  • Moving the results back into an analysis program
    is difficult.
  • Available from
  • William D. Richards
  • http//www.sfu.ca/richards/Pages/negopy.htm
  • 4) PAJEK
  • Program for analyzing and plotting very large
    networks
  • Intuitive windows interface
  • Used for all of the real data plots in this
    presentation
  • Mainly a graphics program, but is expanding the
    analytic capabilities
  • Free
  • Available from

23
Network Analysis Programs
  • 5) Cyram Netminer for Windows A new
    exploratory tool for networks
  • 6) SPAN - Sas Programs for Analyzing Networks
    (Moody, ongoing)
  • is a collection of IML and Macro programs that
    allow one to
  • a) create network data structures from the Add
    Health nominations
  • b) import/export data to/from the other network
    programs
  • c) calculate measures of network pattern and
    composition
  • d) analyze network models
  • Allows one to work with multiple, large networks
  • Easy to move from creating measures to analyzing
    data

24
Network Data Collected in Add Health
In -School Network Data
  • Complete Network Data collected in every school
  • Each student was asked to name up to 5 male and 5
    female friends
  • These data provide the basic information needed
    to construct network context measures.
  • Due to response rates, we computed data on 129 of
    the 144 total schools.
  • Variable is named MFltgtAID form male friend,
    FFltgtAID for female friends.

25
Slide here of the survey instrument
26
Network Data Collected in Add Health
In -School Network Data
  • Nomination Categories
  • Matchable people inside egos school or sister
    school
  • People who were present that day
  • ID starting with 9 and are in the sample
  • People who were absent that day
  • ID starting with 9, but not in the school sample
  • People in egos school, but not on the directory
  • Nomination appears as 99999999
  • People in egos sister school, but not on the
    director
  • Nomination appears as 88888888
  • People not in egos school or the sister school
  • Nomination appears as 77777777
  • Other special codes
  • Nominations appears as 99959995
  • Nominator Categories
  • Matchable nominator
  • Person who was on the roster, ID starts is 9.
  • Unmatchable nominator

27
Network Data Collected in Add Health
In -School Network Data
Example 1. Ego is a matchable person in the
School
Out
Un
Out
Out
Un
Un
M
Ego
M
Ego
M
M
M
M
M
M
True Network
Observed Network
28
Network Data Collected in Add Health
In -School Network Data
Example 2. Ego is not on the school roster
M
M
M
Un
M
Un
M
M
M
M
M
M
Un
Un
Un
True Network
Observed Network
29
Network Data Collected in Add Health
In -School Network Data
30
Network Data Collected in Add Health
In -School Network Data
31
Network Data Collected in Add Health
In -Home Network Data
  • Network Data were collected in both Wave1 and
    Wave 2 Surveys
  • There were two procedures
  • Saturated Settings
  • Attempted to survey every student from the
    In-School sample.
  • 2 large schools, and 10 small schools.
  • Was supposed to replicate the in-school design
    exactly.
  • Unsaturated Settings
  • Each person was only asked to name one other
    person
  • In both cases, the design was not always carried
    out. As such, some of the students in the
    saturated settings were allowed to name only one
    male and one female friend, while some students
    who were in the non-saturated settings were asked
    to nominate a full slate of 5 and 5.

32
Network Data Collected in Add Health
In -Home Network Data
  • Data Usage Notes
  • Romantic Relation Overlap
  • For the W1 and W2 friendship data, any friendship
    that was also a romantic relation was recoded to
    55555555, to protect the romantic relation
    nominations.
  • Bad Machine on Wave 2 Data
  • Data on from one school in wave 2 seems to be
    corrupted. We have no way to show this for
    certain, but it seems to be the case that data
    from machines 200065 or 200106 gave incorrect
    data. We suspect this is so, because almost
    everyone who used these two machines nominated
    the same person multiple times. This results in
    one person having an abnormally large in-degree.
  • All nomination s are now valid
  • Unlike the in-school data, Ids starting with
    something other than 9 can be nominated.
  • Same out-of-sample special codes
  • All other special codes for these data are the
    same as in the in-school data.

33
Network Data Collected in Add Health
In -Home Network Data
Descriptive Statistics for Saturated Settings
34
Constructing Network Measures
Total Network
To construct the social network from the
nomination data, we need to integrate each
persons nominations with every other nomination.
Methods 1) Export the Nomination data to
construct network in other program MOST of the
other programs require you to pre-process the
data a great deal before they can read them. As
such, it is usually easier to create the files in
SAS first, then bring them into UCINET or some
such program. 2) Construct the network in
SAS The best way to do this is to combine IML
and the MACRO language. SAS IML lets you work
with matrices in a (fairly) strait forward
language, the SAS MACRO language makes it easy to
work with all of the schools at once. Programs
already set up to do this are available in SPAN.
35
Constructing Network Measures
Adjacency Matrices
The key to analyzing / measuring the total
network is constructing either an adjacency
matrix or an adjacency list. These data
structures allow you to directly identify both
the people ego nominates and the people that
nominate ego. Thus, the first step in any
network analysis will be to construct the
adjacency matrix.
To do this you need to 1) Identify the universe
of possible people in the network. This is
usually the same as the set of people that
you have sampled. However, if you want to
include ties to non-sampled people you may make
the universe include all people named by
anyone. 2) create a blank matrix with n rows and
n columns. 3) loop over all respondents, placing
a value in the column that corresponds to the
persons they nominate. This can be binary (named
or not) or valued (number of activities they do
with alter).
36
Constructing Network Measures
Local Networks.
  • To create and calculate measures based only on
    the people ego nominates, you can work directly
    from the nomination list (dont need to construct
    the adjacency matrix).
  • To create and calculate measures based on the
    received or reciprocated ties, you need to have a
    list of people who nominate ego, which is easiest
    to get given the adjacency matrix.
  • To calculate positional measures (density,
    reciprocity, etc.) all you need is the nomination
    data.
  • To calculate compositional data, you need both
    the nomination data and matching attribute data.

37
Constructing Network Measures
Peer Groups.
Identifying cohesive peer groups requires first
specifying what a cohesive peer group is.
Potential definitions could be a) all people
within k steps of ego (extended ego-network) b)
a set of people who interact with each other
often (relative density) c) a set of people with
a particular pattern of ties (a closed loop, for
example) UCINET, STRUCTURE, NEGOPY and SPAN all
provide methods for identifying cohesive groups.
They all differ on the underlying definition of
what constitutes a group. The FACTIONS
algorithm in UCINET and NEGOPYs algorithm use
relative density. The CROWD algorithm is SPAN
uses a combination of relative density and
pattern. Once you have constructed the
adjacency matrix, you can export to these other
programs fairly easily. However, most of them
are QUITE time consuming (FACTIONS, for example,
is a bear) and take a good deal of time to run,
so be sure you have identified exactly what you
want before you start processing.
38
Constructing Network Measures
Peer Groups Characteristics.
Identifying Cohesive Sub-Groups
  • Cohesion The group is difficult to separate the
    connection of the group does not depend on one
    relation or person.
  • Groupness Relative to the rest of the network,
    a cohesive sub - group has high relational
    volume.
  • Inclusion Some people are not in groups while
    others bridge groups.

39
Examples of Peer groups within Add Health High
Schools Crowds Algorithm
40
Observed Clustering within Adolescent Social
Networks
Network Characteristics of Sub Groups
  • On average, 65 of a schools adolescents are in
    cohesive sub-groups.
  • 87 of all relations are within sub-groups.
  • The average sub-group has 22 members.
  • The average diameter for a sub-group is 3 steps.
  • The mean segregation index is .96 (1Complete,
    0Random)

41
Observed Clustering within Adolescent Social
Networks
Distribution of Characteristic within groups,
relative to school distribution
42
Constructing Network Data School Level
43
Constructing Network Data School Level
Inter-Group Relations
44
Analysis Using Network Data Nets as Dependent
Variable Racial Segregation
Same race friendship preference
by racial heterogeneity
1.6
Countryside h.s.
1.0
Same Race Friendship Preference (b1)
.4
-.2
.1
.8
.3
.6
Racial Heterogeneity
45
Analysis Using Network Data Nets as Dependent
Variable Modeling the network
Network Model Coefficients, In school Networks
46
Analysis Using Network Data Nets as Independent
Variable Suicide
Relational Structures and Forms of Suicide
Regulation
Low
High
High
Anomic
Altruistic
Integration
Low
Egotistic
Fatalistic
47
Analysis Using Network Data Nets as Independent
Variable Suicide
Measuring Isolation and Anomie.
48
Analysis Using Network Data Nets as Independent
Variable Suicide
49
Analysis Using Network Data Nets as Independent
Variable Weapons
Probability of Carrying a Weapon by Race and
Gender
0.14
0.12
0.1
Probability of carrying a weapon
0.08
Males
Females
0.06
0.04
0.02
0
White
Black
Hispanic
Asian
Native American
Other
Race/Ethnicity
a) Figure represents predicted probabilities
model 6 of table 5, holding all other variables
at the full sample mean.
50
Analysis Using Network Data Nets as Independent
Variable Weapons
Network Effects on Weapon Carrying
0.18
0.16
Peer Group Deviance
0.14
0.12
0.1
Probability of carrying a weapon to school
Social Outsiders
0.08
0.06
School Oriented Peer Group
0.04
0.02
0
Positive
0.08
0.19
0.3
0.41
0.52
0.63
0.74
0.85
Negative
0
1
2
3
4
5
6
7
character of peer context
51
Analysis Using Network Data Nets as Independent
Variable Sexual Debut
52
Analysis Using Network Data Nets as Independent
Variable Pregnancy
53
Wave III Respondents
  • Wave II participants
  • Main sample plus special samples
  • Aged 18-25
  • Partners or original participants
  • 2,000 couples

54

How is what happens in adolescence
related to what happens in young
adulthood?

The influence of adolescent contexts
on young adult outcomes
55
Additional Content of Wave III
AHPVT Social security number Longitude and
latitude College context Physical
measurements Biomarkers Network Transitions
56
Special Features
CASI event history calendar Preloaded data from
Waves I and II Re-interviews with STI-positive
individuals Binge-drinking sample High school
transcript data
57
Wave III Questionnaire Content
Family relationships Relationships
Friends Pregnancies and births Education Del
inquency and violence Work experience Involvement
with criminal justice system General
health Tobacco, alcohol, drugs, suicide Mental
health Mentoring Illnesses, disabilities Civic
participation Marriage/cohabitation Religion and
spirituality Sexual experiences and STDs Gambling
Write a Comment
User Comments (0)
About PowerShow.com