The statistical analysis of personal network data - PowerPoint PPT Presentation

1 / 69
About This Presentation
Title:

The statistical analysis of personal network data

Description:

Disaggregated analysis of dyadic relations (e.g., run an ... of dyadic relations ... Some characteristic of the dyadic relation (e.g., strength of ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 70
Provided by: Rica68
Category:

less

Transcript and Presenter's Notes

Title: The statistical analysis of personal network data


1
The statistical analysis of personal network data
  • Part I Cross-sectional analysis
  • Part II Dynamic analysis

2
A word about quantitative and qualitative
approaches
  • Quantitative and qualitative approaches play
    complementary roles in personal network analysis
  • A qualitative pilot study can help to identify
    important predictors / Qualitative analyses can
    provide insights into the sources of error/
    temporal instability
  • Quantitative analyses are crucial to determine
    the statistical effect of characteristics /
    Individuals do not know how for example their own
    constant characteristics influence their network.

3
In summary, types of information collected with
Egonet
  • Information about the respondent (ego e.g., age,
    sex, nationality)
  • Information about the associates (alters) to whom
    ego is connected (e.g., age, sex, nationality)
  • Information about the ego-alter pairs (e.g.,
    closeness, frequency and or means of contact,
    time of knowing, geographic distance, whether
    they discuss a certain topic, type of relation
    e.g., family, friend, neighbour, workmate -)
  • Information about the relations among alters as
    perceived by ego (simply whether they are related
    or not, or strong/weak/no relation)

4
The statistical analysis of personal versus
sociocentric networks what are the differences?
  • Whereas sociocentric network researchers often
    (yet not always) concentrate on a single network,
    personal network researchers typically
    investigate a sample of networks.
  • The dependency structure of sociocentric networks
    is complex, therefore leading to the need of
    specialized social network software, but personal
    network researchers, as they often hardly use the
    data on alter-alter relations, have a simpler
    dependency structure...

5
Personal network data have a multilevel
structure
  • E.g. sample of 20 respondents, for each
    respondent, we collected data of 45 alters, so we
    have in total a collection of 900 dyads

ego
alter
6
Three types of analysis have been used in past
research
  • Type I Aggregated analysis
  • Type II Disaggregated analysis
  • (not okay, forget about it quickly!)
  • Type III Multilevel analysis

7
Type 1 Aggregated analysis
  • First, aggregate all information to the
    ego-level
  • Compositional variables (aggregated
    characteristics of alters or ego-alter
    relations) e.g., percentage of women, average
    age of the alters, average time of knowing,
    average closeness
  • Structural variables (aggregated characteristics
    of alter-alter relations) e.g., network size,
    density of the network, betweenness, number of
    isolates, cliques
  • Then use standard statistical procedures to e.g.
  • Describe the network composition or structure or
    compare them across populations
  • Explain the networks (network as a dependent
    variable)
  • Relate the networks to some variable of interest
    (network as an explanatory variable)
  • Statistically correct provided that you are aware
    of your level of analysis

8
Example Effect at network level cannot be
interpreted at tie level
9
Example Effect at network level cannot be
interpreted at tie level
10
Type 2 Disaggregate analysis
  • Disaggregated analysis of dyadic relations (e.g.,
    run an linear regression analysis on the 900
    alters) is statistically not correct even though
    it has been done (e.g. Wellman et al., 1997,
    Suitor et al., 1997)
  • Observations of alters are not statistically
    independent as is assumed by standard statistical
    procedures
  • Standard errors are underestimated, and
    consequently significance is overestimated

11
Type 3 Multilevel analysis
  • Multilevel analysis of dyadic relations
  • Multilevel analysis is a generalization of linear
    regression, where the variance in outcome
    variables can be analyzed at multiple
    hierarchical levels. In our case, alters (level
    1) are nested within egos / networks (level 2),
    hence variance is decomposed in variance between
    and within networks.
  • Software e.g., MLwiN, HLM, VarCL
  • Dependent variable Some characteristic of the
    dyadic relation (e.g., strength of tie) -
    Networks as the dependent variables. Note
    Special multilevel models have been developed for
    discrete dependent variables.
  • Explanatory variables can be (among others)
  • characteristics of ego (level 2),
  • characteristics of alters (level 1),
  • characteristics of the ego-alter pairs (level 1).

12
See for a good article about the possibilities of
multilevel analysis of personal networks (incl. a
quick comparison with aggregated and
disaggregated types of analysis)
  • Van Duijn, M. A. J., Van Busschbach, J. T.,
    Snijders, T. A. B. (1999). Multilevel analysis of
    personal networks as dependent variables. Social
    Networks, 21, 187-209.

13
In summary, cross-sectional analysis...
The two types of analysis, even when focusing on
the same variable, address different types of
questions ? Multilevel analysis e.g., what
predicts the strength of ties? ? Aggregated
analysis e.g., what predicts the average
strength of ties in personal networks?
14
Illustration of type I Aggregate analysis The
case of migrants in Spain
  • We collected information of about 300 migrants in
    Catalonia with Egonet (in 2004-2005), from four
    countries of origin
  • For each respondent, information was collected
    about
  • Ego (country of origin, years of residence in
    Spain, sex, age, marital status, level of
    education, etc.)
  • Alters (country of origin, country of living,
    etc.)
  • Ego-alter pairs (closeness, tie strength, type of
    relation, etc.)
  • Relations among alters

15
Illustration The case of migrants in Spain
  • Our research questions were
  • Can we distinguish different types of personal
    networks (profiles) among migrants?
  • Can the type of personal network be predicted by
    the years of residence of a migrant?
  • If so, do years of residence still predict
    network profiles when controlled for other
    important background characteristics?

16
Method
  • For each personal network (excluding ego), we
    first calculated compositional and structural
    characteristics (aggregate level)
  • Then, we used the following statistical
    procedures to analyse the 286 valid cases
  • K-means cluster analysis based on various network
    characteristics (see next slide), to identify
    homogeneous groups of networks (network
    profiles)
  • ANOVA to see whether profiles differ in years of
    residence
  • Multinomial logistic regression to predict
    profile membership from years of residence
    controlled for background variables age, sex,
    country of origin, employment

17
K-means cluster analysis (SPSS)
  • Based on the network variables (all
    standardized)
  • 1. Proportion of alters whose country of origin
    is Spain
  • 2. Proportion of fellow migrants
  • 3. Density
  • 4. Network betweenness centralization
  • 5. Number of clusters (subgroups) within the
    network
  • 6. Subgroup homogeneity regarding living in Spain
  • 7. Average frequency of contact (7-point scale)
  • 8. Average closeness (5-point scale)
  • 9. Proportion of family in the network

18
Results cluster analysis
  • Five-cluster solution was best interpretable and
    reasonably balanced
  • Cluster sizes
  • Profile 1, the scarce network N 54
  • Profile 2, the dense family network N 28
  • Profile 3, the multiple subgroups network N
    73
  • Profile 4, the two worlds connected network N
    75
  • Profile 5, the embedded network N 50
  • Characteristics that most contributed to the
    cluster partition are
  • density
  • homogeneity of the subgroups regarding living in
    Spain
  • percentage of Spanish in the network

19
Description of profiles
20
Profile 1. Scarce network
Color country of origin (white foreign, black
Spain) Size country of living (large
Spain, small other country)
21
Description of profiles
22
Profile 2. Dense family network
Color country of origin (white foreign, black
Spain) Size country of living (large
Spain, small other country)
23
Description of profiles
24
Profile 3 Multiple subgroups network
Color country of origin (white foreign, black
Spain) Size country of living (large
Spain, small other country)
25
Description of profiles
26
Profile 4 Two worlds connected
Color country of origin (white foreign, black
Spain) Size country of living (large
Spain, small other country)
27
Description of profiles
28
Profile 5 Embedded network
Color country of origin (white foreign, black
Spain) Size country of living (large
Spain, small other country)
29
Is the partition related to years of residence?
(ANOVA in SPSS)
Overall F (4, 2.67) 6.634, p lt .001 Per
profile There are two homogeneous subsets that
differ significantly in years of residence
Profiles 1 and 2, versus profiles 3, 4, and 5.
30
Is the partition also related to years of
residence when controlled for background
characteristics?
  • Multinominal logistic regression (SPSS)
  • Age and employment status did not have
    significant effects
  • Sex and country of origin, however, influenced
    profile membership significantly e.g.,
    Senegambians had a higher probability to have a
    dense family network than others.
  • However, even controlled for these background
    characteristics, years of residence still
    predicts cluster membership.

31
Conclusion of our illustration
  • The network profiles give valuable information
    about adaptation to a host country
  • The scarce network and the dense family network
    seem transitional networks, whereas the other
    three seem more settled.

32
But...
  • In order to investigate whether the networks of
    migrants really follow a certain pattern of
    change (or multiple patterns depending on for
    example country of origin or entry situation), we
    need a longitudinal model.

33
... and what about the analysis of alter-alter
relations?
  • Most researchers are only interested in
    alter-alter relations to say something about the
    structure of personal networks of respondents
  • Use structural measures (density, betweenness,
    number of cliques etc.) in an aggregated analysis
  • Apply triad census analysis (Kalish Robins,
    2006)
  • If youre interested in predicting who is related
    to whom (among the alters)
  • Specify Exponential Random Graph Model (ERGM) for
    each network and then run a meta-analysis over
    the results (cf., Lubbers, 2003 Lubbers
    Snijders, 2007)

34
ERGMs
  • ERGMs are available in, among others, the
    software StOCNET (where you can find SIENA as
    well)
  • Dependent variable whether alters are related or
    not
  • Independent variables characteristics of alters,
    the relation alters have with ego, the
    alter-alter pair, endogenous network
    characteristics such as transitivity (in the
    meta-analysis, characteristics of ego can be
    added as well)
  • Type of analysis Apply a common ERGM to each
    network (leaving ego out), then run a
    meta-analysis (cf. Lubbers, 2003 Snijders
    Baerveldt, 2003 Lubbers Snijders, 2007).

35
Part II. Dynamic analysis
  • How do personal networks change over time?
  • Data on personal networks are collected in two or
    more waves in a panel study

36
Interest in dynamic analysis
  • Networks at one point in time are snapshots, the
    results of an untraceable history (Snijders)
  • E.g., personal communities in Toronto (Wellman et
    al.)
  • Changes following a focal life event (individual
    level)
  • E.g., transition from high school to university
    (Degenne Lebeaux, 2005) childbearing, moving,
    return to school in midlife (Suitor Keeton,
    1997) retirement (Van Tilburg, 1992) marriage
    (Kalmijn et al., 2003) divorce (Terhell, Broese
    Van Groenou, Van Tilburg, 2007) widowhood
    (Morgan, Neal, Carder, 2000) migration (Molina
    et al.)
  • Broader studies of social change Social and
    cultural changes in countries with dramatic
    institutional changes
  • E.g., post-communism in Finland, Russia (Lonkila,
    1998), and Eastern Germany (Völker Flap, 1995)

37
Types of dynamic personal network research
(networks as dependent variables)
  • Feld et. al. (2007), Field Methods 19, 218-236

38
Types of dynamic personal network research
  • Feld et. al. (2007), Field Methods 19, 218-236

39
Types of dynamic personal network research
  • Feld et. al. (2007), Field Methods 19, 218-236

40
Types of dynamic personal network research
  • Feld et. al. (2007), Field Methods 19, 218-236

41
Types of dynamic personal network research
  • Feld et. al. (2007), Field Methods 19, 218-236

42
Types of dynamic personal network research
  • Feld et. al. (2007), Field Methods 19, 218-236

43
Illustration The case of migrants in Spain
  • Migrants in Catalonia (Barcelona, Vic, Girona).
  • We collected information about the personal
    networks of about 300 migrants (in 2004-2005).
  • Sample of 90 individuals for the second wave (1,5
    - 2 years later on average).
  • Questionnaire at t2 identical to t1, but
    supplemented with queries about the changes, such
    as about alters who disappeared from the network
  • For the present illustration, we are focusing on
    Argentinean migrants only (part of the interviews
    N22).

44
Type 1 Persistence of ties with alters across
time
  • Dependent variable whether a tie persists or not
    to a subsequent time (dichotomous)
  • Explanatory variables characteristics of ego,
    alter, the ego-alter pair, and the situation,
    especially in combination with the initial
    characteristics of the relationship
  • Type of analysis Logistic multilevel analysis

45
Illustration type 1 The case of migrants in Spain
  • Cases 900 alters nested within 20 respondents
  • Descriptive How persistent are ties over time?
  • 53 of these alters were again nominated in Wave
    2 (N 473), whereas 47 of the nominations was
    not repeated (N 427).
  • Explanatory What predicts the persistence of
    ties over time?
  • Logistic multilevel analysis (see Table 1)

46
Table 1. Regression coefficients and standard
errors (between brackets) of the logistic
multilevel regression model predicting
persistence of ties (N 900).
47
Additionally Differences between dissolved and
new ties
  • Are the new ties qualitatively better than the
    broken ones?
  • Alters newly nominated in Wave 2 were somewhat
  • frequently contacted (3.2 versus 2.8 on frequency
    of contact scale, t 5.32, df 888, p lt .001),
    and somewhat
  • closer (2.9 versus 2.4 on closeness, t 3.70, df
    888, p lt .001)
  • than the alters who were not nominated again in
    Wave 2.
  • Furthermore, new relations were somewhat more
    often family members (18) than relations that
    were broken (12 ?2 6.03, df 1, p lt .05).
    Involution?

48
Type 2 Changes in characteristics of persistent
ties across time
  • Dependent variable change in some characteristic
    of the relationship (e.g., change in strength of
    tie)
  • Explanatory variables characteristics of ego,
    alter, the ego-alter pair, and the situation,
    especially in combination with the initial
    characteristics of the relationship
  • Type of analysis Multilevel analysis

49
Illustration Type 2 The case of migrants in
Spain
  • Cases 473 persistent ties
  • Descriptive
  • There was a fair amount of change in frequency of
    contact (Mt1 3.50, Mt2 2.94 t 8.231, df
    472, p lt .05) and less change in closeness in
    stable ties (Mt1 3.68, Mt2 3.87 t -4.065,
    df 472, p lt .05)
  • Explanatory
  • Multilevel analysis (see Table 2).

50
Table 2. Regression coefficients and standard
errors (between brackets) of the multilevel
regression model predicting changes in frequency
of contact and closeness in stable ties (N
473).
p lt .05
51
Type 3 Changes in the size of the network across
time
  • Dependent variable change in number of ties in
    the personal network
  • Explanatory variables characteristics of ego, of
    the set of alters, and the situation, especially
    in combination with the initial characteristics
    of the network
  • Type of analysis Regression analysis

52
Illustration type 3 The case of migrants in
Spain
  • The size of the network was fixed at 45 alters in
    both waves, so this type of analysis cannot be
    illustrated with our data.

53
Type 4 Changes in overall network
characteristics across time
  • Dependent variable change in compositional or
    structural variable (e.g., percentage of alters
    with higher education, density of the network)
  • Explanatory variables characteristics of ego, of
    the set of alters, and the situation, especially
    in combination with the initial characteristics
    of the network
  • Type of analysis Regression analysis

54
Illustration type 4 The case of migrants in
Spain
  • Cases 22 respondents.
  • The network stability of the 22 respondents was
    on average 53 (SD 13.6), and varied between
    29 and 76 among respondents.
  • How does the composition and structure of the
    networks (the stable and unstable part together)
    change over time?
  • Descriptive Overall, the network characteristics
    hardly changed over time (Table 3). The only
    characteristics that differed significantly
    between Wave 1 and 2 were average closeness and
    betweenness, both of which increased slightly
    over the years.
  • Explanatory These changes could not be predicted
    by ego characteristics (using a regression
    analysis at ego level) the most important
    predictor of the change was the variable at t1
    (regression to the mean).

55
Table 3. Means and standard deviations of the
compositional variables of the personal networks
at t1 and t2 (N 22), correlations between the
two waves, and t-test of differences between the
two waves.
p lt .05
56
Conclusions from the illustration
  • There is quite some instability in the personal
    relations of Argentinean immigrants in Catalonia,
    most importantly in their peripheral relations
  • Relational characteristics predict the
    persistence of ties, whereas demographic
    characteristics of ego affect the flux and flow
    within their persistent ties
  • These quantitative analyses suggest that
    important changes in the number of active
    contacts and/or changes in ties (from 30-70) are
    compatible with overall stability in network
    composition.

57
Further analyses
  • We will investigate (based on all 90 respondents)
    whether persons with different network profiles
    at t1 have different patterns of changes in their
    networks, indicating different ways of
    assimilation to Spain.

58
So what about the dynamics of alter-alter
relations?
  • ... Lets propose a type 5?

59
Type 5 Changes in ties among alters across time
  • Dependent variable whether alters make new ties
    or break existing ties with other alters across
    time
  • Independent variables characteristics of alters,
    the relation alters have with ego, the
    alter-alter pair, endogenous network
    characteristics such as transitivity (in the
    meta-analysis, characteristics of ego can be
    added as well)
  • Type of analysis Apply a common SIENA model to
    each network (leaving ego out), then run a
    meta-analysis (cf. Lubbers, 2003 Snijders
    Baerveldt, 2003 Lubbers Snijders, 2007). A
    multilevel version of SIENA is on the agenda.

60
SIENA makes assumptions which seem to be violated
in personal networks
  • It is assumed that people act strategically/ration
    ally within the network, so the network should
    make sense to them and they should know who are
    the alters
  • Thoughts on strategical behavior and robustness
  • Strategical behaviour among alters also occurs in
    personal networks, e.g., befriend the friends of
    friends.
  • In sociocentric networks, people are also
    influenced by others outside the networks (e.g.
    out-of-school friends).
  • In large sociocentric networks (e.g., an
    organisation), people do not know all alters
    either.

61
Illustration of type 5 Changes in ties among
alters across time
  • We are currently applying SIENA to each case
  • In a meta-analysis, we can then investigate
    whether for example a significant tendency of
    transitivity among alters is related to more
    stability in the relations between ego and the
    alters

62
Case study Normas network at t1
63
Case study Normas network at t2
64
Case study Normas network at t2 (new contacts
depicted in red)
65
Case study SIENA analysis of Normas network
  • In Normas network, there are 62 actors (28
    stable actors, 17 who come and 17 who go). Of the
    378 stable ties, 292 are not related at any
    moment, 64 are related at both moments, 15 only
    at t1 and 7 only at t2.
  • Statistical results The following effects were
    significant (apart from degree)
  • Similarity in the frequency of contact between
    alters If two alters had about the same
    frequency of contact with ego, they had a higher
    probability of having a relation themselves.
  • Transitivity If A and B are related, and B and C
    as well, then it is likely that A and C also
    become related. (but note that A and C already
    had a transitive relation via the invisible
    ego!).
  • Alter is family of ego or not The family members
    of ego have a lower tendency to contact other
    alters as the other network members.

66
Sources of change in (personal) networks
  • Unreliability due to measurement error
  • Inherent instability
  • Systemic change
  • External change
  • Leik Chalkley (1997), Social Networks 19, 63-74

67
Sources of change in (personal) networks
  • Unreliability due to measurement error
  • Inherent instability
  • Systemic change
  • External change
  • Researchers should consider the potential impact
    of measurement error and inherent instability on
    the substantive conclusions! E.g., plan a pilot
    study, supplement with qualitative analyses,
    calculate test-retest reliability of network and
    scales of closeness etc.

Error sources
68
Conclusion
  • Multiple statistical methods for personal network
    research, depending on your research interest
  • Combining several methods probably gives greatest
    insight

69
  • Thanks!
  • My e-mail MirandaJessica.Lubbers_at_uab.es
Write a Comment
User Comments (0)
About PowerShow.com