Title: E-Health (EHL)
1Summary of day 1
- Introduction to Knowledge Mapping and Social
Network Analysis (SNA) - The importance of asking the right question
Questions ?
2Day 2
3Session 4-Data collection
Steeve Ebener, WHO
Manila, Philippines 28-30 April 2008
4Content
- Which data to collect
- Source of the data
- Ego vs Full network
- Ego network data collection
- Full network data collection
- Data collection Instrument
- Data issues
- Example of the DTTB project
5Which data to collect ?
- Several types
- data about the respondent (demographics)
- network specific questions (see previous session)
- any additional information which might be useful
for the analysis - case of the Ego networks
- example of the DTTB project
6Which data to collect ?
Data about the respondent (demographics)
Examples
- income
- education
- location (birth, assignment,...)
- gender
- age
- ethnicity
- religion
- occupation
- ...
Valente
7Which data to collect ?
Other data of importance for the analysis
Examples
- DTTB project
- Access to internet
- Area of expertise they have been looking for
- Number of experts they knew before being deployed
- Electricity availability
- Name of the referral facility
- What to improve in the DTTB program
- time and reason of contact
- Cultural values
- Competency areas
- Use of different types of media (online, printed,
face-to-face) for getting information.
Experts directory
8Source of the data
- Secondary data (often 2 mode)
- Memberships in groups
- Facebook "networks"
- Participation in events
- Listserv threads
- Voting records
- Text analysis
- Other
- Email records, purchase/sale records, marriage
records, etc... (patient records?)
Borgatti
9Source of the data
- Primary data
- Experiments
- Rumor planting
- Observation
- Westen-Electric Hawthorne plant studies
- Survey/Census/Logbook
- Telephone, web, paper, etc
- example of the DTTB project
Borgatti
10Ego vs full network
- Ego Network data collection
- interview of the respondents (egos) to ask about
their contacts (alters) - The alters are not interviewed
- One ego's alters are not matched up with other
egos or their alters - Collect a lot of (perceived) info on the alters
- Full Network data collection ("regular" SNA)
- interview of the respondents and their contacts
- Do generally not collect info on the alters
Borgatti
11Ego vs full network
Borgatti
12Ego Network data collection
- Characterize the relationship with each alter
- Optionally obtain ego's perception of which
alters have ties with other alters - Connections between ego's or between alters of
different egos are not collected
Modified from Borgatti
13Full Network data collection
- Survey vs Census
- in a survey only part of the network is
interviewed - sampling issue (still in its infancy)
- data analyzed but not graphed
- in a census all the network is interviewed
- saturation sampling
- The most typical kind of network methodology
- Usually what people think of when we say
networks. - These data can be graphed and analyzed using
matrix methods
Modified from Borgatti and Valente
14Full Network data collection
- Survey - Sampling
- Fixed probability (e.g. random sampling)
- Fine with Ego Network
- Ok for some complete network studies
- Adaptive sample (e.g. snowball sampling)
Red nodes are interviewed alters Blue node are
not Interviewed
- Stances
- Nominalist/etic (least delusional approach)
- Realist/emic (best used for true groups)
- Combination
Modified from Borgatti and Valente
15Full Network data collection
Survey sampling (Stance)
Borgatti
16Full Network data collection
- Census
- Unknown network
- Ask for the help of one or several of the members
- Use the snowball approach to identify all the
members - Known network
- Interview all the members
Avoid survey when possible !
Borgatti and Valente
17Data collection instruments
Ethical Issues
- Respondents cannot be anonymous
- Snowball sampling ask respondent to name others
- "bill says that you inject illegal drugs with
him, can we talk to you ? - Missing data are troublesome
- Might put the attention on the wrong issue
- Results may be wrong
- Non-participants still included as mentioned by
others - And participants are like informers
- Outputs ideally show individual level data
- Pushes boundary of the professional
- Deceptively powerful
- SNA is still unknown, look like research
- Quid pro quo arrangements with research sites
- Management might hire/fire based on results
Borgatti
18Data collection instruments
Ethical Issues
Need to find out what are the potential ethical
threat and to whom ?
- In academic setting
- In management setting
- In mixed situations
- In national security setting
- ...
Need to address them
Borgatti
19Data collection instruments
Ethical Issues
- Consent form, disclosure contract
- Anonymizing (not releasing demographic data for
example) - Address non-participation
- Aggregating categorizing
- Avoid Quid pro quos
- Find the good time to perform the analysis (not
before a restructuration for example) - Protect data (e.g. theft)
- Organization consulting (who gets to see the data
?, professional debriefing,...) - ...
Borgatti
20Data collection instruments
Ethical issues
Truly informed consent form
Borgatti
21Data collection instruments
Ethical issues
3-way disclosure contract
- For research done in organizations
- Signed by management, the researchers, and each
participants
- Clearly identifies what will be done with the data
Borgatti
22Data collection instruments
Ethical issues
Questionnaire might also have to go through the
ethical committee's approval
Confidentiality reminder on the questionnaire
Borgatti
23Data collection instruments
Which instrument ?
- Data about the respondent (demographics)
- Network specific questions
- Any additional information which might be useful
for the analysis
Questionnaire form(s) if performed once
Wants to do it over time ?
Logbook
DTTB project
24Data collection instruments
Questionnaire format
General questionnaire rules also applies here
(e.g. importance of the order of the questions)
Some specific issues
- Aided (rosters) vs unaided (open-ends)
- Tick, rate or rank ?
- Across (Multigrids) or down (separated questions)
- Paper or electronic
Borgatti
25Data collection instruments
Closed-Ended vs Open-Ended Roster of names or
just blank lines ?
- Closed-ended (aided)
- requires bounded list
- Can be impractical for large network
- Each alter has equal chance of choice
- Open-ended (unaided)
- Subject to recall errors
- can limit number of choices made
- (more effort, limited space)
use hybrid designs otherwise
Borgatti
26Data collection instruments
Hybrid Questionnaire
- Paper questionnaire with a separated booklet
containing name directory - Web version questionnaire with drop-down menus
Hybrid designs are useful in large networks
Importance to use a unique ID to cover name
writing mistakes
Borgatti
27Data collection instruments
Tick, rate or rank ?
- Ask respondent for yes/no decisions or
quantitative assessment ? - Yes/no are easier on respondents (therefore
reliable, believable - Yes/no "much" faster to administer
- But yes/no provides no discrimination among
levels ratings provide more nuance - A series of binaries can replace on quantitative
rating - Instead of "How-often do you see each person?"
- 1 once a year 2 once a month 3 once a week
etc. - Use three questions (in this order)
- Who do you see at least once a year ?
- Who do you see at least once a month ?
- Who do you see at least once a week ?
Borgatti
28Data collection instruments
Tick, rate or rank ?
- Users asked to rank others in terms of
communication frequency over 3 weeks - person most communicated with was ranked in top 4
only 52 of time - Accuracy of prediction s of the next month's
communication was the same, who you like who
you talk to - All studies taken together had similar results
- Studies in other fields corroborate
Ranking can give unreliable results !
Borgatti
29Data collection instruments
Repeated Roster vs MultiGrid
Borgatti
30Data collection instruments
Paper or electronic ?
- Paper medium
- Reliable
- Reassuring to respondents
- Possible errors in data entry
- Data entry is time-consuming
- ...
- Electronic (Pda, Web)
- Span distances, time zones
- harder to lose
- Fewer data handling errors
- Possible lower response rate
- Emailed documents (e.g. excel file) vs online
survey instrument - ...
Choose what is most adapted
Borgatti
31Data collection instruments
Design Considerations
- Network questionnaires can be fun but are usually
time-consuming and might generate anxiety - Providing value
- Treating respondents with respect
- Attractive formatting
- Cloak in authority and importance
- Do not forget that multiple, similar relational
questions risk respondent fatigue and annoyance - Who do you give advice to ?
- Who do you give information to ?
- Who do you give guidance to ?
- Who do you counsel ?
Aggregation to larger categories, such as
affective instrumental can work well
Borgatti
32Data issues
Information accuracy
- Response strategies and biases
- Appearing central
- Recalling important people more than others
- Hiding illicit relationships
- Cognitive cultural schemas
- Recall biased towards normal, frequent, logical
- Role schemas filter perceptions, learning
Can't do much about it but need to be aware of it
Ethnographic sandwich
Borgatti
33Data issues
Unexpected asymmetry
- A claims to have sex with B, but B does not claim
to have sex with A - The relation is logically symmetric, but
empirically asymmetric - can be an error in the recall, strategic response
- Sometimes asymmetry is the point
- Logically symmetric data may be symmetrized
- if either A or B mention the other, it's a tie
- ! Can't symmetrize logically non-symmetric
relations (e.g. gives advices to), unless - changing the meaning of tie
- you have asked the question both ways (who do you
give advice to ?, who gives advice to you ?)
Borgatti
34Data issues
Missing data
- Quick and dirty
- For logically symmetric relations
- if B-A tie is missing, substitute A-B tie
- For logically non-symmetric relations, ask
questions both ways (see previous slide) - Bayesian imputation methods (not addressed here)
Borgatti
35Example of the DTTB Project
- Ego or Full network data collection ?
- Known network
- 2 Batches (18 in Batch 21 and 16 in Batch 22)
- A DTTB doctors can contact and other DTTB doctor
or an outside expert. Further contacts made by
outside experts are not captured - mixed Ego (contact with experts) and full network
(among DTTBs) analysis - use of expert information form
- Question who do you contact for medical
expertise ? - Expertise network analysis
- Data collected over a period of 9 months
- time dimension
Mixed approach
36Example of the DTTB Project
Questionnaires and forms
Questionnaire Batch 21- 22
Questionnaire previous Batches
Experts contact form
37Example of the DTTB Project
Logbook