Title: Session 3 Quantitative data: how are they collected and used Qualitative: data Qualitative research
1Session 3Quantitative data how are they
collected and used Qualitative
dataQualitative research methods and designs
2Session objectives
- By the end of this session you will be able to
- Identify main types of types of data collection
methods - The differences between surveillance, monitoring
and research data - Identify some common terms used in epidemiology
will be examined - Identify the five types of quantitative data
- Understand some common epidemiological terms
- Some measurements of health status
- Understand some basic measures of risk
- Understand why sample size is important
3Main types of types of quantitative data
collection methods
- Quantitative data have different characteristics
from qualitative data - To be valid and objective, data must be collected
- Systematically
- From a complete or truly representative sample
- In the same way by all collectors
- NB We will be exploring quantitative data and
data collection techniques later in the course
4Characteristics of quantitative data
- Quantitative data can be either counted or
measures - They can be used
- To describe and summarise
- To compare between groups
- To estimate / predict effect sizes
5What can quantitative data tell us?
- Epidemiologic triad
- Who, When, Where
- The size of the problem
- To some extent Why
- But not
- How it feels to be in an affected group
- Are interventions appreciated and why,
- What was the experience of the people involved
6Types of quantitative data
- Continuous (Ratio)
- Continuous
- Discreet
- Ordinal
- Interval
- Binomial
- Categorical (nominal)
7Continuous (ratio) data
- Regular and ordered numerical data
- Discrete or continuous
- Has a true zero
- Precise relationship between numbers e.g. 10
is twice 5 - in family (discrete)
- Age (continuous)
- Concept of normal distribution
- Mean average
- Median middle
- Mode most common
8Frequency Distribution
If there are enough observations, and if
observations are random/systematic (ie not
biased) In normal distributions Mean Median
Mode
9Confidence Intervals (eg, 95)
Range of values between which we are 95
confident that the true value in the population
lies If we sample a population randomly and
without bias 100 times the true average value
will lie within the CIs 95 times (and outside it
5 times)
10Ordinal data
- Data points are in a specific order, and the
order denotes a meaning - Placement in a race
- House numbers in a street
- A, B, C
- Monday, Tuesday, etc
11Interval data
- Scaled measure with even and known distance
between data-points, BUT - 0 does not mean nothing
- Although the distance between data-points is
regular, it is not continuous - 10 is NOT twice
5 - e.g. temperature scale (30OC is not twice as hot
as 15OC)
12Binomial data
- Two categories, which can be used for
stratification (sub-analysis) - Mutually exclusive
- Yes/no
- In/out
- Male/female
- Dead/alive
13Categorical data
- Variables have discrete sub-categories, which can
be used for stratification (sub-analysis) - Mutually exclusive
- Sub-categories have no relationship to one-another
14The differences between surveillance, monitoring
and research data
- Provide framework for understanding the structure
of health status in populations - For being able to describe the health of defined
populations at a point in time - For being able to compare same population at
different times - For being able to compare different populations
15Data collections data quality
- Is a particular data collection a result of
routine activity or a special effort? - Surveillance
- Surveys
- Monitoring
- Research
- Uses and misuses
16Surveillance and surveys
- Surveillance
- Routine data collections
- Notifications (eg of communicable diseases)
- Monitoring (specific conditions, eg air quality,
RTAs) - Special events (eg adverse drug reactions
outbreaks, terrorist activities) - Registries (births, cancers, immunisation)
- Surveys
- Specific data collections
- Epidemiological studies
17Routine repository data collections
- Completeness of repository data collections
depends on source - Generally excellent-to-fairly-complete
- Laboratory notifications (usually automated
notifications) - Research data (vested interest)
- Routine event-based (eg births and deaths)
- Routine (legislated) notifiable diseases
- Generally poor-or-quality-not-known
- Special interest group
18Data quality
- Depends on several important factors
- Collection forms - should not be complicated, esp
for large collections (KISS principle) - Timeliness data collection should happen ASAP
after event - Accuracy data recorder should
- be capable of recording/conveying relevant
detail - understand the purpose of data collection
19Counting numbers and causes
- Cultural differences in the ways diseases and
deaths are understood and reported - Diagnostic capacity varies, even within one State
- ICD 10 coding systems are an attempt at
international standardisation - Some jurisdictions use DRGs (as a part of health
care funding mechanism), but the codes are not
transportable
20Epidemiological Indices
- Numerator and denominator
- Incidence and prevalence
- Rate
- demographic
- birth
- growth
- mortality
- incidence
- attack
- prevalence
- case fatality
- Ratio and proportion
- Burden of disease studies
21Numerator and Denominator
- Numerator
- Number of new events/cases during a specified
period, the portion used to calculate a rate or
ratio - Denominator
- The total population at risk in a
fraction/calculation of risk ratio or rate
22Rates
- Incidence
- New cases in specified period
- cum. Person-years observation
- Cumulative incidence
- New cases in specified period_
- Total persons disease free at start
- (Point) prevalence rate
- people with disease during a (usually short)
time - total population under study
- Example Household study 6524 children lt5 Did
child have diarrhoea on day of survey or during
previous 15 days - 982 had diarrhoea
- Period prevalence 982/6524 15.1
- (Barros, Victoria, Forsberg et al Bull WHO,
86(1)59-65,1991
23Mortality (Deaths)
- Deaths are a commonly used measure of disease
- Mortality rates (the proportion of cases which
are fatal) are high in important diseases in
developing countries and are quite a good measure
of disease (not so true in developed countries) - Deaths from communicable diseases are generally
high, and are much lower in western countries
24Measures of morbidity and mortality
- Morbidity is a general term which refers to
incidence of disease, including - The illnesses experienced, by
- The number of people who are ill, and
- The length of time they are ill for
- Attack rate
- The proportion of people in a population who
develop a specific disease - __ _ Number of cases__________
- Total population (usually per 100,000)
- Case fatality rate
- The proportion of cases of a specified disease
who die as a direct cause of that disease - deaths from disease in a specified time
period and place - Total cases of the disease in the time
period and place - The death rate
- An estimate of the person-time death rate in the
population who die during a specified period
(usually 1 year) (may be different time period,
eg during an outbreak) - deaths in a specified time period and place
- Population in the same area at risk of
dying
25Demographic growth
- (Crude) Birth rate
- Live births in a geographical area in a given
year - Mid-year or average population in the same area
- Population growth rate
- (Live births deaths) in a geographical area in
a given year - Mid-year or average population in the
same area - Migration
- inbound
- outbound
26Ratio and Proportion
- Ratio
- One quantity divided by another
- Many types, some have special rules
- Risk Rate ratio, relative risk, odds ratios
- Proportion
- Numerator expressed as a
- Decimal fraction (eg 0.2)
- Vulgar fraction (eg 1/5)
- Percentage (eg 20)
27Burden of disease studies
- Adjusted measures
- PYLL (Potential years of life lost)
- A measure of the impact of deaths from disease on
society, the years which would have been lived
had the disease not happened - QALYs (Quality adjusted life years)
- Overall life expectancy reduced by the years lost
due to chronic disease, disability etc - DALYs (Disability adjusted life years)
- Life expectancy adjusted for long-term disability
taking treatments into account widely used but
criticised as they are based in part on guesswork
28Some common epidemiological terms describing the
measurements of health status
- Indicators what are they?
- Epidemiological indicators
- Economic indicators
- The DALY
- The QALY
- Some qualitative health indicators
- Social indicators
29Health indicators
- Can a common measure be used as an advocacy
tool? - Epidemiological indicators
- Economic indicators
- Social indicators
- Fatal and non-fatal indices
30Epidemiological indicators
- Numerator and denominator
- Incidence and prevalence
- Rate
- Demographic vital statistics (birth, population
growth and change, mortality) - Ratios and proportions
- incidence
- attack
- prevalence
- case fatality
- Premature mortality (potential) years of life
lost ((P)YLL) is the standard measure - What standard to use?
31Economic indicators
- Burden of disease studies
- Health Expectancies
- eg disability free life expectancy (DFLE)
- Health Gaps
- DALY (DALY YLL YLD)
- QALY
- Quantified burden of disease
- Death
- Disability
- Years lost due to disability (YLD)
- Non-fatal disease measurement
- Card sort
- Visual analogue
- Time trade off
- Standard gamble
- Person trade off
32The DALY
- First used extensively in the World Bank 1993
Annual Report Investing in Health (see World
Bank website) - A summary measure of population health
- Measures fatal and non-fatal outcomes
- Allows estimates of health impact/ effectiveness
- 1 DALY 1 year lost of healthy life
33The QALY
- Another way of summarising population health
- Common measure for fatal and non-fatal outcomes
- Covers more than the DALY, includes some impacts
of disability - Estimates of health impact/ effectiveness
possible
34Qualitative health indicators
- Opportunity - resilience, ability to withstand
stress, reserve - Perceptions - satisfaction, self-rating
- Social function - participation, limitations in
social roles, contact, intimacy - Psychological function - happiness, reasoning
capacity, distress, behavior - Physical function - mobility, sleep, performance,
fatigue - Impairment - symptoms, signs, diagnosis,
physiological measures
35Social health indicators
- Housing / shelter
- Food
- Education
- Employment
36Basic measures of risk
- Higher incidence in a group in the population
- Basic rate
- Higher exposure rate in disease group that
comparison group - Risk ratio, odds ratio
37Exercise 3 Data types
38Exercise 3 Data types (Answers)
39Sampling techniques
- How are populations sampled in order to study
them? - How do you know how many people to sample?
- A sample should reflect the characteristics of
the population from which it is drawn - 2 basic kinds of sampling techniques
- Based on probability sampling (random)
- Based on convenience sampling (purposive)
40Sample Size Calculations
A Range of values within which we are 95 sure
the true value is situated
41Sampling techniques
- A sample ought to possess all the characteristics
of the population from which it is drawn to be
fully representative of this population - Some common sampling methods
- Total population
- Simple random
- Systematic
- Cluster
- Stratified and matched ( one sample per stratum
or group to be compared) - Latin squares
42Total population
- All members of a population are identifiable
- All members are contactable
- Possible with notifiable diseases (eg
meningococcal disease, Q-fever) and other events
(eg childbirth) - Less successful with common diseases
- Unlikely to be successful without
population-bases registries
43Simple Random Sampling
- Each member of the population (or sampling unit)
has an equal chance of being selected - Requires a sampling frame
- Required sample is randomly chosen one-by-one
from numbers representing each unit - Often costly and time consuming
44Systematic Sampling
- Population is listed systematically say by
number or alphanumerically - Sampling fraction (k) is calculated by dividing
the population (N) by the sample size (n) - eg N 18,000
- n 900
- k 20
- The first member of the sample is selected by
choosing a random number between 1 and the
sampling fraction (1 and 20) - Every 20th member is then sampled until the list
is exhausted
45Cluster Sampling
- The population is grouped into clusters or
primary sampling units (PSUs) - A PSU could be a village, a district, a refugee
camp, a school or hospital, or any well-defined
group of persons or households - You do need to know the population size of each
PSU - Clusters (for WHO studies usually 30) are chosen
in Stage I with a probability according to size - In Stage 2, an equal number of sampled persons or
households is chosen randomly
46Stratified and matched sampling
- Strata / matching criteria are identified prior
to commencement - May be structured to reflect population, to may
be deliberately constructed to collect more from
specific groups - Strata groups which will be compared have equal
numbers
47Accuracy - the aim of every study
48Confidence Intervals
Vaccination Coverage Town A 20 10 Town
B 45 10 Town C 60 10 Different ?
C
A
B
0 10 20 30 40
50 60 70 80 90
100
49Is there a difference between 2 groups? (two
proportions?)
Look at confidence intervals
(a)
No difference detected
Definite difference detected
(b)
Not sure Test (do ?2 or t-test calculate p
value)
(c)
50Part two Qualitative data
- By the end of this session you will be able to
- Understand some common qualitative data
collection methods - how and when they are used
- Focus groups
- Nominal group techniques
- Delphi methods, In-depth interviews
- Visual materials and unobtrusive methods
- Action research in community health settings
- Combined quantitative and qualitative methods
when, how and why
51What can qualitative data tell us?
- The experience of the disease/health event
- How it feels to be in an affected group
- Are interventions appreciated and why,
- What was the experience of the people involved
- And to some extent Why
- But nothing about the epidemiologic triad,
- Who is affected
- When did it happen
- Where do they fit in to their community
- The size of the problem
52How are they used?
- To understand peoples perceptions and meanings
of a situation or process - To delineate attitudes
- To explain practices and processes
53When they are used?
- To inform background to study
- To identify common experiences
- To frame quantitative questions
- To enrich quality of data
- To understand some quantitative results needing
further exploration - Sometimes, to understand causality in terms of
the above
54Theoretical Frameworks
55Some commonly used qualitative data collection
methods
- Interactive
- Interviews
- In-depth interviews
- Focus groups
- Nominal group techniques
- Participant observation
- Delphi methods
- Diary
- Unobtrusive
- Observation
- Visual materials
- Secondary analysis of existing data
56Interviews
- Face to face, telephone, and, internet-based
techniques - Structured ltltlt semi-structured gtgtgt unstructured
- Structured questionnaire (large surveys needing
consistent approach) - Interviewee Respondent
- Semi-structured Interview schedule, aide
memoire (small scale studies) - Unstructured conversation geared towards
research topic (small scale studies) - Semi- and un-structured interviews are designed
to elicit meaning and context to responses - Interviewee Informant
57In-depth interviews
- Dynamic process
- Theme list rather than interview schedule
- Often take 1-2 hours
- Questions may evolve as the interviews progress
and the researcher develops and integrates
understanding - Researcher can use prompts and go back to
previous questions if it helps understanding - Needs a skilled interviewer taped if possible
and transcribed
58Focus groups
- Groups of 4-10 people based on a common
experience - eg New mothers, women at work, bus drivers
- Held usually at neutral location
- Usually take no more than an hour
- Should be a relaxed and friendly process
- Theme list the basis for discussion
- Not much time for evolution
- Needs a skilled moderator and scribe
- Taped if possible
- Transcribed later for analysis
59Nominal group techniques
- People recruited on the basis of particular
characteristics of interest, often quite
complicated to identify, access and recruit - Eg IDUs, NESB people, mothers of children with a
handicap - Data collection be singly or in groups
- Used to canvas variety and common features of
personal opinions, feelings, meanings - Does not identify strength of feeling
60(Participant) observation
- Ethnographic technique
- Researcher may, or may not, belong to the studies
group - Used to record an ongoing processes and reflect
on meanings of practice - Studies of sick care
- Studies of food handling
- Studies of infant rearing
- Can be useful in recording qualitative
information of unknown scope in a systematic way
61Observation
- Ethnographic technique
- Researcher may or may not belong to the studies
group - Used to record an ongoing processes and reflect
on meanings of practice - Studies of sick care
- Studies of food handling
- Studies of infant rearing
62Delphi methods
- Recruitment of representative group
- Series of sets of questions with increasingly
structured framework - Eg NPHP core public health functions
63Diary methods
- Systematic recording of reflections of a process
- Reflections used as the raw data for further
analysis - Introspective data collection however analysis
can be illuminating - Think of some famous historical diaries and how
they tell us stories
64Visual and published materials
- Paintings and drawings graffiti
- Often the author is identifiable and
contactable - Photographs
- Treatment of various subgroups e.g. asylum
seekers - Advertisements
- e.g. doctors and depressed patients study
- Printed materials eg newspaper articles
- Risk communication study
- Web materials esp website chatrooms
65Secondary analysis
- Meta analysis other syntheses of sets of
studies - Some researchers return to pre-existing datasets
(their own and other peoples) and reanalyse them
using new frameworks and insights - Handle with care why were the data originally
collected?
66Action research in community health settings
- With respect to a specific research question,
action research combines - Published research to identify the relevant
issues - A focus group of stakeholders to identify key
issues, needs, concerns - Brings the findings of both back arms back to
the group - The group themselves organises what action to
take with respect to the problem
67Combined methods and triangulation
- Triangulation is a powerful way of collecting
data where an outcome measure may need to be
verified - MCH Nurses administration of immunisation
- Coverage study
- ACIR
- Vaccine use
- Users study
- MCH opinion
68Recruitment and Sampling techniques (1)
- Probability and non-probability sampling
- Sample size
- Depends on study objectives, generally 20-30
- Notion of saturation
- If any statistical analysis is to be done,
minimum of 5 participants for each cell,
usually at least 30
69Recruitment and Sampling techniques (2)
- Theoretical (grounded theory)
- Where are the data and where will you find them
- Quota
- Snowball
- Purposive (judgemental)
- Convenience
70Qualitative data analysis
- Grounded theory (thematic analysis)
- Content analysis
- Discourse analysis
- Semiotics / post-structuralism
71Ethical considerations
- Today some methods are becoming harder to
undertake because of ethical constraints around - Privacy issues
- Observation techniques
- Recruitment techniques
72References
- Good summaries to be found in
- Kerr, Taylor and Heard (eds) Handbook of Public
Health Methods. McGraw Hill, 1998 - Liamputtong and Ezzy. Qualitative research
Methods, OUP, 2004