Title: GIS in Spatial Epidemiology: small area studies of exposure outcome relationships
1GIS in Spatial Epidemiologysmall area studies
of exposure- outcome relationships
- Robert Haining
- Department of Geography
- University of Cambridge
2- Spatial epidemiology
- Some definitions
- Geographical correlation studies
- Framework for analysis
- Problems with small area analysis
- Reasons for conducting small area analysis
- Good practice
- Regression models
- Reference to a case study
- Data issues
- Statistical modelling
3- Spatial epidemiology is concerned with describing
and understanding spatial variation in disease
risk. - Individual level data
- Counts for small areas.
- Recent developments owe much to
- Geo-referenced health and population data
- Computing advances
- Development of GIS
- Statistical methodology.
4Geographical correlation studies
- These studies typically involve examining
geographical variations in exposure to
environmental variables (air water soil etc)
and their association with health outcomes whilst
controlling for other relevant factors using
regression.
5Framework for analysis
- Population is unevenly distributed
geographically - People move around (day-to-day movements longer
term movements including migration) - People possess relevant individual
characteristics (age, sex, genetic make-up,
lifestyle, etc) - Live in communities
6Problems with small area analyses
- Frequency and quality of population data (e.g.
Census every 10 years) - Spatial compatibility of different data sets
- Availability of data on population movements
- Measuring population exposure to the
environmental variable - Environmental impacts are often likely to be
quite small (relative to, for example, lifestyle
effects) and there may be serious confounding
effects - Cannot estimate strength of an association
- Ecological (or aggregation) bias.
7Reasons for conducting small area analysis
- Provides a qualitative answer about the existence
of an association (e.g. between environmental
variable and health outcome) - May provide evidence that can be followed up in
other ways.
8Good practice (Richardson 1992)
- Allow for heterogeneity of exposure
- Use well defined population groups
- Use survey data to help obtain good exposure
data - Allow for latency times
- Allow for population movement effects
9 Regression model specification Oi denotes the
number of cases for area i. i 1,,n. If the
outcome is rare, typically, it is assumed that Oi
is Poisson distributed with parameter ?i. The
expected value of Oi is written EOi ?i
Eiri i 1, , n, where ri is the unknown
area-specific relative risk in area i, and Ei
defines the expected number of cases for i given
the size of the population and its age and sex
composition.
10ln?i lnEi lnri .
ln?i lnEi ? ?1X1,i ?2X2,i .....
?kXk,i ?Zi
- This defines a Poisson regression model where ?
is the intercept parameter, and ?1, ?2,, ?k and
? are regression parameters. lnEi is an offset.
-
- The area-specific relative risk at i is
associated with attributes of the population
X1,,Xk and the environmental exposure Z at i. - Adjustment for overdispersion is necessary
because of population heterogeneity at the scale
of the individual small areas (see, for example,
Manton and Stallard 1981). - Allowance for data uncertainty arising from the
use of sample data
11A short case study I Data Issues and GIS
- Demographic and social and economic data
- Pre-2001 Census
- Enumeration Districts (EDs)
- Wards.
- 2001 Census
- Output Areas (OAs)
- Super Output Areas (SOAs)
- Health data (Heart disease stroke mortality
admissions) - Individual records geo-referenced to ED
- Postcoded counts
- Environmental data (NOx PM10 CO)
- Grided
12- Problem obtain a measure of air pollution
exposure at the ED level.
13Step 1 Measuring NOx exposure. The Indic-Airviro
model
14Average annual mean pollution levels 1994-9 (exc
1998) a) NOx (ug/m3) b) PM10 (ug/m3)
15Comparing modelled and monitored values for NOx.
16Step 2 Transferring the gridded data to the ED
framework. Areal Interpolation i Area weighting
17Areal Interpolation (from grid to EDs) ii point
in polygon ED centroid
18Areal Interpolation (from grid to EDs) iii point
in polygon weighted PostPoint
19(No Transcript)
20(No Transcript)
21Weighted PostPoint and ED centroid exposure
measures are very similar areal weighting
different
22Weighted PostPoint differs from both ED centroid
and areal weighting
23Where all three methods will give the same or
similar results
24Step 3 Making allowance for population movements
- Long term population movement
- Sheffield Health and Illness Prevalence study
- 12,239 representative individuals 18-94 tracked
from 1994-2002 - 1491 died 1572 left Sheffield.
- Of the 9176 remaining
- 70 did not move
- 23 made 1 move
- 5 made 2 moves
- Just over 1 made 3 moves
- Under 1 made 4 or more moves.
- gt significant risk of misclassification of
exposure level.
252. Short term population movement
26Spatially smoothed CO average of the annual mean
pollution levels (1994-1999, excluding 1998) for
Sheffield enumeration districts (ug/m3)(i) 1km
(ii) 2km (iii) 4km
27Comparing indoor and outdoor air pollution
exposure
- People spend between 75 and 90 of their time
indoors. - Indoor pollution levels depend not only on
outdoor emissions but on housing conditions
(cooking, heating, ventilation etc). - Evidence on relationship between indoor and
outdoor pollution levels
28 29Statistical modelling issues
- ln?i lnEi ? ?1X1,i ?2X2,i ...
?kXk,i ?Zi -
- 1.Overdispersion linked to spatially correlated
missing covariates. - 2.Sampling errors where data are based on
surveys (e.g lifestyle data). - Fitted spatially structured random effects models
in WinBUGS (MCMC estimation) to handle
overdispersion - Used posterior densities for some of the
lifestyle covariates (e.g. smoking prevalence) - WinBUGS output sent to GIS to map model output
(e.g. area specific risks).
30Map of excess relative risks of coronary heart
disease. An area (i) is considered to have excess
relative risk when 97.5 of the simulated values
of relative risk of area i (ri) are greater than
1.
31References
- P.Brindley, R.Maheswaran, T.Pearson, S.Wise and
R.Haining (2004) Using modelled outdoor air
pollution data for health surveillance. In
R.Maheswaran and M.Craglia (eds) GIS in Public
Health Practice. Taylor and Francis, London,
p.125-149. - P.Brindley, S.Wise, R.Maheswaran, and R.Haining.
(2005) The effect of alternative
representations of population location on the
areal interpolation of air pollution exposure.
Computers, Environment and Urban Systems, Vol 29,
455-469. - R.Maheswaran, R.Haining, P.Brindley, J.Law,
T.Pearson, N.Best (2006) Outdoor NOx and stroke
mortality adjusting for small area level
smoking prevalence using a Bayesian approach.
Statistical Methods in Medical Research, 2006,
15, 499-516.