Title: Part B: Spatial Autocorrelation and regression modelling
1Chapter 5
- Part B Spatial Autocorrelation and regression
modelling
2Autocorrelation
- Time series correlation model
- xt,1 t1,2,3n-1 and xt,2 t2,3,4n
3Spatial Autocorrelation
- Correlation coefficient
- xi i1,2,3n, yi i1,2,3n
- Time series correlation model
- xt,1 t1,2,3n-1 and xt,2 t2,3,4n
- Mean values
Lag 1 autocorrelation - large n
4Spatial Autocorrelation
- Classical statistical model assumptions
- Independence vs dependence in time and space
- Toblers first law
- All things are related, but nearby things are
more related than distant things - Spatial dependence and autocorrelation
- Correlation and Correlograms
5Spatial Autocorrelation
- Covariance and autocovariance
- Lags fixed or variable interval
- Correlograms and range
- Stationary and non-stationary patterns
- Outliers
- Extending concept to spatial domain
- Transects
- Neighbourhoods and distance-based models
6Spatial Autocorrelation
- Global spatial autocorrelation
- Dataset issues regular grids irregular lattice
(zonal) datasets point samples - Simple binary coded regular grids use of Joins
counts - Irregular grids and lattices extension to x,y,z
data representation - Use of x,y,z model for point datasets
- Local spatial autocorrelation
- Disaggregating global models
7Spatial Autocorrelation
A. Completely separated pattern (ve) B. Evenly spaced pattern (-ve)
C. Random pattern
8Spatial Autocorrelation
- Joins count
- Binary coding
- Edge effects
- Double counting
- Free vs non-free sampling
- Expected values (free sampling)
- 1-1 15/60, 0-0 15/60, 0-1 or 1-0 30/60
9Spatial Autocorrelation
A. Completely separated (ve) B. Evenly spaced (-ve)
C. Random
10Spatial Autocorrelation
- Joins count some issues
- Multiple z-scores
- Binary or k-class data
- Rooks move vs other moves
- First order lag vs higher orders
- Equal vs unequal weights
- Regular grids vs other datasets
- Global vs local statistics
- Sensitivity to model components
11Spatial Autocorrelation
- Irregular lattice (x,y,z) and adjacency tables
Cell data
Cell coordinates (row/col)
x,y,z view
4.55 5.54
2.24 -5.15 9.02
3.10 -4.39 -2.09
0.46 -3.06
1,1 1,2 1,3
2,1 2,2 2,3
3,1 3,2 3,3
4,1 4,2 4,3
x y z
1 2 4.55
1 3 5.54
2 1 2.24
2 2 -5.15
2 3 9.02
3 1 3.1
3 2 -4.39
3 3 -2.09
4 2 0.46
4 3 -3.06
3 7
1 4 8
2 5 9
6 10
Cell numbering
Adjacency matrix, total 1s26
12Spatial Autocorrelation
- Spatial (auto)correlation coefficient
- Coordinate (x,y,z) data representation for cells
- Spatial weights matrix (binary or other), Wwij
- From last slide S wij26
- Coefficient formulation desirable properties
- Reflects co-variation patterns
- Reflects adjacency patterns via weights matrix
- Normalised for absolute cell values
- Normalised for data variation
- Adjusts for number of included cells in totals
13Spatial Autocorrelation
14Spatial Autocorrelation
Moran I 1016.19/(26196.68)0.0317 ? 0
A. Computation of variance/covariance-like quantities, matrix C
B. CW Adjustment by multiplication of the weighting matrix, W
15Spatial Autocorrelation
- Morans I
- Modification for point data
- Replace weights matrix with distance bands, width
h - Pre-normalise z values by subtracting means
- Count number of other points in each band, N(h)
16Spatial Autocorrelation
Source data points Lag distance bands, h Correlogram
17Spatial Autocorrelation
- Geary C
- Co-variation model uses squared differences
rather than products - Similar approach is used in geostatistics
18Spatial Autocorrelation
- Extending SA concepts
- Distance formula weights vs bands
- Lattice models with more complex neighbourhoods
and lag models (see GeoDa) - Disaggregation of SA index computations
(row-wise) with/without row standardisation
(LISA) - Significance testing
- Normal model
- Randomisation models
- Bonferroni/other corrections
19Regression modelling
- Simple regression a statistical perspective
- One (or more) dependent (response) variables
- One or more independent (predictor) variables
- Linear regression is linear in coefficients
- Vector/matrix form often used
- Over-determined equations least squares
20Regression modelling
- Ordinary Least Squares (OLS) model
- Minimise sum of squared errors (or residuals)
- Solved for coefficients by matrix expression
21Regression modelling
- OLS models and assumptions
- Model simplicity and parsimony
- Model over-determination, multi-collinearity
and variance inflation - Typical assumptions
- Data are independent random samples from an
underlying population - Model is valid and meaningful (in form and
statistical) - Errors are iid
- Independent No heteroskedasticity common
distribution - Errors are distributed N(0,?2)
22Regression modelling
- Spatial modelling and OLS
- Positive spatial autocorrelation is the norm,
hence dependence between samples exists - Datasets often non-Normal gtgt transformations may
be required (Log, Box-Cox, Logistic) - Samples are often clustered gtgt spatial
declustering may be required - Heteroskedasticity is common
- Spatial coordinates (x,y) may form part of the
modelling process
23Regression modelling
- OLS vs GLS
- OLS assumes no co-variation
- Solution
- GLS models co-variation
- y N(?,C) where C is a positive definite
covariance matrix - yX?u where u is a vector of random variables
(errors) with mean 0 and variance-covariance
matrix C - Solution
24Regression modelling
- GLS and spatial modelling
- y N(?,C) where C is a positive definite
covariance matrix (C must be invertible) - C may be modelled by inverse distance weighting,
contiguity (zone) based weighting, explicit
covariance modelling - Other models
- Binary data Logistic models
- Count data Poisson models
25Regression modelling
- Choosing between models
- Information content perspective and AIC
- where n is the sample size, k is the number of
parameters used in the model, and L is the
likelihood function
26Regression modelling
- Some regression terminology
- Simple linear
- Multiple
- Multivariate
- SAR
- CAR
- Logistic
- Poisson
- Ecological
- Hedonic
- Analysis of variance
- Analysis of covariance
27Regression modelling
- Spatial regression trend surfaces and residuals
(a form of ESDA) - General model
- y - observations, f( , , ) - some function,
(x1,x2) - plane coordinates, w - attribute vector - Linear trend surface plot
- Residuals plot
- 2nd and 3rd order polynomial regression
- Goodness of fit measures coefficient of
determination
28Regression modelling
- Regression spatial autocorrelation (SA)
- Analyse the data for SA
- If SA significant then
- Proceed and ignore SA, or
- Permit the coefficient, ? , to vary spatially
(GWR), or - Modify the regression model to incorporate the SA
29Regression modelling
- Regression spatial autocorrelation (SA)
- Analyse the data for SA
- If SA significant then
- Proceed and ignore SA, or
- Permit the coefficient, ? , to vary spatially
(GWR) or - Modify the regression model to incorporate the SA
30Regression modelling
- Geographically Weighted Regression (GWR)
- Coefficients, ?, allowed to vary spatially, ?(t)
- Model
- Coefficients determined by examining
neighbourhoods of points, t, using distance decay
functions (fixed or adaptive bandwidths) - Weighting matrix, W(t), defined for each point
- Solution
- GLS
31Regression modelling
- Geographically Weighted Regression
- Sensitivity model, decay function, bandwidth,
point/centroid selection - ESDA mapping of surface, residuals, parameters
and SEs - Significance testing
- Increased apparent explanation of variance
- Effective number of parameters
- AICc computations
32Regression modelling
- Geographically Weighted Regression
- Count data GWPR
- use of offsets
- Fitting by ILSR methods
- Presence/Absence data GWLR
- True binary data
- Computed binary data - use of re-coding, e.g.
thresholding - Fitting by ILSR methods
33Regression modelling
- Regression spatial autocorrelation (SA)
- Analyse the data for SA
- If SA significant then
- Proceed and ignore SA, or
- Permit the coefficient, ? , to vary spatially
(GWR) or - Modify the regression model to incorporate the SA
34Regression modelling
- Regression spatial autocorrelation (SA)
- Modify the regression model to incorporate the
SA, i.e. produce a Spatial Autoregressive model
(SAR) - Many approaches including
- SAR e.g. pure spatial lag model, mixed model,
spatial error model etc. - CAR a range of models that assume the expected
value of the dependent variable is conditional on
the (distance weighted) values of neighbouring
points - Spatial filtering e.g. OLS on spatially
filtered data
35Regression modelling
- SAR models
- Pure spatial lag
- Re-arranging
- MRSA model
Spatial weights matrix
Autoregression parameter
Linear regression added
36Regression modelling
- SAR models
- Spatial error model
- Substituting and re-arranging
-
Linear regression spatial error
iid error vector
Spatial weighted error vector
Linear regression (global)
iid error vector
SAR lag
Local trend
37Regression modelling
- CAR models
- Standard CAR model
- Local weights matrix distance or contiguity
- Variance
- Different models for W and M provide a range of
CAR models
Autoregression parameter
weighted mean for neighbourhood of i
Expected value at i
38Regression modelling
- Spatial filtering
- Apply a spatial filter to the data to remove SA
effects - Model the filtered data
- Example
Spatial filter