2. Topics:
• Concepts and definitions of area pattern analysis
• Concepts of spatial autocorrelation
• Spatial autocorrelation statistics
2
3. Area pattern analysis: Concepts and definitions
What is an “area”?
1. Natural areas: self-defining, their
boundaries are defined by the phenomenon
itself (e.g. lake, land cover)
3
6. 2. Imposed areas: area objects are imposed by
human beings, such as countries, states,
counties etc. Boundaries are defined
independently of any phenomenon, and
attribute values are enumerated by surveys
or censuses.
Area pattern analysis: Concepts and definitions
6
Yau Tsim Mong District:
Yau Tsim District and Mong Kok District merged in 1994
8. 3. Raster: space is divided into small regular
grid cells.
Area pattern analysis: Concepts and definitions
8
9. 3. Raster: space is divided into small regular
grid cells.
Area pattern analysis: Concepts and definitions
9
10. - In raster data, individual cells, not actual ground
objects, are the basic areal unit.
- Raster data are generally used to represent
continuous phenomena.
Area pattern analysis: Concepts and definitions
10
11. Planar enforcement
Area pattern analysis: Concepts and definitions
11
# Planar enforcement means that all the space on a map must be
filled and that any point must fall in one polygon alone, that is,
polygons must not overlap (eg, differing soil types cannot overlap)
# Planar enforcement implies that the phenomenon being
represented is conceptualized as a field.
16. A polygon is a
two-dimensional
surface stored as
a sequence of
points defining its
exterior bounding
ring and 0 or more
interior rings.
Polygons by
definition are
always simple.
Most often they
define parcels of
land, water
bodies, and other
features that have
a spatial extent.
Area = Polygon
Area pattern analysis: Concepts and definitions
16
17. Modifiable Areal Unit Problem (MAUP)
8% 8%
Area pattern analysis: Concepts and definitions
17
18. Illness rate = ?%
Illness rate = ?% Illness rate = ?%
Illness rate = ?%
Illness
Area pattern analysis: Concepts and definitions
Modifiable Areal Unit Problem (MAUP)
18
27. What is statistical “correlation”?
A correlation
measures the
relationship
between any two
variables X and Y.
But not any spurious
pairs of variables like these.
Spatial autocorrelation: Concept
27
29. Spatial autocorrelation: 1. Tobler’s Law
The confirmation of Tobler’s first law of geography:
Everything is related to everything else, but near
things are more related than distant things.
# Spatial autocorrelation helps understand the degree
to which one object is similar to other nearby objects.
# Spatial autocorrelation measures how much close
objects are in comparison with other close objects.
Spatial autocorrelation: Concept
Spatial autocorrelation: Four ways to describe it
29
30. Spatial:
On a map
Auto:
Self
Correlation:
Degree of relative
similarity
Positive: similar values cluster together on a map
Negative: dissimilar (different) values cluster together on a map
Spatial
autocorrelation
Positive spatial autocorrelation
Negative spatial
autocorrelation
30
e.g., elevation
e.g., checkerboard
5 by 5 checkerboard
31. 2002 population density
Positive spatial autocorrelation
- high values surrounded by
nearby high values
- intermediate values surrounded
by nearby intermediate values
- low values surrounded by
nearby low values
Spatial autocorrelation: Concept
31
Puerto Rico
32. Negative spatial autocorrelation
- high values surrounded by
nearby low values
- intermediate values surrounded
by nearby intermediate values
- low values surrounded by
nearby high values
competition for space
Grocery store density
Spatial autocorrelation: Concept
33. 2. Based on similarity
The degree to which characteristics at one location are similar
to (or different from) those nearby.
Similar to = positive spatial autocorrelation
Different from (dissimilar) = negative spatial autocorrelation
Positive spatial autocorrelation much more
common than negative!!!
Spatial autocorrelation: Concept
…….Why?
Example: the diffusion of an innovation
through an agricultural community, where
farmers are more likely to adopt new
techniques that their neighbors have
already used with success.
Lecture 4
35. High negative spatial
autocorrelation
No spatial
autocorrelation
High positive spatial
autocorrelation
Dispersed Pattern Random Pattern Clustered Pattern
CLUSTERED
UNIFORM/
DISPERSED
3. Based on probability
Measure of the extent to which the occurrence of an event
in one geographic unit (polygon) makes more probable, or
less probable, than the occurrence of a similar event in a
neighboring unit.
- Do you recognize this from earlier discussion?
It’s the same concept as clustered, random, dispersed!
Spatial autocorrelation: Concept
35
36. Crime rate in an area
Crime rate in
near-by areas
4. Using correlation
Correlation of a variable with itself through space.
The correlation between an observation’s value on a variable
and the value of near-by observations on the same variable.
Correlation = “similarity”, “association”, or “relationship”
Scatter diagram
Spatial autocorrelation: Concept
36
37. Spatial autocorrelation:
shows the association
or relationship
between the same
variable in “near-by”
areas.
Standard statistics:
shows the association
or relationship
between two different
variables.
education
income
education
Education
“next door”
In a neighboring
or near-by areas
Each point is a
geographic location
Spatial autocorrelation: Concept
37
38. Spatial autocorrelation – Methods
- To explore how spatial patterns in a set of polygons change
over time.
- Significant implications for the use of statistical techniques in
analyzing spatial data.
Fundamental assumption: the sample observations are
randomly selected and therefore independent of each other.
True?
38
39. Why is spatial autocorrelation important?
Two reasons:
1. Spatial autocorrelation is important because it implies
the existence of a spatial process.
- Why are near-by areas similar to each other?
- Why do high income people live “next door” to each other?
- These are GEOGRAPHICAL questions.
• They are about location
2. It invalidates most traditional statistical inference
tests.
- If SA exists, then the results of standard statistical inference
tests may be incorrect (wrong!)
- We need to use spatial statistical inference tests
Create
Processes Pattern
Population
Infer
Sample
Spatial autocorrelation: Concept
39
40. Why are standard statistical tests wrong?
• Statistical tests are based on the assumption that
the values of observations in each sample are
independent of one another.
• Spatial autocorrelation violates this
- samples taken from nearby areas are related to each
other and are NOT independent.
Values near each other are
similar in magnitude.
Implies a relationship between
nearby observations
Spatial autocorrelation: Concept
40
41. What is statistical “correlation”?
X
Y
Positive correlation
Spatial autocorrelation: Concept
41
42. What is statistical “correlation”?
X
Y Correlation coefficient, r = 1
Spatial autocorrelation: Concept
42
43. What is statistical “correlation”?
X
Y Correlation coefficient, r = -1
Spatial autocorrelation: Concept
43
50. What is “spatial autocorrelation”?
Spatial autocorrelation: Concept
50
It is a measure of the degree to
which a set of spatial features
and their associated data
values tend to be clustered
together in space (positive
spatial autocorrelation) or
dispersed (negative spatial
autocorrelation).
51. What is “spatial autocorrelation”?
Spatial autocorrelation: Concept
51
52. What is “spatial autocorrelation”?
Spatial autocorrelation: Concept
52
54. What is “spatial autocorrelation”?
Identification of SPATIAL events
Quantitative nature of data set
Understand if events are similar or dissimilar
by defining the intensity of the spatial
process, and how strong a variable happens
in the space.
Geometric nature of data set
Conceptualise spatial relationships – at which
distance events influence each other
(distance band).
Spatial autocorrelation: Concept
54
56. Spatial autocorrelation: Methods
Measurement based on adjacency/contiguity
- If zone j is next to zone i, it receives a weight of 1
- otherwise it receives a weight of 0.
Hexagons Irregular
Rook Queen
Sharing a border or boundary
Rook: sharing a border
Queen: sharing a border or a point 56
57. Measuring contiguity: lagged contiguity
Should we include second order contiguity?
hexagon
rook queen
1st
order
2nd
order
Next
nearest
neighbor
Nearest
neighbor
Spatial autocorrelation: Methods
57
58. Spatial weights matrix for Rook case
Matrix contains a:
- 1 if share a border
- 0 if do not share a border
A B
C D
A B C D
A 0 1 1 0
B 1 0 0 1
C 1 0 0 1
D 0 1 1 0
4 areal units 4x4 matrix
W =
associated
geographic connectivity/
weights matrix
Common border
Spatial autocorrelation: Methods
58
59. Joint count statistics
For binary (also called dichotomous) variables,
areas on a map either be white (W) or black (B).
Spatial autocorrelation: Methods
clustering random dispersed
59
60. Joint count statistics: Binary variables!
At the nominal level, only the presence (B) or the
absence (W) of a specific thematic property is
considered.
Spatial autocorrelation: Methods
60
• At the nominal level, a particular category or a set of
categories, for example the presence of a socio-economic
category or urbanization level (urban/rural).
• at the ordinal level, a class (a rank) or a set of classes, for
example the presence of the best agricultural soil classes
(arable/nonarable).
• At the interval and ratio levels, an interval of values, for
example the presence of a significant rate of criminality
(low/high).
61. Joint count statistics
Spatial autocorrelation: Methods
• The most basic statistic for area pattern analysis of
binary variables.
• Joint: two areas sharing a common edge or boundary.
Rook’s case Queen’s case
61
62. Joint count statistics
Spatial autocorrelation: Methods
For any choropleth map, we can count the number and types
of joints that exist (BB, BW, WW).
Rook’s case
Vertical joints: 4columns*3/column=12
Horizontal joints:4rows*3/row=12
Total joints: 12+12=24
BB: 6
BW:11
WW:7
62
A choropleth map is a thematic map in which
polygons (areas) are shaded or patterned using
different colors in proportion to the measurement
of the statistical variable being displayed on the
map, such as population density or GDP.
63. Spatial autocorrelation: Methods
Joint count statistics
?
Autocorrelation
Positive
Joint Number
BB
WW
BW
94
94
22
Negative
Autocorrelation
Joint Number
BB
WW
BW
49
49
112
Random
Autocorrelation
Joint Number
BB
WW
BW
?
?
?
B: black; W: white
63
64. In an independent random process:
Expected number of BB joints JBB = kp2
Expected number of WW joints JWW = kq2
Expected number of BW joints JBW = 2kpq
where k = total number of joints
p = probability of an area being B
q = probability of an area being W
JBB = kp2 = 210(0.5)2 = 52.5
JWW = kq2 = 210(0.5)2 = 52.5
JBW = 2kpq = 2(210)(0.5)(0.5) = 105
210
Random
Autocorrelation
Joint Number
BB
WW
BW
56
47
107
Spatial autocorrelation: Methods
Joint count statistics
64
65. Major limitations:
1. Works with binary data only
2. Applies to area data only
Random
Autocorrelation
Spatial autocorrelation: Methods
Joint count statistics
Observed JBW < Expected JBW: Positive autocorrelation
Observed JBW = Expected JBW: Random
Observed JBW > Expected JBW: Negative autocorrelation
65
66. Joint count statistics method
Major limitations:
1. Works with binary data only
2. Applies to area data only
3. Joint counting is tedious and
error-prone
4. Computation of test statistic is
complicated and formidable
Spatial autocorrelation: Methods
66
69. (a): Moran’s I
• The most common measure of spatial autocorrelation
• Use for points or polygons
- Joint Count statistic only for polygons
• Use for a continuous variable (any value)
- Joint Count statistic only for binary variable (1,0)
• Varies on a scale between -1 to +1
-1 0 +1
high negative spatial
autocorrelation
no spatial
autocorrelation
high positive spatial
autocorrelation
• It can also be used as an index for dispersion/random/
cluster patterns.
Dispersed Pattern Random Pattern Clustered Pattern
CLUSTERED
UNIFORM/
DISPERSED
Spatial autocorrelation: Methods
69
70. Moran’s I vs. Correlation Coefficient r
Correlation Coefficient r
Relationship between two variables
70
Moran’s I
Involves one variable only;
Correlation between variable, X, and the “spatial lag” of X formed
by averaging all the values of X for the neighboring polygons
Education
Income
r = -0.71
Price
Quantity
r = 0.71
Crime Rate
Crime in
nearby
area
r = -0.71
Grocery Store Density
Grocery
Store
Density
Nearby
r = 0.71
Spatial autocorrelation: Methods
71. Moran’s Index (Moran’s I)
n
1
i
2
i
n
1
i
n
1
j
ij
n
1
i
n
1
j
j
i
ij
)
x
(x
)
w
(
)
x
)(x
x
(x
w
N
I
i
j
Spatial autocorrelation: Methods
(O'sullivan and Unwin, 2003)
71
72. Moran’s Index (Moran’s I)
5 7 11
6 10 13
8 14 16
(a) 5 7 13
8 16 14
10 11 6
(b)
a b
Mean
Standard deviation
Variance
Moran’s I
10 10
3.807887 3.807887
14.5 14.5
0.5532 0.0575
Spatial autocorrelation: Methods
72
73. Moran’s Index (Moran’s I)
1. This index measures spatial
autocorrelation based on both
feature locations and feature values
simultaneously.
2. Given a set of spatial features and
an associated attribute, it evaluates
whether the pattern expressed is
clustered, dispersed, or random.
3. Its results are relatively easy to interpret:
+1 is indicative of perfect clustering
-1 is indicative of perfect dispersion
0 is indicative of zero spatial
autocorrelation (random)
Spatial autocorrelation: Methods
73
74. Test statistic for normal frequency distribution
74
0
-1.96
2.5%
1.96
2.5%
2.54
Reject null at 5%
Reject null
Null Hypothesis: no spatial autocorrelation.
Moran’s I = 0
Alternative Hypothesis: spatial autocorrelation exists.
Moran’s I ≠ 0
Reject Null Hypothesis if Z test > 1.96 (or < -1.96)
- less than a 5% chance that, in the population, there is no
spatial autocorrelation.
- 95% confident that spatial autocorrelation exits.
Spatial autocorrelation: Methods
Z
75. I = 0.00
I = -1.00
I = +1.00
I = +0.293
I = -0.393
Random
Independent
Extreme
Negative
Extreme
Positive
Negative
Positive
75
76. Moran’s Index (Moran’s I)
Moran’s I shows
the similarity of
nearby features
through the I
value (-1 to 1), but
does not indicate
if the clustering is
for high values or
low values.
I= -0.12, slightly dispersed
I= 0.26, clustered
Spatial autocorrelation: Methods
76
77. 77
Moran Scatter Plots
Moran’s I can be interpreted as the correlation between
variable, X, and the “spatial lag” of X formed by averaging
all the values of X for the neighboring polygons.
We can then draw a scatter diagram between these two
variables (in standardized form): X and lag-X (or W_X)
Least squares “best fit” line to the
points.
The slope of this regression line is
Moran’s I
(will discuss Regression next week)
Xi
Lag Xi
is average of
these
Spatial autocorrelation: Methods
78. Moran’s scatter plot
Low/High
negative SA
High/High
positive SA
Low/Low
positive SA
High/Low
negative SA
Q1
Q3
Q2
Q4
Spatial autocorrelation: Methods
78
79. Q1 (values [+], nearby values [+]): H-H
Q3 (values [-], nearby values [-]): L-L
Q2 (values [-], nearby values [+]): L-H
Q4 (values [+], nearby values [-]): H-L
Locations of positive spatial association
(“I’m similar to my neighbors”).
Locations of negative spatial association
(“I’m different from my neighbors”).
Spatial autocorrelation: Methods
79
80. Example 1
- Scatter plot of X vs. Lag-X;
- The slope of the regression
line is Moran’s I
80
Moran’s I = 0.49
High
surrounded
by high
Low
surrounded
by low
Population density
in Puerto Rico
X
Lag-X
Spatial autocorrelation: Methods
81. 5 7 11
6 10 13
8 14 16
(a) 5 7 13
8 16 14
10 11 6
(b)
a b
Mean
Standard deviation
Variance
Moran’s I
10 10
3.807887 3.807887
14.5 14.5
0.5532 0.0575
Spatial autocorrelation: Methods
Example 2
81
82. (b): Geary’s Index (C)
• The value of Geary's C lies between 0 and 2.
• 1 means no spatial autocorrelation.
• Values lower than 1 demonstrate increasing positive spatial
autocorrelation, whilst values higher than 1 illustrate
increasing negative spatial autocorrelation.
0: positive spatial autocorrelation
1: no spatial autocorrelation
2: negative spatial autocorrelation
Spatial autocorrelation: Methods
82
a measure of spatial autocorrelation or an
attempt to determine if adjacent
observations of the same phenomenon are
correlated.
n
1
i
2
i
n
1
i
n
1
j
ij
n
1
i
n
1
j
2
j
i
ij
)
x
(x
)
w
(
2
)
x
(x
w
)
1
(N
C
83. 83
• Calculation is similar to Moran’s I
- For Moran I, the cross-product is based on the deviations from the mean
for the two location values.
- For Geary C, the cross-product uses the actual values themselves at
each location.
n
1
i
2
i
n
1
i
n
1
j
ij
n
1
i
n
1
j
j
i
ij
)
x
(x
)
w
(
)
x
)(x
x
(x
w
N
I
n
1
i
2
i
n
1
i
n
1
j
ij
n
1
i
n
1
j
2
j
i
ij
)
x
(x
)
w
(
2
)
x
(x
w
)
1
(N
C
Spatial autocorrelation: Methods
Geary’s Index (C)
• Geary's C is inversely related to Moran's I, but it is not
identical.
• Interpretation is very different, essentially the opposite!
- Geary’s C varies on a scale from 0 to 2
• Moran's I is a measure of global spatial autocorrelation, while
Geary's C is more sensitive to local spatial autocorrelation.
• Can convert to a -/+1 scale by: calculating C* = 1 – C.
83
84. Local measure of spatial autocorrelation
• Global statistics – identify and
measure the pattern of the entire
study area.
- Do not indicate where specific
patterns occur!
• Local statistics – identify variation
across the study area, focusing on
individual features and their
relationships to nearby features
(i.e. specific areas of clustering).
Spatial autocorrelation: Methods
84
85. Local Indicators of Spatial Association (LISA)
The statistic is calculated
for each areal unit in the
data.
For each polygon, the index
is calculated based on
neighboring polygons with
which it shares a border.
Spatial autocorrelation: Methods
85