Lecture 7 Area Objects and Spatial Autocorrelation.pptx

GEOG2120
INTRODUCTORY
SPATIAL ANALYSIS
Lecture 7 Area Objects and
Spatial Autocorrelation
Contact details for Dr Ran:
Room 1032, Jockey Club Tower,
Centennial Campus

Topics:
• Concepts and definitions of area pattern analysis
• Concepts of spatial autocorrelation
• Spatial autocorrelation statistics
2

Area pattern analysis: Concepts and definitions
What is an “area”?
1. Natural areas: self-defining, their
boundaries are defined by the phenomenon
itself (e.g. lake, land cover)
3

4

5

2. Imposed areas: area objects are imposed by
human beings, such as countries, states,
counties etc. Boundaries are defined
independently of any phenomenon, and
attribute values are enumerated by surveys
or censuses.
6
Yau Tsim Mong District:
Yau Tsim District and Mong Kok District merged in 1994

7
2010 2011

3. Raster: space is divided into small regular
grid cells.
8

3. Raster: space is divided into small regular
grid cells.
9

- In raster data, individual cells, not actual ground
objects, are the basic areal unit.
- Raster data are generally used to represent
continuous phenomena.
10

Planar enforcement
11
# Planar enforcement means that all the space on a map must be
filled and that any point must fall in one polygon alone, that is,
polygons must not overlap (eg, differing soil types cannot overlap)
# Planar enforcement implies that the phenomenon being
represented is conceptualized as a field.

12

13

14
，9

15
B A

A polygon is a
two-dimensional
surface stored as
a sequence of
points defining its
exterior bounding
ring and 0 or more
interior rings.
Polygons by
definition are
always simple.
Most often they
define parcels of
land, water
bodies, and other
features that have
a spatial extent.
Area = Polygon
16

Modifiable Areal Unit Problem (MAUP)
8% 8%
17

Illness rate = ?%
Illness rate = ?% Illness rate = ?%
Illness rate = ?%
Illness
18

19

Area Pattern Analysis – Concepts and Definitions
Scale is very important !!!
20

Focusing on attribute data
21

Focusing on attribute data
22

Spatial autocorrelation: Concept
Random Clustered Scattered
23

The First Law of
Geography
Waldo Tobler

Everything is related to
everything else, but
near things are more
related than distant
things. --Waldo Tobler
25

Spatial
Autocorrelation
The single most important
concept in Geography and GIS!
26

What is statistical “correlation”?
A correlation
measures the
relationship
between any two
variables X and Y.
But not any spurious
pairs of variables like these.
27

Simultaneity ≠ Causality

Spatial autocorrelation: 1. Tobler’s Law
The confirmation of Tobler’s first law of geography:
Everything is related to everything else, but near
things are more related than distant things.
# Spatial autocorrelation helps understand the degree
to which one object is similar to other nearby objects.
# Spatial autocorrelation measures how much close
objects are in comparison with other close objects.
Spatial autocorrelation: Four ways to describe it
29

Spatial:
On a map
Auto:
Self
Correlation:
Degree of relative
similarity
Positive: similar values cluster together on a map
Negative: dissimilar (different) values cluster together on a map
Spatial
autocorrelation
Positive spatial autocorrelation
Negative spatial
autocorrelation
30
e.g., elevation
e.g., checkerboard
5 by 5 checkerboard

2002 population density
Positive spatial autocorrelation
- high values surrounded by
nearby high values
- intermediate values surrounded
by nearby intermediate values
- low values surrounded by
nearby low values
31
Puerto Rico

Negative spatial autocorrelation
- high values surrounded by
nearby low values
- intermediate values surrounded
by nearby intermediate values
- low values surrounded by
nearby high values
competition for space
Grocery store density

2. Based on similarity
The degree to which characteristics at one location are similar
to (or different from) those nearby.
Similar to = positive spatial autocorrelation
Different from (dissimilar) = negative spatial autocorrelation
Positive spatial autocorrelation much more
common than negative!!!
…….Why?
Example: the diffusion of an innovation
through an agricultural community, where
farmers are more likely to adopt new
techniques that their neighbors have
already used with success.
 Lecture 4

Spatial autocorrelation exists everywhere!
Pollution monitoring Satellite image
Household sampling Agricultural experiment

High negative spatial
autocorrelation
No spatial
autocorrelation
High positive spatial
autocorrelation
Dispersed Pattern Random Pattern Clustered Pattern
CLUSTERED
UNIFORM/
DISPERSED
3. Based on probability
Measure of the extent to which the occurrence of an event
in one geographic unit (polygon) makes more probable, or
less probable, than the occurrence of a similar event in a
neighboring unit.
- Do you recognize this from earlier discussion?
It’s the same concept as clustered, random, dispersed!
35

Crime rate in an area
Crime rate in
near-by areas
4. Using correlation
Correlation of a variable with itself through space.
The correlation between an observation’s value on a variable
and the value of near-by observations on the same variable.
Correlation = “similarity”, “association”, or “relationship”
Scatter diagram
36

Spatial autocorrelation:
shows the association
or relationship
between the same
variable in “near-by”
areas.
Standard statistics:
shows the association
or relationship
between two different
variables.
education
income
education
Education
“next door”
In a neighboring
or near-by areas
Each point is a
geographic location
37

Spatial autocorrelation – Methods
- To explore how spatial patterns in a set of polygons change
over time.
- Significant implications for the use of statistical techniques in
analyzing spatial data.
Fundamental assumption: the sample observations are
randomly selected and therefore independent of each other.
True?
38

Why is spatial autocorrelation important?
Two reasons:
1. Spatial autocorrelation is important because it implies
the existence of a spatial process.
- Why are near-by areas similar to each other?
- Why do high income people live “next door” to each other?
- These are GEOGRAPHICAL questions.
• They are about location
2. It invalidates most traditional statistical inference
tests.
- If SA exists, then the results of standard statistical inference
tests may be incorrect (wrong!)
- We need to use spatial statistical inference tests
Create
Processes Pattern
Population
Infer
Sample
39

Why are standard statistical tests wrong?
• Statistical tests are based on the assumption that
the values of observations in each sample are
independent of one another.
• Spatial autocorrelation violates this
- samples taken from nearby areas are related to each
other and are NOT independent.
Values near each other are
similar in magnitude.
Implies a relationship between
nearby observations
40

X
Y
Positive correlation
41

X
Y Correlation coefficient, r = 1
42

X
Y Correlation coefficient, r = -1
43

44

X
Y Correlation coefficient, r = 0.82
45

The Z factor:
46

47

關聯 ≠ 因果
CORRELATION ≠ CAUSATION
Stochastic Deterministic
48

Correlation, Causation and Implication
CORRELATION
CAUSATION IMPLICATION
49

What is “spatial autocorrelation”?
50
It is a measure of the degree to
which a set of spatial features
and their associated data
values tend to be clustered
together in space (positive
spatial autocorrelation) or
dispersed (negative spatial
autocorrelation).

51

52

Positive spatial
autocorrelation
Negative spatial
autocorrelation
No (zero) spatial
autocorrelation
Is this pattern
the result of a
random process?
What is the level of
“relatedness” of
this pattern?
53

Identification of SPATIAL events
Quantitative nature of data set
Understand if events are similar or dissimilar
by defining the intensity of the spatial
process, and how strong a variable happens
in the space.
Geometric nature of data set
Conceptualise spatial relationships – at which
distance events influence each other
(distance band).
54

Measuring spatial autocorrelation
Adjacency
(Contiguity)
Distance
55

Spatial autocorrelation: Methods
Measurement based on adjacency/contiguity
- If zone j is next to zone i, it receives a weight of 1
- otherwise it receives a weight of 0.
Hexagons Irregular
Rook Queen
Sharing a border or boundary
Rook: sharing a border
Queen: sharing a border or a point 56

Measuring contiguity: lagged contiguity
Should we include second order contiguity?
hexagon
rook queen
1st
order
2nd
order
Next
nearest
neighbor
Nearest
neighbor
57

Spatial weights matrix for Rook case
Matrix contains a:
- 1 if share a border
- 0 if do not share a border
A B
C D
A B C D
A 0 1 1 0
B 1 0 0 1
C 1 0 0 1
D 0 1 1 0
4 areal units 4x4 matrix
W =
associated
geographic connectivity/
weights matrix
Common border
58

Joint count statistics
For binary (also called dichotomous) variables,
areas on a map either be white (W) or black (B).
clustering random dispersed
59

Joint count statistics: Binary variables!
At the nominal level, only the presence (B) or the
absence (W) of a specific thematic property is
considered.
60
• At the nominal level, a particular category or a set of
categories, for example the presence of a socio-economic
category or urbanization level (urban/rural).
• at the ordinal level, a class (a rank) or a set of classes, for
example the presence of the best agricultural soil classes
(arable/nonarable).
• At the interval and ratio levels, an interval of values, for
example the presence of a significant rate of criminality
(low/high).

• The most basic statistic for area pattern analysis of
binary variables.
• Joint: two areas sharing a common edge or boundary.
Rook’s case Queen’s case
61

For any choropleth map, we can count the number and types
of joints that exist (BB, BW, WW).
Rook’s case
Vertical joints: 4columns*3/column=12
Horizontal joints:4rows*3/row=12
Total joints: 12+12=24
BB: 6
BW:11
WW:7
62
A choropleth map is a thematic map in which
polygons (areas) are shaded or patterned using
different colors in proportion to the measurement
of the statistical variable being displayed on the
map, such as population density or GDP.

?
Autocorrelation
Positive
Joint Number
BB
WW
BW
94
94
22
Negative
Autocorrelation
Joint Number
BB
WW
BW
49
49
112
Random
Autocorrelation
Joint Number
BB
WW
BW
?
?
?
B: black; W: white
63

In an independent random process:
Expected number of BB joints JBB = kp2
Expected number of WW joints JWW = kq2
Expected number of BW joints JBW = 2kpq
where k = total number of joints
p = probability of an area being B
q = probability of an area being W
JBB = kp2 = 210(0.5)2 = 52.5
JWW = kq2 = 210(0.5)2 = 52.5
JBW = 2kpq = 2(210)(0.5)(0.5) = 105
210
Random
Autocorrelation
Joint Number
BB
WW
BW
56
47
107
64

Major limitations:
1. Works with binary data only
2. Applies to area data only
Random
Autocorrelation
Observed JBW < Expected JBW: Positive autocorrelation
Observed JBW = Expected JBW: Random
Observed JBW > Expected JBW: Negative autocorrelation
65

Joint count statistics method
Major limitations:
1. Works with binary data only
2. Applies to area data only
3. Joint counting is tedious and
error-prone
4. Computation of test statistic is
complicated and formidable
66

Measuring similarity of nearby features
3
3
5
4 6
3
67

3
3
5
4 6
3
Geary’ C Moran’ I
3 – 3 = 0
3 – 5 = -2
3 – 4 = -1
Mean = 24 / 6 = 4
Target – Mean = 3 – 4 = -1
Neighbour – Mean:
3 – 4 = -1
5 – 4 = 1
4 – 4 = 0
Measuring similarity of nearby features
68

(a): Moran’s I
• The most common measure of spatial autocorrelation
• Use for points or polygons
- Joint Count statistic only for polygons
• Use for a continuous variable (any value)
- Joint Count statistic only for binary variable (1,0)
• Varies on a scale between -1 to +1
-1 0 +1
high negative spatial
autocorrelation
no spatial
autocorrelation
high positive spatial
autocorrelation
• It can also be used as an index for dispersion/random/
cluster patterns.
Dispersed Pattern Random Pattern Clustered Pattern
CLUSTERED
UNIFORM/
DISPERSED
69

Moran’s I vs. Correlation Coefficient r
Correlation Coefficient r
Relationship between two variables
70
Moran’s I
Involves one variable only；
Correlation between variable, X, and the “spatial lag” of X formed
by averaging all the values of X for the neighboring polygons
Education
Income
r = -0.71
Price
Quantity
r = 0.71
Crime Rate
Crime in
nearby
area
r = -0.71
Grocery Store Density
Grocery
Store
Density
Nearby
r = 0.71

Moran’s Index (Moran’s I)




 
 



 n
1
i
2
i
n
1
i
n
1
j
ij
n
1
i
n
1
j
j
i
ij
)
x
(x
)
w
(
)
x
)(x
x
(x
w
N
I
i
j
(O'sullivan and Unwin, 2003)
71

5 7 11
6 10 13
8 14 16
(a) 5 7 13
8 16 14
10 11 6
(b)
a b
Mean
Standard deviation
Variance
Moran’s I
10 10
3.807887 3.807887
14.5 14.5
0.5532 0.0575
72

1. This index measures spatial
autocorrelation based on both
feature locations and feature values
simultaneously.
2. Given a set of spatial features and
an associated attribute, it evaluates
whether the pattern expressed is
clustered, dispersed, or random.
3. Its results are relatively easy to interpret:
+1 is indicative of perfect clustering
-1 is indicative of perfect dispersion
0 is indicative of zero spatial
autocorrelation (random)
73

Test statistic for normal frequency distribution
74
0
-1.96
2.5%
1.96
2.5%
2.54
Reject null at 5%
Reject null
Null Hypothesis: no spatial autocorrelation.
Moran’s I = 0
Alternative Hypothesis: spatial autocorrelation exists.
Moran’s I ≠ 0
Reject Null Hypothesis if Z test > 1.96 (or < -1.96)
- less than a 5% chance that, in the population, there is no
spatial autocorrelation.
- 95% confident that spatial autocorrelation exits.
Z

I = 0.00
I = -1.00
I = +1.00
I = +0.293
I = -0.393
Random
Independent
Extreme
Negative
Extreme
Positive
Negative
Positive
75

Moran’s I shows
the similarity of
nearby features
through the I
value (-1 to 1), but
does not indicate
if the clustering is
for high values or
low values.
I= -0.12, slightly dispersed
I= 0.26, clustered
76

77
Moran Scatter Plots
Moran’s I can be interpreted as the correlation between
variable, X, and the “spatial lag” of X formed by averaging
all the values of X for the neighboring polygons.
We can then draw a scatter diagram between these two
variables (in standardized form): X and lag-X (or W_X)
Least squares “best fit” line to the
points.
The slope of this regression line is
Moran’s I
(will discuss Regression next week)
Xi
Lag Xi
is average of
these

Moran’s scatter plot
Low/High
negative SA
High/High
positive SA
Low/Low
positive SA
High/Low
negative SA
Q1
Q3
Q2
Q4
78

Q1 (values [+], nearby values [+]): H-H
Q3 (values [-], nearby values [-]): L-L
Q2 (values [-], nearby values [+]): L-H
Q4 (values [+], nearby values [-]): H-L
Locations of positive spatial association
(“I’m similar to my neighbors”).
Locations of negative spatial association
(“I’m different from my neighbors”).
79

Example 1
- Scatter plot of X vs. Lag-X；
- The slope of the regression
line is Moran’s I
80
Moran’s I = 0.49
High
surrounded
by high
Low
surrounded
by low
Population density
in Puerto Rico
X
Lag-X

5 7 11
6 10 13
8 14 16
(a) 5 7 13
8 16 14
10 11 6
(b)
a b
Mean
Standard deviation
Variance
Moran’s I
10 10
3.807887 3.807887
14.5 14.5
0.5532 0.0575
Example 2
81

(b): Geary’s Index (C)
• The value of Geary's C lies between 0 and 2.
• 1 means no spatial autocorrelation.
• Values lower than 1 demonstrate increasing positive spatial
autocorrelation, whilst values higher than 1 illustrate
increasing negative spatial autocorrelation.
0: positive spatial autocorrelation
1: no spatial autocorrelation
2: negative spatial autocorrelation
82
a measure of spatial autocorrelation or an
attempt to determine if adjacent
observations of the same phenomenon are
correlated.




 
 



 n
1
i
2
i
n
1
i
n
1
j
ij
n
1
i
n
1
j
2
j
i
ij
)
x
(x
)
w
(
2
)
x
(x
w
)
1
(N
C

83
• Calculation is similar to Moran’s I
- For Moran I, the cross-product is based on the deviations from the mean
for the two location values.
- For Geary C, the cross-product uses the actual values themselves at
each location.




 
 



 n
1
i
2
i
n
1
i
n
1
j
ij
n
1
i
n
1
j
j
i
ij
)
x
(x
)
w
(
)
x
)(x
x
(x
w
N
I




 
 



 n
1
i
2
i
n
1
i
n
1
j
ij
n
1
i
n
1
j
2
j
i
ij
)
x
(x
)
w
(
2
)
x
(x
w
)
1
(N
C
Geary’s Index (C)
• Geary's C is inversely related to Moran's I, but it is not
identical.
• Interpretation is very different, essentially the opposite!
- Geary’s C varies on a scale from 0 to 2
• Moran's I is a measure of global spatial autocorrelation, while
Geary's C is more sensitive to local spatial autocorrelation.
• Can convert to a -/+1 scale by: calculating C* = 1 – C.
83

Local measure of spatial autocorrelation
• Global statistics – identify and
measure the pattern of the entire
study area.
- Do not indicate where specific
patterns occur!
• Local statistics – identify variation
across the study area, focusing on
individual features and their
relationships to nearby features
(i.e. specific areas of clustering).
84

Local Indicators of Spatial Association (LISA)
The statistic is calculated
for each areal unit in the
data.
For each polygon, the index
is calculated based on
neighboring polygons with
which it shares a border.
85

Raw data
LISA
Example
86

GEOG2120
Next week . . .
- Correlation and Regression

Lecture 7 Area Objects and Spatial Autocorrelation.pptx

Recommended

Recommended

More Related Content

Similar to Lecture 7 Area Objects and Spatial Autocorrelation.pptx

Similar to Lecture 7 Area Objects and Spatial Autocorrelation.pptx (20)

Recently uploaded

Recently uploaded (20)

Lecture 7 Area Objects and Spatial Autocorrelation.pptx