Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Regional Income Convergence
1. Regional Income Convergence: A Spatial
Analysis Approach
Prepared by César R. Sobrino
Universidad del Turabo
November 27, 2017
1 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
2. Outline
1 Regression Analysis
OLS regression
Assumptions and Tests
2 Spatial Econometrics
Spatial Dependence & Spatial Heterogeneity
Spatial Matrix (W) and Moran’s I statistic
3 Income Convergence
σ-convergence
β -convergence and speed of convergence (θ)
4 GeoDa
Managing shapefiles
Creating Ws and Moran’s I statistic
Regression
2 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
3. OLS: Ordinary Least Squares
Parameters
The coefficients in an equation that determine the
exact mathematical relation among the variables
(growth rate and initial income)
Unknowns.
Parameter estimation
The process of finding estimates of the numerical
values of the parameters of an equation
3 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
4. OLS
OLS
The general purpose of linear regression is to find a
(linear) relationship between the dependent variable
and a set of explanatory variables.
There can be cross-section or times series data.
Bivariate form
Y = a + bX + ε
Intercept parameter (a) gives value of Y where
regression line crosses Y -axis (value of Y when X is
zero.
Slope parameter (b) gives the change in Y associated
with a one-unit change in X : ∆Y /∆X
4 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
5. OLS
Two objectives:
Find a good match (fit) between a + bX and observed
values of Y ( a and b are the regression coefficients).
Discover which of the explanatory variables (Xs)
contribute significantly to the linear relationship
OLS accomplished both stated objectives in an optimal
fashion according to some criteria, and is referred to as
a Best Linear Unbiased Estimator (BLUE)
OLS estimates (a and b) are found minimizing the sum
of the squared prediction errors (hence least squares).
5 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
6. OLS
The OLS regression line (red one) is that minimizes the
sum of the squared prediction errors
6 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
7. OLS
In order to obtain the BLUE property and to able to
make statistical inferences about the population
parameters (a and b) by means of your estimates (a
and b), you need to make certain assumptions about
the random part of the regression equation (the
random error ε)
Two of these assumptions are crucial to obtain the
unbiasedness and efficiency of the OLS estimates.
7 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
8. OLS
Assumptions
The random error (ε) has mean zero (there is no
systematic misspecification or bias in the regression
equation).
Expected value: E(ε) = 0
If E(ε) = 0 does not hold, estimators are biased
The random error terms are uncorrelated and have a
constant variance (they are homoskedastic).
Variance: E(εε ) = σ2
I
If E(εε ) = σ2
I does not hold, this means that either
autocorrelation or heteroskedasticity are present, so
estimators are inefficient.
8 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
9. Hypothesis Tests
Null hypothesis H0: a = 0 or H0: b = 0 .
Alternative hypothesis H1: a = 0 or H1: b = 0 .
If you reject H0, the paramater (a or b ) is
statistically different from zero.
9 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
10. Individual statistical significance
Must determine if there is sufficient statistical evidence
to indicate that Y is truly related to X (i.e., b = 0)
Even if b = 0, it is possible that the sample will
produce an estimate b that is different from zero
Test for statistical significance using t-tests or p-values
10 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
11. Individual significance - t-Test
First determine the level of significance (0.1%, 1%, 5%,
10%)
Probability of finding a parameter estimate to be
statistically different from zero when, in fact, it is zero
(alpha). α = 0.001, 0.01, 0.05, or 0.1, respectively.
Probability of a Type I Error (alpha).
1 – level of significance (alpha) = level of confidence
t-ratio is computed as t = b/Sb
where Sb
is the standard error of estimate b
Use t-table to choose critical t-value with n – k
degrees of freedom for the chosen level of significance
n = number of observations
k = number of parameters estimated.
11 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
12. Individual significance-t-Test
If the absolute value of t-ratio is greater than the
critical t , the parameter estimate is statistically
significant at the given level of significance.
If t-ratio (in absolute value) is equal to 2 (or bigger
than 2) , you can reject H0.
12 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
13. Individual significance - p-Values
Treat as statistically significant only those parameter
estimates with p-values smaller than the maximum
acceptable significance level. p-value gives exact level
of significance.
Also the probability of finding significance when none
exists
Significance levels (alpha)
α = 0.001, or 0.1% significance level
α = 0.01, or 1% significance level
α = 0.05, or 5% significance level
α = 0.1, or 10 % significance level
E.g. if p-value = 0.00001, you reject H0 at 0.1%
significance level, if p-value = 0.08, you reject H0 at
10% significance level, and, if p-value = 0.14, you
cannot reject H0 at 10% significance level
13 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
14. Joint significance -F-test
Used to test for significance of overall regression
equation
Compare F-statistic to critical F-value from F-table
Two degrees of freedom, n – k & k – 1
Level of significance
If F-statistic exceeds the critical F (=4), the regression
equation overall is statistically significant at the
specified level of significance.
14 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
15. Coefficient of Determination: R2
R2
measures the percentage of total variation in the
dependent variable (Y ) that is explained by the
regression equation
Ranges from 0 to 1
High R2
indicates Y and X are highly correlated
E.g. R2
= 0.8 means that 80% of the changes of Y are
explained by the regression equation.
15 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
16. Spatial Analysis: Motivation
Diagnosis
The assumption of normal, homoskedastic and
uncorrelated error terms that lead to BLUE
characteristic of OLS estimators are not necessarily
satisfied by the real models and data.
When dealing with spatial data you must give special
attention to the possibility that the errors or the
variables (Xs) in the model show spatial
dependence.
16 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
17. Spatial Analysis: Motivation
What is spatial autocorrelation (dependence)
important?
We need to examine the influences of spatial
autocorrelation upon the inferences that may be
drawn from statistical tests.
As these inferences are based on independence
assumptions (OLS asumptions), then the presence of
spatial autocorrelation is likely to bias any resultant
inference.
Dependence amongts error terms brings inefficient
OLS estimates. Spatial Error (SEM).
OLS estimates are biased, and thus inferences based on
the regression model will be incorrect. Spatial Lag
(SAR).
17 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
18. Spatial Analysis: Motivation
Applied work in regional science (economics, health,
demographics, etc.) uses of spatial data.
Spatial data: Data collected with reference to location.
Administrative spatial units (states, districts,
counties, etc.).
Functional regions (E.g. labour market regions).
Points in space (E.g. cities, municipalities, plants) .
Using spatial data, model estimation, hypothesis
testing and prediction have to allow for spatial
effects.
18 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
19. Spatial Dependence
Lack of independence among spatial data,
Observations at location i depend on other
observations at locations j (= i).
Spatial dependence is associated with the notion of
relative space (location)
Neighbouring regions are expected to be more alike
than arbitrary regions.
Spatial dependence is expected to diminish with
increasing distance.
Spatial dependence are multidirectional by nature.
Time series is unidirectional.
19 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
20. Spatial Dependence: Causes
Nuisance:
The delineation of spatial units is somewhat arbitrary.
Spatial data are usually collected for administrative
units (states, districts, counties, etc.).
If the correspondence between the spatial scale of a
phenomenon under study and the delineation of the
spatial units of observation is not strong,
measurement errors are to be expected.
OLS models can be corrected by including a spatial
error specification in the model (SEM).
20 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
21. Spatial Dependence: Causes
Substantive:
Interaction and dependence on the regional level may
be itself a modelling problem because it generats
model bias.
Location and distance are important forces at work in
human geography and market activity. E.g spatial
spillovers, hierarchy of places, etc..
This can be corrected by including an explicit spatial
lag term as an explanatory variable in the model
(SAR).
21 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
22. Spatial Heterogeneity
It refers to varying economic relationships or
disturbances over space.
A different relationship may hold for every spatial
unit. This situation characterizes the case of structural
instability.
In case of structural instability, the regression
coefficients are not constant across the spatial units.
E.g. Sample: 35,000 homes sold within the last 5 years
in Lucas county, Ohio.
3 distinct distributions,with low-priced homes nearest
to the Central Business District(CBD) and high
priced homes farthest away from the CBD.
This suggests different relationships may be at work
to describe home prices in different locations.
22 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
23. Spatial Weight Matrix (W)
Quantify location for analyzing spatial effects
Contiguity (neighbourhood)
The relative location among spatial units. Usually
established from a map.
Units near should reflect a greater degree of spatial
dependence than those more distant from each
other. For spatial heterogeneity, relationships may
be similar for neighbouring units.
Distance
Latitude and longitude allow us to calculate distances
from any point in space.
Spatial dependence will decline with distance.
For (spatial heterogeneity, closer units should
exhibit similar relationships.
23 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
24. W
In a regular grid, neighbours can be defined in a
number of ways. Among others, you may find
In analogy of the game of chess, rook contiguity, bishop
contiguity and queen contiguity are distinguished.
Inverse distance raised to a power.
24 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
25. W: Rook contiguity
A spatial unit is a neighbour of another unit if both
areas share a common edge (side). In the next figure,
the units B1, B2, B3 and B4 are neighbours of unit A
according to the rook criterion.
B2
B1 A B3
B4
25 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
26. W:Queen contiguity
A spatial unit is a neighbour of another unit if both
areas share a common edge or vertex. In the next
figure the units B1, B2, B3 and B4 as well as C1, C2,
C3 and C4 are neighbours of unit A according to the
queen criterion.
C1 B2 C2
B1 A B3
C3 B4 C4
26 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
27. W:Distance-based spatial weight matrix
Spatial interaction will decline with increasing distance
due to increasing geographical impediments.
Nearer regions have a greater potential influence.
Power function: Wij = 1/dij
γ
, where
γ is a power parameter
Wij element of matrix W at row i and column j
(i = j)
dij: distance between region i and region j
The distances, dij, are usually measured between the
centres of the regions (latitude and longitude).
27 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
29. Testing Spatial Autocorrelation
Moran’s I - statistic: test for spatial dependence.
Pearson correlation: ρxy = Sxy
SxSy
,
where Sxy is the covariance between x and y, Sx is the
standard deviation of x, and, Sy is the standard
deviation of y
Covariance formula
Sxy =
n
i=1(xi − ¯x)(yi − ¯y)
n − 1
, then
ρxy =
n
i=1(xi − ¯x)(yi − ¯y)
SxSy(n − 1)
29 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
30. Moran’s I - statistic
Similarities between units i and j are calculated as the
product of the differences between xi (variable of
interest) and xj (spatial lag) with the overall mean (¯x),
divided by the sample variance. This ratio has to be
adjusted for the spatial weights used.
I =
n
n
i
n
j Wij
n
i
n
j Wij(xi − ¯x)(xj − ¯x)
n
i (xi − ¯x)2
where xi is the i-th observation, n is the sample size,
and Wij is the spatial weight between i and j.
30 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
31. Moran’s I - statistic
The expected value of Moran’s I statistic: − 1/(n − 1)
E.g. if n = 48 regions ⇒ − 1/(48 − 1) = 0.0213, which is
close to zero, meaning no spatial autocorrelation.
Then, H0 : I = 0 and H1 : I = 0.
A standardized matrix bounds I between -1 and 1.
-1 means perfect clustering of dissimilar values (perfect
dispersion).
0 is no autocorrelation (perfect randomness)
1 means perfect clustering of similar values (spatial
autocorrelation).
31 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
32. Spatial Lag (SAR)
1 OLS regression Y = a + bX + ε
2 SAR (including W) : Y = ρWY + a + bX + ε
3 Y = (1 − ρW)−1
a + (1 − ρW)−1
bX + (1 − ρW)−1
ε
4 Where ρ is a scalar parameter that indicates the effect
of the dependent variable in the neighbors on Y in the
focal area, intercept, (1 − ρW)−1
a, slope, (1 − ρW)−1
b
, and, error term, (1 − ρW)−1
ε
5 GeoDa reports 3) & ρ
6 Not including ρW brings biased estimates and thus
inferences based on an OLS model will be incorrect
32 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
33. Spatial Error Model (SEM)
1 OLS regression Y = a + bX + ε
2 SEM (including W): Y = a + bX + ε & ε = λWε + µ
3 Y = a + bX + (1 − λW)−1
µ
4 Where: λ is the autoregressive coeficient and µ is
another error term, intercept a, slope b, and , error
term , (1 − λW)−1
µ
5 Geoda reports 3) & λ
6 Not including λW brings unbiased estimates and
biased standard errors and consequently, t-tests &
p-values will be misleading.
33 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
34. Income Convergence
Robert Solow (1956) “Capital should flow from
countries with a high capital-to-output ratio to
countries with a low capital-to-output ratio ”
“Poor” countries/regions/states should have higher
growth rates.
”rich” countries/regions/states should have lower
growth rates
The analysis using regions is called Regional Income
Convergence.
34 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
35. Sigma Convergence, (σ- convergence)
It refers to decreasing variance of variables over time.
This is measured by the coefficient of variation (CV)
which gives the relative standard deviation to the
mean (the standard deviation divided by mean).
Since CV is mean standardized, it controls for
increasing averages over time and can be directly
compared across different variables.
When the CV of real per capita income across regions
falls over time, there is σ-convergence .
35 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
36. Beta Convergence, (β- convergence)
It considers the mobility of countries (regions).
It is defined as a negative correlation between the
position of individual countries (regions) at the
beginning of an observation period and the changes or
growth rates over this period.
It assumes that growth from a low base is faster than
growth from high levels.
36 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
37. β- convergence
OLS regression model
LINC1i − LINC0i = a + βLINC0i + ε0i
Where:
LINC1i is the final(1) per capita income for region i
in logs.
LINC0i is the initial(0) per capita income for region i
in logs.
LINC1i − LINC0i is the growth rate between the
final year and the initial year.
L stands for logs
ε0i is an error term
37 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
38. β values
β > -1 and β <0 (β ∈ ]-1,0[) and significant means
β-convergence.
β > 0 (β ∈ ]0, ∞+
[) and significant means “divergence”
β = 0 , neither “convergence” nor “divergence”
β not significant , neither “convergence” nor
“divergence”
38 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
39. Convergence rate (θ)
θ = ln(1 + β)/(−k)
Where
k is difference between periods (E.g. k=1945-1929=16)
E.g if β = -0.2 and k= 16.
θ = ln(−0.2 + 1)/(−16)
θ = ln(0.8)/(−16)
θ = − 0.22/− 16 = 0.01375 or 1.4% (speed of
convergence).
This means that regions converge at a speed of 1.4
percent per year.
Note: ln(1)=0 and ln(0) does not exist, so, if β = -1 ,
θ does not exist, and , if β = 0 , θ =0
The logarithmic function does not take negative values.
39 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
40. Rey and Montuori(1998) (R & M)
This is an article on regional income convergence
Their data includes 48 states and used four years in
their study (1929, 1945, 1946, and 1994). They
included neither Hawaii nor Alaska.
Three periods: 1929-94, 1929-45, and, 1946-94. So,
they run a cross-sectional analysis.
40 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
41. GeoDa
GeoDa is a free and open source software tool that
serves for spatial data analysis.
You may download it from
http://geodacenter.github.io/download.html
The shapefiles (shp) are the most used files.
A shapefile stores nontopological geometry and
attribute information for the spatial features in a data
set. It includes an ID variable to identify regions.
A shapefile consists of at least four actual files, an
index file (shx), a data base table (dbf) and a
projection file (prj).
41 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
42. GeoDa
For your research paper, first you have to choose a
country and gather your data in excel (or in Open
Office/spreadsheet).
Download Open Office from https:
//www.openoffice.org/download/index.html
Later, look for a shapefile of the regions of that
country. This link is helpful
http://www.gadm.org/country.
Open that shapefile in Geoda.
Create the variables that you will use. Table/Add
variable and set integer, 10 lentgh, and 3 decimals.
GeoDa will create empty columns.
Click “Table” and select (if you need to do it) the
regions that you will use. Do not include isolate
regions such as islands.
42 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
43. GeoDa
Save as a new shapefile (create a new directory).
Automatically, GeoDa creates a dbf, shx, and prj files.
To include your data to the new shapefile, you have to
open the new dbf file using Open Office
(spreadsheet/international/OK).
Check the correspondence between the regions of the
new dbf file and the regions of your data. The order of
your data has to be equal to the order of the new dbf
file.
Copy your data and paste it on the new dbf file. Fill
the empty columns.
Save (Keep the current format)
43 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
44. GeoDa: US states
Open the new shapefile with your gathered data
The variables that I have gathered are:
INC29: 1929 real per capita income
INC45: 1945 real per capita income
INC46: 1946 real per capita income
INC94: 1994 real per capita income
Three different sample periods 1929-94, 1929-45, and,
1946-94
The initial year is 1929, the break year is 1945, and the
final year is 1994.
44 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
45. GeoDa: Calculating variables
Creating variables in logs Table/Add Variables
LINC29, LINC45, LINC46, and , LINC94. GeoDa
will create empty columns.
Table/Variable calculation/univariate set “LINC29” ,
operator “log (base e)” and variable “INC29”. Do the
same for the other variables.
Creating growth rates Table/Add Variables:
dI94I29, dI45I29, and, dI94I46, GeoDa will create
empty columns.
Table/Variable calculation/bivariate set “dI94I29” ,
variable, “, LINC94,” operator, “subtract”, variable,
“LINC29” . Do the same for the other growth rates.
45 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
46. GeoDa US states - Descriptive Statistics
Click on Explore/Boxplot
46 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
47. US states: Exploring β-convergence
Explore/
Scatter plot/
1994-29
Y: “dI94I29”, X: ‘‘LINC29”, OK
1945-29
Y: “dI45I29”, X: ‘‘LINC29”, OK
1994-46
Y: ‘dI94I46”, X: ‘‘LINC46”, OK
You should get negative relationships
47 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
48. US states: Exploring β- convergence
1994-29 1945-29 1994-46
X : Initial income & Y : Growth rate.
At first glance, β- convergence holds.
48 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
49. GeoDa: Exploring Spatial Dependence
Map/Quantile Map/5 to check if there are spatial
patterns. Do you find any?
1929 per capita income 1945 per capita income
1994 per capita income
49 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
50. GeoDa: Creating W and Moran scatterplots
Spatial Matrix
Tools/Weights manager/create
Weights File ID variable “’GEOID’ .
Your shapefile must have one ID variable
Queen Contiguity
Create/Save
Moran’s I
Space/
Univariate’s Moran’s I/
Set the variable you want to analyze/
Set W/Queen
The scatterplot enables you to assess how similar a
spatial unit is to its neighbors.
50 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
51. GeoDa: Moran scatterplot- state per capita income
X: Spatial units; Y: the weighted average or spatial lag
of the corresponding observation on the X axis.
1929 1945 1994
They show spatial dependence because there is a
positive correlation (See page 146, R & M)
51 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
52. GeoDa US states - Exploring data
The CV of real per capita income in logs across US
states falls over time, so σ-convergence holds
According to Moran’s I, data shows spatial
dependence.
Table: Descriptive Statistics
Mean Median SD CV Moran’s I
LINC29 6.35 6.38 0.38 0.06 0.65
LINC45 7.02 7.03 0.23 0.03 0.57
LINC94 9.96 9.95 0.13 0.01 0.35
52 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
53. GeoDa: OLS regression
See slide 37 and R & M page 148, equation 4
Click on Regression
Dependent variable/ growth rate (E.g. dI94I29 )
Independent variable/ initial income (E.g. LINC29 )
Weights file/
Classic: This will run classical OLS regression with
spatial dependence diagnostics, click Run.
Three regressions:
1 dI94I29i = a + βLINC29i + ε29i
2 dI45I29i = a + βLINC29i + ε29i
3 dI94I46i = a + βLINC46i + ε46i
Where i : 1, 2, 3, ....., .48
53 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
54. GeoDa: OLS regression-Outcomes: 1994-29
SUMMARY OF OUTPUT: ORDINARY LEAST SQUARES ESTIMATION
Dependent Variable : DIN94IN29 Number of Observations: 48
Mean dependent var : 3.61054 Number of Variables : 2
S.D. dependent var : 0.284673 Degrees of Freedom : 46
R-squared : 0.918195 F-statistic : 516.314
Adjusted R-squared : 0.916417 Prob(F-statistic) : 1.20184e-26
Sum squared residual: 0.318208 Log likelihood : 52.281
Sigma-square : 0.00691757 Akaike info criterion : -100.562
S.E. of regression : 0.0831719 Schwarz criterion : -96.8195
Sigma-square ML : 0.00662934
S.E of regression ML: 0.0814207
----------------------------------------------------------------------
-------
Variable Coefficient Std.Error t-Statistic Probability
----------------------------------------------------------------------
-------
CONSTANT 8.25684 0.204832 40.3104 0.00000
LINC29 -0.732026 0.0322159 -22.7225 0.00000
----------------------------------------------------------------------
-------
REGRESSION DIAGNOSTICS
MULTICOLLINEARITY CONDITION NUMBER 34.095520 (Extreme
Multicollinearity)
TEST ON NORMALITY OF ERRORS
TEST DF VALUE PROB
Jarque-Bera 2 1.0399 0.59456
DIAGNOSTICS FOR HETEROSKEDASTICITY
RANDOM COEFFICIENTS
TEST DF VALUE PROB
Breusch-Pagan test 1 0.0012 0.97181
Koenker-Bassett test 1 0.0013 0.97079
DIAGNOSTICS FOR SPATIAL DEPENDENCE FOR WEIGHT MATRIX : nuevoq
(row-standardized weights)
TEST MI/DF VALUE PROB
Moran's I (error) 0.1509 1.9658 0.04932
Lagrange Multiplier (lag) 1 3.5538 0.05941
Robust LM (lag) 1 1.9997 0.15733
Lagrange Multiplier (error) 1 2.1903 0.13888
Robust LM (error) 1 0.6362 0.42511
Lagrange Multiplier (SARMA) 2 4.1900 0.12307
============================== END OF REPORT
================================
54 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
55. GeoDa: OLS regression-Outcomes: 1945-29
SUMMARY OF OUTPUT: ORDINARY LEAST SQUARES ESTIMATION
Dependent Variable : DIN45IN29 Number of Observations: 48
Mean dependent var : 0.674542 Number of Variables : 2
S.D. dependent var : 0.17481 Degrees of Freedom : 46
R-squared : 0.831930 F-statistic : 227.696
Adjusted R-squared : 0.828276 Prob(F-statistic) : 1.96164e-19
Sum squared residual: 0.246528 Log likelihood : 58.4065
Sigma-square : 0.0053593 Akaike info criterion : -112.813
S.E. of regression : 0.0732072 Schwarz criterion : -109.071
Sigma-square ML : 0.00513599
S.E of regression ML: 0.0716658
----------------------------------------------------------------------
-------
Variable Coefficient Std.Error t-Statistic Probability
----------------------------------------------------------------------
-------
CONSTANT 3.39038 0.180291 18.8051 0.00000
LINC29 -0.427882 0.0283561 -15.0896 0.00000
----------------------------------------------------------------------
-------
REGRESSION DIAGNOSTICS
MULTICOLLINEARITY CONDITION NUMBER 34.095520
TEST ON NORMALITY OF ERRORS
TEST DF VALUE PROB
Jarque-Bera 2 0.2160 0.89762
DIAGNOSTICS FOR HETEROSKEDASTICITY
RANDOM COEFFICIENTS
TEST DF VALUE PROB
Breusch-Pagan test 1 1.9185 0.16602
Koenker-Bassett test 1 2.2058 0.13749
SPECIFICATION ROBUST TEST
TEST DF VALUE PROB
White 2 2.3107 0.31495
DIAGNOSTICS FOR SPATIAL DEPENDENCE FOR WEIGHT MATRIX : nuevoq
(row-standardized weights)
TEST MI/DF VALUE PROB
Moran's I (error) 0.3815 4.3930 0.00001
Lagrange Multiplier (lag) 1 11.0500 0.00089
Robust LM (lag) 1 2.3441 0.12576
Lagrange Multiplier (error) 1 14.0018 0.00018
Robust LM (error) 1 5.2958 0.02138
Lagrange Multiplier (SARMA) 2 16.3459 0.00028
============================== END OF REPORT
================================
55 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
56. GeoDa: OLS regression-Outcomes: 1994-46
SUMMARY OF OUTPUT: ORDINARY LEAST SQUARES ESTIMATION
Data set : nuevo
Dependent Variable : DIN94IN46 Number of Observations: 48
Mean dependent var : 2.91979 Number of Variables : 2
S.D. dependent var : 0.163036 Degrees of Freedom : 46
R-squared : 0.727570 F-statistic : 122.85
Adjusted R-squared : 0.721647 Prob(F-statistic) : 1.39578e-14
Sum squared residual: 0.347588 Log likelihood : 50.1615
Sigma-square : 0.00755626 Akaike info criterion : -96.3229
S.E. of regression : 0.0869268 Schwarz criterion : -92.5805
Sigma-square ML : 0.00724142
S.E of regression ML: 0.0850965
----------------------------------------------------------------------
-------
Variable Coefficient Std.Error t-Statistic Probability
----------------------------------------------------------------------
-------
CONSTANT 7.07005 0.374654 18.8709 0.00000
LINC46 -0.589693 0.0532032 -11.0838 0.00000
----------------------------------------------------------------------
-------
REGRESSION DIAGNOSTICS
MULTICOLLINEARITY CONDITION NUMBER 59.704369
TEST ON NORMALITY OF ERRORS
TEST DF VALUE PROB
Jarque-Bera 2 0.5390 0.76376
DIAGNOSTICS FOR HETEROSKEDASTICITY
RANDOM COEFFICIENTS
TEST DF VALUE PROB
Breusch-Pagan test 1 1.5535 0.21262
Koenker-Bassett test 1 1.4503 0.22849
SPECIFICATION ROBUST TEST
TEST DF VALUE PROB
White 2 1.6639 0.43519
DIAGNOSTICS FOR SPATIAL DEPENDENCE
FOR WEIGHT MATRIX : nuevoq (row-standardized weights)
TEST MI/DF VALUE PROB
Moran's I (error) 0.3141 3.6646 0.00025
Lagrange Multiplier (lag) 1 10.4680 0.00121
Robust LM (lag) 1 2.5955 0.10717
Lagrange Multiplier (error) 1 9.4918 0.00206
Robust LM (error) 1 1.6193 0.20319
Lagrange Multiplier (SARMA) 2 12.0873 0.00237
============================== END OF REPORT
================================
56 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
57. GeoDa: OLS regression-Outcomes
With these outputs, you should be able to complete R
& M Table 2
You may find R2
s, AICs (Akaike Infomation
Criterion), βs, and, p − values,
Convergence rate(θ) is calculated using β (See slide 39)
Tests for spatial dependence: Robust LM (lag and
error) and Moran’s I (error).
Breusch-Pagan Test (test for Heteroskedasticity).
AIC: Value for model selection
57 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
59. GeoDa: Diagnostic Tests
Heteroskedasticiy: When regression errors do not
have a constant variance over all observations.
Breush-Pagan Test:
H0: homocedasticity ; H1: heteroskedasticity
Multicollinearity: High correlation between Xs
Condition number > 30 is considered suspect
Condition number =1 means a lack of multicollinearity
Non-normal errors: Most regression models assume
normal errors distributions
Jarque-Bera Test:
H0: normal errors ; H1: no existence of normal errors
AIC: Calculate AIC for each model with the same
data set, and the “best” model is the one with
minimum AIC value.
If p − value is greater than 0.1, you cannot reject H0
59 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
60. GeoDa: OLS vs SAR & SEM
GeoDa reports Moran’s I (error), LM (lag), LM (error),
Robust LM (lag), and, Robust LM (error)
Moran’s I (error) is an extension of Moran’s I -statistic
to measure spatial autocorrelation in regression
models. It is useful to detect spatial dependence but
they do not allow to discriminate betweem SAR and
SEM.
H0: OLS ; H1: Spatial dependence
LM (error): H0: OLS ; H1: SEM
LM (lag): H0: OLS ; H1: SAR
If LMs are significant (H0 is rejected) , focus on robust
tests.
If p − value is greater than 0.1, you cannot reject H0
60 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
61. GeoDa: OLS vs SAR & SEM
Robust LM (error): H0: OLS ; H1: SEM
Robust LM (lag): H0: OLS ; H1: SAR
if both robust measures are significant, stick with the
more significant.
If p − value is greater than 0.1, you cannot reject H0
61 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
62. Interpretation of OLS outcomes
Results provide much support for β-convergence.
Coefficients highly signicant and negative.
R2
above 0.7 in all three samples
Convergence rate over entire sample, 2% yearly but
first sub sample, 3.5%, second sub sample 2%
Moran’s I statistic (MI) provides very strong evidence
(See p-value) of spatial dependence
Robust tests point to the presence of spatial error
(SEM) rather than the spatial lag (SAR).
Breusch- Pagan test for heteroscedasticity is not
significant in any of the sub-samples. Then, omit
further consideration of the spatial heterogeneity
models.
62 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
63. GeoDa : SAR
See slide 32 and R & M page 150, equation 9
Click on Regression
Dependent variable/ growth rate (E.g. dI94I29 )
Independent variable/ initial income (E.g. LINC29 )
Weights file/
Spatial lag
Three regressions:
1 dI94I29i = a + ρWdI94I29i + βLINC29i + ε29i
2 dI45I29i = a + ρWdI45I29i + βLINC29i + ε29i
3 dI94I46i = a + ρWdI94I46i + βLINC46i + ε46i
Where i : 1, 2, 3, ....., .48
63 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
64. GeoDa: SAR -Outcomes: 1994-29
SUMMARY OF OUTPUT: SPATIAL LAG MODEL - MAXIMUM LIKELIHOOD ESTIMATION
Data set : nuevo
Spatial Weight : nuevoq
Dependent Variable : DIN94IN29 Number of Observations: 48
Mean dependent var : 3.61054 Number of Variables : 3
S.D. dependent var : 0.284673 Degrees of Freedom : 45
Lag coeff. (Rho) : 0.153427
R-squared : 0.924712 Log likelihood : 54.1372
Sq. Correlation : - Akaike info criterion : -102.274
Sigma-square : 0.00610122 Schwarz criterion : -96.6607
S.E of regression : 0.0781103
----------------------------------------------------------------------
-------
Variable Coefficient Std.Error z-value Probability
----------------------------------------------------------------------
-------
W_DIN94IN29 0.153427 0.0776567 1.97571 0.04819
CONSTANT 7.21331 0.560658 12.8658 0.00000
LINC29 -0.655089 0.0491648 -13.3243 0.00000
----------------------------------------------------------------------
-------
REGRESSION DIAGNOSTICS
DIAGNOSTICS FOR HETEROSKEDASTICITY
RANDOM COEFFICIENTS
TEST DF VALUE PROB
Breusch-Pagan test 1 1.1563 0.28223
DIAGNOSTICS FOR SPATIAL DEPENDENCE
SPATIAL LAG DEPENDENCE FOR WEIGHT MATRIX : nuevoq
TEST DF VALUE PROB
Likelihood Ratio Test 1 3.7124 0.05401
============================== END OF REPORT
================================
64 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
65. GeoDa: SAR-Outcomes: 1945-29
SUMMARY OF OUTPUT: SPATIAL LAG MODEL - MAXIMUM LIKELIHOOD ESTIMATION
Data set : nuevo
Spatial Weight : nuevoq
Dependent Variable : DIN45IN29 Number of Observations: 48
Mean dependent var : 0.674542 Number of Variables : 3
S.D. dependent var : 0.17481 Degrees of Freedom : 45
Lag coeff. (Rho) : 0.295355
R-squared : 0.865881 Log likelihood : 63.2943
Sq. Correlation : - Akaike info criterion : -120.589
Sigma-square : 0.00409851 Schwarz criterion : -114.975
S.E of regression : 0.0640196
----------------------------------------------------------------------
-------
Variable Coefficient Std.Error z-value Probability
----------------------------------------------------------------------
-------
W_DIN45IN29 0.295355 0.0974027 3.0323 0.00243
CONSTANT 2.64263 0.300699 8.78829 0.00000
LINC29 -0.341486 0.0388813 -8.78278 0.00000
----------------------------------------------------------------------
-------
REGRESSION DIAGNOSTICS
DIAGNOSTICS FOR HETEROSKEDASTICITY
RANDOM COEFFICIENTS
TEST DF VALUE PROB
Breusch-Pagan test 1 3.4138 0.06465
DIAGNOSTICS FOR SPATIAL DEPENDENCE
SPATIAL LAG DEPENDENCE FOR WEIGHT MATRIX : nuevoq
TEST DF VALUE PROB
Likelihood Ratio Test 1 9.7755 0.00177
============================== END OF REPORT
================================
65 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
66. GeoDa:SAR-Outcomes: 1994-46
SUMMARY OF OUTPUT: SPATIAL LAG MODEL - MAXIMUM LIKELIHOOD ESTIMATION
Data set : nuevo
Spatial Weight : nuevoq
Dependent Variable : DIN94IN46 Number of Observations: 48
Mean dependent var : 2.91979 Number of Variables : 3
S.D. dependent var : 0.163036 Degrees of Freedom : 45
Lag coeff. (Rho) : 0.350822
R-squared : 0.783057 Log likelihood : 54.8668
Sq. Correlation : - Akaike info criterion : -103.734
Sigma-square : 0.00576651 Schwarz criterion : -98.1199
S.E of regression : 0.0759375
----------------------------------------------------------------------
-------
Variable Coefficient Std.Error z-value Probability
----------------------------------------------------------------------
-------
W_DIN94IN46 0.350822 0.11406 3.07577 0.00210
CONSTANT 5.02565 0.731279 6.87241 0.00000
LINC46 -0.445107 0.0651113 -6.8361 0.00000
----------------------------------------------------------------------
-------
REGRESSION DIAGNOSTICS
DIAGNOSTICS FOR HETEROSKEDASTICITY
RANDOM COEFFICIENTS
TEST DF VALUE PROB
Breusch-Pagan test 1 4.2002 0.04042
DIAGNOSTICS FOR SPATIAL DEPENDENCE
SPATIAL LAG DEPENDENCE FOR WEIGHT MATRIX : nuevoq
TEST DF VALUE PROB
Likelihood Ratio Test 1 9.4106 0.00216
============================== END OF REPORT
================================
66 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
67. GeoDa : SEM
See slide 33 and R & M page 149, equation 8
Click on Regression
Dependent variable/ growth rate (E.g. dI94I29 )
Independent variable/ initial income (E.g. LINC29 )
Weights file/
Spatial error
Three regressions:
1 dI94I29i = a + βLINC29i + ε29i ; ε29i = λWε29i + µ29i
2 dI45I29i = a + βLINC29i + ε29i ; ε29i = λWε29i + µ29i
3 dI94I46i = a + βLINC46i + ε46i ; ε46i = λWε46i + µ46i
Where i : 1, 2, 3, ....., .48
67 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
68. GeoDa: SEM: 1994-29
SUMMARY OF OUTPUT: SPATIAL ERROR MODEL - MAXIMUM LIKELIHOOD ESTIMATION
Data set : nuevo
Spatial Weight : nuevoq
Dependent Variable : DIN94IN29 Number of Observations: 48
Mean dependent var : 3.610542 Number of Variables : 2
S.D. dependent var : 0.284673 Degrees of Freedom : 46
Lag coeff. (Lambda) : 0.254318
R-squared : 0.922600 R-squared (BUSE) : -
Sq. Correlation : - Log likelihood : 53.223613
Sigma-square : 0.00627236 Akaike info criterion : -102.447
S.E of regression : 0.0791982 Schwarz criterion : -98.7048
----------------------------------------------------------------------
-------
Variable Coefficient Std.Error z-value Probability
----------------------------------------------------------------------
-------
CONSTANT 8.15606 0.231367 35.2516 0.00000
LINC29 -0.716327 0.0363589 -19.7016 0.00000
LAMBDA 0.254318 0.182314 1.39494 0.16303
----------------------------------------------------------------------
-------
REGRESSION DIAGNOSTICS
DIAGNOSTICS FOR HETEROSKEDASTICITY
RANDOM COEFFICIENTS
TEST DF VALUE PROB
Breusch-Pagan test 1 0.2873 0.59194
DIAGNOSTICS FOR SPATIAL DEPENDENCE
SPATIAL ERROR DEPENDENCE FOR WEIGHT MATRIX : nuevoq
TEST DF VALUE PROB
Likelihood Ratio Test 1 1.8853 0.16973
============================== END OF REPORT
================================
68 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
69. GeoDa: SEM: 1945-29
SUMMARY OF OUTPUT: SPATIAL ERROR MODEL - MAXIMUM LIKELIHOOD ESTIMATION
Data set : nuevo
Spatial Weight : nuevoq
Dependent Variable : DIN45IN29 Number of Observations: 48
Mean dependent var : 0.674542 Number of Variables : 2
S.D. dependent var : 0.174810 Degrees of Freedom : 46
Lag coeff. (Lambda) : 0.580250
R-squared : 0.883041 R-squared (BUSE) : -
Sq. Correlation : - Log likelihood : 64.766993
Sigma-square : 0.0035741 Akaike info criterion : -125.534
S.E of regression : 0.0597838 Schwarz criterion : -121.792
----------------------------------------------------------------------
-------
Variable Coefficient Std.Error z-value Probability
----------------------------------------------------------------------
-------
CONSTANT 3.21562 0.216223 14.8718 0.00000
LINC29 -0.39988 0.0338626 -11.8089 0.00000
LAMBDA 0.58025 0.131908 4.3989 0.00001
----------------------------------------------------------------------
-------
REGRESSION DIAGNOSTICS
DIAGNOSTICS FOR HETEROSKEDASTICITY
RANDOM COEFFICIENTS
TEST DF VALUE PROB
Breusch-Pagan test 1 3.0394 0.08127
DIAGNOSTICS FOR SPATIAL DEPENDENCE
SPATIAL ERROR DEPENDENCE FOR WEIGHT MATRIX : nuevoq
TEST DF VALUE PROB
Likelihood Ratio Test 1 12.7210 0.00036
============================== END OF REPORT
================================
69 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
70. GeoDa:SEM: 1994-46
SUMMARY OF OUTPUT: SPATIAL ERROR MODEL - MAXIMUM LIKELIHOOD ESTIMATION
Data set : nuevo
Spatial Weight : nuevoq
Dependent Variable : DIN94IN46 Number of Observations: 48
Mean dependent var : 2.919792 Number of Variables : 2
S.D. dependent var : 0.163036 Degrees of Freedom : 46
Lag coeff. (Lambda) : 0.433936
R-squared : 0.776745 R-squared (BUSE) : -
Sq. Correlation : - Log likelihood : 53.732096
Sigma-square : 0.00593429 Akaike info criterion : -103.464
S.E of regression : 0.0770343 Schwarz criterion : -99.7218
----------------------------------------------------------------------
-------
Variable Coefficient Std.Error z-value Probability
----------------------------------------------------------------------
-------
CONSTANT 6.64014 0.432628 15.3484 0.00000
LINC46 -0.529052 0.0613691 -8.62082 0.00000
LAMBDA 0.433936 0.157919 2.74785 0.00600
----------------------------------------------------------------------
-------
REGRESSION DIAGNOSTICS
DIAGNOSTICS FOR HETEROSKEDASTICITY
RANDOM COEFFICIENTS
TEST DF VALUE PROB
Breusch-Pagan test 1 1.5605 0.21160
DIAGNOSTICS FOR SPATIAL DEPENDENCE
SPATIAL ERROR DEPENDENCE FOR WEIGHT MATRIX : nuevoq
TEST DF VALUE PROB
Likelihood Ratio Test 1 7.1413 0.00753
============================== END OF REPORT
================================
70 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
71. GeoDa: SAR & SEM-Outcomes
With these outputs, you should be able to complete R
& M Table 3
You may find R2
s, AICs, βs, λs, ρs and, p − values,
Convergence rate(θ) is calculated using β (See slide 39)
71 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
72. Reporting SAR & SEM outomes
Table: Spatial Dependence Models
Model specification AIC β λ, ρ LM test
(p-value) p-value p-value
1929-94
Spatial error (ML) -102.447 -0.716 (0.000) 0.163 0.169
Spatial lag (ML) -102.274 -0.655(0.000) 0.048 0.054
1929-45
Spatial error (ML) -125.534 -0.399(0.000) 0.000 0.000
Spatial lag (ML) -102.447 -0.341(0.000) 0.002 0.002
1946-94
Spatial error (ML) -103.464 -0.529(0.000) 0.006 0.008
Spatial lag (ML) -103.734 -0.445(0.000) 0.002 0.002
Convergence rate (θ) based on the spatial error (ML) estimates
θ
1929-94 0.019
1929-45 0.032
1946-94 0.016
72 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
73. Interpretation of SEM & SAR outcomes
For SEM, as expected, AIC indicates that the fit of
each of the three spatial models is superior to OLS.
βs are significant and negative but different from OLS
coefficients.
OLS suffers from a misspecication due to omitted
spatial dependence.
The coefficients on error (λ) and lag(ρ) terms are
significant in the sub-samples. For the full sample, just
ρ is significant
LM test indicates that there is not spatial dependence
remaining in SAR and SEM.
Including spatial dependence reduces the convergence
rates (θs)
Convergence rate over entire sample, 1.9% yearly but
first sub sample, 3.2%, second sub sample 1.6%
73 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
74. Research paper
Choose a country (e.g. Canada, South Korea,
Portugal, Mexico, etc.), find the historical data of the
real per capita GDP (personal income, GSP(Gross
State Product)) of its
states/provinces/municipalities/regions, and, do an
income convergence analysis about them (β -
convergence and σ - convergence).
In addition, you have to choose an event that was
important for the economy of that country (e.g. a new
constitution, an improvement in the power system, an
strong devaluation of its currency, a natural disaster, a
war, etc.), so that,you may choose "four years" like
Rey and Montuori did.
74 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
75. Research paper
Sections: introduction, literature, methodology (data),
outcomes (explanations of them), and conclusions.
Your scholarly work must report:
Exploring β- convergence (Slide 48).
Exploring Spatial Dependence (Slide 49).
Moran scatterplots (Slide 51).
Exploring data - σ- convergence & Moran’s I (Slide
52).
OLS outcomes, convergence rates, & spatial diagnostic
tests (Table slide 58).
SAR & SEM outcomes. Convergence rates (Table
slide 72).
75 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App
76. References
1 Anselin, L (2005). Exploring Spatial Data with
GeoDaTM: A workbook
2 LeSage, J. and Pace R. K. (2009). “Introduction to
Spatial Econometrics” Taylor & Francis„ Boca Raton
3 Rey S. J. and Montouri B. D. (1999) “US Regional
Income Convergence: a Spatial Econometric
Perspective”, Regional Studies 33 , 143-156.
4 Solow, R. M. (1956). “A Contribution to the Theory of
Economic Growth” The Quarterly Journal of
Economics, 70(1), 65-94.
76 / 76 Prepared by César R. Sobrino Regional Income Convergence: A Spatial Analysis App