来源:https://geodacenter.github.io/GeoDaSpace/
本文将主要介绍使用空间计量Python、Stata、R软件来复制实现
Technical Aspects of Implementing GMM Estimation of the Spatial Error Model in PySAL and GeoDaSpace∗ by Luc Anselin, Pedro V. Amaral and Daniel Arribas-Bel中空间误差模型的GMM估计的技术方面的论文结果。
1
Stata代码汇总
代码为
/* This routine assumes the following files are in the current Stata working directory: */
/* - NAT.shp */
/* - NAT.dbf */
/* Create row-standardized Queen weights matrix */
shp2dta using NAT, database(nat) coordinates(natxy) genid(ids) gencentroids(nat_c) replace
use nat
spmat contiguity w using natxy, norm(row) id(ids) replace
/* Table 2 - Spatial error model with exogenous variables and homoskedasticity (GM_Error_Hom): */
spivreg HR90 RD90 UE90, el(w) id(ids)
/* Table 3 - Spatial error model with endogenous variables and homoskedasticity (GM_Endog_Error_Hom): */
spivreg HR90 RD90 (UE90 = FP89), el(w) id(ids)
/* Table 4 - Combo model with homoskedasticity (GM_Combo_Hom): */
spivreg HR90 RD90 UE90, dl(w) el(w) id(ids)
/* Table 5 - Spatial error model with exogenous variables and heteroskedasticity (GM_Error_Het): */
spivreg HR90 RD90 UE90, el(w) id(ids) het
/* Table 6 - Spatial error model with endogenous variables and heteroskedasticity (GM_Endog_Error_Het): */
spivreg HR90 RD90 (UE90 = FP89), el(w) id(ids) het
/* Table 7 - Combo model with heteroskedasticity (GM_Combo_Het): */
spivreg HR90 RD90 UE90, el(w) dl(w) id(ids) het
2
Python部分
首先导入相关库,代码为
import numpy as np
import pysal as ps
然后打开数据库并创建变量作为numpy数组,代码为:
db = ps.open('nat.dbf','r')
其中,为dbf格式数据,r表示打开方式为可读
Regression 1: use all defaults, i.e., A1='het' changed by LA to A1='hom_sc' as in Drukker
相关代码为:
reg1 = ps.spreg.GM_Error_Hom(y,x,w,name_y='HR90',name_x = x_names, name_w = 'nat_queen.gal',name_ds='NAT')
然后打印上述结果
相关代码为:
结果为:
Regression 1a: use A1 matrix as in Anselin with A1='het'
相关代码为:
reg1a = ps.spreg.GM_Error_Hom(y,x,w,A1='het',name_y='HR90',name_x = x_names, name_w = 'nat_queen.gal',name_ds='NAT')
print reg1a.summary
结果为:
Regression 1b: use A1 matrix as in Anselin, without scaling factor, A1='hom'
相关代码为:
reg1b = ps.spreg.GM_Error_Hom(y,x,w,A1='hom',name_y='HR90',name_x = x_names, name_w = 'nat_queen.gal',name_ds='NAT')
print reg1b.summary
结果为:
Regression 1c: use A1 with scaling factor, but force estimation as 2SLS (as in Stata)
Need to use base class to force passing a constant term, since the user class puts constant in x
相关代码为:
ones = np.ones(y.shape)
reg1c = ps.spreg.error_sp_hom.BaseGM_Endog_Error_Hom(y,ones,yend=x, q=x, w=w, A1='hom_sc')
print reg1c.betas
print map(np.sqrt, reg1c.vm.diagonal())
结果为:
1.2 Endogenous
建立新的外生、内生和工具变量
相关代码为:
xex = x[:,0]
xex.shape = (len(hr90),1)
yend = x[:,1]
yend.shape = (len(hr90),1)
fp89 = db.by_col("FP89")
q = array(fp89)
q.shape = (len(fp89),1)
reg2 = ps.spreg.GM_Endog_Error_Hom(y,xex,yend,q,w,name_y="HR90",name_x=["RD90"],name_yend=["UE90"],name_q=["FP89"],\
name_w="natqueen.gal",name_ds="nat.shp")
print reg2.summary
Error hom model with endogeneity -- A1 = 'het'
reg2a = ps.spreg.GM_Endog_Error_Hom(y,xex,yend,q,w,A1='het',name_y="HR90",name_x=["RD90"],name_yend=["UE90"],name_q=["FP89"],\
name_w="natqueen.gal"
,name_ds="nat.shp")
print reg2a.summary
结果为:
1.3 Combo Model¶
Error hom model -- combo -- instrument lag order set to 2
相关代码为:
reg3 = ps.spreg.GM_Combo_Hom(y,x,w=w,w_lags=2,name_y='HR90',name_x = x_names, name_w = 'nat_queen.gal',name_ds='NAT')
print reg3.summary
reg3a = ps.spreg.GM_Combo_Hom(y,x,w=w,name_y='HR90',name_x = x_names, name_w = 'nat_queen.gal',name_ds='NAT')
print reg3a.summary
reg3b = ps.spreg.GM_Combo_Hom(y,x,w=w,w_lags=2,A1='het',name_y='HR90',name_x = x_names, name_w = 'nat_queen.gal',name_ds='NAT')
print reg3b.summary
结果为:
2 Heteroskedasticity¶
2.1 Exogenous only¶
相关代码为:
reg4 = ps.spreg.GM_Error_Het(y,x,w,name_y="HR90",name_x=x_names,name_w="natqueen.gal",name_ds="nat.shp")
print reg4.summary
Step1c to True as in Arraiz et al (same as sphet1)
reg4a = ps.spreg.GM_Error_Het(y,x,w,step1c=True,name_y="HR90"
,name_x=x_names,name_w="natqueen.gal",name_ds="nat.shp")
print reg4a.summary
reg4b = ps.spreg.error_sp_het.BaseGM_Endog_Error_Het(y,ones,yend=x, q=x, w=w)
print reg4b.betas
结果为:
2.2 Endogenous
相关代码为:
reg5 = ps.spreg.GM_Endog_Error_Het(y,xex,yend,q,w,name_y="HR90",name_x=["RD90"],name_yend=["UE90"],name_q=["FP89"],\
name_w="natqueen.gal",name_ds="nat.shp")
print reg5.summary
reg5a = ps.spreg.GM_Endog_Error_Het(y,xex,yend,q,w,step1c=True,name_y="HR90",name_x=["RD90"],name_yend=["UE90"],name_q=["FP89"],\
name_w="natqueen.gal",name_ds="nat.shp")
print reg5a.summary
结果为:
2.3 Combo model
相关代码为:
reg6 = ps.spreg.GM_Combo_Het(y,x,w=w,w_lags=2,name_y='HR90',name_x = x_names, name_w = 'nat_queen.gal',name_ds='NAT')
print reg6.summary
reg6a = ps.spreg.GM_Combo_Het(y,x,w=w,name_y='HR90',name_x = x_names, name_w = 'nat_queen.gal',name_ds='NAT')
print reg6a.summary
reg6c = ps.spreg.GM_Combo_Het(y,x,w=w,w_lags=2,step1c=True,name_y='HR90',name_x = x_names, name_w = 'nat_queen.gal',name_ds='NAT')
print reg6c.summary
结果为:
3
R代码汇总
%load_ext rmagic
GMM Comparison
This document reproduces the R results in the paper *Technical Aspects of Implementing GMM Estimation of the Spatial Error Model in PySAL and GeoDaSpace∗ by Luc Anselin, Pedro V. Amaral and Daniel Arribas-Bel.
The main library used is sphet
, in its stable version (available from CRAN) and its bleeding edge development version (available from R-forge here). In this document, we use the development version as available from the master trunk on Novemer 28th., 2012.
In addition, the following dependencies are required:
Load the dependency libraries (note loading them will produce several output lines, keep scrolling down).
%%R
library(spdep, quiet=T)
library(sem)
library(foreign)
Loading required package
: lattice
Attaching package: ‘lattice’
The following object(s) are masked from ‘package:boot’:
melanoma
Loading required package: foreign
Checking rgeos availability: FALSE
Note: when rgeos is not available, polygon geometry computations in maptools depend on gpclib,
which has a restricted licence. It is disabled by default;
to enable gpclib, type gpclibPermit()
deldir 0.0-21
Loading required package: matrixcalc
Data
To replicate the results, you will need the
NAT
dataset in your machine. Throughout the document, we assume you have both the dbf
file and a queen contiguity gal
file created from the shapefile in your working directory. If this is the case, the path for the files should just be their names:
%%R
gal.path 'NAT_queen.gal'
dbf.path 'NAT.dbf'
NAT_queen.gal
is not included but you can easily create it in PySAL
or in OpenGeoDa. Once they are both in place, load them into the session:
%%R
w read.gal(gal.path, override.id=TRUE)
w nb2listw(w)
dbf read.dbf(dbf.path)
sphet2
code
The code in this section reproduces all the results in the paper reported to come from the development version of sphet
(codenamed sphet2
througout the paper).
Next, source sphet2
. First specify the local path in your machine where the folder with the code is:
%%R
sphet2.path "/home/dani/code/R/sphet/pkg/R/"
If the files are in the same directory as this document, simply replace the path by ""
(that is: sphet2.path""
). Now you can load sphet2
code:
%%R
source(paste(sphet2.path,
"circular.R", sep=''))
source(paste(sphet2.path, "dist_functions.R", sep=''))
source(paste(sphet2.path, "distance.R", sep=''))
source(paste(sphet2.path, "gm_functions.R", sep=''))
source(paste(sphet2.path, "gs2slshet.R", sep=''))
source(paste(sphet2.path, "gwt2dist.R", sep=''))
source(paste(sphet2.path, "ivreg.R", sep=''))
source(paste(sphet2
.path, "kernelsfun.R", sep=''))
source(paste(sphet2.path, "listw2dgCMatrix.R", sep=''))
source(paste(sphet2.path, "Omega.R", sep=''))
source(paste(sphet2.path, "s2slshac.R", sep
=''))
source(paste(sphet2.path, "spreg.R", sep=''))
source(paste(sphet2.path, "summary.sphet.R", sep=''))
source(paste(sphet2.path, "twostagels.R", sep=''))
source(paste(sphet2.path, "utilities.R", sep=''))
Models
Table 2 (GMErrorHom)
%%R
model spreg(HR90 ~ RD90 + UE90, data = dbf,listw = w, model = "error" )
print(summary(
model))
Generalized stsls
Call:
spreg(formula = HR90 ~ RD90 + UE90, data = dbf, listw = w, model = "error")
Residuals:
Min.
1st Qu. Median Mean 3rd Qu. Max.
-18.400 -3.420 -0.662 0.019 2.530 67.100
Coefficients:
Estimate Std. Error t-value Pr(>|t|)
(Intercept) 6.676229 0.349809 19.0853 <2e-16 ***
RD90 3.944978 0.155334 25.3967 <2e-16 ***
UE90 -0.077044 0.047125 -1.6349 0.1021
rho 0.414905 0.019389 21.3987 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Table 3 (GMEndogError_Hom)
%%R
model spreg(HR90 ~ RD90, endog = dbf$UE90
, instruments = cbind(dbf$FP89), data = dbf,listw = w, model = "error",verbose=TRUE)
print(summary(model))
function: 169.7589 lambda: 0.2
function: 169.7588 lambda: 0.2
function: 81.68172 lambda: 0.9
function: 81.67946 lambda: 0.8999915
function: 47.50417 lambda: 0.7771504
function: 47.50421 lambda: 0.7771506
function:
1411.72 lambda: -0.2228496
function: 21.65544 lambda: 0.6771504
function: 21.6555 lambda: 0.6771507
function: 3.706638 lambda: 0.4771504
function: 3.706629 lambda: 0.4771505
function: 1.78937 lambda: 0.5402429
function: 1.789369 lambda: 0.5402429
function: 1.441354 lambda: 0.5235599
function: 1.441354 lambda: 0.5235599
function:
1.438217 lambda: 0.5217685
function: 1.438217 lambda: 0.5217686
function: 1.438211 lambda: 0.5218404
function: 1.438211 lambda: 0.5218404
function: 1.438211 lambda: 0.5218402
function: 1.438211 lambda: 0.5218405
function: 1.438211 lambda: 0.5218398
function: 1.438211 lambda: 0.5218402
function: 0.004362049 lambda: 0.5218402
function:
0.00436205 lambda: 0.5218402
function: 0.003979579 lambda: 0.4779536
function: 0.003979577 lambda: 0.4779536
function: 0.00374883 lambda: 0.4952313
function: 0.00374883 lambda: 0.4952313
function: 0.003748192 lambda: 0.4943871
function: 0.003748192 lambda: 0.4943871
function: 0.003748192 lambda: 0.4943604
function: 0.003748192 lambda: 0.4943605
function:
0.003748192 lambda: 0.4943605
function: 0.003748192 lambda: 0.4943614
function: 0.003748192 lambda: 0.4943595
function: 0.003748192 lambda: 0.4943605
Generalized stsls
Call:
spreg(formula = HR90 ~ RD90, data = dbf, listw = w, endog = dbf$UE90,
instruments = cbind(dbf$FP89), model = "error", verbose = TRUE)
Residuals:
Min.
1st Qu. Median Mean 3rd Qu. Max.
-32.700 -4.100 0.141 0.034 3.950 65.400
Coefficients:
Estimate Std. Error t-value Pr(>|t|)
(Intercept) 21.060572 1.538510 13.689 < 2.2e-16 ***
RD90 8.241952 0.488835 16.860 < 2.2e-16 ***
endogenous -2.243751 0.228954 -9.800 < 2.2e-16 ***
rho 0.494360 0.021735 22.745 <
2.2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Table 4 (GMComboHom)
%%R
model spreg(HR90 ~
RD90 + UE90, data = dbf,listw = w, het = TRUE, model = "error" )
print(summary(model))
Generalized stsls
Call:
spreg(formula = HR90 ~ RD90 + UE90,
data = dbf, listw = w, model = "error",
het = TRUE)
Residuals:
Min. 1st Qu. Median Mean 3rd Qu. Max.
-18.400 -3.410
-0.664 0.019 2.530 67.100
Coefficients:
Estimate Std. Error t-value Pr(>|t|)
(Intercept) 6.658555 0.474454 14.0341 <2e-16 ***
RD90 3.941664 0.259899
15.1661 <2e-16 ***
UE90 -0.074496 0.061092 -1.2194 0.2227
rho 0.474047 0.023665 20.0314 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05
‘.’ 0.1 ‘ ’ 1
Table 6 (GMEndogError_Het)
%%R
model spreg(HR90 ~ RD90, endog = dbf$UE90, instruments = dbf$FP89, data = dbf,listw = w, het = TRUE, model = "error")
print
(summary(model))
Generalized stsls
Call:
spreg(formula = HR90 ~ RD90, data = dbf, listw = w, endog = dbf$UE90,
instruments = dbf$FP89, model = "error", het =
TRUE)
Residuals:
Min. 1st Qu. Median Mean 3rd Qu. Max.
-32.700 -4.100 0.146 0.035 3.950 65.400
Coefficients:
Estimate Std. Error t-value Pr(>|t|)
(Intercept) 21.028779 2.562926 8.2050 2.306e-16 ***
RD90 8.237610 0.781683 10.5383 < 2.2e-16 ***
endogenous -2.239152 0.390154 -5.7392 9.515e-09
***
rho 0.466672 0.029767 15.6777 < 2.2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Table 7 (GMComboHet)
%%R
model spreg(HR90 ~ RD90 + UE90, data = dbf,listw = w, het = TRUE, model = "sarar" )
print(summary(model))
Generalized stsls
Call
:
spreg(formula = HR90 ~ RD90 + UE90, data = dbf, listw = w, model = "sarar",
het = TRUE)
Residuals:
Min. 1st Qu. Median Mean 3rd
Qu. Max.
-18.500 -3.470 -0.704 0.015 2.540 67.000
Coefficients:
Estimate Std. Error t-value Pr(>|t|)
(Intercept) 6.940607 0.859998
8.0705 7.001e-16 ***
RD90 4.007396 0.326093 12.2891 < 2.2e-16 ***
UE90 -0.095717 0.066417 -1.4412 0.1495
lambda -0.022001 0.087638 -0.2511 0.8018
rho 0.558391 0.050685 11.0168
< 2.2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1