〖摘 要〗 虽然空间变系数 (SVC) 建模在应用科学中很流行,但其计算负担很大。如果考虑空间变系数的多尺度属性,则尤其如此。鉴于此背景,本研究开发了一种基于 Moran 特征向量的空间变系数 (M-SVC) 建模方法,可有效地估计多尺度空间变系数模型。该估计通过 (1) 秩降低、(2) 预压缩和 (3) 顺序似然最大化来加速。步骤 (1) 和 (2) 从似然函数中消除样本大小 N;在这些步骤之后,似然最大化成本与 N 无关。步骤 (3) 进一步加速似然最大化,因此即使空间变系数的数量 K 很大,也可以估计多尺度空间变系数模型。通过蒙特卡罗模拟实验将 M-SVC 方法与地理加权回归 (GWR) 进行比较。这些模拟结果表明,当 N 很大时,本文方法比地理加权回归快得多,尽管数值估计了 2K 个参数,而地理加权回归仅数值估计了 1 个参数。然后,将所提出的方法应用于土地价格分析作为说明。开发的空间变系数估计方法在 R 包 “spmoran” 中实现
〖原 文〗 Murakami, D. and Griffith, D.A. (2019) ‘Spatially varying coefficient modeling for large datasets: Eliminating N from spatial regressions’, Spatial Statistics, 30, pp. 39–64. Available at: https://doi.org/10.1016/j.spasta.2019.02.003.
1. 导言
在地统计学(例如,Cressie,1993 [8])和空间计量经济学(例如,LeSage 和 Pace,2009[36])中提出的空间统计方法,通常需要计算复杂度为 O(N3) 的密集协方差矩阵的逆,其中 N 是样本量。这些方法不适用于大样本。
[1] Anselin, L., 2010. Thirty years of spatial econometrics. Pap. Reg. Sci. 89(1), 3–25.
[2] Arbia, G., 2014. Pairwise likelihood inference for spatial regressions estimated on very large datasets. Spat. Stat. 7, 21–39.
[3] Banerjee, S., Gelfand, A. E., Finley, A. O., Sang, H., 2008. Gaussian predictive process models for large spatial data sets. J. R. Stat. Soc. Series B Stat. Methodol. 70 (4), 825–848.
[4] Bates, D. M., 2010. lme4: Mixed-effects modeling with R. http://lme4.r-forge.rproject.org/book.
[5] Burden, S., Cressie, N., Steel, D.G., 2015. The SAR model for very large datasets: a reduced rank approach. Econom. 3 (2), 317–338.
[7] Cahill, M., Mulligan, G., 2007. Using geographically weighted regression to explore local crime patterns. Soc. Sci. Comput. Rev. 25 (2), 174–193.
[8] Cressie, N., 1993. Statistics for Spatial Data. John Wiley & Sons, New York.
[9] Cressie, N., Johannesson, G., 2008. Fixed rank kriging for very large spatial data sets. J. R. Stat. Soc. Series B Stat. Methodol. 70 (1), 209–226.
[10] Datta, A., Banerjee, S., Finley, A.O., Gelfand, A.E., 2016. Hierarchical nearestneighbor Gaussian process models for large geostatistical datasets. J. Am. Stat. Assoc. 111 (514), 800–812.
[11] Debarsy. N. Yang, Y., 2018. Editorial for the special issue entitled: New advances in spatial econometrics: Interactions matter. Reg. Sci. Urban Econ. DOI: 10.1016/j.regsciurbeco.2018.02.004.
[12] Dong, G., Nakaya, T., Brunsdon, C., 2018. Geographically weighted regression models for ordinal categorical response variables: An application to geo-referenced life satisfaction data. Comput. Environ. Urban Syst. 70, 35–42.
[13] Dray, S., Legendre, P., PeresNeto, P.R., 2006. Spatial modelling: a comprehensive framework for principal coordinate analysis of neighbour matrices (PCNM). Ecol. Model. 196 (3–4), 483–493.
[14] Drineas, P., Mahoney, M.W., 2005. On the Nyström method for approximating a Gram matrix for improved kernel-based learning. J. Mach. Learn. Res. 6, 2153–2175.
[15] Farber, S., Páez, A., 2007. A systematic investigation of cross-validation in GWR model estimation: empirical analysis and Monte Carlo simulations. J. Geog. Sci. 9 (4), 371–396.
[16] Finley, A.O., 2011. Comparing spatially-varying coefficients models for analysis of ecological data with non–stationary and anisotropic residual dependence. Methods Ecol. Evol. 2 (2), 143–154.
[17] Finley, A.O., Banerjee, S., MacFarlane, D.W., 2011. A hierarchical model for quantifying forest variables over large heterogeneous landscapes with uncertain forest areas. J. Am. Stat. Assoc. 106 (493), 31–48.
[18] Fotheringham, A.S., Brunsdon, C., Charlton, M., 2002. Geographically Weighted Regression: The Analysis of Spatially Varying Relationships. John Wiley & Sons Chichester, UK.
[19] Fotheringham, A.S., Yang, W., Kang, W., 2017. Multiscale Geographically Weighted Regression (MGWR). Ann. Am. Assoc. Geogr. 107 (6), 1247–1265.
[20] Furrer, R., Genton, M.G., Nychka, D., 2006. Covariance tapering for interpolation of large spatial datasets. J. Comput. Graph. Stat. 15 (3), 502–523.
[21] Gelfand, A.E., Kim, H.J., Sirmans, C.F., Banerjee, S., 2003. Spatial modeling with spatially varying coefficient processes. J. Am. Stat. Assoc. 98 (462), 378–396.
[22] Geniaux, G., Martinetti, D., 2018. A new method for dealing simultaneously with spatial autocorrelation and spatial heterogeneity in regression models. Reg Sci Urban Econ. DOI: 10.1016/j.regsciurbeco.2017.04.001.
[23] Goodchild, M.F., 2004. The validity and usefulness of laws in geographic information science and geography. Ann. Assoc. Am. Geogr. 94 (2), 300–303.
[24] Griffith, D.A., 2000. Eigenfunction properties and approximations of selected incidence matrices employed in spatial analyses. Linear Algebra Appl. 321 (1-3), 95112.
[25] Griffith, D.A., 2003. Spatial Autocorrelation and Spatial Filtering: Gaining Understanding through Theory and Scientific Visualization. Springer, Berlin.
[26] Griffith, D.A., 2004. Extreme eigenfunctions of adjacency matrices for planar graph employed in spatial analyses. Linear Algebra Appl. 388, 201–219.
[27] Griffith, D.A., 2008. Spatial-filtering-based contributions to a critique of geographically weighted regression (GWR). Environ. Plann. A 40 (11), 2751–2769.
[28] Griffith, D.A., 2015. Approximation of Gaussian spatial autoregressive models for massive regular square tessellation data. Int. J. Geogr. Inf. Sci. 29 (12), 2143–2173.
[29] Griffith, D.A., Chun, Y., 2014. Spatial autocorrelation and spatial filtering. In: Fischer, M.M. and Nijkamp, N (eds). Handbook of Regional Science. Springer, Berlin, Heidelberg, pp. 1477–1507.
[30] Harris, P., Fotheringham, A. S., Juggins, S., 2010a. Robust geographically weighted regression: a technique for quantifying spatial relationships between freshwater acidification critical loads and catchment attributes. Ann. Assoc. Am. Geogr. 100 (2), 286–306.
[31] Harris, R., Singleton, A., Grose, D., Brunsdon, C., Longley, P., 2010B. Grid enabling geographically weighted regression: A case study of participation in higher education in England. Tran. GIS, 14(1), 43–61.
[32] Heaton, M.J., Datta, A., Finley, A., Furrer, R., Guhaniyogi, R., Gerber, F., Gramacy, R.B., Hammerling, D., Katzfuss, M., Lindgren F., Nychka, D.W., Sun, F., ZammitMangion, A., 2017. A case study competition among methods for analyzing larg spatial data. Arxiv, 1710.05013.
[33] Helbich, M., Griffith, D.A., 2016. Spatially varying coefficient models in real estate: Eigenvector spatial filtering and alternative approaches. Comput. Environ. Urban Syst. 57, 1–11.
[34] Kelejian, H.H., Prucha, I.R., 1998. A generalized spatial two-stage least squares procedure for estimating a spatial autoregressive model with autoregressive disturbances. J. Real Estate Finance Econ. 17 (1), 99–121.
[35] LeSage, J.P., Pace, R.K., 2007. A matrix exponential spatial specification. J. Econom. 140 (1), 190–214.
[36] LeSage, J.P., Pace, R.K., 2009. Introduction to Spatial Econometrics. Chapman and Hall/CRC, New York.
[37] Li, M., Bi, W., Kwok, J.T., Lu, B.L., 2015. Large-scale Nyström kernel matrix approximation using randomized SVD. IEEE Trans Neural Netw. Learn. 26 (1), 152164.
[38] Lindgren, F., Rue, H., Lindström, J., 2011. An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach. J. R. Stat. Soc. Series B Stat. Methodol. 73 (4), 423–498.
[39] Lu, B., Harris, P., Charlton, M., Brunsdon, C., 2014. The GWmodel R package further topics for exploring spatial heterogeneity using geographically weighted models. Geo-spatial Inform. Sci. 17 (2), 85–101.
[40] Lu, B., Brunsdon, C., Charlton, M., Harris, P., 2017. Geographically weighted regression with parameter-specific distance metrics. Int. J. Geogr. Inf. Sci. 31 (5), 982–998.
[41] Lu, B., Yang, W., Ge, Y., Harris, P., 2018. Improvements to the calibration of a geographically weighted regression with parameter-specific distance metrics and bandwidths. Comput. Environ. Urban Syst, DOI: 10.1016/j.compenvurbsys.2018.03.012.
[42] Moran, P. A., 1950. Notes on continuous stochastic phenomena. Biometrika 37 (1/2), 17–23.
[43] Murakami, D., 2018. spmoran: An R package for Moran's eigenvector-based spatial regression analysis. Arxiv, 1703.04467.
[44] Murakami, D., Griffith, D.A., 2015. Random effects specifications in eigenvector spatial filtering: a simulation study. J. Geog. Sci. 17 (4), 311–331.
[45] Murakami, D., Lu, B., Harris, P., Brunsdon, C., Charlton, M., Nakaya, T., Griffith, D.A., 2017. The importance of scale in spatially varying coefficient modeling. Arxiv, 1709.08764
[46] Murakami, D., Griffith, D.A., 2018. Eigenvector spatial filtering for large data sets: fixed and random effects approaches. Geogr. Anal. DIO: 10.1111/gean.12156.
[47] Murakami, D., Yoshida, T., Seya, H., Griffith, D.A., Yamagata, Y., 2017. A Moran coefficient-based mixed effects approach to investigate spatially varying relationships. Spat. Stat. 19, 68–89.
[49] Nychka, D., Bandyopadhyay, S., Hammerling, D., Lindgren, F., Sain, S., 2015. A multiresolution Gaussian process model for the analysis of large spatial datasets. J. Comput. Graph. Stat. 24 (2), 579–599.
[50] Oshan, T.M., Fotheringham, A.S., 2018. A comparison of spatially varying regression coefficient estimates using geographically weighted and spatial‐filter‐based techniques. Geogr. Anal. 50 (1), 53–75.
[51] Sang, H., Huang, J. Z., 2012. A full scale approximation of covariance functions for large spatial data sets. J. R. Stat. Soc. Series B Stat. Methodol. 74 (1), 111–132.
[53] Smirnov, O., Anselin, L., 2001. Fast maximum likelihood estimation of very large spatial autoregressive models: a characteristic polynomial approach. Comput. Stat. Data Anal. 35 (3), 301–319.
[54] Stein, M.L., 2014. Limitations on low rank approximations for covariance matrices of spatial data. Spat. Stat. 8, 1–19.
[55] Tiefelsdorf, M., Griffith, D.A., 2007. Semiparametric filtering of spatial autocorrelation: the eigenvector approach. Environ. Plann. A 39 (5), 1193.
[56] Tobler, W.R., 1970. A computer movie simulating urban growth in the Detroit region. Econ. Geogr. 46, 234–240.
[57] Tran, H.T., Nguyen, H.T., Tran, V.T., 2016. Large-scale geographically weighted regression on Spark. Proceedings of the 2016 International Conference on Knowledge and Systems Engineering (KSE), 127–132.
[58] Wheeler, D.C., Calder, C.A., 2007. An assessment of coefficient accuracy in linear regression models with spatially varying coefficients. J. Geog. Sci. 9 (2), 145–166.
[59] Wheeler, D.C., Tiefelsdorf, M., 2005. Multicollinearity and correlation among local regression coefficients in geographically weighted regression. J. Geog. Sci. 7 (2), 161–187.
[60] Wheeler, D.C., Waller, L., 2009. Comparing spatially varying coefficient models: case study examining violent crime rates and their relationships to alcohol outlets and illegal drug arrests. J. Geog. Sci. 11 (1), 1–22.
[61] Yang, W., 2014. An extension of geographically weighted regression with flexible bandwidths. PhD Thesis. University of St Andrews.
[62] Zhang, K., Kwok, J.T., 2010. Clustered Nyström method for large scale manifold learning and dimension reduction. IEEE Trans. Neural Netw. 21 (10), 1576–1587.