Required current contributed CRAN packages:

I am running R 4.1.2, with recent update.packages().

needed <- c("sphet", "coda", "spatialreg", "spData", "sf")

Script

Script and data at https://github.com/rsbivand/ECS530_h21/raw/main/ECS530_211118.zip. Download to suitable location, unzip and use as basis.

Background to spatial econometrics

Antecedents

In the same way that Fujita, Krugman, and Venables (1999) begin their study of the spatial economy by looking at the antecedents of their subject, it is helpful to place spatial econometrics in its temporal and academic context. This context is sufficiently different from the contemporary setting that it may be hard to grasp the background for many of the features of spatial econometrics that came into being during its earlier years. Indeed, the ranges of topics that were studied in economics in the 1960’s and 1970’s differ markedly from those in focus today. If we can sketch the context within which spatial econometrics was created, and its methods developed, we should be able to illuminate choices made then which influence our understanding and application of spatial econometric methods.

Critics of the practice of spatial econometrics, such as Gibbons and Overman (2012), appear to overlook these antecedents, and consequently judge the potential of the field on a partial, perhaps anachronistic, understanding, viewing phenomena with a history in ahistorical way. Since we are attempting to provide an introduction to applied spatial econometrics, we need to throw light on the original motivations and concerns of the first scholars engaged in the field. Luc Anselin (2010) indicates clearly and repeatedly [L. Anselin (1988b); anselin:06; anselin:10a] that we should acknowledge Spatial Econometrics by Paelinck and Klaassen (1979) of the Netherlands Economic Institute as our starting point, and so celebrates thirty years of spatial econometrics in 2009. This firm confirmation of the importance of Jean Paelinck’s contributions as scholar and community-builder is fully justified. We should then turn to the motivations given in Paelinck and Klaassen (1979) to indicate which contextual factors were of importance at that time, and the breadth of the academic communities with which they were in contact.

In a recent short commentary, Paelinck (2013) recalls his conviction, expressed in 1967, that “early econometric exercises \(\ldots\) relating only variables possessing the same regional index \(\ldots\) were inadequate to represent the correct spatial workings of the economy, which would then be reflected in the policy outcomes.” A year before, Paelinck (2012) points to salient isomorphisms linking spatial regression models, simultaneous equation models and input-output models; these were known of and discussed in the early formative period of spatial econometrics. We will return in subsequent chapters to the ways in which spatial regression models may be specified, but for now, a simple presentation of these isomorphisms as perceived in the early period is sufficient:

\[ {\mathbf y} = {\mathbf A}{\mathbf y} + {\mathbf X}{\mathbf b} + {\mathbf \varepsilon} \]

is a spatial regression model where \({\mathbf A}\) is a matrix expressing the mutual first order spatial dependencies between regions — the similarity of this form and the Koyck distributed lag model is striking [Koyck (1954); Klein (1958); Griliches (1967)};

\[ {\mathbf A}{\mathbf y} + {\mathbf X}{\mathbf b} = {\mathbf \varepsilon} \]

is a simultaneous equation model where \({\mathbf A}\) is a matrix expressing the dependencies between the equations; and:

\[ {\mathbf y} = {\mathbf A}{\mathbf y} + {\mathbf f} \]

is an input-output model where \({\mathbf A}\) is a matrix of sectoral input-output coefficients, and \({\mathbf f}\) is final demand.

Input-output models, simultaneous equation models, and the importance of policy outcomes were all known intimately at the Netherlands Economic Institute at this time, and elsewhere among applied economists. The isomporhisms flowed from the known to the unknown, from the stuff of contemporary research and policy advice to doubts about the calibration of aspatial models, and on to what became termed spatial econometrics. If we compare these topics with those described for Regional Science by Boyce (2004), we can see the outlines of research priorities at the time: including urban and regional models for planning, regional and interregional input-output models, transport and location models. During the 1960s and 1970s, many of these models were enhanced — matching needs for policy advice — to cover environmental questions, adding natural resources as inputs and pollution to outputs. Paelinck’s co-author in a key paper in spatial econometrics (Hordijk and Paelinck 1976), went on to work in environmental management and research.

Reading Paelinck and Klaassen (1979), we see that the programme of research into the space economy undertaken at the Netherlands Economic Institute led first to the publication of Paelinck and Nijkamp (1975), and then to Klaassen, Paelinck, and Wagenaar (1979), published in the same year as Spatial Econometrics. All three books were published in the same series and appear to reflect the core concerns of economists at the Institute doing research on regionalised national macro-economic models. The direct link to Jan Tinbergen is evident in the account of the context of economic research in the Netherlands given by Theil (1964). If we take Paelinck at his word, he and his colleagues were aware that an aspatial regionalisation of national accounts, of input-output models, or transport models, might prejudice policy advice and outcomes through inadequate and inappropriate calibration.

Klaassen, Paelinck, and Wagenaar (1979) is mainly concerned with model construction, while about a third of Paelinck and Nijkamp (1975) is devoted to input-output analysis. Both books show sustained concern for economic measurement, especially of national accounts data, intersectoral transactions, and many other topics. Considerable attention is also paid to the data collection units, be they sectors or regions. The need to attempt to define regions that match the underlying economic realities was recognised clearly, and a key part of Paelinck and Nijkamp (1975) is devoted to regionalisation, and the distinction between functional regions and homogeneous regional classifications is made. In the motivation for spatial econometric models given in Paelinck and Klaassen (1979), consumption and investment in a region are modelled as depending on income both in the region itself and in its contiguous neighbours, termed a “spatial income-generating model.” It became important to be able to calibrate planning models of this kind to provide indications of the possible outcomes of alternative policy choices, hence the need for spatial econometrics.

Economic planning was widespread in Europe at the time, and was also central in the development of Regional Science, in particular input-output models; as Boyce (2004) recounts, Walter Isard worked closely with Wassily Leontief. Operational and planning motivations for applied economics were unquestioned, as economists in the post-war period saw their role, beyond educating young economists, as providing rational foundations for economic policy. It is worth noting that Jean Paelinck participated actively in the Association de Science Régionale De Langue Française, becoming president in 1973–1976. The first president of the association was François Perroux, who had founded it with Walter Isard in 1961 (Bailly, Derycke, and Torre 2012).

Until the 1980s, it was not at all unusual to publish original results in other languages than English. French spatial economic research, for example Ponsard (1983), while making little impact in Anglophone countries, was widely used in teaching and research elsewhere (Billot and Thisse 1992). They contrast, though, the “word wizardry of François Perroux with the rigour of Claude Ponsard” (Billot and Thisse 1992), echoing the views expressed by Drèze (1964) with regard to the work of Perroux. Even if we accept that “word wizardry” deserves more rigour and recasting in normative and empirically testable forms, it is also part of the context within which spatial econometrics came into being. A reading of Perroux (1950) is worthwhile, because it not only gives the reader a vignette of the context in the post-war period, but also provides a discussion of economic space, as opposed to banal, unreflected space — mere position — that has largely disappeared from our considerations.

The title of the journal: Regional and Urban Economics, Operational Methods , founded by Jean Paelinck in 1971, and which was renamed as Regional Science and Urban Economics in 1975 (Boyce 2004), points to the perceived importance of “operational methods,” a version of the term “operational theory and method” used in the title of Paelinck and Nijkamp (1975). Spatial econometrics does not seem to have come into being as a set of estimation techniques as such, as perhaps we might think today, but rather as an approach addressing open research questions both in space economy and in the enhancement of interregional models to be used in offering policy advice.

Were motivations of this kind common during the 1960s and early 1970s? Not only was the spread of Regional Science extensive and firmly established (Boyce 2004), but public bodies were concerned to regionalise economic measurement and policy advice (Graham and Romans 1971). In Britain, Environment and Planning was started in 1969 with Alan Wilson as founding editor and published by Pion; he was assistant director at the Centre for Environmental Studies at this time before moving to the University of Leeds. In a recently published lecture series, Wilson (2012) cites Paelinck and Nijkamp (1975) as giving principles for contributions from economics to urban and regional analysis (Wilson 2000). The papers presented at annual Regional Science meetings were published in a series by Pion; the first number in the series included contributions by Granger (1969) and A. D. Cliff and Ord (1969).

In a contribution to a panel session at the 2006 annual meeting of the American Association of Geographers (co-panelists Luc Anselin and Daniel Griffith), Keith Ord pointed to the continued relevance of Granger’s remarks at the meeting almost forty years earlier (J. Keith Ord 2010); we will return to these concerns below. As noted by R. S. Bivand (2008), communities of researchers working in and near mathematical and theoretical geography was more integrated in the pre-internet and pre-photocopier age than one might expect, with duplicated working papers prepared using stencils circulating rapidly between collaborating academic centres. Knowledge of the preliminary results of other researchers then fed through into rapid innovation in an exciting climate for those with access to these meetings and working papers.

There was considerable overlap between quantitative geography and regional science, so that work like A. D. Cliff and Ord (1969) is cited by Hordijk (1974), and was certainly known at the Netherlands Economic Institute several years earlier. Although it has not been possible to find out who participated in the August 1968 conference of the British and Irish Section of the Regional Science Association at which A. D. Cliff and Ord (1969) was read, it was not unusual for members of other sections to be present, and to return home with bundles of duplicated papers. Up to the 1990s, presenters at conferences handed out copies of their papers, and conference participants posted home parcels of these hand-outs, indexed using the conference programme.

Leslie Hepple was among the more thorough scholars working on the underpinnings of spatial econometrics prior to the publication of Paelinck and Klaassen (1979). His wide-ranging review (Hepple 1974) is cited by Bartels and Hordijk (1977), again demonstrating the close links between those working in this field. We will be returning to the review paper, and to Hepple (1976), which studies methods of estimation for spatial econometrics models in some depth, building on and extending J. K. Ord (1975).

Hepple (1974), like A. D. Cliff and Ord (1973), saw no distinction between spatial statistics and the antecedents to spatial econometrics. Obviously, spatial econometrics was strongly influenced by the research tasks undertaken by regional and urban economists and regional scientists. As D. A. Griffith and Paelinck (2007) point out, spatial statistics and spatial econometrics continue to share most topics of interests, with each also possessing shorter lists of topics that have been of less concern to the other. They advocate a “non-standard” spatial econometrics, which is inclusive to wider concerns. It seems appropriate in this context to mention the somewhat heterodox position taken by McMillen (2010), who draws attention to the crucial issue of functional form, which he argues may well lie behind observed spatial autocorrelation; we will return to this in later chapters.

Implementing methods in econometrics

Having described some of the contextual issues suggesting how specific research concerns influenced how spatial econometrics came into being, it may now be helpful to turn to broader econometrics. It will become clearer that the research concerns of broader econometrics at that time, apart from spatial interdependence and interaction, were generally similar to those of spatial econometrics. Indeed, econometrics and the provision of national economic data for analysis, modelling, and for the provision of policy advice were intimately linked, as were mathematical economics and econometrics. The place of econometrics within economics, as Sandmo (2011) shows, was a matter of some contention from the very beginning.

Both Morgan (1990) and Qin (1993) conclude their historical accounts of the beginnings of econometrics in pessimistic ways. Econometrics had begun by addressing a range of research topics, including the expression of economic theories in mathematical form, the building of operational econometric models, the exploration of testing and estimation techniques, and statistical data preparation. “By the 1950s the founding ideal of econometrics, the union of mathematical and statistical economics into a truly synthetic economics, had collapsed” (Morgan 1990). The flavour of Paelinck and Klaassen (1979) seems somewhat anachronistic compared to late 1950s and early 1960s “textbook” econometrics. One might conclude that while spatial econometrics was aligned with Haavelmo’s probabilistic revolution (Qin 1993), it retained, like Haavelmo, adherence to the founding ideals [Bjerkholt (2007); hendry+johansen:12].

It appears from the research programme culminating in Paelinck and Klaassen (1979) that spatial econometrics could be seen as “creative juggling in which theory and data came together to find out about the real world” (Morgan 1990). Morgan (1990) and Qin (1993) indicate that, from the 1950s, the depth of econometrics was flattened , with “[D]ata taken less seriously as a source of ideas and information for econometric models, and the theory-development role of applied econometrics was downgraded to the theory-testing role” (Morgan 1990). The history of econometric ideas is of assistance in illuminating key components of what has come to be the practice of spatial econometrics:

With Haavelmo, the profession seemed to have arrived at a consensus: a statistical model of plural economic causes and errors in the relations. But in the meantime, the old scientific ideal had also changed, from a deterministic to a probabilistic view of the way the world worked. This left the explanatory level of Haavelmo’s model open to new doubts. Was it really based on underlying random behaviour of the economic variables (as in contemporary evolutionary biology and quantum mechanics), or was it, after all, only a convenient way of formally dealing with inference in the non-experimental framework? (Morgan 1990).

The study of the history of econometrics has been actively promoted by David Hendry. It is desirable to motivate our choice of the term “applied spatial econometrics”; to do this we turn to Hendry (2009), the introductory chapter in the “Applied Econometrics” volume of the Palgrave Handbook of Econometrics. If we can shed light on how “applied econometrics” is understood, then we may be able to provide adequate underpinnings for what is to follow. This, however, turns out to be hard to do, as Hendry challenges simple definitions:

At the superficial level, “Applied Econometrics” is “any application of econometrics, as distinct from theoretical econometrics. \(\ldots\) Some applied econometricians would include any applications involving analyses of”real economic data” by econometric methods, making “Applied econometrics” synonymous with empirical econometrics. However, such a view leads to demarcation difficulties from applied economics on the one hand, and applied statistics on the other. \(\ldots\) Outsiders might have thought that “Applied Econometrics” was just the application of econometrics to data, but that is definitely not so \(\ldots\) Rather, the notion of mutual penetration dominates, but as a one-way street. Economic theory comes first, almost mandatorially. Perhaps this just arises from a false view of science, namely that theory precedes evidence, \(\ldots\) (Hendry 2009).

He butresses his views using the history of econometrics to caricature empirical econometric research in the light of his view that applied economics has a “false” view of science:

\(\ldots\) cumulative critiques \(\ldots\) led to an almost monolithic approach to empirical econometric research: first postulate an individualistic, intertemporal optimization theory; next derive a model therefrom; third, find data with the same names as the theory variables; then select a recipe from the econometrics cookbook that appropriately blends the model and the data, or if necessary, develop another estimator; finally report the newly forged economic law. \(\ldots\) Instead of progress, we find fashions, cycles and “schools” in research. \(\ldots\) At about the same time that a priori theory-based econometrics became dominant, data measurement and quality issues were also relegated as a central component of empirical publications (Hendry 2009).

Hendry’s scepticism with regard to the practice of applied econometrics finds ample support in the proposal by Anderson et al. (2008) that econometric results be acknowledged only to the extent that they are open for replication. They point out that theoretical results in economics (and econometrics) are much easier to check, and that referees routinely question formal proofs, because the relevant equations are included in journal submissions. Further contributions to the discussion on reproducible econometric research results have been made by Koenker and Zeileis (2009) and Yalta and Yalta (2010). As we will see later on, spatial econometrics is in the fortunate position of having relatively many open source software implementations.

In order to complete this brief review of the antecedents to spatial econometrics, it makes sense to point to descriptions of the development of econometric software. Both in a handbook chapter (Renfro 2009a), and in book form (Renfro 2009b), we can benefit from Charles Renfro’s experience and insights. He does admit to expressing distinct views, but they are backed both by experience and evidence. One concern is that econometric software development has been seen less and less as academic achievement, but rather as a practical, technical concern only:

To take an interest in programming and econometric software development would seem therefore to be the graveyard of any academic economist’s professional ambitions, justifiable only as a minimally diverting hobby, spoken of only to trusted colleagues. (Renfro 2009b)

He is not alone in drawing attention to the view that economics suffers from a skewed distribution of attention to the actual components of knowledge creation, placing theory firmly in first place, with commensurately much less weight given to measurement, data preparation and programming. As he says, applied economics researchers would benefit from a more even distribution of academic acknowledgement to those who, in terms of the division of labour within the discipline, develop software:

As a software developer, the econometrician who incorporates new theoretical econometric results may therefore be faced with the often-difficult task of not only evaluating the relevance, hence the operational validity of the theoretical solution, but also implementing these results in a way that is contextually meaningful. This operationally focused econometrician consequently not only needs to understand the theoretical advances that are made but also to exercise independent judgment in the numerical implementation of new techniques, for in fact neither are blueprints provided nor are the building blocks prefabricated. (Renfro 2009b)

Since this division of labour is arguably more uneven in economics than in other subjects, it leads to the restriction of research questions that applied economists can address to those that match estimation functions available in software implementations for which they have licences and which they know how to use:

One of the consequences of this specialization has been to introduce an element of user dependence, now grown to a sufficiently great degree that for many economists whichever set of econometric operations can be performed by their choice of program or programs has for them in effect become the universal set. (Renfro 2009b)

This deplorable situation has arisen over time, and certainly did not characterise the research context when spatial econometrics came into being. At that time, graduate students simply regarded learning to program, often in Fortran, as an essential part of their preparation as researchers. This meant that researchers were moch “closer” to their tools, and could adapt them to suit their research needs. Nowadays, few economists feel confident as programmers despite the fact that modern high-level languages such as Matlab, Python, or R are easy to learn and very flexible, and many econometric and statistical software applications offer scripting languages (such as SAS, Stata, SPSS and specifically econometric programs).

The links between methodological advance and the evolution of spatial economic theory are only touched upon in Luc Anselin (2010) — in that sense, his review is concerned with theoretical spatial econometrics (statistical methods) rather than applied spatial econometrics (economic models). Over time, applied spatial econometrics has tended to become synonymous with regression modelling applied to spatial data where spatial autocorrelation and spatial heterogeneity in particular are present and need to be accommodated. Its treatment of spatial effects reflects the growing “legitimization of space and geography” (Luc Anselin 2010) in the quantitative social sciences more generally. But the subfield perhaps needs to be more than that if it is to justify its separate identity from spatial statistics and fully justify its “econometric” label. A close link with mainstream economic theory would seem essential in order to provide economic legitimacy to models (systems of equations) within which geography and spatial relationships have been, in economic terms, rigorously embedded (Fingleton 2000). (Haining 2014)

Introduction

  • Beyond software functionality, the existence of communities of interest becomes important

  • These extend the choices available, and bring in points of view across applied scientific domains and in modern computation, for example interworking across C++, Python and Julia with R

  • For example, the HSAR package [Dong and Harris (2015); dongetal:15] uses C++ extensively, rather than R sparse matrix packages

  • The model being introduced into INLA for non-MCMC Bayesian inference is another example of cooperation with other applied statisticians

  • The R packages spatialreg, sphet (Roger S. Bivand and Piras 2015) and for spatial panel models splm (Millo and Piras 2012) provide implementations of many model fitting functions for Gaussian dependent variables

  • These also use shared methods in spdep for computing impacts, and for MC inference on impacts

  • There are a range of fitting functions for non-Gaussian dependent variables in McSpatial, spatialprobit (Wilhelm and Matos 2013) and ProbitSpatial (Martinetti and Geniaux 2017), and these are also available using the ´slm´ model being introduced into INLA (Gómez-Rubio, Bivand, and Rue 2021)

  • Work is continuing on providing Bayesian inference functions, through MCMC and otherwise; the HSAR package is an example

  • See R. Bivand, Millo, and Piras (2021) for a recent survey

Spatial Regression

Even though it may be tempting to focus on interpreting the map pattern of an area support response variable of interest, the pattern may largely derive from covariates (and their functional forms), as well as the respective spatial footprints of the variables in play. Spatial autoregressive models in two dimensions began without covariates and with clear links to time series (Whittle 1954). Extensions included tests for spatial autocorrelation in linear model residuals, and models applying the autoregressive component to the response or the residuals, where the latter matched the tests for residuals (A. Cliff and Ord 1972; A. D. Cliff and Ord 1973). These “lattice” models of areal data typically express the dependence between observations using a graph of neighbours in the form of a contiguity matrix.

A division has grown up, possibly unhelpfully, between scientific fields using conditional autoregressive (CAR) models (Besag 1974), and simultaneous autoregressive models (SAR) (J. K. Ord 1975; Hepple 1976). Although CAR and SAR models are closely related, these fields have found it difficult to share experience of applying similar models, often despite referring to key work summarising the models (Ripley 1981, 1988; Cressie 1993). Ripley gives the SAR variance as (1981, 89):

\[ {\rm Var}_S = \sigma^ 2(I-\lambda W_S)^{-1} (I-\lambda W_S^{\rm T})^{-1} \] where \(\lambda\) is a spatial autocorrelation parameter and \(W_S\) is a nonsingular matrix that represents spatial dependence. The CAR variance is:

\[ {\rm Var}_C = \sigma^ 2(I-\lambda W_C)^{-1} \] where and \(W_C\) is a symmetric and strictly positive definite matrix

More recent books expounding the theoretical bases for modelling with areal data simply point out the similarities in relevant chapters (Gaetan and Guyon 2010; Lieshout 2019); the interested reader is invited to consult these sources for background information and examples using the functions described below.

Of course, handling a spatial correlation structure in a generalised least squares model or a (generalised) linear or nonlinear mixed effects model such as those provided in the nlme and many other packages does not have to use a graph of neighbours (Pinheiro and Bates 2000). These models are also spatial regression models, using functions of the distance between observations, and fitted variograms to model the spatial autocorrelation present; such models have been held to yield a clearer picture of the underlying processes (Wall 2004), building on geostatistics. For example, the glmmTMB package successfully uses this approach to spatial regression (Brooks et al. 2017). Here we will only consider spatial regression using spatial weights, chiefly as implemented in the spatialreg package recently split out of the spdep package which had grown unnecessarily large, covering too many aspects of spatial dependence.

Recent development: (R. Bivand, Millo, and Piras 2021)

Spatial regression with spatial weights

Spatial autoregression models using spatial weights matrices were described in some detail using maximum likelihood estimation some time ago (A. D. Cliff and Ord 1973, 1981). A family of models were elaborated in spatial econometric terms extending earlier work, and in many cases using the simultaneous autoregressive framework and row standardization of spatial weights (L. Anselin 1988a). The simultaneous and conditional autoregressive frameworks can be compared, and both can be supplemented using case weights to reflect the relative importance of different observations (Waller and Gotway 2004).

Here we shall use the Boston housing data set, which has been restructured and furnished with census tract boundaries (R. Bivand 2017). The original data set used 506 census tracts and a hedonic model to try to estimate willingness to pay for clean air. The response was constructed from counts of ordinal answers to a 1970 census question about house value; the response is left and right censored in the census source. The key covariate was created from a calibrated meteorological model showing the annual nitrogen oxides (NOX) level for a smaller number of model output zones. The numbers of houses responding also varies by tract and model output zone. There are several other covariates, some measured at the tract level, some by town only, where towns broadly correspond to the air pollution model output zones.

library(sf)
## Linking to GEOS 3.10.1, GDAL 3.4.0, PROJ 8.2.0
library(spatialreg)
## Loading required package: spData
## Loading required package: Matrix
boston_506 <- st_read(system.file("shapes/boston_tracts.shp", package="spData")[1])
## Reading layer `boston_tracts' from data source 
##   `/home/rsb/lib/r_libs/spData/shapes/boston_tracts.shp' using driver `ESRI Shapefile'
## Simple feature collection with 506 features and 36 fields
## Geometry type: POLYGON
## Dimension:     XY
## Bounding box:  xmin: -71.52311 ymin: 42.00305 xmax: -70.63823 ymax: 42.67307
## Geodetic CRS:  NAD27
nb_q <- spdep::poly2nb(boston_506)
lw_q <- spdep::nb2listw(nb_q, style="W")

We can start by reading in the 506 tract data set from spData, and creating a contiguity neighbour object and from that again a row standardized spatial weights object. If we examine the median house values, we find that they have been assigned as missing values, and that 17 tracts are affected.

table(boston_506$censored)
## 
##  left    no right 
##     2   489    15
summary(boston_506$median)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##    5600   16800   21000   21749   24700   50000      17

Next, we can subset to the remaining 489 tracts with non-censored house values, and the neighbour object to match. The neighbour object now has one observation with no neighbours.

boston_506$CHAS <- as.factor(boston_506$CHAS)
boston_489 <- boston_506[!is.na(boston_506$median),]
nb_q_489 <- spdep::poly2nb(boston_489)
lw_q_489 <- spdep::nb2listw(nb_q_489, style="W", zero.policy=TRUE)

The NOX_ID variable specifies the upper level aggregation, letting us aggregate the tracts to air pollution model output zones. We can create aggregate neighbour and row standardized spatial weights objects, and aggregate the NOX variable taking means, and the CHAS Charles River dummy variable for observations on the river.

agg_96 <- list(as.character(boston_506$NOX_ID))
boston_96 <- aggregate(boston_506[, "NOX_ID"], by=agg_96, unique)
nb_q_96 <- spdep::poly2nb(boston_96)
lw_q_96 <- spdep::nb2listw(nb_q_96)
boston_96$NOX <- aggregate(boston_506$NOX, agg_96, mean)$x
boston_96$CHAS <- aggregate(as.integer(boston_506$CHAS)-1, agg_96, max)$x

The response is aggregated using the weightedMedian() function in matrixStats, and midpoint values for the house value classes. Counts of houses by value class were punched to check the published census values, which can be replicated using weightedMedian() at the tract level. Here we find two output zones with calculated weighted medians over the upper census question limit of USD 50,000, and remove them subsequently as they also are affected by not knowing the appropriate value to insert for the top class by value.

nms <- names(boston_506)
ccounts <- 23:31
for (nm in nms[c(22, ccounts, 36)]) {
  boston_96[[nm]] <- aggregate(boston_506[[nm]], agg_96, sum)$x
}
br2 <- c(3.50,  6.25,  8.75, 12.50, 17.50, 22.50, 30.00, 42.50, 60.00)*1000
counts <- as.data.frame(boston_96)[, nms[ccounts]]
f <- function(x) matrixStats::weightedMedian(x=br2, w=x, interpolate=TRUE)
boston_96$median <- apply(counts, 1, f)
is.na(boston_96$median) <- boston_96$median > 50000
summary(boston_96$median)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##    9009   20417   23523   25263   30073   49496       2

Before subsetting, we aggregate the remaining covariates by weighted mean using the tract population counts punched from the census (R. Bivand 2017).

boston_94 <- boston_96[!is.na(boston_96$median),]
nb_q_94 <- spdep::subset.nb(nb_q_96, !is.na(boston_96$median))
lw_q_94 <- spdep::nb2listw(nb_q_94, style="W")

We now have two data sets each at the lower, census tract level and the upper, air pollution model output zone level, one including the censored observations, the other excluding them.

The original model related the log of median house values by tract to the square of NOX values, including other covariates usually related to house value by tract, such as aggregate room counts, aggregate age, ethnicity, social status, distance to downtown and to the nearest radial road, a crime rate, and town-level variables reflecting land use (zoning, industry), taxation and education (R. Bivand 2017). This structure will be used here to exercise issues raised in fitting spatial regression models, including the presence of multiple levels.

form <- formula(log(median) ~ CRIM + ZN + INDUS + CHAS + I((NOX*10)^2) + I(RM^2) + 
                  AGE + log(DIS) + log(RAD) + TAX + PTRATIO + I(BB/100) + 
                  log(I(LSTAT/100)))

Before moving to presentations of issues raised in fitting spatial regression models, it is worth making a few further points. A recent review of spatial regression in a spatial econometrics setting is given by Kelejian and Piras (2017); note that their usage is to call the spatial coefficient of the lagged response \(\lambda\) and that of the lagged residuals \(\rho\), the reverse of other usage (L. Anselin 1988a; James P. LeSage and Pace 2009); here we use \(\rho_{\mathrm{Lag}}\) for the spatial coefficient in the spatial lag model, and \(\rho_{\mathrm{Err}}\) for the spatial error model. One interesting finding is that relatively dense spatial weights matrices may downweight model estimates, suggesting that sparser weights are preferable (Tony E. Smith 2009). Another useful finding is that the presence of residual spatial autocorrelation need not bias the estimates of variance of regression coefficients, provided that the covariates themselves do not exhibit spatial autocorrelation (T. E. Smith and Lee 2012). In general, however, the footprints of the spatial processes of the response and covariates may not be aligned, and if covariates and the residual are autocorrelated, it is likely that the estimates of variance of regression coefficients will be biassed downwards if attempts are not made to model the spatial processes.

In trying to model these spatial processes, we may choose to model the spatial autocorrelation in the residual with a spatial error model (SEM).

\[ {\mathbf y} = {\mathbf X}{\mathbf \beta} + {\mathbf u}, \qquad {\mathbf u} = \rho_{\mathrm{Err}} {\mathbf W} {\mathbf u} + {\mathbf \varepsilon}, \] where \({\mathbf y}\) is an \((N \times 1)\) vector of observations on a response variable taken at each of \(N\) locations, \({\mathbf X}\) is an \((N \times k)\) matrix of covariates, \({\mathbf \beta}\) is a \((k \times 1)\) vector of parameters, \({\mathbf u}\) is an \((N \times 1)\) spatially autocorrelated disturbance vector, \({\mathbf \varepsilon}\) is an \((N \times 1)\) vector of independent and identically distributed disturbances and \(\rho_{\mathrm{Err}}\) is a scalar spatial parameter.

If the processes in the covariates and the response match, we should find little difference between the coefficients of a least squares and a SEM, but very often they diverge, suggesting that a Hausman test for this condition should be employed (R. Pace and LeSage 2008). This may be related to earlier discussions of a spatial equivalent to the unit root and cointegration where spatial processes match (Fingleton 1999).

A model with a spatial process in the response only is termed a spatial lag model (SLM, often SAR - spatial autoregressive) (James P. LeSage and Pace 2009).

\[ {\mathbf y} = \rho_{\mathrm{Lag}} {\mathbf W}{\mathbf y} + {\mathbf X}{\mathbf \beta} + {\mathbf \varepsilon}, \] where \(\rho_{\mathrm{Lag}}\) is a scalar spatial parameter.

Work reviewed by Mur and Angulo (2006) on the Durbin model; the Durbin model adds the spatially lagged covariates to the covariates included in the spatial lag model, giving a spatial Durbin model (SDM) with different processes in the response and covariates:

\[ {\mathbf y} = \rho_{\mathrm{Lag}} {\mathbf W}{\mathbf y} + {\mathbf X}{\mathbf \beta} + {\mathbf W}{\mathbf X}{\mathbf \gamma} + {\mathbf \varepsilon}, \] where \({\mathbf \gamma}\) is a \((k' \times 1)\) vector of parameters. \(k'\) defines the subset of the intercept and covariates, often \(k' = k-1\) when using row standardised spatial weights and omitting the spatially lagged intercept.

This permits the spatial processes to be viewed and tested for as a Common Factor (Burridge 1981; R. S. Bivand 1984). The inclusion of spatially lagged covariates lets us check whether the same spatial process is manifest in the response and the covariates (SEM), whether they are different processes, or whether no process is detected. The Common Factor is present when \({\mathbf \gamma} = - \rho_{\mathrm{Lag}} {\mathbf \beta}\):

\[ {\mathbf y} = \rho_{\mathrm{Lag}} {\mathbf W}{\mathbf y} + {\mathbf X}{\mathbf \beta} - \rho_{\mathrm{Lag}} {\mathbf W}{\mathbf X} {\mathbf \beta} + {\mathbf \varepsilon}, \qquad ({\mathbf I} - \rho_{\mathrm{Lag}} {\mathbf W}){\mathbf y} = ({\mathbf I} - \rho_{\mathrm{Lag}} {\mathbf W}) {\mathbf X}{\mathbf \beta} + {\mathbf \varepsilon}, \] where \({\mathbf I}\) is the \(N \times N\) identity matrix, and \({\mathbf I} - \rho_{\mathrm{Lag}} {\mathbf W}\) is the Common Factor:

\[ \qquad {\mathbf y} = {\mathbf X}{\mathbf \beta} + ({\mathbf I} - \rho_{\mathrm{Lag}} {\mathbf W})^{-1} {\mathbf \varepsilon}, \qquad {\mathbf y} = {\mathbf X}{\mathbf \beta} + {\mathbf u}, \qquad {\mathbf u} = \rho_{\mathrm{Err}} {\mathbf W} {\mathbf u} + {\mathbf \varepsilon}, \]

If we extend this family with processes in the covariates and the residual, we get a spatial error Durbin model (SDEM). If it is chosen to admit a spatial process in the residuals in addition to a spatial process in the response, again two models are formed, a general nested model (GNM) nesting all the others, and a model without spatially lagged covariates (SAC, also known as SARAR - Spatial AutoRegressive-AutoRegressive model). If neither the residuals nor the residual are modelled with spatial processes, spatially lagged covariates may be added to a linear model, as a spatially lagged X model (SLX) (Elhorst 2010; Roger S. Bivand 2012; J. P. LeSage 2014; Halleck Vega and Elhorst 2015).

We can write the general nested model (GNM) as:

\[ {\mathbf y} = \rho_{\mathrm{Lag}} {\mathbf W}{\mathbf y} + {\mathbf X}{\mathbf \beta} + {\mathbf W}{\mathbf X}{\mathbf \gamma} + {\mathbf u}, \qquad {\mathbf u} = \rho_{\mathrm{Err}} {\mathbf W} {\mathbf u} + {\mathbf \varepsilon}, \]

This may be constrained to the double spatial coefficient model SAC/SARAR by setting \({\mathbf \gamma} = 0\), to the spatial Durbin (SDM) by setting \(\rho_{\mathrm{Err}} = 0\), and to the error Durbin model (SDEM) by setting \(\rho_{\mathrm{Lag}} = 0\). Imposing more conditions gives the spatial lag model (SLM) with \({\mathbf \gamma} = 0\) and \(\rho_{\mathrm{Err}} = 0\), the spatial error model (SEM) with \({\mathbf \gamma} = 0\) and \(\rho_{\mathrm{Lag}} = 0\), and the spatially lagged X model (SLX) with \(\rho_{\mathrm{Lag}} = 0\) and \(\rho_{\mathrm{Err}} = 0\).

Although making predictions for new locations for which covariates are observed was raised as an issue some time ago, it has many years to make progress in reviewing the possibilities (R. S. Bivand 2002; Goulard, Laurent, and Thomas-Agnan 2017). The prediction methods for SLM, SDM, SEM, SDEM, SAC and GNM models fitted with maximum likelihood were contributed as a Google Summer of Coding project by Martin Gubri. This work, and work on similar models with missing data (Suesse 2018) is also relevant for exploring censored median house values in the Boston data set. Work on prediction also exposed the importance of the reduced form of these models, in which the spatial process in the response interacts with the regression coefficients in the SLM, SDM, SAC and GNM models.

The consequence of these interactions is that a unit change in a covariate will only impact the response as the value of the regression coefficient if the spatial coefficient of the lagged response is zero. Where it is non-zero, global spillovers, impacts, come into play, and these impacts should be reported rather than the regression coefficients (James P. LeSage and Pace 2009; Elhorst 2010; Roger S. Bivand 2012; J. P. LeSage 2014; Halleck Vega and Elhorst 2015). Local impacts may be reported for SDEM and SLX models, using linear combination to calculate standard errors for the total impacts of each covariate (sums of coefficients on the covariates and their spatial lags).

This can be seen from the GNM data generation process:

\[ ({\mathbf I} - \rho_{\mathrm{Err}} {\mathbf W})({\mathbf I} - \rho_{\mathrm{Lag}} {\mathbf W}){\mathbf y} = ({\mathbf I} - \rho_{\mathrm{Err}} {\mathbf W})({\mathbf X}{\mathbf \beta} + {\mathbf W}{\mathbf X}{\mathbf \gamma}) + {\mathbf \varepsilon}, \] re-writing:

\[ {\mathbf y} = ({\mathbf I} - \rho_{\mathrm{Lag}} {\mathbf W})^{-1}({\mathbf X}{\mathbf \beta} + {\mathbf W}{\mathbf X}{\mathbf \gamma}) + ({\mathbf I} - \rho_{\mathrm{Lag}} {\mathbf W})^{-1}({\mathbf I} - \rho_{\mathrm{Err}} {\mathbf W})^{-1}{\mathbf \varepsilon}. \] There is interaction between the \(\rho_{\mathrm{Lag}}\) and \({\mathbf \beta}\) (and \({\mathbf \gamma}\) if present) coefficients. This can be seen from the partial derivatives: \(\partial y_i / \partial x_{jr} = (({\mathbf I} - \rho_{\mathrm{Lag}} {\mathbf W})^{-1} ({\mathbf I} \beta_r + {\mathbf W} \gamma_r))_{ij}\). This dense matrix \(S_r({\mathbf W}) = (({\mathbf I} - \rho_{\mathrm{Lag}} {\mathbf W})^{-1} ({\mathbf I} \beta_r + {\mathbf W} \gamma_r))\) expresses the direct impacts (effects) on its principal diagonal, and indirect impacts in off-diagonal elements.

Current work in the spatialreg package is focused on refining the handling of spatially lagged covariates using a consistent Durbin= argument taking either a logical value or a formula giving the subset of covariates to add in spatially lagged form. There is a speculation that some covariates, for example some dummy variables, should not be added in spatially lagged form. This then extends to handling these included spatially lagged covariates appropriately in calculating impacts. This work applies to cross-sectional models fitted using MCMC or maximum likelihood, and will offer facilities to spatial panel models.

It is worth mentioning the almost unexplored issues of functional form assumptions, for which flexible structures are useful, including spatial quantile regression presented in the McSpatial package (McMillen 2013). There are further issues with discrete response variables, covered by some functions in McSpatial, and in the spatialprobit and ProbitSpatial packages (Wilhelm and Matos 2013; Martinetti and Geniaux 2017); the MCMC implementations of the former are based on LeSage and Pace (2009). Finally, Wagner and Zeileis (2019) show how an SLM model may be used in the setting of recursive partitioning, with an implementation using spatialreg::lagsarlm() in the lagsarlmtree package.

Estimators

The review of cross-sectional maximum likelihood and generalized method of moments (GMM) estimators in spatialreg and sphet for spatial econometrics style spatial regression models by Bivand and Piras (2015) is still largely valid. In the review, estimators in these R packages were compared with alternative implementations available in other programming languages elsewhere. The review did not cover Bayesian spatial econometrics style spatial regression. More has changed with respect to spatial panel estimators described in Millo and Piras (2012), but will not be covered here.

Maximum likelihood

For models with single spatial coefficients (SEM and SDEM using errorsarlm(), SLM and SDM using lagsarlm()), the methods initially described by Ord (1975) are used. The following table shows the functions that can be used to estimate the models described above using maximum likelihood.

model model name maximum likelihood estimation function
SEM spatial error errorsarlm(..., Durbin=FALSE, ...)
SEM spatial error spautolm(..., family="SAR", ...)
SDEM spatial Durbin error errorsarlm(..., Durbin=TRUE, ...)
SLM spatial lag lagsarlm(..., Durbin=FALSE, ...)
SDM spatial Durbin lagsarlm(..., Durbin=TRUE, ...)
SAC spatial autoregressive combined sacsarlm(..., Durbin=FALSE, ...)
GNM general nested sacsarlm(..., Durbin=TRUE, ...)

The estimating functions errorsarlm() and lagsarlm() take similar arguments, where the first two, formula= and data= are shared by most model estimating functions. The third argument is a listw spatial weights object, while na.action= behaves as in other model estimating functions if the spatial weights can reasonably be subsetted to avoid observations with missing values. The weights= argument may be used to provide weights indicating the known degree of per-observation variability in the variance term - this is not available for lagsarlm().

args(errorsarlm)
## function (formula, data = list(), listw, na.action, weights = NULL, 
##     Durbin, etype, method = "eigen", quiet = NULL, zero.policy = NULL, 
##     interval = NULL, tol.solve = .Machine$double.eps, trs = NULL, 
##     control = list()) 
## NULL
args(lagsarlm)
## function (formula, data = list(), listw, na.action, Durbin, type, 
##     method = "eigen", quiet = NULL, zero.policy = NULL, interval = NULL, 
##     tol.solve = .Machine$double.eps, trs = NULL, control = list()) 
## NULL

The Durbin= argument replaces the earlier type= and etype= arguments, and if not given is taken as FALSE. If given, it may be FALSE, TRUE in which case all spatially lagged covariates are included, or a one-sided formula specifying which spatially lagged covariates should be included. The method= argument gives the method for calculating the log determinant term in the log likelihood function, and defaults to "eigen", suitable for moderately sized data sets. The interval= argument gives the bounds of the domain for the line search using stats::optimize() used for finding the spatial coefficient. The tol.solve() argument, passed through to base::solve(), was needed to handle data sets with differing numerical scales among the coefficients which hindered inversion of the variance-covariance matrix; the default value in base::solve() used to be much larger. The control= argument takes a list of control values to permit more careful adjustment of the running of the estimation function.

The spautolm() function also fits spatial regressions with the spatial process in the residuals, and takes a possibly poorly named family= argument, taking the values of "SAR" for the simultaneous autoregressive model (like errorsarlm()), "CAR" for the conditional autoregressive model, and "SMA" for the spatial moving average model.

args(spautolm)
## function (formula, data = list(), listw, weights, na.action, 
##     family = "SAR", method = "eigen", verbose = NULL, trs = NULL, 
##     interval = NULL, zero.policy = NULL, tol.solve = .Machine$double.eps, 
##     llprof = NULL, control = list()) 
## NULL

The sacsarlm() function may take second spatial weights and interval arguments if the spatial weights used to model the two spatial processes in the SAC and GNM specifications differ. By default, the same spatial weights are used. By default, stats::nlminb() is used for numerical optimization, using a heuristic to choose starting values.

args(sacsarlm)
## function (formula, data = list(), listw, listw2 = NULL, na.action, 
##     Durbin, type, method = "eigen", quiet = NULL, zero.policy = NULL, 
##     tol.solve = .Machine$double.eps, llprof = NULL, interval1 = NULL, 
##     interval2 = NULL, trs1 = NULL, trs2 = NULL, control = list()) 
## NULL

Where larger data sets are used, a numerical Hessian approach is used to calculate the variance-covariance matrix of coefficients, rather than an analytical asymptotic approach.

Apart from spautolm() which returns an "spautolm" object, the model fitting functions return "Sarlm" objects. Standard methods for fitted models are provided, such as summary():

args(getS3method("summary", "Sarlm"))
## function (object, correlation = FALSE, Nagelkerke = FALSE, Hausman = FALSE, 
##     adj.se = FALSE, ...) 
## NULL

The Nagelkerke= argument permits the return of a value approximately corresponding to a coefficient of determination, although the summary method anyway provides the value of stats::AIC() because a stats::logLik() method is provided for "sarlm" and "spautolm" objects. If the "sarlm" object is a SEM or SDEM, the Hausman test may be performed by setting Hausman=TRUE to see whether the regression coefficients are sufficiently like least squares coefficients, indicating absence of mis-specification from that source. As an example, we may fit SEM and SDEM to the 94 and 489 observation Boston data sets, and present the Hausman test results:

eigs_489 <- eigenw(lw_q_489)
SDEM_489 <- errorsarlm(form, data=boston_489, listw=lw_q_489, Durbin=TRUE, 
                       zero.policy=TRUE, control=list(pre_eig=eigs_489))
SEM_489 <- errorsarlm(form, data=boston_489, listw=lw_q_489, 
                      zero.policy=TRUE, control=list(pre_eig=eigs_489))
cbind(data.frame(model=c("SEM", "SDEM")), 
      rbind(broom::tidy(Hausman.test(SEM_489)), 
            broom::tidy(Hausman.test(SDEM_489))))[,1:4]
##   model statistic      p.value parameter
## 1   SEM   51.9862 2.827192e-06        14
## 2  SDEM   48.6551 6.481654e-03        27

Here we are using the control= list argument to pass through pre-computed eigenvalues for the default "eigen" method. Both test results for the 489 tract data set suggest that the regression coefficients do differ, perhaps that the footprints of the spatial processes do not match. Likelihood ratio tests of the spatial models against their least squares equivalents show that spatial process(es) are present, but we find that neither SEM not SDM are adequate representations.

cbind(data.frame(model=c("SEM", "SDEM")), 
      rbind(broom::tidy(LR1.Sarlm(SEM_489)), 
            broom::tidy(LR1.Sarlm(SDEM_489))))[,c(1, 4:6)]
##   model statistic p.value parameter
## 1   SEM  198.4133       0         1
## 2  SDEM  159.3798       0         1

For the 94 air pollution model output zones, the Hausman tests find little difference between coefficients, but this is related to the fact that the SEM and SDEM models add little to least squares or SLX using likelihood ratio tests.

eigs_94 <- eigenw(lw_q_94)
SDEM_94 <- errorsarlm(form, data=boston_94, listw=lw_q_94, Durbin=TRUE,
                      control=list(pre_eig=eigs_94))
SEM_94 <- errorsarlm(form, data=boston_94, listw=lw_q_94, control=list(pre_eig=eigs_94))
cbind(data.frame(model=c("SEM", "SDEM")), 
      rbind(broom::tidy(Hausman.test(SEM_94)), 
            broom::tidy(Hausman.test(SDEM_94))))[, 1:4]
##   model statistic   p.value parameter
## 1   SEM 15.657083 0.3347612        14
## 2  SDEM  9.205152 0.9994394        27
cbind(data.frame(model=c("SEM", "SDEM")), rbind(broom::tidy(LR1.Sarlm(SEM_94)), broom::tidy(LR1.Sarlm(SDEM_94))))[,c(1, 4:6)]
##   model statistic   p.value parameter
## 1   SEM 2.5933581 0.1073126         1
## 2  SDEM 0.2158487 0.6422213         1

We can use spatialreg::LR.sarlm() to apply a likelihood ratio test between nested models, but here choose lmtest::lrtest(), which gives the same results, preferring models including spatially lagged covariates.

broom::tidy(lmtest::lrtest(SEM_489, SDEM_489))
## Warning in tidy.anova(lmtest::lrtest(SEM_489, SDEM_489)): The following column
## names in ANOVA output were not recognized or transformed: X.Df, LogLik
## # A tibble: 2 × 5
##    X.Df LogLik    df statistic   p.value
##   <dbl>  <dbl> <dbl>     <dbl>     <dbl>
## 1    16   273.    NA      NA   NA       
## 2    29   311.    13      74.4  1.23e-10
broom::tidy(lmtest::lrtest(SEM_94, SDEM_94))
## Warning in tidy.anova(lmtest::lrtest(SEM_94, SDEM_94)): The following column
## names in ANOVA output were not recognized or transformed: X.Df, LogLik
## # A tibble: 2 × 5
##    X.Df LogLik    df statistic    p.value
##   <dbl>  <dbl> <dbl>     <dbl>      <dbl>
## 1    16   59.7    NA      NA   NA        
## 2    29   81.3    13      43.2  0.0000421

The SLX model is fitted using least squares, and also returns a log likelihood value, letting us test whether we need a spatial process in the residuals.

args(lmSLX)
## function (formula, data = list(), listw, na.action, weights = NULL, 
##     Durbin = TRUE, zero.policy = NULL) 
## NULL

In the tract data set we obviously do:

SLX_489 <- lmSLX(form, data=boston_489, listw=lw_q_489, zero.policy=TRUE)
broom::tidy(lmtest::lrtest(SLX_489, SDEM_489))
## Warning in modelUpdate(objects[[i - 1]], objects[[i]]): original model was of
## class "SlX", updated model is of class "Sarlm"
## Warning in tidy.anova(lmtest::lrtest(SLX_489, SDEM_489)): The following column
## names in ANOVA output were not recognized or transformed: X.Df, LogLik
## # A tibble: 2 × 5
##    X.Df LogLik    df statistic   p.value
##   <dbl>  <dbl> <dbl>     <dbl>     <dbl>
## 1    28   231.    NA       NA  NA       
## 2    29   311.     1      159.  1.55e-36

but in the output zone case we do not.

SLX_94 <- lmSLX(form, data=boston_94, listw=lw_q_94)
broom::tidy(lmtest::lrtest(SLX_94, SDEM_94))
## Warning in modelUpdate(objects[[i - 1]], objects[[i]]): original model was of
## class "SlX", updated model is of class "Sarlm"
## Warning in tidy.anova(lmtest::lrtest(SLX_94, SDEM_94)): The following column
## names in ANOVA output were not recognized or transformed: X.Df, LogLik
## # A tibble: 2 × 5
##    X.Df LogLik    df statistic p.value
##   <dbl>  <dbl> <dbl>     <dbl>   <dbl>
## 1    28   81.2    NA    NA      NA    
## 2    29   81.3     1     0.216   0.642

This outcome is sustained also when we use the counts of house units by tract and output zones as weights:

SLX_94w <- lmSLX(form, data=boston_94, listw=lw_q_94, weights=units)
SDEM_94w <- errorsarlm(form, data=boston_94, listw=lw_q_94, Durbin=TRUE, weights=units,
                       control=list(pre_eig=eigs_94))
broom::tidy(lmtest::lrtest(SLX_94w, SDEM_94w))
## Warning in modelUpdate(objects[[i - 1]], objects[[i]]): original model was of
## class "SlX", updated model is of class "Sarlm"
## Warning in tidy.anova(lmtest::lrtest(SLX_94w, SDEM_94w)): The following column
## names in ANOVA output were not recognized or transformed: X.Df, LogLik
## # A tibble: 2 × 5
##    X.Df LogLik    df statistic p.value
##   <dbl>  <dbl> <dbl>     <dbl>   <dbl>
## 1    28   97.5    NA    NA      NA    
## 2    29   98.0     1     0.917   0.338

Generalized method of moments

The estimation methods used for fitting SLM and the spatially lagged response part of SARAR models are based on instrumental variables, using the spatially lagged covariates (usually including lags of lags too) as instruments for the spatially lagged response (Piras 2010). This, and the use of spatially lagged covariates in the moment conditions for models including a spatial process in the residuals, means that including the spatially lagged covariates in the initial model is challenging, and functions in the sphet package do not provide a Durbin= argument. This makes it harder to accommodate data with multiple spatial process footprints. However, as Kelejian and Piras show in their recent review (2017), these approaches have other benefits, such as being able to instrument variables suspected of suffering from endogeneity or measurement error. Let us first compare the ML and GMM estimates of the SEM regression coefficients for rescaled squared NOX values. We use the sphet::spreg() wrapper function, and fit a SEM model. Extracting the matching rows from the summary objects for these models, we can see that the values are not dissimilar, despite the difference in estimation methods.

SEM1_94 <- sphet::spreg(form, data=boston_94, listw=lw_q_94, model="error")
res <- rbind(summary(SEM_94)$Coef["I((NOX * 10)^2)",], 
             summary(SEM1_94)$CoefTable["I((NOX * 10)^2)",])
rownames(res) <- c("ML", "GMM")
res
##         Estimate Std. Error   z value     Pr(>|z|)
## ML  -0.009564967  0.0026099 -3.664879 0.0002474555
## GMM -0.009744770  0.0026113 -3.731770 0.0001901391

Using sphet::spreg(), we can instrument the rescaled squared NOX variable, dropping it first from the formula, next creating the rescaled squared NOX variable as a column in the "sf" object, extracting a matrix of coordinates from the centroids of the output zones, and creating a one-sided instrument formula from a second order polynomial in the coordinates (here improperly, as they are not projected) and mean distances to downtown and radial roads. The endog= argument takes a one-sided formula for the variables to be modelled in the first stage of the model. Had we re-run the original air pollution model many times under slightly varying scenarios, we could have used an ensemble of NOX loadings to yield its distribution by output zone. Because this is not possible, we assume that the measurement error can be captured by using selected instruments. Unfortunately, the NOX regression coefficient estimate from the second stage has fallen substantially in absolute size, although the sign is unchanged.

formiv <- update(form, . ~ . - I((NOX*10)^2))
boston_94$NOX2 <- (boston_94$NOX*10)^2
suppressWarnings(ccoords <- st_coordinates(st_centroid(st_geometry(boston_94))))
iform <- formula(~poly(ccoords, degree=2) + DIS + RAD)
SEM1_94iv <- sphet::spreg(formiv, data=boston_94, listw=lw_q_94, endog = ~NOX2, 
                   instruments=iform, model="error")
summary(SEM1_94iv)$CoefTable["NOX2",]
##     Estimate   Std. Error      t-value     Pr(>|t|) 
## -0.001974509  0.005664411 -0.348581556  0.727403476

Handling measurement error in this or similar ways is one of the benefits of GMM estimation methods, although here the choice of instruments was somewhat arbitrary.

Markov chain Monte Carlo

The Spatial Econometrics Library is part of the extensive Matlab code repository at https://www.spatial-econometrics.com/ and documented in LeSage and Pace (2009). The Google Summer of Coding project in 2011 by Abhirup Mallik mentored by Virgilio Gómez-Rubio yielded translations of some of the model fitting functions for SEM, SDEM, SLM, SDM, SAC and GNM from the Matlab code. These have now been added to spatialreg as spBreg_err(), spBreg_lag() and spBreg_sac() with Durbin= arguments to handle the inclusion of spatially lagged covariates. As yet, heteroskedastic disturbances are not accommodated. The functions return "mcmc" objects as specified in the coda package, permitting the use of tools from coda for handling model output. The default burnin is 500 draws, followed by default 2000 draws returned, but these values and many others may be set through the control= list argument. Fitting the SDEM model for the output zones takes about an order of magnitude longer than using ML, but there is more work to do subsequently, and this difference scales more in the number of samples than covariates or observations.

system.time(SDEM_94B <- spBreg_err(form, data=boston_94, listw=lw_q_94, Durbin=TRUE))
##    user  system elapsed 
##   1.818   0.001   1.835
system.time(SDEM_489B <- spBreg_err(form, data=boston_489, listw=lw_q_489, Durbin=TRUE, zero.policy=TRUE))
##    user  system elapsed 
##   2.496   0.004   2.521

Most time in the ML case using eigenvalues is taken by log determinant setup and optimization, and by dense matrix asymptotic standard errors if chosen (always chosen with default "eigen" log determinant method):

t(errorsarlm(form, data=boston_94, listw=lw_q_94, Durbin=TRUE)$timings[,2])
##      set_up eigen_set_up eigen_opt coefs eigen_hcov eigen_se
## [1,]  0.004        0.005     0.006 0.003      0.002    0.001
t(errorsarlm(form, data=boston_489, listw=lw_q_489, Durbin=TRUE, zero.policy=TRUE)$timings[,2])
##      set_up eigen_set_up eigen_opt coefs eigen_hcov eigen_se
## [1,]  0.008        0.053     0.025 0.005      0.094    0.192

while in the MCMC case, the default use of Spatial Econometric Toolbox gridded log determinants and obviously sampling takes most time:

t(attr(SDEM_94B, "timings")[ , 3])
##      set_up SE_classic_set_up complete_setup sampling finalise
## [1,]  0.003             0.245          0.027    1.501    0.057
t(attr(SDEM_489B, "timings")[ , 3])
##      set_up SE_classic_set_up complete_setup sampling finalise
## [1,]  0.005              0.42          0.069    1.977     0.05

However, as we will see shortly, inference from model impacts may need sampling (where the spatially lagged response is included in the model), and in the case of MCMC models, the samples have already been drawn.

Handling the log determinant term

We will briefly touch on selected implementation details that applied users of spatial regression models would be wise to review. The handling of the log determinant term applies to all such users, while impacts are restricted to those employing spatial econometrics style models either including the spatially lagged response or including spatially lagged covariates.

It has been known for over twenty years that the sparse matrix representation of spatial weights overcomes the difficulties of fitting models with larger numbers of observations using maximum likelihood and MCMC where the log determinant term comes into play (R. K. Pace and Barry 1997a, 1997c, 1997d, 1997b). During the development of these approaches in model fitting functions in spatialreg, use was first made of C code also used in the S-PLUS SpatialStats module (Kaluzny et al. 1998), then SparseM which used a compressed sparse row form very similar to "nb" and "listw" objects. This was followed by the use of spam and Matrix methods, both of which mainly use compressed sparse column representations. Details are provided in Bivand, Hauke and Kossowski (2013).

The domain of the spatial coefficient(s) is given by the interval= argument to model fitting functions, and returned in the fitted object:

SEM_94$interval
## [1] -1.527257  1.000000

This case is trivial, because the upper bound is unity by definition, because of the use of row standardization. The interval is the inverse of the range of the eigenvalues of the weights matrix:

1/range(eigs_94)
## [1] -1.527257  1.000000

Finding the interval within which to search for the spatial coefficient is trivial for smaller data sets, but more complex for larger ones. It is possible to use heuristics implemented in lextrW() (D. Griffith, Bivand, and Chun 2015):

1/c(lextrW(lw_q_94))
##  lambda_n  lambda_1 
## -1.527257  1.000000

or RSpectra::eigs() after coercion to a Matrix package compressed sparse column representation:

W <- as(lw_q_94, "CsparseMatrix")
1/Re(c(RSpectra::eigs(W, k=1, which="SR")$values, 
       RSpectra::eigs(W, k=1, which="LR")$values))
## [1] -1.527257  1.000000

Why are we extracting the real part of the values returned by eigs()? Since nb_q_94 is symmetric, the row-standardized object lw_q_94 is symmetric by similarity, a result known since Ord (1975); consequently, we can take the real part without concern. Had the underlying neighbour relationships not been symmetric, we should be more careful.

The baseline log determinant term as given by Ord (1975) for a coefficient value proposed in sampling or during numerical optimization; this extract matches the "eigen" method (with or without control=list(pre_eig=...)"):

coef <- 0.5
sum(log(1 - coef * eigs_94))
## [1] -2.867292

Using sparse matrix functions from Matrix, the LU decomposition can be used for asymmetric matrices; this extract matches the "LU" method:

I <- Diagonal(nrow(boston_94))
LU <- lu(I - coef * W)
dU <- abs(diag(slot(LU, "U")))
sum(log(dU))
## [1] -2.867292

and Cholesky decomposition for symmetric matrices, with similar.listw() used to handle asymmetric weights that are similar to symmetric. The default value od super allows the underlying Matrix code to choose between supernodal or simplicial decomposition; this extract matches the "Matrix_J" method:

W <- as(similar.listw(lw_q_94), "CsparseMatrix")
super <- as.logical(NA)
cch <- Cholesky((I - coef * W), super=super)
c(2 * determinant(cch, logarithm = TRUE)$modulus)
## [1] -2.867292

The "Matrix" and "spam_update" methods are to be preferred as they pre-compute the fill-reducing permutation of the decomposition since the weights do not change for different values of the coefficient.

Maximum likelihood model fitting functions in spatialreg and splm use jacobianSetup() to populate env= environment with intermediate objects needed to find log determinants during optimization. Passing environments to objective functions is efficient because they are passed by reference rather than value. The con= argument passes through the populated control list, containing default values unless the key-value pairs were given in the function call (pre_eig= is extracted separately). The which= argument is 1 by default, but will also take 2 in SAC and GNM models. HSAR uses mcdet_setup() to set up Monte Carlo approximation terms.

args(jacobianSetup)
## function (method, env, con, pre_eig = NULL, trs = NULL, interval = NULL, 
##     which = 1) 
## NULL

For each value of coef, the do_ldet() function returns the log determinant, using the values stored in environment env=:

args(do_ldet)
## function (coef, env, which = 1) 
## NULL

As yet the Bayesian models are limited to control argument ldet_method="SE_classic" at present, using "LU" to generate a coarse grid of control argument nrho=200L log determinant values in the interval, spline interpolated to a finer grid of length control argument interpn=2000L, from which griddy Gibbs samples are drawn. It is hoped to add facilities to choose alternative methods in the future. This would offer possibilities to move beyond griddy Gibbs, but using gridded log determinant values seems reasonable at present.

Impacts

Global impacts have been seen as crucial for reporting results from fitting models including the spatially lagged response (SLM, SDM, SAC. GNM) for over ten years (James P. LeSage and Pace 2009). Extension to other models including spatially lagged covariates (SLX, SDEM) has followed (Elhorst 2010; Roger S. Bivand 2012; Halleck Vega and Elhorst 2015). In order to infer from the impacts, linear combination may be used for SLX and SDEM models. For SLM, SDM, SAC and GNM models fitted with maximum likelihood or GMM, the variance-covariance matrix of the coefficients is available, and can be used to make random draws from a multivariate Normal distribution with mean set to coefficient values and variance to the estimated variance-covariance matrix. For these models fitted using Bayesian methods, draws are already available. In the SDEM case, the draws on the regression coefficients of the unlagged covariates represent direct impacts, and draws on the coefficients of the spatially lagged covariates represent indirect impacts, and their by-draw sums the total impacts.

Impacts are calculated using model object class specific impacts() methods, here taking the method for "sarlm" objects as an example. In the sphet package, the impacts method for "gstsls" uses the spatialreg impacts() framework, as does the splm package for "splm" fitted model objects. impacts() methods require either a tr= - a vector of traces of the power series of the weights object typically computed with trW() or a listw= argument. If listw= is given, dense matrix methods are used. The evalues= argument is experimental, does not yet work for all model types, and takes the eigenvalues of the weights matrix. The R= argument gives the number of samples to be taken from the fitted model. The Q= permits the decomposition of impacts to components of the power series of the weights matrix (James P. LeSage and Pace 2009).

args(getS3method("impacts", "Sarlm"))
## function (obj, ..., tr = NULL, R = NULL, listw = NULL, evalues = NULL, 
##     useHESS = NULL, tol = 1e-06, empirical = FALSE, Q = NULL) 
## NULL

The summary method for the output of impacts() methods where inference from samples was requested by default uses the summary() method for "mcmc" objects defined in the coda package. It can instead report just matrices of standard errors, z-values and p-values by setting zstats= and short= to TRUE.

args(getS3method("summary", "LagImpact"))
## function (object, ..., zstats = FALSE, short = FALSE, reportQ = NULL) 
## NULL

Since sampling is not required for inference for SLX and SDEM models, linear combination is used for models fitted using maximum likelihood; results are shown here for the air pollution variable only. The literature has not yet resolved the question of how to report model output, as each covariate is now represented by three impacts. Where spatially lagged covariates are included, two coefficients are replaced by three impacts.

sum_imp_94_SDEM <- summary(impacts(SDEM_94))
rbind(Impacts=sum_imp_94_SDEM$mat[5,], SE=sum_imp_94_SDEM$semat[5,])
##               Direct     Indirect        Total
## Impacts -0.012764045 -0.018454532 -0.031218577
## SE       0.002354762  0.004717734  0.005303962

The impacts from the same model fitted by MCMC are very similar:

sum_imp_94_SDEM_B <- summary(impacts(SDEM_94B))
rbind(Impacts=sum_imp_94_SDEM_B$mat[5,], SE=sum_imp_94_SDEM_B$semat[5,])
##               Direct     Indirect        Total
## Impacts -0.012791651 -0.017728819 -0.030520470
## SE       0.002460167  0.005119155  0.006077588

as also are those from the SLX model. In the SLX and SDEM models, the direct impacts are the consequences for the response of changes in air pollution in the same observational entity, and the indirect (local) impacts are the consequences for the response of changes in air pollution in neighbouring observational entities.

sum_imp_94_SLX <- summary(impacts(SLX_94))
rbind(Impacts=sum_imp_94_SLX$mat[5,], SE=sum_imp_94_SLX$semat[5,])
##               Direct     Indirect        Total
## Impacts -0.012766473 -0.018743765 -0.031510238
## SE       0.002796966  0.005564633  0.006114058

In contrast to local indirect impacts in SLX and SDEM models, global indirect impacts are found in models including the spatially lagged response. For purposes of exposition, let us fit an SLM:

SLM_489 <- lagsarlm(form, data=boston_489, listw=lw_q_489, zero.policy=TRUE)

Traces of the first m= matrices of the power series in the spatial weights are pre-computed to speed up inference from samples from the fitted model, and from existing MCMC samples (James P. LeSage and Pace 2009). The traces can also be used in other contexts too, so their pre-computation may be worthwhile anyway. The type= argument is "mult" by default, but may also be set to "MC" for Monte Carlo simulation or "moments" using a space-saving looping algorithm.

args(trW)
## function (W = NULL, m = 30, p = 16, type = "mult", listw = NULL, 
##     momentsSymmetry = TRUE) 
## NULL
W <- as(lw_q_489, "CsparseMatrix")
tr_489 <- trW(W)
str(tr_489)
##  num [1:30] 0 90.8 29.4 42.1 26.5 ...
##  - attr(*, "timings")= Named num [1:2] 0.173 0.18
##   ..- attr(*, "names")= chr [1:2] "user.self" "elapsed"
##  - attr(*, "type")= chr "mult"
##  - attr(*, "n")= int 489

In this case, the spatial process in the response is not strong, so the global indirect impacts (here for the air pollution variable) are weak.

SLM_489_imp <- impacts(SLM_489, tr=tr_489, R=2000)
SLM_489_imp_sum <- summary(SLM_489_imp, short=TRUE, zstats=TRUE)
res <- rbind(Impacts=sapply(SLM_489_imp$res, "[", 5), SE=SLM_489_imp_sum$semat[5,])
colnames(res) <- c("Direct", "Indirect", "Total")
res
##               Direct      Indirect        Total
## Impacts -0.005930731 -1.014744e-05 -0.005940879
## SE       0.001048725  1.010022e-04  0.001052947

Of more interest is trying to reconstruct the direct and total impacts using dense matrix methods; the direct global impacts are the mean of the diagonal of the dense impacts matrix, and the total global impacts are the sum of all matrix elements divided by the number of observations. The direct impacts agree, but the total impacts differ slightly.

coef_SLM_489 <- coef(SLM_489)
IrW <- Diagonal(489) - coef_SLM_489[1] * W
S_W <- solve(IrW)
S_NOX_W <- S_W %*% (diag(489) * coef_SLM_489[7])
c(Direct=mean(diag(S_NOX_W)), Total=sum(S_NOX_W)/489)
##       Direct        Total 
## -0.005930731 -0.005940858

This bare-bones approach corresponds to using the listw= argument, and as expected gives the same output.

sapply(impacts(SLM_489, listw=lw_q_489), "[", 5)
##        direct      indirect         total 
## -5.930731e-03 -1.014744e-05 -5.940879e-03

The experimental evalues= approach which is known to be numerically exact by definition gives the same results as the matrix power series trace approach, so the slight difference may be attributed to the consequences of inverting the spatial process matrix.

sapply(impacts(SLM_489, evalues=eigs_489), "[", 5)
##        direct      indirect         total 
## -5.930731e-03 -1.014744e-05 -5.940879e-03

Impacts are crucial to the interpretation of Durbin models including spatially lagged covariates and models including the spatially lagged response. Tools to calculate impacts and their inferential bases are now available, and should be employed, but as yet some implementation details are under development and ways of presenting results in tabular form have not reached maturity.

Predictions

We will use the predict() method for "sarlm" objects to double-check impacts, here for the pupil-teacher ratio (PTRATIO). The method was re-written by Martin Gubri based on Goulard, Laurent and Thomas-Agnan (2017). The pred.type= argument specifies the prediction strategy among those presented in the article.

args(getS3method("predict", "sarlm"))
## function (object, newdata = NULL, listw = NULL, pred.type = "TS", 
##     all.data = FALSE, zero.policy = NULL, legacy = TRUE, legacy.mixed = FALSE, 
##     power = NULL, order = 250, tol = .Machine$double.eps^(3/5), 
##     spChk = NULL, ...) 
## NULL

First we’ll increment PTRATIO by one to show that, using least squares, the mean difference between predictions from the incremented new data and fitted values is equal to the regression coefficient.

nd_489 <- boston_489
nd_489$PTRATIO <- nd_489$PTRATIO + 1
OLS_489 <- lm(form, data=boston_489)
fitted <- predict(OLS_489)
nd_fitted <- predict(OLS_489, newdata=nd_489)
all.equal(unname(coef(OLS_489)[12]), mean(nd_fitted - fitted))
## [1] TRUE

In models including the spatially lagged response, and when the spatial coefficient in different from zero, this is not the case in general, and is why we need impacts() methods. The difference here is not great, but neither is it zero, and needs to be handled.

fitted <- predict(SLM_489)
## This method assumes the response is known - see manual page
nd_fitted <- predict(SLM_489, newdata=nd_489, listw=lw_q_489, pred.type="TS",
                     zero.policy=TRUE)
all.equal(unname(coef_SLM_489[13]), mean(nd_fitted - fitted))
## [1] "Mean relative difference: 0.001783081"

In the Boston tracts data set, 17 observations of median house values, the response, are censored. Using these as an example and comparing some pred.type= variants for the SDEM model and predicting out-of-sample, we can see that there are differences, suggesting that this is a fruitful area for study. There have been a number of alternative proposals for handling missing variables (Gómez-Rubio, Bivand, and Rue 2015; Suesse 2018). Another reason for increasing attention on prediction is that it is fundamental for machine learning approaches, in which prediction for validation and test data sets drives model specification choice. The choice of training and other data sets with dependent spatial data remains an open question, and is certainly not as simple as with independent data.

Here, we’ll list the predictions for the censored tract observations using three different prediction types, taking the exponent to get back to the USD median house values.

nd <- boston_506[is.na(boston_506$median),]
t0 <- exp(predict(SDEM_489, newdata=nd, listw=lw_q, pred.type="TS", zero.policy=TRUE))
suppressWarnings(t1  <- exp(predict(SDEM_489, newdata=nd, listw=lw_q, pred.type="KP2",
                                    zero.policy=TRUE)))
suppressWarnings(t2  <- exp(predict(SDEM_489, newdata=nd, listw=lw_q, pred.type="KP5",
                                    zero.policy=TRUE)))
data.frame(fit_TS=t0[,1], fit_KP2=c(t1), fit_KP5=c(t2),
           censored=boston_506$censored[as.integer(attr(t0, "region.id"))])
##        fit_TS   fit_KP2   fit_KP5 censored
## 13  23912.450 29477.240 28147.151    right
## 14  28125.588 27001.192 28516.159    right
## 15  30553.380 36184.007 32476.053    right
## 17  18518.448 19620.801 18878.244    right
## 43   9563.613  6816.721  7561.353     left
## 50   8371.200  7196.445  7383.139     left
## 312 51477.038 53300.877 54172.605    right
## 313 45920.630 45823.263 47094.833    right
## 314 44195.663 44585.510 45361.371    right
## 317 43427.407 45706.682 45442.461    right
## 337 39878.563 42071.842 41126.925    right
## 346 44708.480 46693.655 46107.827    right
## 355 48187.620 49067.627 48910.651    right
## 376 42881.289 45883.193 44965.674    right
## 408 44293.862 44615.008 45669.977    right
## 418 38210.776 43374.808 41913.830    right
## 434 41647.159 41690.222 42397.968    right

The ACS US census tracts data set

Let us read in again the ACS data set with over 70,000 observations:

library(sf)
df_tracts <- st_read("df_tracts.gpkg")
## Reading layer `df_tracts' from data source 
##   `/home/rsb/und/ecs530/2021/df_tracts.gpkg' using driver `GPKG'
## Simple feature collection with 71353 features and 28 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -124.7332 ymin: 24.5447 xmax: -66.9499 ymax: 49.38436
## Geodetic CRS:  NAD83
set.ZeroPolicyOption(TRUE)
## [1] FALSE
spdep::set.ZeroPolicyOption(TRUE)
## [1] FALSE
nb_subset <- readRDS("nb_subset.rds")
lw <- spdep::nb2listw(nb_subset, style="W")

In the original study, logarithms of the median income CV were thought to be driven by a slightly broader set of covariates. Here we take logs of the vacancy rate, the elderly rate, the Black and Hispanic rates, the population density, and group quarters population (institutional like prisons, non-institutional like student accommodation https://www.census.gov/newsroom/releases/archives/2010_census/cb11-tps13.html).

tr_form <- log(med_inc_cv) ~ log1p(vacancy_rate) + log1p(old_rate) + log1p(black_rate) + log1p(hisp_rate) + log1p(group_pop) + log1p(dens)

The available covariates are not the same as in the original article, but our focus here is to show that larger data sets can be used in analysis:

lm_mod <- lm(tr_form, data=df_tracts)
summary(lm_mod)
## 
## Call:
## lm(formula = tr_form, data = df_tracts)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -3.14822 -0.29445  0.01038  0.30068  2.99263 
## 
## Coefficients:
##                       Estimate Std. Error t value Pr(>|t|)    
## (Intercept)         -6.2284926  0.0355078 -175.41   <2e-16 ***
## log1p(vacancy_rate)  0.4039100  0.0248345   16.26   <2e-16 ***
## log1p(old_rate)      0.4069179  0.0040408  100.70   <2e-16 ***
## log1p(black_rate)    0.5478924  0.0110139   49.75   <2e-16 ***
## log1p(hisp_rate)     0.5854893  0.0121825   48.06   <2e-16 ***
## log1p(group_pop)     0.0216907  0.0007901   27.45   <2e-16 ***
## log1p(dens)          0.0354277  0.0018286   19.37   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4526 on 71346 degrees of freedom
## Multiple R-squared:  0.2296, Adjusted R-squared:  0.2295 
## F-statistic:  3543 on 6 and 71346 DF,  p-value: < 2.2e-16

The least squares residuals show strong positive spatial autocorrelation:

spdep::lm.morantest(lm_mod, lw)
## 
##  Global Moran I for regression residuals
## 
## data:  
## model: lm(formula = tr_form, data = df_tracts)
## weights: lw
## 
## Moran I statistic standard deviate = 50.637, p-value < 2.2e-16
## alternative hypothesis: greater
## sample estimates:
## Observed Moran I      Expectation         Variance 
##     1.117328e-01    -6.347763e-05     4.874291e-06

Adding the spatially lagged covariates in SLX model form might make sense:

SLX <- lmSLX(tr_form, data=df_tracts, listw=lw, zero.policy=TRUE)
summary(SLX)
## 
## Call:
## lm(formula = formula(paste("y ~ ", paste(colnames(x)[-1], collapse = "+"))), 
##     data = as.data.frame(x), weights = weights)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -3.14224 -0.29424  0.01015  0.30057  2.94089 
## 
## Coefficients:
##                          Estimate Std. Error  t value Pr(>|t|)    
## (Intercept)             -6.091034   0.054824 -111.102  < 2e-16 ***
## log1p.vacancy_rate.      0.204452   0.039726    5.147 2.66e-07 ***
## log1p.old_rate.          0.402543   0.004422   91.026  < 2e-16 ***
## log1p.black_rate.        0.493457   0.027068   18.230  < 2e-16 ***
## log1p.hisp_rate.         0.607203   0.033922   17.900  < 2e-16 ***
## log1p.group_pop.         0.019748   0.000834   23.678  < 2e-16 ***
## log1p.dens.              0.010350   0.004203    2.463  0.01380 *  
## lag.log1p.vacancy_rate.  0.374812   0.050936    7.359 1.88e-13 ***
## lag.log1p.old_rate.     -0.016907   0.006436   -2.627  0.00861 ** 
## lag.log1p.black_rate.    0.053985   0.030672    1.760  0.07840 .  
## lag.log1p.hisp_rate.    -0.039247   0.037392   -1.050  0.29390    
## lag.log1p.group_pop.     0.012803   0.001529    8.374  < 2e-16 ***
## lag.log1p.dens.          0.033688   0.004742    7.105 1.22e-12 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4521 on 71340 degrees of freedom
## Multiple R-squared:  0.2315, Adjusted R-squared:  0.2314 
## F-statistic:  1791 on 12 and 71340 DF,  p-value: < 2.2e-16

The level of residual spatial autocorrelation is effectively unchanged, perhaps suggesting that the spatial footprint of the autocorrelation process(es) may not be well-represented by the chosen definition of tract neighbours:

spdep::lm.morantest(SLX, lw)
## 
##  Global Moran I for regression residuals
## 
## data:  
## model: lm(formula = formula(paste("y ~ ", paste(colnames(x)[-1],
## collapse = "+"))), data = as.data.frame(x), weights = weights)
## weights: lw
## 
## Moran I statistic standard deviate = 50.563, p-value < 2.2e-16
## alternative hypothesis: greater
## sample estimates:
## Observed Moran I      Expectation         Variance 
##     1.115597e-01    -7.287566e-05     4.874367e-06

The linear combinations of unlagged and lagged covariates give these impacts, many pulling the same way:

summary(spatialreg::impacts(SLX))
## Impact measures (SlX, estimable, n-k):
##                         Direct    Indirect      Total
## log1p(vacancy_rate) 0.20445207  0.37481174 0.57926381
## log1p(old_rate)     0.40254315 -0.01690728 0.38563587
## log1p(black_rate)   0.49345676  0.05398505 0.54744181
## log1p(hisp_rate)    0.60720345 -0.03924693 0.56795651
## log1p(group_pop)    0.01974814  0.01280282 0.03255096
## log1p(dens)         0.01035030  0.03368823 0.04403853
## ========================================================
## Standard errors:
##                           Direct    Indirect       Total
## log1p(vacancy_rate) 0.0397260202 0.050935722 0.032647455
## log1p(old_rate)     0.0044223032 0.006435572 0.006346559
## log1p(black_rate)   0.0270685196 0.030671611 0.012643255
## log1p(hisp_rate)    0.0339216058 0.037391531 0.013710963
## log1p(group_pop)    0.0008340256 0.001528895 0.001507331
## log1p(dens)         0.0042030102 0.004741785 0.002155113
## ========================================================
## Z-values:
##                        Direct  Indirect    Total
## log1p(vacancy_rate)  5.146553  7.358524 17.74300
## log1p(old_rate)     91.025678 -2.627160 60.76298
## log1p(black_rate)   18.229913  1.760098 43.29912
## log1p(hisp_rate)    17.900198 -1.049621 41.42353
## log1p(group_pop)    23.678098  8.373907 21.59510
## log1p(dens)          2.462592  7.104546 20.43444
## 
## p-values:
##                     Direct     Indirect   Total     
## log1p(vacancy_rate) 2.6532e-07 1.8585e-13 < 2.22e-16
## log1p(old_rate)     < 2.22e-16 0.0086101  < 2.22e-16
## log1p(black_rate)   < 2.22e-16 0.0783911  < 2.22e-16
## log1p(hisp_rate)    < 2.22e-16 0.2938925  < 2.22e-16
## log1p(group_pop)    < 2.22e-16 < 2.22e-16 < 2.22e-16
## log1p(dens)         0.013794   1.2073e-12 < 2.22e-16
summary(df_tracts$tot_pop)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##     501    2925    4028    4283    5341   37452
quantile(df_tracts$group_pop, seq(0, 1, 0.1))
##    0%   10%   20%   30%   40%   50%   60%   70%   80%   90%  100% 
##     0     0     0     0     2     7    16    40    91   187 19496
plot(df_tracts$group_pop, df_tracts$group_pop/df_tracts$tot_pop)

plot(df_tracts$tot_pop, df_tracts$group_pop/df_tracts$tot_pop)

Weighted least squares use the range of the weights to assign tracts with larger populations less residual variance, but here the group quarters populations play in too:

lm_modw <- lm(tr_form, data=df_tracts, weights=tot_pop)
summary(lm_modw)
## 
## Call:
## lm(formula = tr_form, data = df_tracts, weights = tot_pop)
## 
## Weighted Residuals:
##      Min       1Q   Median       3Q      Max 
## -165.025  -18.290    0.634   18.582  206.188 
## 
## Coefficients:
##                       Estimate Std. Error t value Pr(>|t|)    
## (Intercept)         -6.2252852  0.0349852 -177.94   <2e-16 ***
## log1p(vacancy_rate)  0.5567749  0.0272130   20.46   <2e-16 ***
## log1p(old_rate)      0.4053828  0.0040674   99.67   <2e-16 ***
## log1p(black_rate)    0.5058137  0.0113856   44.43   <2e-16 ***
## log1p(hisp_rate)     0.5708457  0.0114385   49.91   <2e-16 ***
## log1p(group_pop)     0.0239794  0.0007418   32.33   <2e-16 ***
## log1p(dens)          0.0336080  0.0017827   18.85   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 28.5 on 71346 degrees of freedom
## Multiple R-squared:  0.2204, Adjusted R-squared:  0.2203 
## F-statistic:  3361 on 6 and 71346 DF,  p-value: < 2.2e-16
spdep::lm.morantest(lm_modw, lw)
## 
##  Global Moran I for regression residuals
## 
## data:  
## model: lm(formula = tr_form, data = df_tracts, weights = tot_pop)
## weights: lw
## 
## Moran I statistic standard deviate = 49.407, p-value < 2.2e-16
## alternative hypothesis: greater
## sample estimates:
## Observed Moran I      Expectation         Variance 
##     1.090160e-01    -6.281631e-05     4.874287e-06

We can try the same with other models, such as SLX, but again with little change:

SLXw <- lmSLX(tr_form, weights=tot_pop, data=df_tracts, listw=lw, zero.policy=TRUE)
summary(SLXw)
## 
## Call:
## lm(formula = formula(paste("y ~ ", paste(colnames(x)[-1], collapse = "+"))), 
##     data = as.data.frame(x), weights = weights)
## 
## Weighted Residuals:
##      Min       1Q   Median       3Q      Max 
## -164.762  -18.250    0.628   18.602  212.170 
## 
## Coefficients:
##                           Estimate Std. Error  t value Pr(>|t|)    
## (Intercept)             -6.2010197  0.0542376 -114.331  < 2e-16 ***
## log1p.vacancy_rate.      0.2900525  0.0426211    6.805 1.02e-11 ***
## log1p.old_rate.          0.3983487  0.0044799   88.919  < 2e-16 ***
## log1p.black_rate.        0.4647722  0.0276407   16.815  < 2e-16 ***
## log1p.hisp_rate.         0.5933632  0.0325690   18.219  < 2e-16 ***
## log1p.group_pop.         0.0212821  0.0007877   27.019  < 2e-16 ***
## log1p.dens.              0.0168140  0.0040944    4.107 4.02e-05 ***
## lag.log1p.vacancy_rate.  0.4325027  0.0518089    8.348  < 2e-16 ***
## lag.log1p.old_rate.     -0.0019333  0.0065644   -0.295    0.768    
## lag.log1p.black_rate.    0.0407380  0.0311804    1.307    0.191    
## lag.log1p.hisp_rate.    -0.0310037  0.0358891   -0.864    0.388    
## lag.log1p.group_pop.     0.0164220  0.0014939   10.993  < 2e-16 ***
## lag.log1p.dens.          0.0231145  0.0046590    4.961 7.02e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 28.45 on 71340 degrees of freedom
## Multiple R-squared:  0.2228, Adjusted R-squared:  0.2226 
## F-statistic:  1704 on 12 and 71340 DF,  p-value: < 2.2e-16
spdep::lm.morantest(SLXw, lw)
## 
##  Global Moran I for regression residuals
## 
## data:  
## model: lm(formula = formula(paste("y ~ ", paste(colnames(x)[-1],
## collapse = "+"))), data = as.data.frame(x), weights = weights)
## weights: lw
## 
## Moran I statistic standard deviate = 49.483, p-value < 2.2e-16
## alternative hypothesis: greater
## sample estimates:
## Observed Moran I      Expectation         Variance 
##     1.091760e-01    -7.215477e-05     4.874344e-06
summary(spatialreg::impacts(SLXw))
## Impact measures (SlX, estimable, n-k):
##                         Direct     Indirect      Total
## log1p(vacancy_rate) 0.29005246  0.432502675 0.72255514
## log1p(old_rate)     0.39834869 -0.001933313 0.39641538
## log1p(black_rate)   0.46477224  0.040737989 0.50551022
## log1p(hisp_rate)    0.59336317 -0.031003701 0.56235946
## log1p(group_pop)    0.02128214  0.016422044 0.03770419
## log1p(dens)         0.01681396  0.023114529 0.03992849
## ========================================================
## Standard errors:
##                           Direct    Indirect       Total
## log1p(vacancy_rate) 0.0426210645 0.051808861 0.033980108
## log1p(old_rate)     0.0044798951 0.006564424 0.006325152
## log1p(black_rate)   0.0276406965 0.031180367 0.012944562
## log1p(hisp_rate)    0.0325690346 0.035889146 0.012845410
## log1p(group_pop)    0.0007876827 0.001493860 0.001457527
## log1p(dens)         0.0040943629 0.004658979 0.002114653
## ========================================================
## Z-values:
##                        Direct   Indirect    Total
## log1p(vacancy_rate)  6.805378  8.3480444 21.26406
## log1p(old_rate)     88.919201 -0.2945138 62.67286
## log1p(black_rate)   16.814780  1.3065269 39.05194
## log1p(hisp_rate)    18.218629 -0.8638740 43.77902
## log1p(group_pop)    27.018677 10.9930292 25.86861
## log1p(dens)          4.106612  4.9612865 18.88181
## 
## p-values:
##                     Direct     Indirect   Total     
## log1p(vacancy_rate) 1.0078e-11 < 2.22e-16 < 2.22e-16
## log1p(old_rate)     < 2.22e-16 0.76837    < 2.22e-16
## log1p(black_rate)   < 2.22e-16 0.19137    < 2.22e-16
## log1p(hisp_rate)    < 2.22e-16 0.38766    < 2.22e-16
## log1p(group_pop)    < 2.22e-16 < 2.22e-16 < 2.22e-16
## log1p(dens)         4.0151e-05 7.0028e-07 < 2.22e-16

With a larger number of observations, using eigenvalues to calculate the Jacobian is not possible, so we use a pre-computed Cholesky decomposition:

SEM <- spatialreg::errorsarlm(tr_form, data=df_tracts, listw=lw, method="Matrix", zero.policy=TRUE)
apply(SEM$timings, 2, sum)
## user.self   elapsed 
##     4.857     5.050
summary(SEM, Hausman=TRUE, Nagelkerke=TRUE)
## 
## Call:spatialreg::errorsarlm(formula = tr_form, data = df_tracts, listw = lw, 
##     method = "Matrix", zero.policy = TRUE)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -3.161891 -0.287439  0.010315  0.296182  2.979651 
## 
## Type: error 
## Regions with no neighbours included:
##  12562 26251 28654 28696 31900 32259 42437 45571 45614 51434 57159 57306 68844 68898 68901 69293 69294 
## Coefficients: (asymptotic standard errors) 
##                        Estimate  Std. Error  z value  Pr(>|z|)
## (Intercept)         -6.25184400  0.03703706 -168.800 < 2.2e-16
## log1p(vacancy_rate)  0.35134711  0.02829437   12.418 < 2.2e-16
## log1p(old_rate)      0.41108547  0.00418162   98.308 < 2.2e-16
## log1p(black_rate)    0.54672194  0.01341739   40.747 < 2.2e-16
## log1p(hisp_rate)     0.59289677  0.01501856   39.478 < 2.2e-16
## log1p(group_pop)     0.02045655  0.00080153   25.522 < 2.2e-16
## log1p(dens)          0.03120040  0.00220509   14.149 < 2.2e-16
## 
## Lambda: 0.25388, LR test value: 2082.7, p-value: < 2.22e-16
## Approximate (numerical Hessian) standard error: 0.0079519
##     z-value: 31.927, p-value: < 2.22e-16
## Wald statistic: 1019.3, p-value: < 2.22e-16
## 
## Log likelihood: -43642.84 for error model
## ML residual variance (sigma squared): 0.19674, (sigma: 0.44356)
## Nagelkerke pseudo-R-squared: 0.25173 
## Number of observations: 71353 
## Number of parameters estimated: 9 
## AIC: 87304, (AIC for lm: 89384)
## Hausman test: 171.45, df: 7, p-value: < 2.22e-16

We can also use single-chain MCMC with default 2500 draws, of which 500 are treated as burn-in and discarded; much of the time is spent on pre-computing a coarse grid of LU-decomposition Jacobian terms:

set.seed(1)
BSEM <- spatialreg::spBreg_err(tr_form, data=df_tracts, listw=lw, zero.policy=TRUE)
apply(attr(BSEM, "timings")[,c(1,3)], 2, sum)
## user.self   elapsed 
##   115.859   118.196

As we know, the model MCMC output is held in an "mcmc" object defined in coda:

library(coda)
summary(BSEM)
## 
## Iterations = 501:2500
## Thinning interval = 1 
## Number of chains = 1 
## Sample size per chain = 2000 
## 
## 1. Empirical mean and standard deviation for each variable,
##    plus standard error of the mean:
## 
##                         Mean        SD  Naive SE Time-series SE
## (Intercept)         -6.25240 0.0368774 8.246e-04      8.246e-04
## log1p(vacancy_rate)  0.35138 0.0288813 6.458e-04      6.458e-04
## log1p(old_rate)      0.41115 0.0041649 9.313e-05      9.313e-05
## log1p(black_rate)    0.54719 0.0137979 3.085e-04      3.085e-04
## log1p(hisp_rate)     0.59254 0.0154012 3.444e-04      3.444e-04
## log1p(group_pop)     0.02045 0.0007997 1.788e-05      1.929e-05
## log1p(dens)          0.03117 0.0022222 4.969e-05      4.969e-05
## lambda               0.25291 0.0053490 1.196e-04      1.196e-04
## sige                 0.20019 0.0012118 2.710e-05      2.710e-05
## 
## 2. Quantiles for each variable:
## 
##                         2.5%      25%      50%      75%    97.5%
## (Intercept)         -6.32307 -6.27707 -6.25300 -6.22796 -6.17848
## log1p(vacancy_rate)  0.29587  0.33131  0.35145  0.37149  0.40597
## log1p(old_rate)      0.40290  0.40839  0.41116  0.41394  0.41910
## log1p(black_rate)    0.51979  0.53798  0.54742  0.55630  0.57370
## log1p(hisp_rate)     0.56259  0.58228  0.59234  0.60257  0.62334
## log1p(group_pop)     0.01887  0.01990  0.02045  0.02099  0.02201
## log1p(dens)          0.02684  0.02966  0.03119  0.03271  0.03541
## lambda               0.24262  0.24962  0.25263  0.25663  0.26363
## sige                 0.19800  0.19944  0.20018  0.20088  0.20234

Diagnostic tests from coda can also be used:

raftery.diag(BSEM, r=0.01)
## 
## Quantile (q) = 0.025
## Accuracy (r) = +/- 0.01
## Probability (s) = 0.95 
##                                                            
##                      Burn-in  Total Lower bound  Dependence
##                      (M)      (N)   (Nmin)       factor (I)
##  (Intercept)         2        930   937          0.993     
##  log1p(vacancy_rate) 2        950   937          1.010     
##  log1p(old_rate)     2        930   937          0.993     
##  log1p(black_rate)   2        913   937          0.974     
##  log1p(hisp_rate)    2        892   937          0.952     
##  log1p(group_pop)    2        930   937          0.993     
##  log1p(dens)         2        950   937          1.010     
##  lambda              1        1190  937          1.270     
##  sige                2        930   937          0.993
plot(BSEM, trace=FALSE)

We can also use weights with SEM and conditioned SEM models:

SEMw <- spatialreg::errorsarlm(tr_form, weights=tot_pop, data=df_tracts, listw=lw, method="Matrix", zero.policy=TRUE)
apply(SEMw$timings, 2, sum)
## user.self   elapsed 
##     4.884     4.986
summary(SEMw, Hausman=TRUE, Nagelkerke=TRUE)
## 
## Call:spatialreg::errorsarlm(formula = tr_form, data = df_tracts, listw = lw, 
##     weights = tot_pop, method = "Matrix", zero.policy = TRUE)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -3.1721397 -0.2875481  0.0096908  0.2949229  2.9755875 
## 
## Type: error 
## Regions with no neighbours included:
##  12562 26251 28654 28696 31900 32259 42437 45571 45614 51434 57159 57306 68844 68898 68901 69293 69294 
## Coefficients: (asymptotic standard errors) 
##                        Estimate  Std. Error  z value  Pr(>|z|)
## (Intercept)         -6.25080288  0.03640540 -171.700 < 2.2e-16
## log1p(vacancy_rate)  0.49017557  0.03116976   15.726 < 2.2e-16
## log1p(old_rate)      0.41003992  0.00422318   97.093 < 2.2e-16
## log1p(black_rate)    0.50494215  0.01384960   36.459 < 2.2e-16
## log1p(hisp_rate)     0.57340744  0.01406260   40.775 < 2.2e-16
## log1p(group_pop)     0.02242876  0.00075294   29.788 < 2.2e-16
## log1p(dens)          0.02993784  0.00213535   14.020 < 2.2e-16
## 
## Lambda: 0.24966, LR test value: 2032, p-value: < 2.22e-16
## Approximate (numerical Hessian) standard error: 0.0038204
##     z-value: 65.349, p-value: < 2.22e-16
## Wald statistic: 4270.5, p-value: < 2.22e-16
## 
## Log likelihood: -44437.41 for error model
## ML residual variance (sigma squared): 780.56, (sigma: 27.938)
## Nagelkerke pseudo-R-squared: 0.23488 
## Number of observations: 71353 
## Number of parameters estimated: 9 
## AIC: 88893, (AIC for lm: 90923)
## Hausman test: 208.19, df: 7, p-value: < 2.22e-16

and its Durbin variant:

SDEM <- spatialreg::errorsarlm(tr_form, data=df_tracts, listw=lw, method="Matrix", Durbin=TRUE, zero.policy=TRUE)
apply(SDEM$timings, 2, sum)
## user.self   elapsed 
##     9.584     9.952
summary(SDEM, Nagelkerke=TRUE)
## 
## Call:spatialreg::errorsarlm(formula = tr_form, data = df_tracts, listw = lw, 
##     Durbin = TRUE, method = "Matrix", zero.policy = TRUE)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -3.153522 -0.286920  0.010042  0.295821  2.932170 
## 
## Type: error 
## Regions with no neighbours included:
##  12562 26251 28654 28696 31900 32259 42437 45571 45614 51434 57159 57306 68844 68898 68901 69293 69294 
## Coefficients: (asymptotic standard errors) 
##                            Estimate  Std. Error  z value  Pr(>|z|)
## (Intercept)             -6.16017866  0.06471718 -95.1861 < 2.2e-16
## log1p(vacancy_rate)      0.22011019  0.03746441   5.8752 4.224e-09
## log1p(old_rate)          0.40323063  0.00432342  93.2666 < 2.2e-16
## log1p(black_rate)        0.50755974  0.02561676  19.8136 < 2.2e-16
## log1p(hisp_rate)         0.62902991  0.03218041  19.5470 < 2.2e-16
## log1p(group_pop)         0.01986287  0.00080835  24.5722 < 2.2e-16
## log1p(dens)              0.00833251  0.00399191   2.0873   0.03686
## lag.log1p(vacancy_rate)  0.33910367  0.05237609   6.4744 9.519e-11
## lag.log1p(old_rate)     -0.00961209  0.00682877  -1.4076   0.15925
## lag.log1p(black_rate)    0.03009720  0.03046327   0.9880   0.32316
## lag.log1p(hisp_rate)    -0.06474559  0.03688723  -1.7552   0.07922
## lag.log1p(group_pop)     0.01440312  0.00167140   8.6174 < 2.2e-16
## lag.log1p(dens)          0.03495989  0.00475808   7.3475 2.021e-13
## 
## Lambda: 0.25272, LR test value: 2070, p-value: < 2.22e-16
## Approximate (numerical Hessian) standard error: 0.0042256
##     z-value: 59.807, p-value: < 2.22e-16
## Wald statistic: 3576.8, p-value: < 2.22e-16
## 
## Log likelihood: -43560.04 for error model
## ML residual variance (sigma squared): 0.19631, (sigma: 0.44307)
## Nagelkerke pseudo-R-squared: 0.25347 
## Number of observations: 71353 
## Number of parameters estimated: 15 
## AIC: 87150, (AIC for lm: 89218)
summary(spatialreg::impacts(SDEM))
## Impact measures (SDEM, estimable, n):
##                          Direct     Indirect      Total
## log1p(vacancy_rate) 0.220110186  0.339103674 0.55921386
## log1p(old_rate)     0.403230628 -0.009612094 0.39361853
## log1p(black_rate)   0.507559744  0.030097197 0.53765694
## log1p(hisp_rate)    0.629029914 -0.064745592 0.56428432
## log1p(group_pop)    0.019862873  0.014403116 0.03426599
## log1p(dens)         0.008332506  0.034959895 0.04329240
## ========================================================
## Standard errors:
##                           Direct    Indirect       Total
## log1p(vacancy_rate) 0.0374644083 0.052376090 0.040633564
## log1p(old_rate)     0.0043234188 0.006828774 0.007484115
## log1p(black_rate)   0.0256167606 0.030463269 0.016154547
## log1p(hisp_rate)    0.0321804058 0.036887235 0.017557548
## log1p(group_pop)    0.0008083488 0.001671399 0.001803419
## log1p(dens)         0.0039919138 0.004758078 0.002749267
## ========================================================
## Z-values:
##                        Direct   Indirect    Total
## log1p(vacancy_rate)  5.875181  6.4743984 13.76236
## log1p(old_rate)     93.266613 -1.4075870 52.59387
## log1p(black_rate)   19.813580  0.9879832 33.28208
## log1p(hisp_rate)    19.546985 -1.7552303 32.13913
## log1p(group_pop)    24.572157  8.6174018 19.00057
## log1p(dens)          2.087346  7.3474828 15.74689
## 
## p-values:
##                     Direct     Indirect   Total     
## log1p(vacancy_rate) 4.2238e-09 9.5191e-11 < 2.22e-16
## log1p(old_rate)     < 2.22e-16 0.15925    < 2.22e-16
## log1p(black_rate)   < 2.22e-16 0.32316    < 2.22e-16
## log1p(hisp_rate)    < 2.22e-16 0.07922    < 2.22e-16
## log1p(group_pop)    < 2.22e-16 < 2.22e-16 < 2.22e-16
## log1p(dens)         0.036857   2.0206e-13 < 2.22e-16

and weighted:

SDEMw <- spatialreg::errorsarlm(tr_form, weights=tot_pop, data=df_tracts, listw=lw, method="Matrix", Durbin=TRUE, zero.policy=TRUE)
apply(SDEMw$timings, 2, sum)
## user.self   elapsed 
##     8.398     8.563
summary(SDEMw, Hausman=TRUE, Nagelkerke=TRUE)
## 
## Call:spatialreg::errorsarlm(formula = tr_form, data = df_tracts, listw = lw, 
##     weights = tot_pop, Durbin = TRUE, method = "Matrix", zero.policy = TRUE)
## 
## Residuals:
##        Min         1Q     Median         3Q        Max 
## -3.1665752 -0.2883039  0.0096818  0.2960586  2.9234058 
## 
## Type: error 
## Regions with no neighbours included:
##  12562 26251 28654 28696 31900 32259 42437 45571 45614 51434 57159 57306 68844 68898 68901 69293 69294 
## Coefficients: (asymptotic standard errors) 
##                            Estimate  Std. Error  z value  Pr(>|z|)
## (Intercept)             -6.26779399  0.06438165 -97.3537 < 2.2e-16
## log1p(vacancy_rate)      0.29917270  0.04034320   7.4157 1.210e-13
## log1p(old_rate)          0.40182015  0.00435840  92.1944 < 2.2e-16
## log1p(black_rate)        0.48896909  0.02617479  18.6809 < 2.2e-16
## log1p(hisp_rate)         0.61656157  0.03094209  19.9263 < 2.2e-16
## log1p(group_pop)         0.02144233  0.00076196  28.1408 < 2.2e-16
## log1p(dens)              0.01417781  0.00388774   3.6468 0.0002655
## lag.log1p(vacancy_rate)  0.40922588  0.05309595   7.7073 1.288e-14
## lag.log1p(old_rate)      0.00219764  0.00697219   0.3152 0.7526096
## lag.log1p(black_rate)    0.00270443  0.03096953   0.0873 0.9304128
## lag.log1p(hisp_rate)    -0.06315358  0.03539952  -1.7840 0.0744198
## lag.log1p(group_pop)     0.01860324  0.00163288  11.3929 < 2.2e-16
## lag.log1p(dens)          0.02522840  0.00467896   5.3919 6.972e-08
## 
## Lambda: 0.24893, LR test value: 2029.7, p-value: < 2.22e-16
## Approximate (numerical Hessian) standard error: 0.0056695
##     z-value: 43.906, p-value: < 2.22e-16
## Wald statistic: 1927.8, p-value: < 2.22e-16
## 
## Log likelihood: -44328.16 for error model
## ML residual variance (sigma squared): 778.22, (sigma: 27.897)
## Nagelkerke pseudo-R-squared: 0.23722 
## Number of observations: 71353 
## Number of parameters estimated: 15 
## AIC: 88686, (AIC for lm: 90714)
## Hausman test: 67.692, df: 13, p-value: 2.1293e-09
summary(spatialreg::impacts(SDEMw))
## Impact measures (SDEM, estimable, n):
##                         Direct     Indirect      Total
## log1p(vacancy_rate) 0.29917270  0.409225875 0.70839857
## log1p(old_rate)     0.40182015  0.002197637 0.40401779
## log1p(black_rate)   0.48896909  0.002704431 0.49167353
## log1p(hisp_rate)    0.61656157 -0.063153582 0.55340799
## log1p(group_pop)    0.02144233  0.018603237 0.04004557
## log1p(dens)         0.01417781  0.025228403 0.03940622
## ========================================================
## Standard errors:
##                          Direct    Indirect       Total
## log1p(vacancy_rate) 0.040343204 0.053095951 0.042489080
## log1p(old_rate)     0.004358399 0.006972192 0.007509624
## log1p(black_rate)   0.026174793 0.030969533 0.016508152
## log1p(hisp_rate)    0.030942095 0.035399521 0.016387297
## log1p(group_pop)    0.000761965 0.001632879 0.001739686
## log1p(dens)         0.003887738 0.004678957 0.002686349
## ========================================================
## Z-values:
##                        Direct    Indirect    Total
## log1p(vacancy_rate)  7.415690  7.70728964 16.67249
## log1p(old_rate)     92.194431  0.31520029 53.80000
## log1p(black_rate)   18.680915  0.08732554 29.78368
## log1p(hisp_rate)    19.926303 -1.78402361 33.77055
## log1p(group_pop)    28.140837 11.39290899 23.01884
## log1p(dens)          3.646803  5.39188581 14.66906
## 
## p-values:
##                     Direct     Indirect   Total     
## log1p(vacancy_rate) 1.2101e-13 1.2879e-14 < 2.22e-16
## log1p(old_rate)     < 2.22e-16 0.75261    < 2.22e-16
## log1p(black_rate)   < 2.22e-16 0.93041    < 2.22e-16
## log1p(hisp_rate)    < 2.22e-16 0.07442    < 2.22e-16
## log1p(group_pop)    < 2.22e-16 < 2.22e-16 < 2.22e-16
## log1p(dens)         0.00026552 6.9722e-08 < 2.22e-16

Finally, using the development version of sphet:

library(sphet)
## 
## Attaching package: 'sphet'
## The following object is masked from 'package:spatialreg':
## 
##     impacts
system.time(SEM_GMM <- spreg(tr_form, data=df_tracts, listw=as(lw, "CsparseMatrix"), model="error", het=FALSE))
##    user  system elapsed 
##   1.933   0.148   2.102
summary(SEM_GMM)
## 
##  Generalized stsls
## 
## Call:
## spreg(formula = tr_form, data = df_tracts, listw = as(lw, "CsparseMatrix"), 
##     model = "error", het = FALSE)
## 
## Residuals:
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -3.15222 -0.29469  0.01032 -0.00021  0.30086  3.00213 
## 
## Coefficients:
##                        Estimate  Std. Error  t-value  Pr(>|t|)    
## (Intercept)         -6.24575367  0.03720516 -167.873 < 2.2e-16 ***
## log1p(vacancy_rate)  0.35157200  0.02843557   12.364 < 2.2e-16 ***
## log1p(old_rate)      0.41042586  0.00419882   97.748 < 2.2e-16 ***
## log1p(black_rate)    0.54682738  0.01352155   40.441 < 2.2e-16 ***
## log1p(hisp_rate)     0.59271015  0.01514450   39.137 < 2.2e-16 ***
## log1p(group_pop)     0.02040826  0.00080207   25.444 < 2.2e-16 ***
## log1p(dens)          0.03112745  0.00222134   14.013 < 2.2e-16 ***
## rho                  0.26259335  0.00477093   55.040 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

with its Durbin variant:

system.time(SDEM_GMM <- spreg(tr_form, data=df_tracts, listw=as(lw, "CsparseMatrix"), model="error", het=FALSE, Durbin=TRUE))
##    user  system elapsed 
##   2.228   0.173   2.428
summary(SDEM_GMM)
## 
##  Generalized stsls
## 
## Call:
## spreg(formula = tr_form, data = df_tracts, listw = as(lw, "CsparseMatrix"), 
##     model = "error", het = FALSE, Durbin = TRUE)
## 
## Residuals:
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -3.14258 -0.29470  0.00949 -0.00075  0.29963  2.94460 
## 
## Coefficients:
##                            Estimate  Std. Error  t-value  Pr(>|t|)    
## (Intercept)             -6.13268960  0.06558835 -93.5027 < 2.2e-16 ***
## log1p(vacancy_rate)      0.22111422  0.03740353   5.9116 3.388e-09 ***
## log1p(old_rate)          0.40241917  0.00432891  92.9608 < 2.2e-16 ***
## log1p(black_rate)        0.50792821  0.02557387  19.8612 < 2.2e-16 ***
## log1p(hisp_rate)         0.62901587  0.03212850  19.5781 < 2.2e-16 ***
## log1p(group_pop)         0.01983835  0.00080826  24.5445 < 2.2e-16 ***
## log1p(dens)              0.00826926  0.00398581   2.0747   0.03802 *  
## lag_log1p(vacancy_rate)  0.34382296  0.05251777   6.5468 5.879e-11 ***
## lag_log1p(old_rate)     -0.01195826  0.00688147  -1.7377   0.08226 .  
## lag_log1p(black_rate)    0.03012242  0.03048615   0.9881   0.32312    
## lag_log1p(hisp_rate)    -0.06631076  0.03690239  -1.7969   0.07235 .  
## lag_log1p(group_pop)     0.01435628  0.00167892   8.5509 < 2.2e-16 ***
## lag_log1p(dens)          0.03521147  0.00476425   7.3908 1.460e-13 ***
## rho                      0.26161661  0.00476485  54.9055 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

This also lets us adjust the coefficient standard errors for heterskedasticity:

system.time(SEM_GMMh <- spreg(tr_form, data=df_tracts, listw=as(lw, "CsparseMatrix"), model="error", het=TRUE))
##    user  system elapsed 
##   2.978   0.117   3.134
summary(SEM_GMMh)
## 
##  Generalized stsls
## 
## Call:
## spreg(formula = tr_form, data = df_tracts, listw = as(lw, "CsparseMatrix"), 
##     model = "error", het = TRUE)
## 
## Residuals:
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -3.15225 -0.29467  0.01030 -0.00021  0.30090  3.00219 
## 
## Coefficients:
##                        Estimate  Std. Error  t-value  Pr(>|t|)    
## (Intercept)         -6.24579581  0.03996340 -156.288 < 2.2e-16 ***
## log1p(vacancy_rate)  0.35123442  0.03135236   11.203 < 2.2e-16 ***
## log1p(old_rate)      0.41044090  0.00456026   90.004 < 2.2e-16 ***
## log1p(black_rate)    0.54681845  0.01429321   38.257 < 2.2e-16 ***
## log1p(hisp_rate)     0.59276338  0.01559615   38.007 < 2.2e-16 ***
## log1p(group_pop)     0.02040097  0.00083032   24.570 < 2.2e-16 ***
## log1p(dens)          0.03109688  0.00232960   13.349 < 2.2e-16 ***
## rho                  0.27593286  0.00488909   56.438 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
system.time(SDEM_GMMh <- spreg(tr_form, data=df_tracts, listw=as(lw, "CsparseMatrix"), model="error", het=TRUE, Durbin=TRUE))
##    user  system elapsed 
##   3.019   0.162   3.215
summary(SDEM_GMMh)
## 
##  Generalized stsls
## 
## Call:
## spreg(formula = tr_form, data = df_tracts, listw = as(lw, "CsparseMatrix"), 
##     model = "error", het = TRUE, Durbin = TRUE)
## 
## Residuals:
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -3.14259 -0.29469  0.00947 -0.00076  0.29964  2.94460 
## 
## Coefficients:
##                            Estimate  Std. Error  t-value  Pr(>|t|)    
## (Intercept)             -6.13295558  0.06908374 -88.7757 < 2.2e-16 ***
## log1p(vacancy_rate)      0.22118334  0.04042524   5.4714 4.465e-08 ***
## log1p(old_rate)          0.40242167  0.00468697  85.8597 < 2.2e-16 ***
## log1p(black_rate)        0.50798180  0.02743648  18.5148 < 2.2e-16 ***
## log1p(hisp_rate)         0.62911182  0.03381343  18.6054 < 2.2e-16 ***
## log1p(group_pop)         0.01983960  0.00083555  23.7443 < 2.2e-16 ***
## log1p(dens)              0.00825998  0.00410720   2.0111   0.04432 *  
## lag_log1p(vacancy_rate)  0.34365182  0.05502973   6.2448 4.242e-10 ***
## lag_log1p(old_rate)     -0.01192960  0.00729841  -1.6345   0.10214    
## lag_log1p(black_rate)    0.03000758  0.03268209   0.9182   0.35853    
## lag_log1p(hisp_rate)    -0.06644231  0.03895247  -1.7057   0.08806 .  
## lag_log1p(group_pop)     0.01436527  0.00171910   8.3563 < 2.2e-16 ***
## lag_log1p(dens)          0.03521675  0.00491840   7.1602 8.056e-13 ***
## rho                      0.27495975  0.00488766  56.2560 < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R’s sessionInfo()

sessionInfo()
## R version 4.1.2 (2021-11-01)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Fedora 34 (Workstation Edition)
## 
## Matrix products: default
## BLAS:   /home/rsb/topics/R/R412-share/lib64/R/lib/libRblas.so
## LAPACK: /home/rsb/topics/R/R412-share/lib64/R/lib/libRlapack.so
## 
## locale:
##  [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=en_GB.UTF-8        LC_COLLATE=en_GB.UTF-8    
##  [5] LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
##  [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] sphet_1.11       coda_0.19-4      spatialreg_1.1-9 Matrix_1.3-4    
## [5] spData_2.0.1     sf_1.0-4        
## 
## loaded via a namespace (and not attached):
##  [1] sass_0.4.0         tidyr_1.1.4        jsonlite_1.7.2     splines_4.1.2     
##  [5] gtools_3.9.2       bslib_0.3.1        assertthat_0.2.1   expm_0.999-6      
##  [9] highr_0.9          sp_1.4-6           yaml_2.2.1         LearnBayes_2.15.1 
## [13] pillar_1.6.4       backports_1.3.0    lattice_0.20-45    glue_1.4.2        
## [17] digest_0.6.28      htmltools_0.5.2    spDataLarge_0.5.4  pkgconfig_2.0.3   
## [21] broom_0.7.10       raster_3.5-2       gmodels_2.18.1     s2_1.0.7          
## [25] mvtnorm_1.1-3      purrr_0.3.4        gdata_2.18.0       RSpectra_0.16-0   
## [29] terra_1.4-16       tibble_3.1.5       proxy_0.4-26       generics_0.1.1    
## [33] ellipsis_0.3.2     cli_3.1.0          magrittr_2.0.1     crayon_1.4.2      
## [37] deldir_1.0-6       evaluate_0.14      fansi_0.5.0        nlme_3.1-153      
## [41] MASS_7.3-54        class_7.3-19       tools_4.1.2        lifecycle_1.0.1   
## [45] matrixStats_0.61.0 stringr_1.4.0      compiler_4.1.2     jquerylib_0.1.4   
## [49] e1071_1.7-9        rlang_0.4.12       classInt_0.4-3     units_0.7-2       
## [53] grid_4.1.2         rstudioapi_0.13    rmarkdown_2.11     boot_1.3-28       
## [57] wk_0.5.0           codetools_0.2-18   DBI_1.1.1          R6_2.5.1          
## [61] zoo_1.8-9          knitr_1.36         dplyr_1.0.7        fastmap_1.1.0     
## [65] utf8_1.2.2         spdep_1.1-12       KernSmooth_2.23-20 stringi_1.7.5     
## [69] Rcpp_1.0.7         vctrs_0.3.8        tidyselect_1.1.1   xfun_0.27         
## [73] lmtest_0.9-38
Anderson, Richard G., William H. Greene, B. D. McCullough, and H. D. Vinod. 2008. “The Role of Data/Code Archives in the Future of Economic Research.” Journal of Economic Methodology 15: 99–119.
Anselin, L. 1988a. Spatial Econometrics: Methods and Models. Kluwer Academic Publishers.
———. 1988b. Spatial Econometrics: Methods and Models. Dordrecht: Kluwer Academic Publishers.
Anselin, Luc. 2010. “Thirty Years of Spatial Econometrics.” Papers in Regional Science 89: 3–25.
Bailly, Antoine, Pierre-Henri Derycke, and André Torre. 2012. “50 Ans de Science Régionale Francophone.” l’Association de Science Régionale de Langue Française. http://www.asrdlf.org/telechargements/50ansASRDLF.pdf.
Bartels, Cornelis P. A., and Leen Hordijk. 1977. “On the Power of the Generalized Moran Contiguity Coefficient in Testing for Spatial Autocorrelation Among Regression Disturbances.” Regional Science and Urban Economics 7: 83–101.
Besag, Julian. 1974. “Spatial Interaction and the Statistical Analysis of Lattice Systems.” Journal of the Royal Statistical Society. Series B (Methodological) 36: pp. 192–236.
Billot, Antoine, and Jacques-Francois Thisse. 1992. “Claude Ponsard (1927-1990): A Biographical Essay.” Annals of Regional Science 26: 191–98.
Bivand, R. S. 1984. Regression Modeling with Spatial Dependence: an Application of Some Class Selection and Estimation Methods.” Geographical Analysis 16: 25–37.
———. 2002. “Spatial Econometrics Functions in R: Classes and Methods.” Journal of Geographical Systems 4: 405–21.
———. 2008. “Implementing Representations of Space in Economic Geography.” Journal of Regional Science 48: 1–27.
Bivand, Roger. 2017. “Revisiting the Boston Data Set — Changing the Units of Observation Affects Estimated Willingness to Pay for Clean Air.” REGION 4 (1): 109–27. https://doi.org/10.18335/region.v4i1.107.
Bivand, Roger S. 2012. After ’Raising the Bar’: Applied Maximum Likelihood Estimation of Families of Models in Spatial Econometrics.” Estadística Española 54: 71–88.
Bivand, Roger S., Jan Hauke, and Tomasz Kossowski. 2013. “Computing the Jacobian in Gaussian Spatial Autoregressive Models: An Illustrated Comparison of Available Methods.” Geographical Analysis 45 (2): 150–79. https://doi.org/10.1111/gean.12008.
Bivand, Roger S., and Gianfranco Piras. 2015. “Comparing Implementations of Estimation Methods for Spatial Econometrics.” Journal of Statistical Software 63 (1): 1–36. https://doi.org/10.18637/jss.v063.i18.
Bivand, Roger, Giovanni Millo, and Gianfranco Piras. 2021. “A Review of Software for Spatial Econometrics in R.” Mathematics 9 (11). https://doi.org/10.3390/math9111276.
Bjerkholt, Olav. 2007. “Writing The Probability Approach with Nowhere to Go: Haavelmo in the United States, 1939–1944.” Econometric Theory 23: 775–837.
Boyce, David. 2004. “A Short History of the Field of Regional Science.” Papers in Regional Science 83: 31–57.
Brooks, Mollie E., Kasper Kristensen, Koen J. van Benthem, Arni Magnusson, Casper W. Berg, Anders Nielsen, Hans J. Skaug, Martin Maechler, and Benjamin M. Bolker. 2017. glmmTMB Balances Speed and Flexibility Among Packages for Zero-Inflated Generalized Linear Mixed Modeling.” The R Journal 9 (2): 378–400. https://journal.r-project.org/archive/2017/RJ-2017-066/index.html.
Burridge, P. 1981. “Testing for a Common Factor in a Spatial Autoregression Model.” Environment and Planning A 13: 795–800.
Cliff, A. D., and J. K. Ord. 1969. “The Problem of Spatial Autocorrelation.” In Studies in Regional Science, edited by A. J. Scott, 25–55. London Papers in Regional Science. London: Pion.
———. 1973. Spatial Autocorrelation. London: Pion.
———. 1981. Spatial Processes. London: Pion.
Cliff, A., and J. K. Ord. 1972. “Testing for Spatial Autocorrelation Among Regression Residuals.” Geographical Analysis 4: 267–84.
Cressie, N. A. C. 1993. Statistics for Spatial Data. New York:Wiley.
Dong, Guanpeng, and Richard Harris. 2015. “Spatial Autoregressive Models for Geographically Hierarchical Data Structures.” Geographical Analysis 47 (2): 173–91. https://doi.org/10.1111/gean.12049.
Drèze, Jacques H. 1964. “Some Postwar Contributions of French Economists to Theory and Public Policy: With Special Emphasis on Problems of Resource Allocation.” The American Economic Review 54 (4): 2–64.
Elhorst, J. Paul. 2010. “Applied Spatial Econometrics: Raising the Bar.” Spatial Economic Analysis 5: 9–28.
Fingleton, B. 1999. Spurious spatial regression: Some Monte Carlo results with a spatial unit root and spatial cointegration.” Journal of Regional Science 39: 1–19.
———. 2000. “Spatial Econometrics, Economic Geography, Dynamics and Equilibrium: A ‘Third Way?’.” Environment and Planning A 32: 1481–98.
Fujita, Masahisa, Paul Krugman, and Anthony J. Venables. 1999. The Spatial Economy. Cambridge, MA: MIT Press.
Gaetan, Carlo, and Xavier Guyon. 2010. Spatial Statistics and Modeling. New York: Springer.
Gibbons, Stephen, and Henry G. Overman. 2012. “Mostly Pointless Spatial Econometrics?” Journal of Regional Science 52: 172–91.
Gómez-Rubio, Virgilio, Roger S. Bivand, and Håvard Rue. 2021. “Estimating Spatial Econometrics Models with Integrated Nested Laplace Approximation.” Mathematics 9 (17): 2044. https://doi.org/10.3390/math9172044.
Gómez-Rubio, Virgilio, Roger Bivand, and Håvard Rue. 2015. “A New Latent Class to Fit Spatial Econometrics Models with Integrated Nested Laplace Approximations.” Procedia Environmental Sciences 27: 116–18. https://doi.org/https://doi.org/10.1016/j.proenv.2015.07.119.
Goulard, Michel, Thibault Laurent, and Christine Thomas-Agnan. 2017. “About Predictions in Spatial Autoregressive Models: Optimal and Almost Optimal Strategies.” Spatial Economic Analysis 12 (2-3): 304–25. https://doi.org/10.1080/17421772.2017.1300679.
Graham, Robert E., and J. Thomas Romans. 1971. “Regional Economic Accounting in the u.s. Office of Business Economics — an Appraisal.” Review of Income and Wealth 17: 1–34.
Granger, Clive W. J. 1969. “Spatial Data and Time Series Analysis.” In Studies in Regional Science, edited by A. J. Scott, 1–25. London Papers in Regional Science 1. London: Pion.
Griffith, Daniel A., and Jean H. P. Paelinck. 2007. “An Equation by Any Other Name Is Still the Same: On Spatial Econometrics and Spatial Statistics.” Annals of Regional Science 41: 209–27.
Griffith, Daniel, Roger Bivand, and Yongwan Chun. 2015. “Implementing Approximations to Extreme Eigenvalues and Eigenvalues of Irregular Surface Partitionings for Use in SAR and CAR Models.” Procedia Environmental Sciences 26: 119–22. https://doi.org/https://doi.org/10.1016/j.proenv.2015.05.013.
Griliches, Zvi. 1967. “Distributed Lags: A Survey.” Econometrica 35: 16–49.
Haining, Robert. 2014. “Spatial Data and Statistical Methods: A Chronological Overview.” In Handbook of Regional Science, edited by M. M. Fischer and P. Nijkamp, 1277–94. Berlin: Springer-Verlag.
Halleck Vega, Solmaria, and J. Paul Elhorst. 2015. “The SLX Model.” Journal of Regional Science 55 (3): 339–63. https://doi.org/10.1111/jors.12188.
Hendry, David. 2009. “The Methodology of Empirical Econometric Modelling: Applied Econometrics Through the Looking-Glass.” In Palgrave Handbook on Econometrics: Applied Econometrics, edited by T. C. Mills and K. Patterson, 3–67. Basingstoke: Palgrave Macmillan.
Hepple, Leslie W. 1974. “The Impact of Stochastic Process Theory Upon Spatial Analysis in Human Geography.” Progress in Geography 6: 89–142.
———. 1976. “A Maximum Likelihood Model for Econometric Estimation with Spatial Series.” In Theory and Practice in Regional Science, edited by I. Masser, 90–104. London Papers in Regional Science. London: Pion.
Hordijk, Leen. 1974. “Spatial Correlation in the Disturbances of a Linear Interregional Model.” Regional and Urban Economics 4: 117–40.
Hordijk, Leen, and Jean H. P. Paelinck. 1976. “Some Principles and Results in Spatial Econometrics.” Recherches Economiques de Louvain 42: 175–97.
Kaluzny, S. P., S. C. Vega, T. P. Cardoso, and A. A. Shelly. 1998. S+SpatialStats. New York, NY: Springer.
Kelejian, Harry, and Gianfranco Piras. 2017. Spatial Econometrics. London: Academic Press.
Klaassen, Leo H., Jean H. P. Paelinck, and Sjoerd Wagenaar. 1979. Spatial Systems. Farnborough: Saxon House.
Klein, L. R. 1958. “The Estimation of Distributed Lags.” Econometrica 26: 553–65.
Koenker, Roger, and Achim Zeileis. 2009. “On Reproducible Econometric Research.” Journal of Applied Econometrics 24: 833–47.
Koyck, L. M. 1954. Distributed Lags and Investment Analysis. Amsterdam: North-Holland.
LeSage, J. P. 2014. “What Regional Scientists Need to Know about Spatial Econometrics.” Review of Regional Studies 44: 13–32. https://journal.srsa.org/ojs/index.php/RRS/article/view/44.1.2.
LeSage, James P., and Kelley R. Pace. 2009. Introduction to Spatial Econometrics. Boca Raton, FL: CRC Press.
Lieshout, M. N. M. van. 2019. Theory of Spatial Statistics. Boca Raton, FL: Chapman; Hall/CRC.
Martinetti, Davide, and Ghislain Geniaux. 2017. “Approximate Likelihood Estimation of Spatial Probit Models.” Regional Science and Urban Economics 64: 30–45. https://doi.org/https://doi.org/10.1016/j.regsciurbeco.2017.02.002.
McMillen, D. P. 2010. “Issues in Spatial Data Analysis.” Journal of Regional Science 50: 119–41.
———. 2013. Quantile Regression for Spatial Data. Heidelberg: Springer-Verlag.
Millo, Giovanni, and Gianfranco Piras. 2012. splm: Spatial Panel Data Models in R.” Journal of Statistical Software 47 (1): 1–38.
Morgan, Mary S. 1990. The History of Econometric Ideas. Cambridge: Cambridge University Press.
Mur, Jesús, and Ana Angulo. 2006. “The Spatial Durbin Model and the Common Factor Tests.” Spatial Economic Analysis 1 (2): 207–26. https://doi.org/10.1080/17421770601009841.
Ord, J. K. 1975. “Estimation Methods for Models of Spatial Interaction.” Journal of the American Statistical Association 70: 120–26.
Ord, J. Keith. 2010. “Spatial Autocorrelation: A Statistician’s Reflections.” In Perspectives on Spatial Data Analysis, edited by L. Anselin and S. J. Rey, 165–80. Berlin: Springer-Verlag.
Pace, R. K., and R. P. Barry. 1997a. Fast CARs.” Journal of Statistical Computation and Simulation 59 (2): 123–45.
———. 1997b. Fast spatial estimation.” Applied Economics Letters 4 (5): 337–41.
———. 1997c. Quick computation of spatial autoregressive estimators.” Geographics Analysis 29 (3): 232–47.
———. 1997d. Sparse spatial autoregressions.” Statistics & Probability Letters 33 (3): 291–97.
Pace, RK, and JP LeSage. 2008. “A Spatial Hausman Test.” Economics Letters 101: 282–84.
Paelinck, Jean H. P. 2012. “Specification and Identification in Spatial Econometric Models.” Estadística Española 54: 35–52.
———. 2013. “Some Challenges for Spatial Econometricians.” In Spatial Econometrics and Regional Economic Analysis, edited by Bogdan Suchecki, 292:11–20. Acta Universitas Lodziensis, Folia Oeconomica. Wydawnictwa Uniwersytetu Łódzkiego.
Paelinck, Jean H. P., and Leo H. Klaassen. 1979. Spatial Econometrics. Farnborough: Saxon House.
Paelinck, Jean H. P., and Peter Nijkamp. 1975. Operational Theory and Method in Regional Economics. Farnborough: Saxon House.
Perroux, François. 1950. “Economic Space: Theory and Applications.” Quarterly Journal of Economics 64: 89–104.
Pinheiro, Jose C., and Douglas M. Bates. 2000. Mixed-Effects Models in S and S-Plus. New York: Springer.
Piras, Gianfranco. 2010. sphet: Spatial Models with Heteroskedastic Innovations in R.” Journal of Statistical Software 35 (1): 1–21.
Ponsard, Claude. 1983. History of Spatial Economic Theory. Berlin: Springer-Verlag.
Qin, Duo. 1993. The Formation of Econometrics: A Historical Perspective. Oxford: Oxford University Press.
Renfro, Charles G. 2009a. “Econometric Software.” In Handbook of Computational Econometrics, edited by David A. Belsley and Erricos Kontoghiorghes, 1–53. Chichester, U.K: John Wiley & Sons.
———. 2009b. The Practice of Econometric Theory: An Examination of the Characteristics of Econometric Computation. Berlin: Springer-Verlag.
Ripley, B. D. 1981. Spatial Statistics. New York: Wiley.
———. 1988. Statistical Inference for Spatial Processes. Cambridge: Cambridge University Press.
Sandmo, Agnar. 2011. Evolving Economics: A History of Economic Thought. Princeton: Princeton University Press.
Smith, T. E., and K. L. Lee. 2012. The effects of spatial autoregressive dependencies on inference in ordinary least squares: a geometric approach.” Journal of Geographical Systems 14 (January): 91–124. https://doi.org/10.1007/s10109-011-0152-x.
Smith, Tony E. 2009. “Estimation Bias in Spatial Models with Strongly Connected Weight Matrices.” Geographical Analysis 41 (3): 307–32. https://doi.org/10.1111/j.1538-4632.2009.00758.x.
Suesse, Thomas. 2018. “Marginal Maximum Likelihood Estimation of SAR Models with Missing Data.” Computational Statistics & Data Analysis 120: 98–110. https://doi.org/https://doi.org/10.1016/j.csda.2017.11.004.
Theil, H. 1964. “Some Developments of Economic Thought in the Netherlands.” The American Economic Review 54: pp. 34–55.
Wagner, Martin, and Achim Zeileis. 2019. “Heterogeneity and Spatial Dependence of Regional Growth in the EU: A Recursive Partitioning Approach.” German Economic Review 20 (1): 67–82. https://doi.org/10.1111/geer.12146.
Wall, M. M. 2004. “A Close Look at the Spatial Structure Implied by the CAR and SAR Models.” Journal of Statistical Planning and Inference 121: 311–24.
Waller, Lance A., and Carol A. Gotway. 2004. Applied Spatial Statistics for Public Health Data. Hoboken, NJ: John Wiley & Sons.
Whittle, P. 1954. On Stationary Processes in the Plane.” Biometrika 41 (3-4): 434–49. https://doi.org/10.1093/biomet/41.3-4.434.
Wilhelm, Stefan, and Miguel Godinho de Matos. 2013. Estimating Spatial Probit Models in R.” The R Journal 5 (1): 130–43. https://doi.org/10.32614/RJ-2013-013.
Wilson, Alan G. 2000. Complex Spatial Systems: The Modelling Foundations of Urban and Regional Analysis. Harlow: Prentice Hall.
———. 2012. The Science of Cities and Regions: Lectures on Mathematical Model Design. Dordrecht: Springer.
Yalta, A.Talha, and A.Yasemin Yalta. 2010. “Should Economists Use Open Source Software for Doing Research?” Computational Economics 35: 371–94.