Spatial spillovers: the spreg suite

SAR, SEM, SLX, and the spatial Durbin model — diagnosed, estimated, and read through impacts

Spatial dependence and LISA established that Indian district luminosity clusters in space. This article models that dependence: it runs the Lagrange-multiplier specification tests, estimates the spatial Durbin model of the paper’s convergence regression, decomposes the association into direct and spillover (indirect) impacts, and checks that the conclusion survives alternative weights.

import warnings

warnings.filterwarnings("ignore")

import geometrics as gm

gdf, df, df_dict = gm.data.load_india()
df = gm.set_labels(df, df_dict, set_panel=True)

# The paper's weights: 6 nearest neighbors on plain lon/lat centroids (crs=None)
w = gm.make_weights(gdf, method="knn", k=6, crs=None)

The specification family

Space can enter a regression through three channels, and the classic cross-sectional models are just the on/off combinations. The spatial lag model (SAR), \(y = \rho W y + X\beta + \varepsilon\), lets the outcome respond to neighbors’ outcomes — one district’s growth feeds its neighbors’ growth and echoes back, a global feedback governed by \(\rho\). The spatial error model (SEM), \(y = X\beta + u\) with \(u = \lambda W u + \varepsilon\), puts the dependence in the disturbances instead: spatially correlated omitted factors, with no substantive spillover in the outcome equation. The SLX model, \(y = X\beta + WX\gamma + \varepsilon\), is the local alternative — neighbors’ characteristics matter directly (each \(\gamma\) is itself the spillover), but there is no outcome feedback.

The spatial Durbin model (SDM) combines the lag and SLX channels, \(y = \rho W y + X\beta + WX\gamma + \varepsilon\), and is the workhorse of the convergence-with-spillovers literature precisely because it nests the others and lets the data separate the channels. The built-in explainer:

print(gm.explain("spatial_durbin_model").to_markdown()[:700], "...")

### Spatial Durbin model (SDM)

**What it is.** The Durbin model combines both channels: y = ρWy + Xβ + WXγ + ε — outcomes respond to neighbors' outcomes (global feedback through ρ) *and* to neighbors' characteristics (local spillovers through γ). It nests SAR (γ = 0), SLX (ρ = 0) and, under a parameter restriction, SEM — which makes it the natural robust choice when LM diagnostics are ambiguous. Estimated by ML as a lag model with lagged covariates; interpretation runs entirely through the LeSage-Pace impact decomposition, where the total impact of a covariate is (β + γ)/(1 - ρ).

**When to use it.** The workhorse for regional convergence with spillovers (the specification of the bundled In ...

All of these are one model= switch away in analyze_spatial_model ("ols", "lag", "error", "slx", "durbin", "durbin_error"), estimated by spreg.

Which model do the data ask for?

analyze_spatial_diagnostics estimates the non-spatial OLS benchmark — here the paper’s unconditional convergence regression, growth of luminosity per capita 1996–2010 on its initial level, on the 1996 cross-section — then runs Moran’s I on the residuals and the five LM tests, and applies the Anselin-Florax decision rule:

growth = df.query("year == 1996")

diag = gm.analyze_spatial_diagnostics(
    growth,
    outcome="growth_ntl_pc_9610",
    covariates=["log_ntl_pc_1996"],
    gdf=gdf,
    w=w,
)
diag.df

	test	statistic	df	p
0	moran_residuals	0.583146	NaN	0.000000e+00
1	lm_lag	452.232754	1.0	2.356183e-100
2	lm_error	575.499649	1.0	3.572770e-127
3	robust_lm_lag	1.083930	1.0	2.978200e-01
4	robust_lm_error	124.350825	1.0	7.059146e-29
5	lm_sarma	576.583580	2.0	6.258445e-126

print(f"Recommendation: {diag.recommendation}\n")
print(diag.reasoning)

Recommendation: error

At least one simple LM test rejects at alpha = 0.05, so the robust forms decide: robust LM error remains significant (statistic = 124.35, p = 7.06e-29) while robust LM lag does not (p = 0.298). The Anselin-Florax rule reads this as spatially correlated disturbances, pointing to the spatial error (SEM) model.

Residual Moran’s I of 0.58 means OLS leaves a lot of spatial structure on the table. Both simple LM tests reject overwhelmingly, so the robust forms decide, and robust LM error wins by a wide margin — the mechanical rule points to the error model. The paper (and this article) nonetheless proceeds with the spatial Durbin model: the SDM nests the error model under a common-factor restriction and keeps the spatially lagged covariates that theory suggests, so it is the safe superset when any robust test fires. The diagnostics tell you dependence is real and must be modeled; they do not have the last word on which superset to estimate.

The spatial Durbin model

analyze_spatial_model aligns the cross-section to the geometry and weights, estimates by maximum likelihood, and computes the LeSage-Pace impact decomposition from betas + vm with Monte-Carlo standard errors (n_draws; the result records how many were used):

sdm = gm.analyze_spatial_model(
    growth,
    outcome="growth_ntl_pc_9610",
    covariates=["log_ntl_pc_1996"],
    gdf=gdf,
    w=w,
    model="durbin",
    n_draws=2000,
)
sdm.gt

	Estimate	Std. Error	z	p
Spatial durbin (sdm) model of NTL per capita growth (1996-2010)
ML estimates \| n = 520 \| R² = 0.680 \| AIC = -2291.7
CONSTANT	-0.0161***	(0.0062)	-2.62	0.0089
Log NTL per capita (1996)	-0.0212***	(0.0019)	-11.15	0.0000
W_Log NTL per capita (1996)	0.0168***	(0.0023)	7.25	0.0000
W_NTL per capita growth (1996-2010)	0.7967***	(0.0301)	26.51	0.0000
* p<0.01, p<0.05, * p<0.10 (z-based)
W: 6-nearest-neighbor (geographic centroids), row-standardized, n=520

The coefficient table is only the raw material. The spatial autoregressive parameter is large — with \(\rho \approx 0.8\), every local difference is amplified through the neighborhood multiplier \(1/(1-\rho) \approx 5\) — and the reportable quantities are the impacts:

print(f"rho = {sdm.rho:.3f}   pseudo-R2 = {sdm.r2:.2f}   "
      f"AIC = {sdm.aic:.0f}   n = {sdm.n_obs}")
sdm.impacts.round(4)

rho = 0.797   pseudo-R2 = 0.68   AIC = -2292   n = 520

	term	direct	se_direct	indirect	se_indirect	total	se_total
0	Log NTL per capita (1996)	-0.0257	0.0024	0.0039	0.0067	-0.0217	0.0062

print(sdm.interpret())

A **spatial Durbin (SDM)** model of **growth_ntl_pc_9610** was estimated by ML on 520 units (weights: 6-nearest-neighbor (geographic centroids), row-standardized, n=520).

The spatial autoregressive parameter is **ρ = 0.797** (statistically significant at the 1% level): outcomes move together with neighbors' outcomes, so any local difference is amplified through the neighborhood multiplier 1/(1-ρ) ≈ 4.92 rather than staying put.

For **Log NTL per capita (1996)**, a one-unit higher value is associated with a direct change of -0.0257 in the same unit's growth_ntl_pc_9610 and an indirect (spillover) change of 0.00393 summed across its neighbors — a total association of **-0.0217** (statistically significant at the 1% level).

Model fit: (pseudo-)R² = 0.68, AIC = -2292 (compare AIC across specifications estimated on the same sample).

_These are associations, not causal effects. A causal reading needs a research design — see `explain('correlation_vs_causation')`._

The convergence reading: dimmer districts grow faster (direct impact −0.026), and the total impact of −0.022 — the number that maps to the speed of convergence — folds in the spillover that arrives through the neighborhood. This is the machinery behind analyze_beta_convergence(..., model="sdm") in the Analyze page.

SLX for contrast: local spillovers only

Setting model="slx" keeps the spatially lagged covariates but switches off the outcome feedback (\(\rho = 0\)). Impacts are then analytic — no Monte-Carlo draws needed — because direct = \(\beta\) and indirect = \(\gamma\) exactly, with no multiplier to simulate:

slx = gm.analyze_spatial_model(
    growth,
    outcome="growth_ntl_pc_9610",
    covariates=["log_ntl_pc_1996"],
    gdf=gdf,
    w=w,
    model="slx",
)
slx.impacts.round(4)

	term	direct	se_direct	indirect	se_indirect	total	se_total
0	Log NTL per capita (1996)	-0.0224	0.0029	0.0035	0.0034	-0.0189	0.0017

The contrast is instructive: without the \(\rho W y\) channel, the estimated spillover of initial luminosity is small and statistically indistinguishable from zero, and the model has no way to represent growth diffusing between neighbors. The SDM attributed most of the spatial action to the global feedback loop — exactly the channel SLX rules out by construction. AIC comparisons across sdm.aic and slx.aic (same sample, same W) make the same point.

Is the result an artifact of the weights?

Every impact above is conditional on the 6-nearest-neighbor W. analyze_spatial_model_by_weights re-estimates the same model under alternative specifications and compares the focal covariate’s direct/indirect/total impacts against a baseline — the source paper’s notebook-c07 robustness check:

alt = {
    "knn4": gm.make_weights(gdf, method="knn", k=4, crs=None),
    "knn6": w,
    "knn8": gm.make_weights(gdf, method="knn", k=8, crs=None),
    "queen": gm.make_weights(gdf, method="queen"),
}
robust = gm.analyze_spatial_model_by_weights(
    growth,
    outcome="growth_ntl_pc_9610",
    covariates=["log_ntl_pc_1996"],
    gdf=gdf,
    weights=alt,
    baseline="knn6",
    n_draws=1000,
)
robust.fig

print(robust.interpret())

The spatial Durbin (SDM) model was re-estimated under **4 alternative spatial weights** specifications; the table and figure compare the direct, indirect and total impacts of **log_ntl_pc_1996** against the **knn6** baseline (total = -0.0217).

The total impact keeps the **same sign in all 4 specifications**, ranging from -0.0219 to -0.0208 (spread 0.0011) — the qualitative conclusion does not hinge on the weights choice.

Every specification's 95% interval covers the baseline estimate, so the alternatives are statistically indistinguishable from the baseline.

By AIC the best-fitting specification is **knn6** (AIC = -2292); AIC comparisons are only meaningful across models estimated on the same sample.

_These are associations, not causal effects. A causal reading needs a research design — see `explain('correlation_vs_causation')`._

The dot-whisker figure is the headline: the total impact keeps its sign in all four specifications and every 95% interval covers the baseline — the convergence conclusion does not hinge on the weights choice. The per-W detail (including each w_spec and AIC) lives in robust.df, and robust.gt renders the comparison table.

Coefficients are not impacts

The closing rule of the spreg suite. In any model with \(\rho W y\), a change in one district’s covariate propagates to its neighbors and feeds back, so no single regression coefficient describes the association — reading \(\beta\) off the SDM coefficient table understates or misstates everything the model has to say. The explainer:

print(gm.explain("spatial_impacts").to_markdown()[:800], "...")

### Direct, indirect, and total impacts

**What it is.** In models with ρWy, a change in one unit's covariate propagates to its neighbors and feeds back, so no single coefficient describes the association. The LeSage-Pace decomposition summarizes the full matrix of cross-unit derivatives: the **direct** impact is the average own-unit association (including feedback through neighbors and back); the **indirect** impact is the average spillover to all other units; their sum is the **total** impact, which for the Durbin model equals (β + γ)/(1 - ρ). geometrics computes these from the estimated parameters and simulates their standard errors by Monte-Carlo draws from the ML covariance matrix — the same quantities Stata's estat impact reports.

**When to use it.** Always, when reporting SAR/SDM r ...

Report direct (own-district, feedback included), indirect (spillover to the rest of the map), and total impacts — never raw coefficients — and say which W they are conditional on. Every geometrics result does the second part for you via w_spec.

Where next

The India case study — the full replication arc these pieces come from
Spatial dependence and LISA — the exploratory half: weights, Moran’s I, and cluster maps