Regional growth, convergence, and spatial spillovers — a reproducible view from outer space
This article replicates and extends the analysis of “Regional growth, convergence, and spatial spillovers in India” (Mendez, Kabiraj & Li; building on Chanda & Kabiraj 2020, World Development): 520 Indian districts observed by radiance-calibrated DMSP-OLS nighttime lights between 1996 and 2010, used as a satellite proxy for economic activity.
Three questions organize everything:
Convergence — do dimmer (poorer) districts grow faster than brighter ones?
Spatial dependence — do neighboring districts light up together?
Spillovers — does a neighborhood’s brightness help local growth?
import warningswarnings.filterwarnings("ignore")import geometrics as gmgdf, df, df_dict = gm.data.load_india()df = gm.set_labels(df, df_dict, set_panel=True)print(f"{gdf.shape[0]} districts x {df['year'].nunique()} years; "f"{df_dict.shape[0]} documented variables")
520 districts x 6 years; 28 documented variables
A view from space
Total district luminosity, classified with Fisher-Jenks. The animation steps through all six satellite years with a pooled classification, so colors are comparable across frames:
The paper’s weights are 6 nearest neighbors (built, like the paper, on plain lon/lat centroids — pass crs=None; the geometrics default would project first):
Initial luminosity is strongly clustered — the paper reports Moran’s I = 0.73:
print(f"Moran's I (initial log luminosity pc): {lisa_initial.moran_i:.2f} "f"(pseudo p = {lisa_initial.p_sim_global:.3f})")print(f"High-High districts: {lisa_initial.n_hh}, "f"Low-Low: {lisa_initial.n_ll}")
Moran's I (initial log luminosity pc): 0.73 (pseudo p = 0.001)
High-High districts: 169, Low-Low: 101
And so is growth:
growth_lisa = gm.explore_lisa_cluster_map( df.query("year == 1996"), "growth_ntl_pc_9610", gdf=gdf, w=w)print(f"Moran's I (growth 1996-2010): {growth_lisa.moran_i:.2f} "f"(pseudo p = {growth_lisa.p_sim_global:.3f})")growth_lisa.fig
Moran's I (growth 1996-2010): 0.60 (pseudo p = 0.001)
Convergence: OLS vs the spatial Durbin model
The paper’s dependent variable is the per-capita luminosity growth rate 1996-2010, shipped verbatim by load_india() (an honest per-capita panel is impossible — district population exists only for 1996 and 2001 — so the paper’s pre-computed columns are carried unchanged). To run the paper’s exact cross-section through the panel API, rebuild a two-period panel whose growth reproduces the paper’s dependent variable identically:
The headline finding: spatial spillovers raise the estimated speed of convergence. Part of every district’s catch-up arrives through its neighborhood — the indirect impact — which OLS attributes to nothing.
sdm.fig
print(sdm.interpret())
Across 520 units, the growth of **ntl_pc** over a 14-period window is associated with its initial log level with a total slope of **-0.0217** (SE 0.00611), statistically significant at the 1% level.
The slope is negative — the β-convergence pattern: units that started lower tended to grow faster, narrowing initial gaps.
That slope implies a convergence speed of λ = 0.0259 per period and a half-life of about 26.7 periods — the time for half of an initial gap to close at this pace.
The SDM decomposition splits the total into a direct component of -0.0212 (own initial level) and a spillover (indirect) component of -0.000534 operating through neighboring units, under the weights: 6-nearest-neighbor (geographic centroids), row-standardized, n=520.
The spatial-lag parameter ρ = 0.797 says each unit's growth moves together with its neighbors' growth, so part of the convergence pattern is shared across space rather than purely unit-by-unit.
_These are associations, not causal effects. A causal reading needs a research design — see `explain('correlation_vs_causation')`._
error
At least one simple LM test rejects at alpha = 0.05, so the robust forms decide: robust LM error remains significant (statistic = 124.35, p = 7.06e-29) while robust LM lag does not (p = 0.298). The Anselin-Florax rule reads this as spatially correlated disturbances, pointing to the spatial error (SEM) model.
Spatial dependence diagnostics for NTL per capita growth (1996-2010)
Is the growth-initial association uniform across India? Geographically weighted regression maps the local convergence coefficient:
gwr = gm.analyze_gwr( df.query("year == 1996"), outcome="growth_ntl_pc_9610", covariates=["log_ntl_pc_1996"], gdf=gdf,)print(f"adaptive bandwidth: {gwr.bw:.0f} neighbors; local R2 mean "f"{gwr.df['local_r2'].mean():.2f}")gwr.figs["log_ntl_pc_1996"]
adaptive bandwidth: 46 neighbors; local R2 mean 0.40
Distribution dynamics
Beyond the regression slope: how does the whole distribution move? (One district records zero luminosity in some years; log-based and relative measures use the always-positive panel.)
smk = gm.analyze_spatial_markov(pos, "ntl_total", gdf=gdf_pos, w=w_pos, k=4)print(f"Homogeneity LR test: {smk.lr_stat:.1f} (p = {smk.lr_p:.2g}) — ""transition dynamics differ by neighborhood")smk.fig
Homogeneity LR test: 73.6 (p = 6.3e-07) — transition dynamics differ by neighborhood
print(smk.interpret())
The spatial Markov chain splits **ntl_total**'s 4-state transition matrix by the neighbors' position (spatial lag under 6-nearest-neighbor (geographic centroids), row-standardized, n=519), giving one matrix per neighborhood class (4 classes); values were expressed relative to each period's mean first.
The unconditional matrix keeps regions in place with average probability 0.871.
Conditioning on context, regions surrounded by **low-value neighbors** stay in place with average probability 0.837, while regions surrounded by **high-value neighbors** do so with 0.896 — movement is more common in low-value neighborhoods.
The homogeneity tests (LR = 73.6, p = 6.26e-07; Q = 73.5, p = 6.3e-07; 24 degrees of freedom) are statistically significant at the 1% level: transition dynamics **differ across neighborhood contexts** — a region's mobility is associated with the state of its neighbors.
_These are associations, not causal effects. A causal reading needs a research design — see `explain('correlation_vs_causation')`._
Regional inequality
σ-convergence and the between/within split — how much of district inequality is between states?
The Theil index of **ntl_total** splits additively into inequality **between** the 28 state groups and inequality **within** them (between + within = total, exactly).
In the latest period (2010), the total Theil index is 0.504: about 59% of it lies between state groups and 41% within them — differences across state means are the dominant layer of inequality.
Over the window, the between-group share fell from 62% to 59% — group means are pulling closer together relative to the differences inside groups.
_These are associations, not causal effects. A causal reading needs a research design — see `explain('correlation_vs_causation')`._
Convergence clubs
Finally, the Phillips-Sul log(t) machinery asks whether all districts share one steady-state path or sort into clubs:
Mendez, C., Kabiraj, S., & Li, J. — Regional growth, convergence, and spatial spillovers in India: A reproducible view from outer space (repository, interactive manuscript)
Chanda, A., & Kabiraj, S. (2020). Shedding light on regional growth and convergence in India. World Development, 133.
Data: DMSP-OLS radiance-calibrated nighttime lights (NOAA/NGDC), district boundaries from the 2001 Census geography.