analyze_spatial_markov
analyze_spatial_markov(
df,
var,
*,
gdf,
w=None,
entity=None,
time=None,
k=5,
m=None,
fixed=True,
relative=True,
title=None,
)Estimate a spatial Markov chain: transitions conditioned on the neighbors’ state.
Rey’s (2001) spatial Markov splits the classic transition matrix by the spatial lag of each region — the (weighted) average of its neighbors — discretized into m classes. One k-by-k matrix per neighbor class shows whether upward or downward moves happen at different rates in rich versus poor neighborhoods, and the Bickenbach-Bode LR / Q tests ask whether those conditional dynamics differ from the pooled (unconditional) matrix.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| df | pd.DataFrame | Long-form panel with entity, time and var columns. The panel must be balanced in var (every entity observed in every period). |
required |
| var | str | Numeric variable whose distribution dynamics are analyzed. | required |
| gdf | gpd.GeoDataFrame | Geometry frame carrying the entity ids (see :func:geometrics.read_gdf); the panel is aligned to the weights’ row order through it. |
required |
| w | W | None | libpysal weights aligned to the gdf entity ids. None builds the default weights (queen contiguity for polygons, 6-nearest-neighbor otherwise) with a :class:~geometrics.GeometricsWarning. |
None |
| entity | str | None | Entity (unit) id column of df; defaults to the declared panel. |
None |
| time | str | None | Time id column; defaults to the declared panel. | None |
| k | int | Number of states for the variable itself (default 5). | 5 |
| m | int | None | Number of classes for the spatial lag (default: same as k). |
None |
| fixed | bool | Pool the n*t values into one quantile classification (default True, giddy’s convention); False re-classifies each period separately. |
True |
| relative | bool | Divide var by its cross-sectional mean per period first (default True, the distribution-dynamics convention for income data). |
True |
| title | str | None | Figure title (a default naming the variable is used when None). |
None |
Returns
| Name | Type | Description |
|---|---|---|
| SpatialMarkovResult | The long panel with each (entity, period) state and neighbor_state, the unconditional p_global, the tuple of m conditional matrices p_conditional (with steady_states stacking their ergodic distributions), the small-multiple heatmap fig, the homogeneity-test table gt, and the LR / Q statistics with their p-values and dof. |
Raises
| Name | Type | Description |
|---|---|---|
| ImportError | If the optional giddy dependency is not installed. |
|
| KeyError | If var is not a column of df. |
|
| TypeError | If var is not numeric. |
|
| ValueError | If k < 2 or m < 2, the panel is unbalanced, fewer than two periods are observed, or the weights ids do not match the geometry. |
Examples
A 3x3 lattice where income levels follow a smooth spatial gradient:
import geopandas as gpd
import numpy as np
import pandas as pd
from shapely.geometry import box
from geometrics.distribution_dynamics import analyze_spatial_markov
from geometrics.weights import make_weights
cells = [box(x, y, x + 1, y + 1) for y in range(3) for x in range(3)]
gdf = gpd.GeoDataFrame(
{"region": [f"r{i}" for i in range(9)]}, geometry=cells, crs="EPSG:4326"
)
w = make_weights(gdf, method="queen")
rng = np.random.default_rng(1)
df = pd.DataFrame(
[
{"region": f"r{i}", "year": y, "income": 1.0 + i / 4 + rng.normal(0, 0.3)}
for y in (2000, 2001, 2002, 2003)
for i in range(9)
]
)
res = analyze_spatial_markov(
df, "income", gdf=gdf, w=w, entity="region", time="year", k=2, m=2
)
res.p_global.round(2)