analyze_inequality_over_time

analyze_inequality_over_time(
    df,
    var,
    *,
    entity=None,
    time=None,
    measures=('gini', 'theil'),
    gdf=None,
    w=None,
    permutations=99,
    start=None,
    end=None,
    title=None,
)

Track cross-sectional inequality measures over time and test their trend.

For every period the function computes the requested inequality measures of var across units — the Gini index (:class:inequality.gini.Gini), the Theil index (:class:inequality.theil.Theil) and the coefficient of variation (sample std over mean) — then regresses the log of each measure on time (OLS, HC1 standard errors). A negative, significant slope means inequality is narrowing: the inequality-narrative complement of σ-convergence.

When geometry is supplied (gdf, with w optional), the per-period spatial Gini decomposition of Rey & Smith (2013) (:class:inequality.gini.Gini_Spatial) is added: gini_spatial is the component of the overall Gini owed to neighbor pairs under w (so gini_spatial <= gini, with the remainder owed to non-neighbor pairs), and gini_spatial_p is the permutation pseudo p-value testing whether the non-neighbor component exceeds its expectation under spatial randomness. Units are aligned to the geometry per period with the same entity set across periods (the intersection of the per-period complete cases).

Parameters

Name Type Description Default
df pd.DataFrame Long-form panel data frame. required
var str Numeric variable whose cross-sectional inequality is tracked (e.g. GDP per capita in levels). Used as supplied; the Theil index requires strictly positive values. required
entity str | None Panel identifiers. Default to those declared via :func:geometrics.set_panel. None
time str | None Panel identifiers. Default to those declared via :func:geometrics.set_panel. None
measures Sequence[str] Measures to compute per period, from "gini", "theil" and "cv". ('gini', 'theil')
gdf gpd.GeoDataFrame | None Geometry frame (see :func:geometrics.read_gdf) enabling the spatial Gini decomposition. None skips it. None
w W | None libpysal weights aligned to the gdf entity ids (only its neighbor structure is used, as a binary graph). None with a gdf builds the default weights (queen contiguity for polygons) with a :class:~geometrics.GeometricsWarning. None
permutations int Number of permutations for the spatial-Gini inference (0 disables it; gini_spatial_p is then NaN). 99
start Any Optional first and last period to include (inclusive, on the scale of the time column). Default to the full range. None
end Any Optional first and last period to include (inclusive, on the scale of the time column). Default to the full range. None
title str | None Title for the figure. None

Returns

Name Type Description
InequalityOverTimeResult Per-period measures df (time, n_units and one column per measure, plus gini_spatial / gini_spatial_p when spatial); the measures-over-time fig with dashed fitted trends; the per-measure trend table gt / summary (measure, slope, se, pvalue, r2, converging); the fitted trend models; n_periods / n_units; and w_spec describing the weights (None without geometry).

Raises

Name Type Description
KeyError If var is not a column of df.
TypeError If var is not numeric.
ValueError For unknown measures, a Theil request on non-positive values (the offending entities are named), w without gdf, fewer than two periods, or a period with fewer than two complete observations.

Notes

Gini_Spatial draws permutations from NumPy’s global RNG and has no seed parameter, so the global seed is set to a fixed value (12345) at the start of the spatial loop to make gini_spatial_p reproducible.

Examples

Inequality trend across three regions over three years:

import pandas as pd

from geometrics.regional_inequality import analyze_inequality_over_time

df = pd.DataFrame(
    {
        "region": ["A", "B", "C"] * 3,
        "year": [2000] * 3 + [2001] * 3 + [2002] * 3,
        "gdppc": [10.0, 20.0, 40.0, 12.0, 21.0, 38.0, 14.0, 22.0, 36.0],
    }
)
res = analyze_inequality_over_time(
    df, "gdppc", entity="region", time="year", measures=("gini", "theil")
)
(res.df["gini"].round(3).tolist(), bool(res.summary.loc[0, "converging"]))