analyze_sigma_convergence

analyze_sigma_convergence(
    df,
    var,
    *,
    entity=None,
    time=None,
    start=None,
    end=None,
    min_periods=3,
    vcov='hetero',
    title=None,
)

σ-convergence: track and test the cross-sectional dispersion of a panel variable.

For each period the function measures how spread out the log of var is across units — the standard deviation (the classic σ), the Gini index and the coefficient of variation — and then asks whether that dispersion shrinks over time by regressing the log dispersion on time. A negative trend slope is σ-convergence: the cross-sectional distribution is narrowing (units are becoming more alike). This is the distributional complement to β-convergence (:func:analyze_beta_convergence).

The variable is taken in levels and logged internally (so pass GDP per capita, not log GDP per capita). The panel must be balanced (every unit present in every period) so the dispersion is comparable across periods.

Parameters

Name Type Description Default
df pd.DataFrame Long panel data frame. required
var str Numeric, strictly positive variable in levels whose log dispersion is tracked. required
entity str | None Panel identifiers. Default to those declared via :func:geometrics.set_panel. None
time str | None Panel identifiers. Default to those declared via :func:geometrics.set_panel. None
start float | None Optional first and last period to include. Default to the full range; the retained window must still be balanced. None
end float | None Optional first and last period to include. Default to the full range; the retained window must still be balanced. None
min_periods int Minimum number of periods required to estimate a dispersion trend (at least 3). 3
vcov str Standard errors of the trend regressions: "hetero" (HC1, default) or "iid". Does not change the point estimates. 'hetero'
title str | None Title for the dual-axis figure. None

Returns

Name Type Description
SigmaConvergenceResult The per-period dispersion table df (time, n_units, mean, std, gini, cv — all on log values); the dual-axis fig (std on the left axis, Gini on the right, with fitted trend overlays); the trend table gt / summary; the fitted trend models; and the headline trend scalars (std_slope / std_se / std_pvalue / std_r2 plus the gini_* and cv_* counterparts). notes records any degraded measure.

Raises

Name Type Description
KeyError If var is not a column of df.
TypeError If var is not numeric.
ValueError If no usable rows remain, var has non-positive values (the log is undefined), the panel is unbalanced, or there are too few units/periods.

Notes

For a dispersion measure :math:D_t computed cross-sectionally at each period t, the trend is the OLS slope b in :math:\ln D_t = a + b t + \varepsilon_t, so b is the average proportional change in dispersion per period and b < 0 is σ-convergence. The standard deviation uses ddof = 1; the Gini index is the relative mean absolute difference over twice the mean; the coefficient of variation is the standard deviation over the mean. See Barro & Sala-i-Martin, Economic Growth, ch. 11.

Examples

Dispersion shrinks by construction (each unit moves halfway to the mean):

import numpy as np
import pandas as pd

from geometrics.convergence import analyze_sigma_convergence

ids = [f"r{i}" for i in range(8)]
y0 = np.linspace(1000.0, 8000.0, 8)
frames = [
    pd.DataFrame(
        {
            "region": ids,
            "year": 2000 + t,
            "gdppc": np.exp(
                np.log(y0).mean() + (np.log(y0) - np.log(y0).mean()) * 0.8**t
            ),
        }
    )
    for t in range(5)
]
res = analyze_sigma_convergence(
    pd.concat(frames), "gdppc", entity="region", time="year"
)
res.std_slope < 0