explore_distribution_over_time
explore_distribution_over_time(
df,
var,
*,
entity=None,
time=None,
relative=False,
periods=None,
kind='ridgeline',
bandwidth=None,
title=None,
)Track how the cross-sectional distribution of one variable evolves over time.
A Gaussian kernel density of var is estimated per period (:class:scipy.stats.gaussian_kde) and evaluated on a single grid shared by all periods, so the densities are directly comparable. kind="ridgeline" stacks one filled density per period with a subtle vertical offset (newest period on top); kind="animated" shows a single density animated over the periods with a play button and slider.
Parameters
| Name | Type | Description | Default |
|---|---|---|---|
| df | pd.DataFrame | Long panel holding var per entity and period. |
required |
| var | str | Numeric column of df whose distribution is tracked. |
required |
| entity | str | None | Panel identifiers; default to the ids declared via :func:geometrics.set_panel. A time id is required. |
None |
| time | str | None | Panel identifiers; default to the ids declared via :func:geometrics.set_panel. A time id is required. |
None |
| relative | bool | Divide var by its cross-sectional mean per period before density estimation (the distribution-dynamics convention): 1.0 marks the period average and a dashed vertical line is drawn at 1. |
False |
| periods | Sequence[Any] | None | Subset of periods to include (default: all periods in df). Unknown periods raise :class:ValueError. |
None |
| kind | Literal['ridgeline', 'animated'] | "ridgeline" (stacked filled densities, one trace per period) or "animated" (one density trace animated over periods with a slider). |
'ridgeline' |
| bandwidth | float | str | None | Kernel bandwidth passed to :class:scipy.stats.gaussian_kde as bw_method (a scalar factor or "scott" / "silverman"). None uses scipy’s default (Scott’s rule). |
None |
| title | str | None | Figure title. Defaults to a description built from the variable label. | None |
Returns
| Name | Type | Description |
|---|---|---|
| DistributionOverTimeResult | Frozen result with the tidy evaluation frame df (columns time, value, density), the themed fig, and notes. |
Raises
| Name | Type | Description |
|---|---|---|
| KeyError | If var is not a column of df. |
|
| TypeError | If var is not numeric. |
|
| ValueError | If no time id resolves, kind is unknown, a requested period is absent, or a period has fewer than 2 distinct values. |
Examples
Ridgeline of a small two-period panel:
import pandas as pd
from geometrics.spacetime import explore_distribution_over_time
df = pd.DataFrame(
{
"region": list("abcdefgh") * 2,
"year": [2000] * 8 + [2010] * 8,
"gdppc": [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0]
+ [2.0, 2.5, 3.5, 4.5, 5.0, 6.5, 7.0, 7.5],
}
)
res = explore_distribution_over_time(df, "gdppc", entity="region", time="year")
len(res.fig.data)