set_roles

set_roles(df, *, outcome=None, covariates=None)

Declare the main outcome and covariates on df and return it.

The roles are stored under df.attrs["geometrics_roles"] so that subsequent explore_* / analyze_* calls (and the no-code apps) can default to them when their primary variable argument is omitted. Explicit arguments to those functions still take precedence.

Parameters

Name Type Description Default
df pd.DataFrame The data frame (modified in place — its attrs are updated and the same object is returned). required
outcome str | None Name of the main outcome (dependent) variable, or None to leave it unset. None
covariates str | Sequence[str] | None Name(s) of the main covariate(s) — a single column or a sequence — or None to leave them unset. None

Returns

Name Type Description
pandas.DataFrame The same df, with df.attrs["geometrics_roles"] updated.

Examples

Declare the key variables once, then explore/analyze without repeating them:

import pandas as pd

import geometrics as gm

df = pd.DataFrame(
    {
        "region": ["A", "A", "B", "B"],
        "year": [2000, 2001, 2000, 2001],
        "gini": [0.42, 0.41, 0.35, 0.34],
        "log_gdp_pc": [8.1, 8.2, 9.0, 9.1],
    }
)
df = gm.set_panel(df, entity="region", time="year")
df = gm.set_roles(df, outcome="gini", covariates=["log_gdp_pc"])