c02_regional_convergence

This notebook examines absolute \(\beta\)-convergence in nighttime luminosity across 520 Indian districts (1996–2010). We regress per capita luminosity growth on initial luminosity levels and visualize the relationship in an annotated scatterplot. This analysis corresponds to the first set of results discussed in the main manuscript.

Setup

In [1]:

# Google Colab: install packages not included in the default environment
try:
    import google.colab
    !pip install statsmodels seaborn -q
except ImportError:
    pass  # Local environment — packages already installed

In [2]:

# Setup
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import statsmodels.formula.api as smf

import warnings
warnings.filterwarnings("ignore")

Data

We use district-level radiance-calibrated nighttime lights data from the DMSP-OLS satellites, covering 520 districts.

In [3]:

# Load dataset (local copy first, else download from GitHub)
import os
import urllib.request

fname = "india520.dta"
url = "https://raw.githubusercontent.com/quarcs-lab/project2025s-py/master/data/" + fname
local = os.path.join("..", "data", fname)
path = local if os.path.exists(local) else fname
if path == fname and not os.path.exists(fname):
    urllib.request.urlretrieve(url, fname)
data = pd.read_stata(path)
print("Districts: {}".format(len(data)))

Districts: 520

Convergence regression

A negative slope on initial luminosity indicates \(\beta\)-convergence: districts with lower initial luminosity grew faster over the period.

In [4]:

# Basic OLS Regression
model1 = smf.ols("light_growth96_10rcr_cap ~ log_light96_rcr_cap", data=data).fit()
print(model1.summary())

                               OLS Regression Results                               
====================================================================================
Dep. Variable:     light_growth96_10rcr_cap   R-squared:                       0.255
Model:                                  OLS   Adj. R-squared:                  0.253
Method:                       Least Squares   F-statistic:                     177.0
Date:                      Sat, 20 Jun 2026   Prob (F-statistic):           5.89e-35
Time:                              18:28:27   Log-Likelihood:                 974.41
No. Observations:                       520   AIC:                            -1945.
Df Residuals:                           518   BIC:                            -1936.
Df Model:                                 1                                         
Covariance Type:                  nonrobust                                         
=======================================================================================
                          coef    std err          t      P>|t|      [0.025      0.975]
---------------------------------------------------------------------------------------
Intercept              -0.0723      0.007     -9.790      0.000      -0.087      -0.058
log_light96_rcr_cap    -0.0199      0.001    -13.306      0.000      -0.023      -0.017
==============================================================================
Omnibus:                       47.400   Durbin-Watson:                   1.995
Prob(Omnibus):                  0.000   Jarque-Bera (JB):              140.318
Skew:                           0.408   Prob(JB):                     3.39e-31
Kurtosis:                       5.411   Cond. No.                         23.2
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

In [5]:

# Compute regression model for scatterplot annotation
model = smf.ols("light_growth96_10rcr_cap ~ log_light96_rcr_cap", data=data).fit()
slope = round(model.params["log_light96_rcr_cap"], 3)
rsq   = round(model.rsquared, 3)

The \(\beta\) coefficient implies an annual speed of convergence \(\lambda = -\ln(1 + \beta T)/T\) (Barro & Sala-i-Martin), where \(T = 14\) years (1996–2010) and the dependent variable is the average annual growth rate, together with a half-life \(\ln(2)/\lambda\) (the time to close half the gap to the steady state).

In [6]:

# Implied annual speed of convergence and half-life (Barro & Sala-i-Martin)
T = 14  # 1996-2010; dependent variable is the average annual growth rate
beta = model.params["log_light96_rcr_cap"]
speed = -np.log(1 + beta * T) / T   # annual convergence speed (lambda)
half_life = np.log(2) / speed       # years to close half the gap
print("Convergence coefficient (beta): {:.4f}".format(beta))
print("Annual speed of convergence:    {:.2%}".format(speed))
print("Implied half-life:              {:.1f} years".format(half_life))

Convergence coefficient (beta): -0.0199
Annual speed of convergence:    2.33%
Implied half-life:              29.7 years

Convergence scatterplot

The scatterplot below visualizes the convergence relationship. Outlier districts are labeled to highlight cases that deviate notably from the overall trend—either bright districts that declined or dim districts that grew unusually fast.

In [7]:

# Identify outlier districts for labeling
mask = (
    ((data["log_light96_rcr_cap"] > -3) & (data["light_growth96_10rcr_cap"] < 0))
    | ((data["log_light96_rcr_cap"] < -7) & (data["light_growth96_10rcr_cap"] > 0.2))
)
outliers = data[mask]

# Annotated scatterplot
fig, ax = plt.subplots(figsize=(8, 6))
sns.regplot(
    data=data,
    x="log_light96_rcr_cap",
    y="light_growth96_10rcr_cap",
    ci=95,
    scatter_kws={"alpha": 0.5, "color": "steelblue"},
    line_kws={"color": "black", "linewidth": 0.8},
    ax=ax,
)

# Label outlier districts
for _, r in outliers.iterrows():
    ax.annotate(
        r["district"],
        xy=(r["log_light96_rcr_cap"], r["light_growth96_10rcr_cap"]),
        xytext=(0, 6),
        textcoords="offset points",
        ha="center",
        fontsize=8,
        bbox=dict(boxstyle="round,pad=0.3", fc="white", ec="gray", alpha=0.7),
    )

# Slope / R-squared annotation box (top-right corner)
annotation = "Slope = {}\nR² = {}".format(slope, rsq)
ax.annotate(
    annotation,
    xy=(0.97, 0.97),
    xycoords="axes fraction",
    ha="right",
    va="top",
    fontsize=11,
    bbox=dict(boxstyle="round,pad=0.4", fc="white", ec="black", alpha=0.9),
)

ax.set_xlabel("Log of luminosity per capita in 1996")
ax.set_ylabel("Growth of luminosity per capita 1996-2010")
sns.despine(ax=ax)
plt.tight_layout()
plt.show()

Figure 1: Regional luminosity convergence across districts in India
Notes: Each point represents one of the 520 districts. The regression line shows the estimated beta-convergence relationship. Outlier districts are labeled.
Source: Data from Chanda and Kabiraj (2020). See Regional convergence notebook for source code.

N2: Regional convergence

Setup

Data

Convergence regression

Convergence scatterplot