c02_regional_convergence

This notebook examines absolute \(\beta\)-convergence in nighttime luminosity across 520 Indian districts (1996–2010). We regress per capita luminosity growth on initial luminosity levels and visualize the relationship in an annotated scatterplot. This analysis corresponds to the first set of results discussed in the main manuscript.

Setup

In [1]:

# Setup
library(haven)
library(ggplot2)

Data

We use district-level radiance-calibrated nighttime lights data from the DMSP-OLS satellites, covering 520 districts.

In [2]:

# Load dataset from GitHub
url <- "https://raw.githubusercontent.com/quarcs-lab/project2025s/master/data/india520.dta"
temp <- tempfile(fileext = ".dta")
download.file(url, temp, mode = "wb")
data <- read_dta(temp)

Convergence regression

A negative slope on initial luminosity indicates \(\beta\)-convergence: districts with lower initial luminosity grew faster over the period.

In [3]:

# Basic OLS Regression
model1 <- lm(light_growth96_10rcr_cap ~ log_light96_rcr_cap, data = data)
summary(model1)


Call:
lm(formula = light_growth96_10rcr_cap ~ log_light96_rcr_cap, 
    data = data)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.110924 -0.022972  0.001766  0.020301  0.198850 

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)    
(Intercept)         -0.072274   0.007382   -9.79   <2e-16 ***
log_light96_rcr_cap -0.019881   0.001494  -13.31   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.03722 on 518 degrees of freedom
Multiple R-squared:  0.2547,    Adjusted R-squared:  0.2533 
F-statistic:   177 on 1 and 518 DF,  p-value: < 2.2e-16

In [4]:

# Compute regression model for scatterplot annotation
model <- lm(light_growth96_10rcr_cap ~ log_light96_rcr_cap, data = data)
slope <- round(coef(model)[2], 3)
rsq   <- round(summary(model)$r.squared, 3)

Convergence scatterplot

The scatterplot below visualizes the convergence relationship. Outlier districts are labeled to highlight cases that deviate notably from the overall trend—either bright districts that declined or dim districts that grew unusually fast.

In [5]:

# Identify outlier districts for labeling
outliers <- data[
  (data$log_light96_rcr_cap > -3 & data$light_growth96_10rcr_cap < 0) |
  (data$log_light96_rcr_cap < -7 & data$light_growth96_10rcr_cap > 0.2),
]

# Annotated scatterplot
p1 <- ggplot(data, aes(x = log_light96_rcr_cap, y = light_growth96_10rcr_cap)) +
  geom_point(alpha = 0.5, color = "steelblue") +
  geom_smooth(method = "lm", color = "black", se = TRUE, linewidth = 0.8) +
  geom_label(data = outliers, aes(label = district),
             size = 3, alpha = 0.7, nudge_y = 0.005) +
  annotate("label",
           x = Inf, y = Inf,
           label = paste("Slope =", slope, "\nR\u00b2 =", rsq),
           hjust = 1.1, vjust = 1.5, size = 4) +
  labs(x = "Log of luminosity per capita in 1996",
       y = "Growth of luminosity per capita 1996-2010") +
  theme_minimal()
p1

`geom_smooth()` using formula = 'y ~ x'

Figure 1: Regional luminosity convergence across districts in India
Notes: Each point represents one of the 520 districts. The regression line shows the estimated beta-convergence relationship. Outlier districts are labeled.
Source: Data from Chanda and Kabiraj (2020). See Regional convergence notebook for source code.

N2: Regional convergence

Setup

Data

Convergence regression

Convergence scatterplot