Regional growth, convergence, and spatial spillovers in India:

A reproducible view from outer space

Author

Anonymous

Abstract

Using satellite nighttime light data as a proxy for economic activity, Chanda and Kabiraj (2020, World Development) studied regional growth and convergence across 520 districts in India. Adopting a reproducible open-science approach, this article builds on their work by extending their main findings on three fronts. First, we illustrate regional convergence patterns using an interactive tool for satellite imagery visualization. Second, we assess the degree of spatial dependence in their main econometric specification. Third, we employ a spatial Durbin model to measure the role of spatial spillovers in the convergence process. Our results indicate that spatial spillovers increase the estimated speed of regional convergence. Overall, the results highlight the role of spatial dependence in regional convergence analyses through the lens of satellite imagery, interactive visualizations, and spillover modeling.

Keywords

Regional convergence, Spatial dependence, Spatial Durbin model, Nighttime lights, Reproducible research, India

Introduction

Regional economic growth and convergence are key concerns in developing countries, particularly in large federal states like India where spatial inequalities can threaten social cohesion and political stability. However, studying regional convergence in developing countries has been historically challenging due to limited availability of consistent economic data at subnational administrative levels. In response to this challenge, the emergence of satellite nighttime light data as a proxy for economic activity has enabled a growing literature about regional growth dynamics at granular geographic scales.

Chanda and Kabiraj (2020) leveraged nighttime light data to document regional convergence across 520 districts in India between 1996 and 2010. Their analysis showed that poorer districts grew faster than richer ones during this period, suggesting a reduction in spatial inequalities. However, their econometric approach did not account for potential spatial spillover effects in the convergence process. Specifically, a district’s growth trajectory might be influenced not only by its own initial conditions but also by those of its neighbors.

This article extends the study of Chanda and Kabiraj (2020) in three key methodological directions. First, we develop an interactive visualization tool that allows researchers to explore spatial and temporal patterns of regional convergence using satellite nighttime light data. This tool helps identify converging regions and growth hotspots that may be difficult to detect in static visualizations. Second, we formally test for spatial dependence in both the dependent and independent variables of the convergence equations. Our tests show that spatial autocorrelation is a relevant feature of satellite data and the regional convergence process. Third, we employ a spatial Durbin model to explicitly account for spatial spillovers and quantify how neighbors can influence the speed of regional convergence.

Our results contribute to the understanding of regional convergence in India on three fronts. First, interactive visualization tools reveal clear spatial patterns in both the initial distribution and subsequent growth of nighttime lights. Second, formal tests of spatial dependence indicate that district-level economic trajectories are not independent of their neighbors. Third, accounting for spatial spillovers through a spatial Durbin model shows that the total convergence effect is considerably larger than previous non-spatial estimates would suggest. Specifically, spatial spillovers appear to accelerate the convergence process by creating additional channels through which lagging regions can catch up.

These findings also provide implications for methodology and policy. Methodologically, they suggest that conventional non-spatial approaches may underestimate the speed of regional convergence by failing to account for inter-district spillovers. From a policy perspective, they suggest that the benefits of place-based policies may extend beyond target districts through spatial multiplier effects, potentially increasing their cost-effectiveness.

In addition to these methodological contributions, this article adopts a reproducible open-science approach by using Jupyter notebooks and the Quarto publishing system. Jupyter notebooks integrate executable code, narrative text, and computational outputs within a single document, supporting multiple programming languages such as Python, R, and Stata. Quarto is an open-source publishing system that generates multiple output formats—including HTML, PDF, and Word—from a single source file, ensuring consistency across all versions of a document. Together, these tools make every analytical step transparent and let any reader re-execute the full analysis from the raw data.

The rest of this article is organized as follows. Section 2 provides an overview of the data and methods, describing the use of nighttime light data as a proxy for economic activity. It also introduces the methodological extensions related to reproducible open science, interactive visualizations, spatial dependence testing, and spillover modeling. Section 3 presents our empirical results, beginning with an interactive exploration of regional convergence patterns, followed by formal tests of spatial dependence. The section concludes with estimates of direct and indirect convergence effects from the spatial Durbin model. Finally, Section 4 offers some concluding remarks.

Data and methods

Data: Nighttime lights as a proxy for economic activity

One of the key challenges in studying economic growth in developing countries like India is the limited availability and reliability of data on aggregate economic activity below the state level. To address this issue, a growing body of literature, pioneered by Henderson et al. (2012), has utilized satellite nighttime light data as a proxy for economic activity at subnational levels. This approach exploits the strong empirical correlation between the intensity of artificial light observed from satellites and the level of economic output on the ground (Chen and Nordhaus 2011).

Nighttime light (hereafter NTL) data have been widely used to investigate economic growth and convergence across national and subnational regions in various countries. For instance, Adhikari and Dhital (2021) examine the impact of decentralization on regional convergence using NTL data. Pinkovskiy and Sala-i-Martin (2016) use NTL data to adjudicate between national accounts and household surveys. They find that national accounts better capture aggregate economic growth. Similarly, Lessmann and Seidel (2017) use NTL data to estimate GDP per capita and spatial inequality globally.

NTL data have been applied to a range of research questions in India. For example, Cook and Shah (2022) use NTL data to analyze the impact of public welfare programs. Jha and Talathi (2024) examine the effects of colonial institutions. Chanda and Cook (2022) investigate the impact of demonetization. Beyer et al. (2021) employ NTL data to study the effects of COVID-19. In the context of convergence studies, Chakravarty and Dehejia (2017) document significant regional disparities in India using NTL data and caution that the goods and services tax may further exacerbate them.

Our study builds on Chanda and Kabiraj (2020). Following their approach, we use the per capita growth in nighttime lights as the dependent variable and the initial nighttime lights per capita as the primary explanatory variable. For 520 districts in India, these variables are derived from NTL data released by the National Geophysical Data Center (NGDC). The data are based on observations from the DMSP-OLS satellites spanning the period from 1996 to 2010. To mitigate the issue of top-coding in NTL data, the NGDC released “radiance-calibrated” nighttime lights for eight specific years within this period. This dataset employs high magnification settings for low-light regions and low magnification settings for brightly lit areas. For this study, we utilize the “radiance-calibrated” nighttime lights data.1

Jupyter and Quarto for reproducible open science

Jupyter notebooks have become a widely used tool for reproducible open science. They allow researchers to integrate executable code, explanatory narrative, and computational outputs within a single self-contained document (Kluyver et al. 2016). This integration ensures that every step of the analytical workflow—from data ingestion and processing to statistical modeling and visualization—is transparently documented and can be independently verified by other researchers. Unlike traditional workflows that separate code, results, and interpretation across files, Jupyter notebooks preserve the complete chain of reasoning. This unified format can be shared, inspected, and re-executed. In this article, we employ Jupyter notebooks with three distinct computational kernels—Python, R, and Stata—to document our data processing, analysis, and visualization steps. This approach allows readers to trace each result back to the code that produced it. Moreover, Python and R notebooks can be executed in the cloud using Google Colaboratory. While Stata is not open-source software, its integration with the Jupyter environment provides a flexible interactive interface. This integration brings the same benefits of literate programming and transparent documentation to proprietary statistical software (Knuth 1984).

Quarto is an open-source scientific and technical publishing system that extends the capabilities of computational notebooks into a versatile manuscript preparation framework (Allaire et al. 2024). Its single-source publishing approach enables researchers to generate multiple output formats from a single source file. These formats include HTML for interactive web-based dissemination, PDF for formal journal submission, Word documents for collaborative editing, and JATS XML for archival and indexing purposes. This approach ensures consistency across all versions of a document, eliminating discrepancies that can arise when maintaining separate files for different output formats. By providing a unified authoring environment that natively supports cross-references, citations, mathematical notation, and embedded computational results, Quarto lowers the technical barriers to producing publication-quality research.

The combination of Jupyter notebooks and Quarto’s manuscript framework creates an integrated infrastructure for reproducible research. In this article, the figures and tables that appear in the manuscript are programmatically embedded from specific Jupyter notebook cells. This ensures that every empirical finding presented in the text is directly traceable to its underlying computation. All data processing and analysis steps are documented in the notebooks, and we provide access to the raw data and code through a public GitHub repository. By adopting this integrated framework, we aim to promote transparency and enable other researchers to build upon our work (Peng 2011).

Google Earth Engine for interactive spatial visualizations

Interactive visualizations of nighttime lights imagery offer methodological advantages over static representations, especially when analyzing temporal and spatial heterogeneity in economic activity (Donaldson and Storeygard 2016). The dynamic nature of these visualizations allows researchers to simultaneously examine multiple dimensions of the data. These dimensions include temporal variations in light intensity, the spatial distribution of economic activity, and the relationship between nighttime lights and other georeferenced variables. The ability to dynamically adjust visualization parameters allows for more nuanced exploration of economic patterns that might be obscured in static representations.

Google Earth Engine (GEE) lowers the computational and technical barriers to creating such interactive visualizations and provides an accessible platform for analyzing satellite data (Gorelick et al. 2017). The platform’s browser-based integrated development environment supports the creation of interactive web applications without requiring extensive infrastructure or specialized software installation. This capability is especially valuable for reproducible research, as it supports the development of web applications that can be shared with other researchers. The platform’s ability to handle large-scale geospatial computations streamlines the workflow from raw data to interactive visualization. Its extensive catalog of pre-processed nighttime lights datasets, including the DMSP-OLS and VIIRS collections, further reduces the storage and processing burden (Tamiminia et al. 2020).

Interactive visualizations of nighttime lights data are especially useful for identifying and analyzing patterns of regional convergence in luminosity. Through dynamic visualization tools, researchers can track the evolution of light intensity across regions over time. This approach effectively identifies areas that exhibit catch-up growth patterns—where initially dim regions progressively converge toward the luminosity levels of their brighter counterparts.

Regional convergence modeling

In neoclassical growth models, the per capita growth rate is predicted to be negatively correlated with a region’s initial endowment, primarily due to diminishing returns to capital accumulation (Solow 1956). Specifically, poorer regions, assuming similar technology and preferences, are expected to experience higher growth rates compared to their wealthier counterparts. Thus, over the long run, regions with similar characteristics should converge to a common steady state. We examine this convergence hypothesis employing a growth regressions framework in the tradition of Barro and Sala-i-Martin (1992). Equation 1 presents this convergence process in its simplest (unconditional) form.

\[ \boldsymbol{g_t} = \beta_1 \boldsymbol{x_{t-1}} + \boldsymbol{\varepsilon_t} \tag{1}\]

where \(\boldsymbol{g_t}\) represents an \(N\text{-by-}1\) vector of observations on per-capita NTL growth for each of the \(N\) regions over the period \(t\). The vector \(\boldsymbol{x_{t-1}}\) represents an \(N\text{-by-}1\) vector of observations on the initial (log) level of per-capita NTL. The parameter \(\beta_1\) is a regression coefficient that indicates the direction and strength of regional convergence. A negative value of \(\beta_1\) would suggest that regions with lower initial NTL levels grow faster, consistent with the convergence hypothesis. Finally, \(\boldsymbol{\varepsilon_t}\) represents a vector of idiosyncratic error terms.

As there are no control variables, this simple convergence framework implies that districts converge to a common steady state. However, regions may differ in various aspects such as geography, socio-economic conditions, and policy implementation. To account for these differences, we include state fixed effects to control for state-specific institutions and policies that influence the rate of convergence. Additionally, we incorporate a range of geo-climatic controls alongside district-specific conditions related to demographics, human capital, and infrastructure. Equation 2 summarizes this conditional convergence framework.

\[ \boldsymbol{g_t} = \beta_1 \boldsymbol{x_{t-1}} + \boldsymbol{X_t} \boldsymbol{\alpha} + \boldsymbol{\varepsilon_t} \tag{2}\]

Here, the matrix \(\boldsymbol{X_t}\) is an \(N\text{-by-}k\) collection of observations on control variables (including state fixed effects) for each district in our sample. The vector \(\boldsymbol{\alpha}\), with dimensions \(k \times 1\), captures the regression coefficients for these variables. The full list of control variables and their descriptions are available in Table A.1 of Chanda and Kabiraj (2020).

Spatial dependence testing

The analysis of regional convergence using nighttime light data requires explicit consideration of spatial dependencies across districts. Spatial dependence—the tendency for observations at nearby locations to be more similar than those at distant locations—can arise from spatial spillovers, shared geographic conditions, or inter-regional economic linkages (Anselin 1988). If present and unaccounted for, spatial dependence can lead to biased parameter estimates and invalid inference in standard regression models. To formally test for such spatial relationships, we employ the Global Moran’s I statistic and Local Indicators of Spatial Association (LISA), applied to the main variables of Equation 2.

The Global Moran’s I statistic, originally proposed by Moran (1950), is the most widely used measure of spatial autocorrelation. It quantifies the overall degree of spatial clustering among geographic units and can be expressed as:

\[ I=\frac{n}{\sum_i \sum_j w_{ij}} \cdot \frac{\sum_i \sum_j w_{ij} \, z_i \, z_j}{\sum_i z_i^2} \tag{3}\]

where \(z_i\) represents the deviation of observation \(i\) from the mean, \(w_{ij}\) denotes the spatial connection (weight) between units \(i\) and \(j\), and \(n\) is the total number of observations. The Moran’s I statistic typically ranges from \(-1\) to \(+1\). Positive values indicate positive spatial autocorrelation, where similar values tend to be located near each other. Values near zero suggest spatial randomness, and negative values indicate spatial dispersion, where dissimilar values are neighbors. Statistical significance is assessed through a permutation-based inference approach that compares the observed statistic against a reference distribution generated by randomly reassigning values across locations.

While the Global Moran’s I provides a single summary measure of overall spatial dependence, it does not reveal where significant clusters or outliers are located. To address this limitation, Anselin (1995) proposed Local Indicators of Spatial Association (LISA), which decompose the global statistic into contributions from each individual observation. The local Moran’s I for observation \(i\) is defined as:

\[ I_i = z_i \sum_j w_{ij} \, z_j \tag{4}\]

where the summation is over the neighbors of \(i\) as defined by the spatial weight matrix. This local statistic classifies each observation into one of four categories based on the relationship between a location’s value and those of its neighbors. High-High (HH) indicates a high-value location surrounded by high-value neighbors, while Low-Low (LL) indicates a low-value location surrounded by low-value neighbors. High-Low (HL) is a spatial outlier where a high-value location is surrounded by low-value neighbors, and Low-High (LH) is the opposite spatial outlier. The HH and LL categories identify spatial clusters, while the HL and LH categories identify spatial outliers. Statistical significance of each local statistic is assessed through conditional permutation tests, with only statistically significant locations (typically at \(p < 0.05\)) reported in the LISA cluster maps.

The specification of the spatial weights matrix \(\mathbf{W}\) is important for capturing the underlying spatial structure. We employ a \(k\)-nearest neighbors weight matrix with \(k = 6\), whereby for each district, the six geographically closest districts are identified as its neighbors. This specification ensures a fixed number of connections per district regardless of the heterogeneity in district sizes across the country:

\[ \mathbf{W}=\left[\begin{array}{ccccc} w_{11} & w_{12} & w_{13} & \cdots & w_{1n} \\ w_{21} & w_{22} & w_{23} & \cdots & w_{2n} \\ \vdots & \vdots & \vdots & w_{ij} & \vdots \\ w_{n1} & w_{n2} & w_{n3} & \cdots & w_{nn} \end{array}\right] \]

where \(w_{ij} = 1\) if district \(j\) is among the six nearest neighbors of district \(i\), and \(w_{ij} = 0\) otherwise. Following standard practice, we row-normalize the weights matrix so that each row sums to one. This ensures that the spatial lag of a variable, \(\mathbf{W} \mathbf{x}\), represents the average value among a district’s neighbors. The choice of \(k = 6\) provides a balance between capturing local spatial interactions and avoiding the inclusion of geographically distant districts that are unlikely to exert meaningful economic influence.

The detection of significant spatial dependence would justify the use of spatial econometric techniques. In particular, Ertur and Koch (2007) and Fischer (2011) argue that the spatial Durbin model (SDM) can appropriately account for spatial dependence in the convergence process. This methodological choice allows us to distinguish between direct effects of district characteristics and indirect effects operating through spatial channels (LeSage and Pace 2009).

Spatial spillover modeling

Our spatial spillover modeling builds upon the spatial Solow growth model developed by Ertur and Koch (2007) and Fischer (2011). Their model extends the traditional Solow framework to account for technological interdependence across regions. The model considers an economy of \(N\) subnational regions, each characterized by a Cobb-Douglas production function with constant returns to scale:

\[ Y_{i t}=A_{i t} K_{i t}^{\alpha_{K}} H_{i t}^{\alpha_{H}} L_{i t}^{1-\alpha_{K} -\alpha_{H}} \tag{5}\]

where \(Y_{it}\) represents output, \(K_{it}\) physical capital, \(H_{it}\) human capital, \(L_{it}\) labor force, and \(A_{it}\) the level of technological knowledge for region \(i\) at time \(t\). The parameters \(\alpha_K\) and \(\alpha_H\) denote the output elasticities with respect to physical and human capital, respectively. A key innovation of this framework is the modeling of technological knowledge, which incorporates both internal and external factors:

\[ A_{i t}=\Omega_{t} k_{i t}^{\theta} h_{i t}^{\phi} \prod_{j \neq i}^{N} A_{j t}^{\rho W_{i j}} \tag{6}\]

This specification captures three distinct components of technological progress:

  • An exogenous component (\(\Omega_t\)) representing the common stock of knowledge across regions
  • An embodied component (\(k_{i t}^{\theta} h_{i t}^{\phi}\)) reflecting technology embedded in physical and human capital per worker
  • A spatial component (\(\prod_{j \neq i}^{N} A_{j t}^{\rho W_{i j}}\)) capturing technological interdependence between regions

Based on this theoretical framework, Ertur and Koch (2007) derived a spatial Durbin model that accounts for regional spillovers in the convergence process. Their model can be compactly written in matrix notation as:

\[ \boldsymbol{g_t} = \beta_1 \boldsymbol{x_{t-1}} + \boldsymbol{X_t} \boldsymbol{\alpha} + \beta_2 \boldsymbol{W} \boldsymbol{x_{t-1}} + \boldsymbol{W} \boldsymbol{X_t} \boldsymbol{\gamma} + \lambda \boldsymbol{W} \boldsymbol{g_t} + \boldsymbol{\varepsilon_t} \tag{7}\]

In this model, \(\boldsymbol{g_t}\) represents an \(N\text{-by-}1\) vector of observations on per-capita NTL growth for each of the \(N\) regions over the period \(t\). The vector \(\boldsymbol{x_{t-1}}\) represents an \(N\text{-by-}1\) vector of observations on the initial (log) level of per-capita NTL. The parameter \(\beta_1\) is a regression coefficient that indicates the direction and strength of regional convergence. The matrix \(\boldsymbol{X_t}\) is an \(N\text{-by-}k\) collection of observations on control variables for each region. The vector \(\boldsymbol{\alpha}\), with dimensions \(k \times 1\), captures the regression coefficients for these variables. Additionally, \(\boldsymbol{W}\boldsymbol{X_t}\) denotes an \(N \times k\) matrix of spatially lagged observations, composed of a linear combination of neighboring values for the variables of interest in each region. The vector \(\boldsymbol{\gamma}\) represents the regression coefficients associated with these spatial lags. The terms \(\boldsymbol{W} \boldsymbol{x_{t-1}}\) and \(\boldsymbol{W} \boldsymbol{g_t}\) refer to \(N\text{-by-}1\) vectors capturing the spatial lags of initial (log) per-capita NTL, and the per-capita NTL growth, respectively. Finally, \(\boldsymbol{\varepsilon_t}\) represents a vector of idiosyncratic error terms.

Results

Regional convergence: An interactive exploration from outer space

Before presenting the regression results, we visually illustrate the concepts of growth and convergence in nighttime lights. First, we create an interactive map of India displaying regional luminosity in 1996 and 2010.2 Second, we examine absolute convergence by constructing a convergence scatterplot. This scatterplot depicts the relationship between per-capita growth rates in nighttime lights (1996–2010) and initial per-capita nighttime light levels (1996) for each district. Finally, we present some case studies to illustrate the nighttime light growth that occurred during our study period. Focusing on some of the poorest regions in the country, we show an increase in nighttime lights that visually aligns with the convergence hypothesis.

Figure 1: Regional luminosity in India: 1996 vs 2010
Notes: Luminosity is measured in radiance-calibrated digital number (DN) values from DMSP-OLS satellites. Interactive web application available at https://bit.ly/india-rc-ntl.
Source: Authors’ visualization using pre-processed luminosity images from the Earth Observation Group (NOAA/NCEI). See View from outer space notebook for source code.

Figure 1 presents static maps (captured from our interactive application) of luminosity for the initial and final years (1996 and 2010). The maps show a noticeable increase in brightness across most parts of India in 2010. Since nighttime lights serve as a proxy for economic activity, this increase in luminosity reflects economic growth over the period. Figure 2 illustrates the relationship between per-capita growth in nighttime lights and initial per-capita nighttime light. The scatterplot shows an inverse relationship: the estimated \(\beta\)-convergence coefficient of about \(-0.02\) implies an annual speed of convergence of roughly 2.3% and a half-life of about 30 years, consistent with the seminal finding of Barro and Sala-i-Martin (1992).

Figure 2: Regional luminosity convergence across districts in India
Notes: Each point represents one of the 520 districts. The regression line shows the estimated beta-convergence relationship. Outlier districts are labeled.
Source: Data from Chanda and Kabiraj (2020). See Regional convergence notebook for source code.

Next, we examine case studies of three economically disadvantaged states in India to illustrate their growth patterns over the study period. Among these, Bihar is the poorest, with a per-capita income at 39.2% of the national average, while Uttar Pradesh and Chhattisgarh have per-capita incomes of 43.8% and 52.3% of the national average, respectively.3 Figure 3 displays the change in luminosity in these three states during our study period. Although these states remain among the poorest, there is a noticeable increase in luminosity over the course of the study.

Figure 3: Some illustrative examples of regional convergence
Notes: Luminosity is measured in radiance-calibrated digital number (DN) values from DMSP-OLS satellites. Interactive web application available at https://bit.ly/india-rc-ntl.
Source: Authors’ visualization using pre-processed luminosity images from the Earth Observation Group (NOAA/NCEI). See View from outer space notebook for source code.

Spatial dependence is a feature of the convergence process

Before proceeding with formal econometric analysis, we examine the spatial distribution of the variables under study. Choropleth maps provide a natural tool for this purpose, as they allow researchers to visualize how a variable of interest varies across geographic units. They also help identify spatial patterns that may not be apparent from summary statistics alone. Figure 4 presents the spatial distribution of initial luminosity in 1996 and the subsequent growth rate of luminosity over the 1996–2010 period across the 520 districts in our sample.

Figure 4: Spatial distribution of initial luminosity and luminosity growth
Notes: Districts are classified into five categories using Fisher-Jenks natural breaks. Panel (a) shows log of luminosity per capita in 1996. Panel (b) shows luminosity growth per capita over 1996–2010.
Source: Data from Chanda and Kabiraj (2020). See Spatial dependence notebook for source code.

The choropleth maps in Figure 4 reveal a distinct spatial pattern. Panel (a) shows that initial luminosity levels are concentrated in specific geographic corridors, with higher values clustered along western coastal areas while large portions of central and eastern India exhibit markedly lower luminosity. Panel (b), which displays the growth rate of luminosity per capita over the study period, presents a pattern that is largely the inverse of the initial distribution. Districts that were initially bright tend to exhibit lower growth rates, whereas districts that were initially dim tend to grow at a faster rate. This spatial inversion provides a first visual indication that regional convergence in India may have an important spatial dimension.

While the choropleth maps visually suggest the presence of spatial structures in both variables, a formal analysis of spatial dependence requires the definition of a neighborhood for each district. This neighborhood structure is specified through a spatial weight matrix, which encodes the connectivity between geographic units. In this study, we adopt a six nearest neighbors weight matrix, whereby for each of the 520 districts, the six geographically closest districts are identified as its neighbors. This connectivity structure is visualized as a network in Figure 5, where each node represents a district centroid and each edge connects a district to one of its six nearest neighbors.

Figure 5: Spatial connectivity structure based on six nearest neighbors
Notes: Each node represents a district centroid. Each edge connects a district to one of its six geographically closest neighbors. The weight matrix is row-standardized.
Source: Data from Chanda and Kabiraj (2020). See Spatial dependence notebook for source code.

With the neighborhood of each district defined, we can formally assess the degree of spatial dependence in the variables. The notion of spatial dependence can be understood through two complementary concepts. The first is the overall degree of spatial clustering in the data, and the second is the specific locations of statistically significant clusters and spatial outliers. The Global Moran’s I statistic captures the first of these concepts. It typically ranges from \(-1\) (indicating perfect spatial dispersion) to \(+1\) (indicating perfect spatial clustering), with values near zero suggesting spatial randomness. In the context of our data, the Moran’s I for initial luminosity per capita is 0.73 (\(p = 0.001\)), indicating a strong degree of positive spatial autocorrelation. Districts with high (low) initial luminosity tend to be surrounded by districts that also exhibit high (low) initial luminosity. Similarly, the Moran’s I for luminosity growth is 0.60 (\(p = 0.001\)), confirming that the growth rates of neighboring districts are also significantly correlated. Together, these statistics provide further evidence that both the initial level and subsequent growth of luminosity exhibit substantial spatial clustering.

While the Global Moran’s I confirms the overall presence of spatial dependence, it does not reveal the location of statistically significant clusters and outliers. To address this limitation, we employ Local Indicators of Spatial Association (LISA). These indicators decompose the global statistic into district-level contributions and classify each district into one of four categories. High-High (HH) indicates a district with a high value surrounded by neighbors with similarly high values, while Low-Low (LL) indicates a low-value district surrounded by low-value neighbors. High-Low (HL) is a spatial outlier where a high-value district is surrounded by low-value neighbors, and Low-High (LH) is the opposite spatial outlier. Figure 6 presents the Moran scatterplot and LISA cluster map for initial luminosity per capita. The cluster map identifies distinct geographic concentrations: HH clusters mark the most luminous regions and their bright neighbors, while LL clusters highlight contiguous areas of low luminosity, predominantly in central and eastern India.

Figure 6: Spatial dependence in the initial level of luminosity
Notes: Panel (a) shows the Moran scatterplot with Global Moran’s I statistic. Panel (b) shows the LISA cluster map with statistically significant clusters at p < 0.05 based on 999 permutations.
Source: Data from Chanda and Kabiraj (2020). See Spatial dependence notebook for source code.

The same LISA analysis applied to luminosity growth rates reveals a geographic pattern that is largely the inverse of the initial luminosity clusters, as shown in Figure 7. Regions that were classified as HH clusters in initial luminosity—the brightest districts and their neighbors—tend to appear as LL clusters in luminosity growth. This indicates that these initially prosperous areas experienced relatively slower growth over the study period. Conversely, districts that formed LL clusters in initial luminosity—the dimmest regions—tend to emerge as HH clusters in growth, reflecting faster catch-up growth in initially lagging areas. This spatial inversion between the initial level and subsequent growth of luminosity is the spatial signature of the convergence process: red regions in the initial luminosity map become blue regions in the growth map, and vice versa. These local spatial patterns provide visual evidence that spatial dependence is a prominent feature of the regional convergence process observed in India. This finding motivates the use of spatial econometric methods to formally account for these strong spatial interdependencies.

Figure 7: Spatial dependence in the growth rate of luminosity
Notes: Panel (a) shows the Moran scatterplot with Global Moran’s I statistic. Panel (b) shows the LISA cluster map with statistically significant clusters at p < 0.05 based on 999 permutations.
Source: Data from Chanda and Kabiraj (2020). See Spatial dependence notebook for source code.

Evidence of spatial spillovers in regional convergence

The regression results of Table 1 provide evidence of both unconditional and conditional convergence across districts in India. Both conventional ordinary least squares (OLS) and spatial econometric approaches indicate significant negative relationships between initial luminosity levels and subsequent growth rates. The direct effects, representing within-district convergence, remain stable across specifications, ranging from -0.020 to -0.026. This consistency across different model specifications and estimation methods suggests that poorer districts are catching up to their wealthier counterparts, even after controlling for various district characteristics and state-level fixed effects.

Table 1: Unconditional and conditional convergence across districts.
Model 1 Model 2 Model 3 Model 4
OLS SDM OLS SDM OLS SDM OLS SDM
Direct -0.020*** -0.026*** -0.022*** -0.021*** -0.025*** -0.026*** -0.025*** -0.025***
(0.002) (0.002) (0.003) (0.002) (0.003) (0.002) (0.003) (0.002)
Indirect 0.004 -0.001 -0.015* -0.013*
(0.006) (0.005) (0.009) (0.007)
Total -0.020*** -0.022*** -0.022*** -0.022*** -0.025*** -0.041*** -0.025*** -0.037***
(0.002) (0.006) (0.003) (0.005) (0.003) (0.009) (0.003) (0.007)
Controls No No No No Yes Yes Yes Yes
State FE No No Yes Yes No No Yes Yes
AIC -1945 -2292 -2409 -2468 -2211 -2358 -2465 -2501

The progression from unconditional to conditional specifications reveals how the estimated convergence process is shaped by the inclusion of additional covariates. In the OLS specifications, the direct effect strengthens from -0.020 to -0.022 in the unconditional models (Models 1 and 2) to -0.025 once district characteristics are added (Models 3 and 4), showing that omitting these characteristics attenuates the estimated speed of convergence. The spatial Durbin direct effects match this pattern in the conditional models (-0.026 in Model 3 and -0.025 in Model 4). In the unconditional models, however, the impact decomposition is unreliable: the spatial autoregressive parameter is very large (around 0.8 in Model 1), which inflates the Model 1 direct effect to -0.026. Once we account for structural differences across districts—such as population density, urbanization, or sectoral composition—the underlying tendency for poorer districts to catch up becomes more pronounced. The inclusion of state fixed effects, which capture unobserved state-level heterogeneity such as differences in governance and institutional quality, further sharpens the estimates by absorbing variation that might otherwise confound the convergence relationship.

The spatial Durbin model also reveals spatial spillover effects that are not captured by traditional OLS estimations. These indirect effects, which capture the influence of neighboring districts’ initial conditions on a district’s growth rate, follow a notable pattern across specifications. In the unconditional models (Models 1 and 2), the estimated indirect effects are statistically insignificant and erratic—negative in Model 2 but positive in Model 1—reflecting how, without controls, the strong residual spatial dependence confounds the spillover estimates with omitted variables. However, once district-level controls are introduced (Models 3 and 4), the indirect effects become both larger in magnitude and statistically significant at the 10% level, reaching -0.015 in Model 3 and -0.013 in our most comprehensive Model 4. This emergence of significant spillover effects in the conditional specifications indicates that the spatial channels of convergence become discernible only after accounting for district-specific characteristics.

The total impact of initial conditions on growth, combining both direct and spillover effects, is substantially larger when we account for spatial dependence. In our fully specified model (Model 4), the total convergence effect in the spatial Durbin model (-0.037) is approximately 48% larger in magnitude than the OLS estimate (-0.025). Translated into an annual speed of convergence, this raises the implied speed from about 3.0% under OLS to about 5.2% under the SDM, shortening the implied half-life from roughly 23 to 13 years. The gap is even larger in Model 3 (SDM total -0.041 vs. OLS -0.025, or 64%), but because the Model 3 and Model 4 spillover estimates are statistically indistinguishable we anchor our interpretation on the preferred Model 4. These differences indicate that conventional non-spatial approaches may underestimate the speed of regional convergence by failing to capture the additional convergence channels created through spatial spillovers.

The model fit statistics further support the spatial econometric approach. The Akaike Information Criterion (AIC) consistently favors the spatial Durbin model over OLS across all four specifications. The most comprehensive model (Model 4 SDM) achieves the lowest AIC value of -2501 compared to -2465 for the corresponding OLS. Notably, the improvement from incorporating spatial structure is most pronounced in the unconditional specification (Model 1). The AIC drops by 347 units when moving from OLS to SDM, reflecting the large amount of spatial dependence left unmodeled by ordinary regressions. Even in the fully specified model, where state fixed effects already absorb much of the spatial heterogeneity, the SDM retains a meaningful advantage. Overall, these results suggest that regional convergence in India operates not only through district-specific factors but also through spatial interactions between neighboring districts. Ignoring these interactions leads to both underestimated convergence speeds and inferior model performance.

Robustness to alternative spatial weight matrices

The spillover results above use a six-nearest-neighbor (6NN) spatial weight matrix. To assess their sensitivity to this choice, we re-estimate the preferred Model 4 under six alternative specifications—four- and eight-nearest neighbors, queen and rook contiguity, and inverse distance and inverse-distance-squared (each applied within a distance band)—and compare them with the 6NN baseline.

Figure 8: Robustness of the Model 4 spatial impacts of initial luminosity to the choice of spatial weight matrix. Points are Direct, Indirect, and Total impacts; bars are 95% Monte-Carlo confidence intervals. The dashed lines mark the 6NN baseline estimates.
Table 2: Model 4 spatial impacts of initial luminosity under alternative spatial weight matrices (full LeSage–Pace method; Monte-Carlo standard errors in parentheses).
Weight matrix Direct Indirect Total AIC
4 nearest neighbors -0.024***
(0.002)
-0.008
(0.006)
-0.032***
(0.006)
-2468
6 nearest neighbors (base) -0.025***
(0.002)
-0.013*
(0.007)
-0.037***
(0.007)
-2501
8 nearest neighbors -0.025***
(0.002)
-0.011
(0.008)
-0.036***
(0.008)
-2485
Queen contiguity -0.025***
(0.002)
-0.010**
(0.005)
-0.035***
(0.005)
-2463
Rook contiguity -0.025***
(0.002)
-0.009**
(0.005)
-0.034***
(0.005)
-2469
Inverse distance -0.025***
(0.002)
-0.016**
(0.007)
-0.041***
(0.008)
-2486
Inverse distance squared -0.025***
(0.002)
-0.012*
(0.006)
-0.037***
(0.006)
-2485

The convergence spillovers are robust to the choice of spatial weights (Figure 8 and Table 2). The direct (within-district) convergence effect is essentially unchanged across all seven specifications, ranging from -0.024 to -0.026 and always significant at the 1% level. The total convergence effect likewise remains negative and significant throughout, ranging from -0.032 to -0.041, with the 6NN baseline (-0.037) near the middle of this range. The indirect (spillover) component is negative in every specification and statistically significant under most of them. This stability indicates that the evidence for spatial spillovers in the convergence process is not an artifact of the particular neighbor definition, but a consistent feature of the data.

Discussion

Beyond the economy: Luminosity and cultural factors

Nighttime lights capture not just GDP but a composite of electrification, urbanization, infrastructure, and broader socioeconomic conditions (Henderson et al. 2012; Mellander et al. 2015). This multi-dimensional nature invites exploration of how luminosity relates to socioeconomic indicators beyond strictly economic output. In this section, we examine the association between luminosity and cultural participation patterns across Indian states.

A growing literature argues that regional cultural attitudes constitute an independent factor in economic development, not merely a by-product of income or institutions. In particular, Tubadji (2025) proposes a Culture-Based Development framework arguing that regional cultural participation patterns—measured through revealed preferences in household expenditure and survey data—predict regional development patterns. For India, the National Sample Survey (NSS) 47th Round (July–December 1991) provides state-level data on six dimensions of cultural participation: live cultural performance, cultural telecast (TV/media), socio-cultural participation, cultural heritage and religion, live cultural shows, and sports.

With nighttime lights data for an adjacent period (1992) now available for 32 Indian states and union territories, we can examine the extent to which luminosity is associated with regional cultural patterns (see Spatial culture notebook for details and extended analyses). Given the small sample size (N = 32) and the potential leverage of small territories at the extremes of the distribution on linear correlation measures, we report Spearman rank correlations throughout this section. Figure 9 presents the relationship between log nighttime lights per capita and the two cultural dimensions that show statistically significant associations.

Figure 9: Relationship between nightlight luminosity and cultural participation across 32 Indian states
Notes: Each point represents one of the 32 Indian states and union territories. Panel (a) shows the bivariate relationship between log nighttime lights per capita and cultural telecast (TV/media); Panel (b) shows the relationship with socio-cultural participation. Solid line is the OLS regression; gray band shows the 95% confidence interval (t-distribution, 30 df). Annotations report regression slope, Spearman rank correlation, and Pearson correlation.
Source: Nighttime lights from CCNL DMSP-OLS (Zhao et al., 2022). Cultural participation from NSS 47th Round (July–December 1991). See Spatial culture notebook for source code.

Cultural telecast (TV/media) exhibits a positive association (Spearman \(\rho\) = 0.370, \(p\) = 0.037) with nighttime light luminosity. That is, states with higher luminosity consume more culture through television and media. This link is intuitive: access to television depends on electrification and urbanization, which are closely tied to luminosity.

In contrast, socio-cultural participation shows a significant negative association (Spearman \(\rho\) = $-$0.404, \(p\) = 0.022). That is, community-based cultural engagement is stronger in states with less luminosity. This suggests that less urbanized regions rely more on collective, in-person forms of cultural participation rather than on media infrastructure.

Figure 10 presents spatial distribution of these two cultural variables through the lens of a LISA analysis. Both dimensions exhibit significant spatial autocorrelation, indicating that cultural participation is not randomly distributed across Indian states. The spatial clustering of cultural telecast broadly mirrors the economic geography distribution. States in the western and southern regions form high-high clusters while northeastern and eastern states form distinct community-participation clusters.

Figure 10: LISA cluster maps of cultural participation across 32 Indian states
Notes: Panel (a) shows results for cultural telecast (TV/media); Panel (b) for socio-cultural participation. Left subpanels show Moran scatterplots with the Global Moran’s I statistic. Right subpanels show LISA cluster maps with statistically significant clusters at p < 0.05 based on 999 permutations and a 6-nearest-neighbors spatial weights matrix. Region labels are overlaid from the CartoDB Positron basemap.
Source: Cultural participation data from NSS 47th Round (July–December 1991). See Spatial culture notebook for source code.

Together, these two results reveal a contrast in how regions engage with culture: more luminous states favor media-based consumption, while less luminous states sustain community-based participation. The remaining four cultural dimensions—live cultural performance, cultural heritage and religion, live cultural shows, and sports—show no statistically significant association with luminosity. However, these findings are based on a small cross-section of 32 states observed at a single point in time. Further investigation with larger datasets covering more regions and additional time periods is needed to establish whether these patterns are robust.

Better luminosity data from VIIRS

Our analysis, following Chanda and Kabiraj (2020), relies on radiance-calibrated DMSP-OLS nighttime lights data covering the period 1996 to 2010. This dataset has been widely used as a proxy for economic activity (Henderson et al. 2012; Chen and Nordhaus 2011). However, DMSP-OLS data are subject to well-documented limitations, including top-coding in bright urban cores and lack of on-board calibration leading to inter-satellite inconsistencies. Blooming artifacts that spatially blur light sources beyond their true boundaries represent an additional concern (Abrahams et al. 2018). These measurement issues can attenuate the precision of convergence estimates, particularly in rapidly urbanizing districts.

The Visible Infrared Imaging Radiometer Suite (VIIRS), operational since 2012, represents a marked improvement over DMSP-OLS along multiple dimensions. VIIRS offers finer spatial resolution (approximately 750 meters versus 2.7 kilometers) and on-board radiometric calibration that provides consistent quantitative measurements. It also features a wider dynamic range that avoids saturation in urban areas while detecting dim lights in rural settlements (Elvidge et al. 2017). Systematic assessments recommend VIIRS as the preferred product for cross-sectional and recent time-series studies (Gibson et al. 2021). Future extensions of our convergence analysis using VIIRS data could yield more precise estimates of both direct and indirect effects, especially in districts where blooming effects may have distorted the true spillover effects.

A practical challenge for extending long-run convergence studies is the discontinuity between DMSP (1992–2013) and VIIRS (2012–present) sensor eras. Recent harmonization efforts, notably the global harmonized nighttime light dataset by Li et al. (2020), have created consistent long-run time series by calibrating VIIRS observations to DMSP-equivalent units during the overlap period. Such harmonized datasets could enable the extension of our spatial Durbin analysis to more recent periods. This would allow researchers to examine whether the convergence patterns and spatial spillovers documented here have persisted, accelerated, or changed in character as India’s economy has continued to transform.

New research directions

While our analysis documents the average convergence effect and its spatial spillover component, the cross-sectional regression framework does not capture potential heterogeneity in convergence patterns across the income distribution. Distribution dynamics approaches, as pioneered by Quah (1996), could reveal whether Indian districts are converging to a single steady state or forming distinct convergence clubs where districts converge within groups but diverge across them. The regression tree methods developed by Durlauf and Johnson (1995) for identifying multiple growth regimes could be combined with spatial econometric techniques. Such an approach could examine whether geographic clusters of districts follow distinct convergence trajectories, potentially revealing spatial poverty traps or growth poles that are not visible in average convergence estimates (Rey and Montouri 1999). Furthermore, analysis of harmonized luminosity data across Chinese provinces has uncovered complex inequality dynamics that differ markedly across cross-sectional and temporal dimensions (Glawe and Mendez 2024). Extending these insights to the Indian district context could help determine whether the convergence patterns we document reflect a single equilibrium or mask the formation of distinct spatial clubs.

Another important direction concerns the causal identification of the spillover channels that our spatial Durbin model captures in reduced form. While our estimates document significant indirect effects, the model does not identify whether these spillovers operate through infrastructure linkages, labor migration, technology diffusion, or market access channels. Quasi-experimental approaches, such as those employed by Asher and Novosad (2020) to study the causal effects of rural road construction on structural transformation in India, could be embedded within spatial econometric frameworks to isolate specific spillover mechanisms. Understanding which channels drive the indirect convergence effects is important for designing spatially targeted policies that may amplify positive spillovers.

Comparative evidence from other developing economies suggests that spatial dependence in regional convergence is a widespread phenomenon rather than an India-specific feature. Studies have documented significant spatial spillover effects in convergence processes across Thailand (Tipayalai and Mendez 2024), Turkey (Ursavas and Mendez 2023), and Indonesian districts (Miranti and Mendez 2023), with neighbor effects and spatial conditioning factors playing important roles in shaping convergence trajectories. This cross-country regularity strengthens the case for systematic investigation of the specific mechanisms driving spatial spillovers, as the channels may differ across institutional and geographic contexts.

The growing availability of diverse satellite products and machine learning methods opens possibilities for richer measurement of regional economic activity. Jean et al. (2016) showed that combining high-resolution daytime imagery with nighttime lights through deep learning can considerably improve poverty prediction in data-scarce settings. Similarly, Keola et al. (2015) showed that integrating nighttime lights with land cover data improves economic measurement in agricultural areas where lights alone provide weak signals. These alternative data sources allow the construction of multi-dimensional proxies for regional economic activity that go beyond what nighttime lights alone can capture. Recent applications demonstrate the practical potential of these advances for subnational economic measurement. Chen et al. (2024) show that higher-quality VIIRS nighttime lights can predict sectoral GDP composition across Turkish provinces, distinguishing between urban service-oriented and rural agricultural regions in ways that DMSP data cannot. Hussein et al. (2025) employ machine learning methods with multiple remote sensing indicators to predict subnational GDP in Vietnam, achieving accuracy improvements over traditional luminosity-based approaches. At a broader scale, Khoun et al. (2025) combine big data sources, socioeconomic surveys, and machine learning to map multidimensional poverty in Cambodia, illustrating how satellite-derived features can complement conventional survey instruments. Applying similar multi-source approaches to Indian districts could yield richer proxies for economic activity that overcome the well-known limitations of nighttime lights in agricultural and low-density areas.

Research reproducibility and open science

The complexity of satellite-based economic research—involving multi-step data processing pipelines, spatial econometric estimation, and geographic visualization—makes reproducibility both challenging and essential. As Donaldson and Storeygard (2016) note, the processing of satellite data requires careful documentation and sharing of code to ensure that results can be replicated and extended. Each methodological choice in the pipeline, from sensor inter-calibration to spatial weight matrix construction, can affect empirical conclusions. This underscores the need for transparent computational workflows that allow other researchers to verify and build upon published findings.

This article adopts a reproducible research approach through Jupyter notebooks and the Quarto publishing framework. All computational analyses are documented in embedded notebooks that readers can inspect alongside the results they produce. The interactive visualization tool, built on Google Earth Engine, allows researchers to explore the spatial and temporal patterns in satellite nighttime light data.

Open-source tools and cloud computing platforms are rapidly lowering the barriers to reproducible spatial economic research. Cloud-based computational notebooks, such as those developed by Mendez and Patnaik (2024) for processing nighttime lights data, eliminate the need for specialized local software and provide accessible workflows for data ingestion, preprocessing, spatial analysis, and visualization. The combination of open data repositories, version-controlled code, and cloud computing infrastructure creates an ecosystem where the full analytical pipeline—from satellite imagery to econometric results—can be made transparent, verifiable, and extensible by the broader research community.

Concluding remarks

This article re-examines the regional convergence hypothesis across Indian districts using satellite nighttime light data, interactive visualizations, and spatial econometric modeling. Building on the work of Chanda and Kabiraj (2020), we developed an interactive web-based visualization tool that illustrates spatial and convergence patterns across Indian districts. Spatial autocorrelation analyses indicate that spatial dependence is a notable characteristic of satellite data and the regional convergence process in India. Estimates from our spatial Durbin model indicate that incorporating spatial spillovers increases the estimated speed of regional convergence. The total convergence effect in our fully specified model is approximately 48% larger than conventional non-spatial estimates. This finding suggests that non-spatial convergence models may underestimate the speed of regional convergence. Additionally, it suggests that place-based development interventions may have broader impacts, as their benefits can extend to neighboring districts through spatial spillover effects.

Our results also demonstrate the usefulness of satellite nighttime lights for studying economic dynamics in countries where subnational data are scarce, infrequent, or unreliable. Conventional economic statistics at the district level are often unavailable or inconsistent across administrative boundaries, especially in large developing economies. Radiance-calibrated nighttime light data allowed us to analyze 520 Indian districts at a spatial granularity that would be difficult to achieve with traditional national accounts. This data-driven approach is potentially transferable to other data-poor contexts across the developing world. The ongoing transition from DMSP-OLS to the higher-resolution VIIRS sensor should yield more precise measurement of subnational economic activity in future studies.

This study also shows how reproducible open-science practices can strengthen the credibility and reach of scientific research. The entire analytical pipeline is documented in computational Jupyter notebooks using Python, R, and Stata code. Research results are automatically embedded within the manuscript through Quarto’s publishing framework. Moreover, a single manuscript source generates multiple output formats: HTML, PDF, Microsoft Word, among others. The HTML version is particularly useful as it allows readers to engage with interactive visualizations, easily inspect the code of multiple computational notebooks, and reevaluate the results in light of their source code. By hosting the complete codebase and data in a public GitHub repository and making the Python notebooks executable in the cloud through Google Colaboratory, we aim to encourage broader adoption of reproducible open-science practices.

Acknowledgments

This research project was supported by JSPS KAKENHI Grant Number 24K04884. During the preparation of this manuscript, the authors acknowledge the use of Claude Code (Anthropic) to assist with manuscript editing, computational notebook development, and research infrastructure setup. After using this AI tool, the authors reviewed the outputs, confirmed their accuracy, and take full responsibility for the content of this publication.

Conflict of Interest

The authors declare no conflict of interest.

Data and Code Availability

All data and computational code used in this study are available in the project repository: [Repository URL removed for blind review]. The interactive HTML version of this manuscript (https://quarcs-lab.github.io/project2025s-py/) embeds the computational notebooks, allowing readers to inspect the complete analytical pipeline from raw data to published results. All notebooks can be executed in the cloud using Google Colaboratory without requiring local software installation. The computational notebooks are implemented entirely in Python; an earlier implementation of the same analysis used R and Stata, and the present Python results are largely consistent with it—the only material difference being the impact decomposition of the unconditional spatial Durbin model (Model 1), whose total effect is unchanged.

References

Abrahams, Alexei, Christopher Oram, and Nancy Lozano-Gracia. 2018. “Deblurring DMSP Nighttime Lights: A New Method Using Gaussian Filters and Frequencies of Illumination.” Remote Sensing of Environment 210: 242–58. https://doi.org/10.1016/j.rse.2018.03.018.
Adhikari, Bibek, and Saroj Dhital. 2021. “Decentralization and Regional Convergence: Evidence from Night-Time Lights Data.” Economic Inquiry 59 (3): 1066–88. https://doi.org/10.1111/ecin.12967.
Allaire, J. J., Charles Teague, Carlos Scheidegger, Yihui Xie, and Christophe Dervieux. 2024. Quarto. https://quarto.org.
Anselin, Luc. 1988. Spatial Econometrics: Methods and Models. Springer. https://doi.org/10.1007/978-94-015-7799-1.
Anselin, Luc. 1995. “Local Indicators of Spatial Association—LISA.” Geographical Analysis 27 (2): 93–115. https://doi.org/10.1111/j.1538-4632.1995.tb00338.x.
Asher, Sam, and Paul Novosad. 2020. “Rural Roads and Local Economic Development.” American Economic Review 110 (3): 797–823. https://doi.org/10.1257/aer.20180268.
Barro, Robert J., and Xavier Sala-i-Martin. 1992. “Convergence.” Journal of Political Economy 100 (2): 223–51. https://doi.org/10.1086/261816.
Beyer, Robert C. M., Tarun Jain, and Biswa N. Sinha. 2021. “Lights Out? COVID-19 Containment Policies and Economic Activity.” Journal of Asian Economics 74: 101318. https://doi.org/10.1016/j.asieco.2021.101318.
Chakravarty, Praveen, and Vivek Dehejia. 2017. “Will GST Exacerbate Regional Divergence?” Economic and Political Weekly 52 (25–26): 97–102. https://www.epw.in/journal/2017/25-26/notes/will-gst-exacerbate-regional-divergence.html.
Chanda, Areendam, and C. Justin Cook. 2022. “Was India’s Demonetization Redistributive? Insights from Satellites and Surveys.” Journal of Macroeconomics 73: 103438. https://doi.org/10.1016/j.jmacro.2022.103438.
Chanda, Areendam, and Shankha Kabiraj. 2020. “Shedding Light on Regional Growth and Convergence in India.” World Development 133: 104961. https://doi.org/10.1016/j.worlddev.2020.104961.
Chen, Xi, and William D. Nordhaus. 2011. “Using Luminosity Data as a Proxy for Economic Statistics.” Proceedings of the National Academy of Sciences 108 (21): 8589–94. https://doi.org/10.1073/pnas.1017031108.
Chen, Yilin, Ugur Ursavas, and Carlos Mendez. 2024. “Can Higher-Quality Nighttime Lights Predict Sectoral GDP Across Subnational Regions? Urban and Rural Luminosity Across Provinces in Türkiye.” Letters in Spatial and Resource Sciences 17 (1): 1–21. https://doi.org/10.1007/s12076-024-00375-x.
Cook, Justin, and Manisha Shah. 2022. “Aggregate Effects from Public Works: Evidence from India.” Review of Economics and Statistics 104 (2): 201–17. https://doi.org/10.1162/rest_a_00951.
Donaldson, Dave, and Adam Storeygard. 2016. “The View from Above: Applications of Satellite Data in Economics.” Journal of Economic Perspectives 30 (4): 171–98. https://doi.org/10.1257/jep.30.4.171.
Durlauf, Steven N., and Paul A. Johnson. 1995. “Multiple Regimes and Cross-Country Growth Behaviour.” Journal of Applied Econometrics 10 (4): 365–84. https://doi.org/10.1002/jae.3950100404.
Elvidge, Christopher D., Kimberly E. Baugh, Mikhail Zhizhin, Feng-Chi Hsu, and Tilottama Ghosh. 2017. “VIIRS Night-Time Lights.” International Journal of Remote Sensing 38 (21): 5860–79. https://doi.org/10.1080/01431161.2017.1342050.
Ertur, Cem, and Wilfried Koch. 2007. “Growth, Technological Interdependence and Spatial Externalities: Theory and Evidence.” Journal of Applied Econometrics 22 (6): 1033–62. https://doi.org/10.1002/jae.963.
Fischer, Manfred M. 2011. “A Spatial Mankiw-Romer-Weil Model: Theory and Evidence.” Annals of Regional Science 47 (2): 419–36. https://doi.org/10.1007/s00168-010-0384-6.
Gibson, John, Susan Olivia, Geua Boe-Gibson, and Chao Li. 2021. “Which Night Lights Data Should We Use in Economics, and Where?” Journal of Development Economics 149: 102602. https://doi.org/10.1016/j.jdeveco.2020.102602.
Glawe, Linda, and Carlos Mendez. 2024. “Harmonized Luminosity and Economic Activity Across Provinces in China: Cross-Sectional Differences, Regional Time Series, and Inequality Dynamics.” Applied Economics 57: 10677–93. https://doi.org/10.1080/00036846.2024.2439583.
Gorelick, Noel, Matt Hancher, Mike Dixon, Simon Ilyushchenko, David Thau, and Rebecca Moore. 2017. “Google Earth Engine: Planetary-Scale Geospatial Analysis for Everyone.” Remote Sensing of Environment 202: 18–27. https://doi.org/10.1016/j.rse.2017.06.031.
Henderson, J. Vernon, Adam Storeygard, and David N. Weil. 2012. “Measuring Economic Growth from Outer Space.” American Economic Review 102 (2): 994–1028. https://doi.org/10.1257/aer.102.2.994.
Hussein, Suleiman, Minh-Thu Thi Nguyen, and Carlos Mendez. 2025. “Predicting Subnational GDP in Vietnam with Remote Sensing Data: A Machine Learning Approach.” Letters in Spatial and Resource Sciences 18 (1): 1–12. https://doi.org/10.1007/s12076-025-00397-z.
Jean, Neal, Marshall Burke, Michael Xie, W. Matthew Davis, David B. Lobell, and Stefano Ermon. 2016. “Combining Satellite Imagery and Machine Learning to Predict Poverty.” Science 353 (6301): 790–94. https://doi.org/10.1126/science.aaf7894.
Jha, Priyaranjan, and Karan Talathi. 2024. “Impact of Colonial Institutions on Economic Growth and Development in India: Evidence from Night-Lights Data.” Economic Development and Cultural Change 72 (4): 1653–708. https://doi.org/10.1086/725058.
Keola, Souknilanh, Magnus Andersson, and Ola Hall. 2015. “Monitoring Economic Development from Space: Using Nighttime Light and Land Cover Data to Measure Economic Growth.” World Development 66: 322–34. https://doi.org/10.1016/j.worlddev.2014.08.017.
Khoun, Theara, Ate Poortinga, Nyein Soe Thwal, Josue Gonzalez de Alba, Andrea McMahon, and Carlos Mendez. 2025. “Mapping the Dimensions of Poverty Through Big Data, Socioeconomic Surveys and Machine Learning in Cambodia.” Social Indicators Research 180: 1593–618. https://doi.org/10.1007/s11205-025-03718-3.
Kluyver, Thomas, Benjamin Ragan-Kelley, Fernando Pérez, et al. 2016. “Jupyter Notebooks—a Publishing Format for Reproducible Computational Workflows.” Positioning and Power in Academic Publishing: Players, Agents and Agendas, 87–90.
Knuth, Donald E. 1984. “Literate Programming.” The Computer Journal 27 (2): 97–111. https://doi.org/10.1093/comjnl/27.2.97.
LeSage, James P., and R. Kelley Pace. 2009. Introduction to Spatial Econometrics. Chapman; Hall/CRC. https://doi.org/10.1201/9781420064254.
Lessmann, Christian, and André Seidel. 2017. “Regional Inequality, Convergence, and Its Determinants: A View from Outer Space.” European Economic Review 92: 110–32. https://doi.org/10.1016/j.euroecorev.2016.11.009.
Li, Xuecao, Yuyu Zhou, Min Zhao, and Xia Zhao. 2020. “A Harmonized Global Nighttime Light Dataset 1992–2018.” Scientific Data 7 (1): 168. https://doi.org/10.1038/s41597-020-0510-y.
Mellander, Charlotta, José Lobo, Kevin Stolarick, and Zara Matheson. 2015. “Night-Time Light Data: A Good Proxy Measure for Economic Activity?” PLOS ONE 10 (10): e0139779. https://doi.org/10.1371/journal.pone.0139779.
Mendez, Carlos, and Ayush Patnaik. 2024. A Python Notebook for Processing Nighttime Lights Data: Methods and Applications. GitHub Repository. https://github.com/quarcs-lab/project2022p.
Miranti, Cani, and Carlos Mendez. 2023. “Social and Economic Convergence Across Districts in Indonesia: A Spatial Econometric Approach.” Bulletin of Indonesian Economic Studies 59 (3): 421–45. https://doi.org/10.1080/00074918.2022.2071415.
Moran, P. A. P. 1950. “Notes on Continuous Stochastic Phenomena.” Biometrika 37 (1–2): 17–23. https://doi.org/10.2307/2332142.
Peng, Roger D. 2011. “Reproducible Research in Computational Science.” Science 334 (6060): 1226–27. https://doi.org/10.1126/science.1213847.
Pinkovskiy, Maxim, and Xavier Sala-i-Martin. 2016. “Lights, Camera, Income! Illuminating the National Accounts-Household Surveys Debate.” Quarterly Journal of Economics 131 (2): 579–631. https://doi.org/10.1093/qje/qjw003.
Quah, Danny T. 1996. “Twin Peaks: Growth and Convergence in Models of Distribution Dynamics.” Economic Journal 106 (437): 1045–55. https://doi.org/10.2307/2235377.
Rey, Sergio J., and Brett D. Montouri. 1999. “US Regional Income Convergence: A Spatial Econometric Perspective.” Regional Studies 33 (2): 143–56. https://doi.org/10.1080/00343409950122945.
Solow, Robert M. 1956. “A Contribution to the Theory of Economic Growth.” Quarterly Journal of Economics 70 (1): 65–94. https://doi.org/10.2307/1884513.
Tamiminia, Haifa, Bahram Salehi, Masoud Mahdianpari, Lindi J. Quackenbush, Sarina Adeli, and Brian Brisco. 2020. “Google Earth Engine for Geo-Big Data Applications: A Meta-Analysis and Systematic Review.” ISPRS Journal of Photogrammetry and Remote Sensing 164: 152–70. https://doi.org/10.1016/j.isprsjprs.2020.04.001.
Tipayalai, Katikar, and Carlos Mendez. 2024. “Regional Convergence and Spatial Dependence in Thailand: Global and Local Assessments.” Journal of the Asia Pacific Economy 29 (2): 693–720. https://doi.org/10.1080/13547860.2022.2041286.
Tubadji, Annie. 2025. “Cultural Entropy, Innovation, and Growth.” Politics & Policy 53 (4): e70050. https://doi.org/10.1111/polp.70050.
Ursavas, Ugur, and Carlos Mendez. 2023. “Regional Income Convergence and Conditioning Factors in Turkey: Revisiting the Role of Spatial Dependence and Neighbor Effects.” Annals of Regional Science 71 (2): 363–89. https://doi.org/10.1007/s00168-022-01168-0.

Footnotes

  1. Following Chanda and Kabiraj (2020), our sample consists of 520 districts (out of a possible 593). There were 593 districts in 2001, which rose to 640 districts in the 2011 census. To match districts across the two census files, we merged newly split districts back with their parent districts in the 2011 census. Eight of the 47 new districts were created by splitting areas from multiple parent districts. Those new districts along with their multiple-origin districts were dropped from our sample. We dropped all the districts in the state of Assam where more than 50% of districts were created in that manner.↩︎

  2. The interactive web application is available at https://bit.ly/india-rc-ntl. It was developed using Google Earth Engine and the source code is available in the View from outer space notebook.↩︎

  3. The data are taken from a report by the Economic Advisory Council to the Prime Minister (EAC-PM), released on September 18, 2024.↩︎