Historical Concessions & Geographic RDD

Causal & Institutional Systems 2019 Geospatial Causal Prototype

Overview

This project studied whether historical concession boundaries were associated with persistent differences in local outcomes, using a geographic regression-discontinuity design and spatially matched control areas.

The work combined concession boundary data, village-level data, geographic controls, conflict outcomes, public-goods indicators and night-light measures to compare areas inside historical concessions with nearby areas outside the same boundaries.

The project should be read as an applied causal-inference and political-economy prototype, not as a simple descriptive comparison.

Problem

Historical concessions may shape local development through infrastructure, settlement patterns, institutional exposure and conflict dynamics. The empirical challenge is that concession areas were not randomly assigned and may differ geographically from nearby untreated areas.

The goal was to construct a credible comparison between treated villages inside concession boundaries and nearby control villages outside those boundaries, while accounting for geographic confounders.

Identification Strategy

The core design was geographic: compare units near concession boundaries, where inside and outside villages are more likely to share similar geography than units farther away.

villages inside concession boundary
vs.
nearby villages outside concession boundary

The analysis used distance buffers, geographic matching and boundary-based comparisons to reduce imbalance between treated and control locations.

Data Construction

The workflow joined multiple spatial and tabular datasets, including concession indicators, village identifiers, coordinates and local-outcome variables. Boundary and GIS-derived inputs were treated as source data; the main implementation work focused on coordinate-based matching, dataset construction and econometric estimation.

Geographic Balance and Matching

The project included balance checks between treated and nearby control units. Matched datasets were constructed at multiple distance thresholds, using latitude-longitude information and distance-based rules to connect treated locations with comparable nearby controls.

Balance diagnostics compared geographic and baseline variables across inside and outside areas before interpreting outcome regressions. The matching and data-construction logic was developed across MATLAB and Stata, while the final regression workflow was implemented in Stata.

Outcome Models

The analysis considered several outcome families:

Regression specifications included geographic controls such as latitude and longitude, and alternative count-data models were considered for overdispersed outcomes.

Spatial Inference

Because observations are spatially located, the project considered the problem of spatial dependence in residuals and standard errors.

The workflow included references to Conley-style spatial HAC corrections and spatially robust inference as part of the econometric design.

Implemented Elements

Outputs

The project produced cleaned analysis datasets, balance checks and regression tables comparing concession and nearby non-concession areas across several outcomes.

The value of the project lies in the research design: moving from historical geographic boundaries to a structured causal comparison with explicit attention to matching, balance and spatial confounding.

Evaluation Limits

The design is informative but not automatic proof of causality. Historical concession placement, omitted geographic variables and spatial dependence all require caution.

Modern Extension

A modern version of the project would formalize the design in a reproducible geospatial pipeline and strengthen the robustness framework.

Technologies and Methods Used

Resources

Code and raw data are not public.

An anonymized technical note can be prepared upon request.