Causal Identification / Synthetic Control
Synthetic Control and SDID
Synthetic control builds a shadow treated unit from weighted untreated donors, using pre-policy fit to construct the post-policy counterfactual.
Mechanism Lab
Animation: how donor weights build the synthetic counterfactual
The animation reveals donor units, pushes their weights into the synthetic path, then compares the treated post-policy path with the synthetic counterfactual.
Step 1 / 5
Donor pool
Start with untreated units that are institutionally comparable and unaffected by the policy.
j=2,...,J+1Animation Control
Reduced-motion users receive the same step states without continuous motion.
01 / Intuition
Core Intuition
When one city, school, region, or country is treated, standard DID may lack a natural control. Synthetic control builds a better comparison by weighting a donor pool.
Weights should fit pre-policy outcomes and covariates, not post-policy outcomes.
Credibility comes from pre-fit quality, donor-pool justification, absence of concurrent shocks, placebo tests, and transparent reporting of weights and sample choices.
02 / Math
From donor weights to the post-treatment counterfactual
01 / Panel structure
Unit 1 is treated and units 2...J+1 are untreated donors. T0 is the last pre-treatment period.
Y_1t: treated unit outcome
Y_jt: donor unit outcome, j=2,...,J+1
t <= T0: pre-period, t > T0: post-period02 / Weight constraints
Synthetic-control weights are usually nonnegative and sum to one, making the synthetic unit a convex combination of donor units.
w_j >= 0, sum_{j=2}^{J+1} w_j = 103 / Pre-treatment fit
Let X1 be treated-unit pre-policy features and X0 the donor feature matrix. Choose weights that make weighted donors match the treated pre-period features.
w_hat = argmin_w (X_1 - X_0 w)^T V (X_1 - X_0 w)
s.t. w >= 0, 1^T w = 104 / Counterfactual path
After treatment, the weighted donor outcome path estimates what the treated unit would have experienced without treatment.
Y_1t(0)_hat = sum_{j=2}^{J+1} w_hat_j Y_jt, for t > T005 / Effect path
The treatment effect at each post-period is observed treated outcome minus synthetic counterfactual.
tau_t_hat = Y_1t - Y_1t(0)_hat
ATT_post = (1/(T-T0)) sum_{t=T0+1}^{T} tau_t_hat06 / Placebo inference
Iteratively pretend each donor is treated and rebuild synthetic controls. A large treated-unit gap relative to placebo gaps strengthens the evidence.
ratio_i = RMSPE_post,i / RMSPE_pre,i07 / SDID intuition
Synthetic DID combines donor weights with time weights, blending synthetic-control weighting with DID-style before-after differencing.
tau_SDID = (Y_1,post - omega^T Y_0,post) - (Y_1,pre - omega^T Y_0,pre) lambda03 / Code
Python code: constrained optimization for synthetic-control weights
This skeleton uses `scipy.optimize.minimize` to estimate nonnegative donor weights that sum to one, then builds the synthetic path, effect path, and pre/post RMSPE.
import numpy as np
import pandas as pd
from scipy.optimize import minimize
# df columns:
# unit, year, outcome, treated_unit
treated_unit = "City A"
pre_years = range(2010, 2020)
post_years = range(2020, 2025)
panel = df.pivot(index="year", columns="unit", values="outcome").sort_index()
donors = [unit for unit in panel.columns if unit != treated_unit]
Y1_pre = panel.loc[pre_years, treated_unit].to_numpy()
Y0_pre = panel.loc[pre_years, donors].to_numpy()
def objective(weights):
synthetic_pre = Y0_pre @ weights
return np.mean((Y1_pre - synthetic_pre) ** 2)
n_donors = len(donors)
constraints = [{"type": "eq", "fun": lambda w: w.sum() - 1}]
bounds = [(0, 1)] * n_donors
start = np.repeat(1 / n_donors, n_donors)
result = minimize(objective, start, bounds=bounds, constraints=constraints)
weights = pd.Series(result.x, index=donors).sort_values(ascending=False)
synthetic_path = panel[donors] @ weights
effect_path = panel[treated_unit] - synthetic_path
pre_rmspe = np.sqrt(np.mean(effect_path.loc[pre_years] ** 2))
post_rmspe = np.sqrt(np.mean(effect_path.loc[post_years] ** 2))
print(weights[weights > 0.01])
print({"pre_rmspe": pre_rmspe, "post_rmspe": post_rmspe})
print(effect_path.loc[post_years])04 / Case
Case: evaluating a city emissions policy with one treated unit
- Question: did a city-level emissions policy introduced in 2020 reduce pollution?
- The donor pool should include untreated cities with comparable institutions, industrial structure, and no major concurrent shocks.
- The key graph is not only the post-policy gap; it is whether the treated city and synthetic city track closely before policy.
- A credible report includes donor weights, pre-policy RMSPE, effect path, placebo distribution, leave-one-donor-out sensitivity, and donor-pool justification.
05 / Risks
Common Pitfalls
References
- Abadie, Diamond, and Hainmueller (2010), Synthetic Control Methods for Comparative Case Studieshttps://doi.org/10.1198/jasa.2009.ap08746
- Abadie (2021), Using Synthetic Controlshttps://doi.org/10.1257/jel.20191450
- Arkhangelsky et al. (2021), Synthetic Difference-in-Differenceshttps://doi.org/10.1257/aer.20190159