5  Classical Synthetic Control

5.1 The SCM idea

Synthetic Control stops using one control state. Instead, it builds a weighted combination of donor states that matches the treated unit’s pre-period as closely as possible on a chosen set of predictors. The weighted combination is “synthetic California”. The gap between observed California and synthetic California is the estimated effect.

Why it works where DiD failed. Difference-in-Differences against Nevada (chapter 4) needed parallel pre-trends with one neighbour. Synthetic Control needs parallel pre-trends with a data-driven blend of many neighbours. The optimisation does the matching, so the analyst no longer has to pick “the right” control state by hand.

5.2 The four-stage tidysynth pipeline

flowchart LR
    A["1. synthetic_control()<br/>declare treated unit<br/>and intervention time"] --> B["2. generate_predictor()<br/>define matching variables<br/>(one call per time window)"]
    B --> C["3. generate_weights()<br/>optimise donor weights<br/>(quadratic programming)"]
    C --> D["4. generate_control()<br/>build synthetic California<br/>and post-period gap series"]
    D --> E["5. plot_/grab_ helpers<br/>trends, weights,<br/>placebos, MSPE ratio,<br/>Fisher exact p-value"]

    style A fill:#6a9bcc,stroke:#cbd5e0,color:#fff
    style B fill:#6a9bcc,stroke:#cbd5e0,color:#fff
    style C fill:#6a9bcc,stroke:#cbd5e0,color:#fff
    style D fill:#d97757,stroke:#cbd5e0,color:#fff
    style E fill:#00d4c8,stroke:#cbd5e0,color:#141413
Figure 5.1: The four-stage tidysynth pipeline: declare → predictors → weights → control, with a fifth stage of plot/grab helpers.

Stages 1–4 produce the estimate. Stage 5 is a battery of inspection helpers — plot_trends(), plot_differences(), plot_weights(), plot_placebos(), plot_mspe_ratio(), grab_unit_weights(), grab_predictor_weights(), grab_balance_table(), grab_significance() — that turn the fitted object into figures and tables. We use all of them below.

5.3 The equation

Let \(X_1\) be the vector of \(k\) pre-period predictors for the treated unit (California), and let \(X_0\) be the \(k \times J\) matrix holding the same predictors for the \(J = 38\) donor states. The Synthetic Control estimator chooses donor weights \(w\) to minimise the (V-weighted) discrepancy between treated and synthetic on the predictors:

\[w^* \, = \, \arg\min_{w \in \mathcal{W}} \, \big(X_1 - X_0 w\big)^\top V \big(X_1 - X_0 w\big),\]

subject to

\[\mathcal{W} = \big\{w \in \mathbb{R}^J \,:\, w_j \ge 0 \,\, \forall j, \,\, \textstyle\sum_{j=1}^J w_j = 1\big\}.\]

The diagonal matrix \(V\) holds the predictor importance weights — the optimiser can care more about pre-period cigarette sales than about, say, beer consumption (we inspect \(V\) below). Once \(w^*\) is solved, the synthetic California outcome at any year \(t\) is

\[\widehat{Y_{1t}(0)} = \sum_{j=1}^J w_j^* \, Y_{jt},\]

and the ATT over 1989–2000 is the mean post-period gap between observed California and that synthetic counterfactual.

5.4 Setup and data

Packages. tidyverse covers wrangling and plotting. tidysynth is the workhorse for this chapter: it wraps the classical Abadie-Diamond-Hainmueller synthetic-control optimiser behind a tidy pipeline of synthetic_control() |> generate_predictor() |> generate_weights() |> generate_control() and ships a battery of plot_*() / grab_*() helpers for inspection. The R/table_helpers.R helper provides gt_pretty() for the donor-weight and balance tables.

Code
library(tidyverse)
library(tidysynth)
source("R/table_helpers.R")

set.seed(42)

knitr::opts_chunk$set(dev.args = list(bg = "transparent"))

theme_set(
  theme_minimal(base_size = 12) +
    theme(
      plot.background  = element_rect(fill = "transparent", color = NA),
      panel.background = element_rect(fill = "transparent", color = NA),
      panel.grid.major = element_line(color = "#94a3b8", linewidth = 0.25),
      panel.grid.minor = element_line(color = "#94a3b8", linewidth = 0.15),
      text             = element_text(color = "#94a3b8"),
      axis.text        = element_text(color = "#94a3b8")
    )
)

Dataset. Unlike the ITS, RDD, and DiD chapters — which used only California or only California-plus-Nevada — Synthetic Control uses the full 39-state × 31-year panel. The donor pool is the point: the other 38 states are the raw material the optimiser will blend into “synthetic California”. We don’t pre-filter or restrict to a window here; the chapter-2 outcome cigsale and the covariates lnincome, retprice, age15to24, beer are all in prop99 and are passed straight into synthetic_control() below.

Code
prop99 <- read_rds("data/proposition99.rds") |> as_tibble()

The loaded prop99 is a 1,209-row × 7-column tibble (39 states × 31 years per row, columns state, year, cigsale, plus the four covariates). The pipeline in the next section consumes it as-is.

5.5 Fit the synthetic-control pipeline

Code
prop99_syn <- prop99 |>
  # 1. Declare the panel structure: outcome, unit, time, treated unit
  #    ("California"), and the last full pre-period year (1988).
  #    generate_placebos = TRUE also fits the model treating each donor
  #    state as treated, for the permutation test below.
  synthetic_control(
    outcome  = cigsale, unit = state, time = year,
    i_unit   = "California", i_time = 1988,
    generate_placebos = TRUE
  ) |>
  # 2. Predictors averaged over the full pre-period (1980-1988).
  generate_predictor(
    time_window = 1980:1988,
    lnincome    = mean(lnincome, na.rm = TRUE),
    retprice    = mean(retprice, na.rm = TRUE),
    age15to24   = mean(age15to24, na.rm = TRUE)
  ) |>
  # 2b. beer is sparser, so use a narrower window where data is densest.
  generate_predictor(time_window = 1984:1988,
                     beer = mean(beer, na.rm = TRUE)) |>
  # 2c. Three "lagged outcomes" - cigsale at three pre-period dates.
  #     These pin synthetic California's pre-period trajectory.
  generate_predictor(time_window = 1975, cigsale_1975 = cigsale) |>
  generate_predictor(time_window = 1980, cigsale_1980 = cigsale) |>
  generate_predictor(time_window = 1988, cigsale_1988 = cigsale) |>
  # 3. Solve the constrained QP for donor weights w*. The three IPOP
  #    parameters are tuning knobs for the interior-point optimiser.
  generate_weights(optimization_window = 1970:1988,
                   margin_ipop = .02,
                   sigf_ipop   = 7,
                   bound_ipop  = 6) |>
  # 4. Compute the synthetic California series from w* and donor outcomes.
  generate_control()

Predictor choices. Seven predictors are passed in. Three are pre-period covariate averages over the full pre-period (lnincome, retprice, age15to24 over 1980–1988). One uses a narrower window where data is densest (beer over 1984–1988). Three are lagged outcomes — cigarette sales themselves at 1975, 1980, and 1988. The lagged outcomes are the most important trick: anchoring the synthetic control on the treated unit’s own pre-period outcome levels at multiple time points forces the synthetic series to track California’s pre-1988 trajectory closely.

5.6 Donor weights and predictor weights

The optimisation produces two weight vectors that drive the entire fit. Both are extractable as tidy tables.

Code
grab_unit_weights(prop99_syn) |>
  arrange(desc(weight)) |>
  head(8) |>
  gt_pretty(decimals = 3) |>
  cols_label(unit = "Donor state", weight = "Weight")
Table 5.1: Donor unit weights (top 8 states).
Donor state Weight
Utah 0.342
Nevada 0.238
Montana 0.209
Colorado 0.149
Connecticut 0.062
New Mexico 0
Idaho 0
Wisconsin 0
Code
grab_predictor_weights(prop99_syn) |>
  arrange(desc(weight)) |>
  gt_pretty(decimals = 3) |>
  cols_label(variable = "Predictor", weight = "V-matrix weight")
Table 5.2: Predictor (V-matrix) weights.
Predictor V-matrix weight
cigsale_1975 0.468
cigsale_1980 0.412
retprice 0.055
cigsale_1988 0.037
beer 0.02
age15to24 0.007
lnincome 0

Two things to notice.

  1. Five states absorb essentially 100% of the donor weight. Utah, Nevada, Montana, Colorado, Connecticut. Every other state gets effectively zero. California is matched mostly to other Western/sunbelt states with similar age structure and cigarette price levels, plus Connecticut as a smoking-rate counterweight from the east.
  2. The two earliest cigsale levels dominate the V matrix. cigsale_1975 and cigsale_1980 together get roughly 88% of the predictor weight. The four behavioural and demographic covariates get less than 9% combined. The optimiser has effectively decided: “the best way to predict California’s cigarette sales is using other states’ cigarette sales.”

For a one-line visual of both weight vectors, tidysynth ships a plot_weights() helper:

Code
plot_weights(prop99_syn)
Figure 5.2: Donor unit weights (left) and predictor V-matrix weights (right) from tidysynth.

5.6.1 A closer look at the V matrix

The combined plot_weights() view is convenient, but the V matrix deserves a stand-alone chart because it answers a different question than the donor weights. Donor weights say which states mimic California; the V matrix says which variables the optimiser used to decide what “mimics” means.

Code
predw_df <- grab_predictor_weights(prop99_syn) |>
  mutate(variable = fct_reorder(variable, weight))

ggplot(predw_df, aes(x = weight, y = variable)) +
  geom_col(fill = "#6a9bcc") +
  geom_text(aes(label = sprintf("%.3f", weight)), hjust = -0.12) +
  scale_x_continuous(expand = expansion(mult = c(0, 0.15))) +
  labs(x = "V-matrix weight (predictor importance)", y = NULL) +
  theme_minimal()
Figure 5.3: V matrix: cigsale_1975 and cigsale_1980 dominate; behavioural covariates get nearly zero weight.

Two readings of the same picture, one practical and one cautionary.

  • Practical reading. Two lagged outcomes (cigsale_1975 and cigsale_1980) carry the bulk of the matching information. The optimiser has decided that California’s pre-period cigarette sales — at multiple time points — are the best fingerprint to match.
  • Cautionary reading. The V matrix is not a causal ranking. It tells you which variables were useful for matching the treated unit’s pre-period, not which variables cause the outcome.

Common pitfall. Treating the V matrix as a list of causal drivers. It is a list of good pre-period predictors for one specific unit, not a structural model of smoking.

5.7 The estimate

Code
# grab_synthetic_control() returns a tidy tibble with observed (real_y)
# and synthetic (synth_y) cigsale for every year. We restrict to the
# post-period and compute the per-year gap.
sc_post <- grab_synthetic_control(prop99_syn) |>
  filter(time_unit > 1988) |>
  mutate(dif = real_y - synth_y)

# Average the per-year gap to recover the ATT.
mean(sc_post$dif)
[1] -18.84561

The Synthetic Control ATT is approximately \(-18.85\) packs/capita averaged over 1989–2000. This is the book’s primary causal estimate and within rounding of the canonical Abadie et al. (2010) result.

Code
plot_trends(prop99_syn)
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.
ℹ The deprecated feature was likely used in the tidysynth package.
  Please report the issue to the authors.

The pre-period fit is excellent — the synthetic and observed series are nearly indistinguishable through 1988. A substantial gap opens immediately after 1989, widening to roughly 30 packs by 2000.

5.8 Predictor balance: did the matching work?

grab_balance_table() shows California, synthetic California, and the unweighted donor average side-by-side on every predictor.

Code
grab_balance_table(prop99_syn) |>
  gt_pretty(decimals = 2)
Table 5.3: Predictor balance: California vs synthetic California vs donor average.
variable California synthetic_California donor_sample
age15to24 0.17 0.17 0.17
lnincome 10.08 9.85 9.83
retprice 89.42 89.39 87.27
beer 24.28 24.22 23.66
cigsale_1975 127.1 126.99 136.93
cigsale_1980 120.2 120.22 138.09
cigsale_1988 90.1 91.39 113.82

On every variable, synthetic California is far closer to California than the unweighted donor average is. The most dramatic improvement is on the lagged outcomes: cigsale_1988 is roughly 90 for California vs ≈91 for the synthetic — a near-perfect match — while the unweighted donor average is around 114. That gap of ~24 packs is exactly the bias the naive pre-post method silently absorbed.

5.9 Visualising the post-period gap

plot_trends() showed both observed and synthetic California on one canvas. The companion helper plot_differences() plots just the gap: \(Y_{1t} - \widehat{Y_{1t}(0)}\), year by year. This isolates the treatment-effect curve in its cleanest form.

Code
plot_differences(prop99_syn)
Ignoring unknown labels:
• colour : ""
• linetype : ""
Figure 5.5: Per-year gap between observed California and synthetic California.

Read the line as the effect of Proposition 99 on California in year \(t\). The pre-period values hover near zero (the matching worked), the line drops sharply after 1989, and it stays negative — steadily widening — throughout the post-period. The 1989–2000 mean of this series is exactly the \(-18.85\) packs ATT reported above.

5.10 Inference via placebo permutation

A “standard error” computed as cross-year SD divided by \(\sqrt{N}\) is not a real sampling-distribution-based standard error. The proper Synthetic Control uncertainty quantification is a permutation test.

The recipe. Refit the synthetic-control model treating each donor state as if it had been the treated unit. Compute the post-period gap for each placebo. Compare California’s gap trajectory to those placebo trajectories. If California’s gap is extreme relative to the placebos, the policy probably did something.

Code
plot_placebos(prop99_syn)
Figure 5.6: Placebo permutation: California in orange, donor placebos in grey (badly pre-fit donors pruned by default).

The orange line is California; the grey lines are the donor placebos. By default, plot_placebos() prunes placebos whose pre-period mean squared prediction error (MSPE) exceeds twice California’s — those donors fit their own pre-period so badly that comparing their post-period gap to California’s would be misleading. After pruning, California’s post-period gap sits visibly below every retained placebo, which is the visual signature of a “real” treatment effect.

The unpruned variant keeps every donor for full transparency:

Code
plot_placebos(prop99_syn, prune = FALSE)
Figure 5.7: Same placebo plot with every donor retained — including badly pre-fit ones — to show the full pool of placebo trajectories.

With pruning off, the grey cloud is messier and a few badly-fit donors swing wildly — but California’s post-period descent still ends up at the bottom of the bundle. The qualitative conclusion does not depend on the pruning rule.

5.11 MSPE ratio and Fisher exact p-value

A sharper version of the same test is the MSPE ratio — the ratio of post-period to pre-period mean squared prediction error. If a unit has a tight pre-period fit and a large post-period gap, the ratio is large.

Code
# grab_significance() returns one row per unit (treated + every placebo)
# with pre_mspe, post_mspe, the post/pre ratio, the unit's rank in that
# ratio, and the Fisher-style p-value (rank / n_units).
grab_significance(prop99_syn) |>
  arrange(desc(mspe_ratio)) |>
  head(5) |>
  gt_pretty(decimals = 3)
Table 5.4: MSPE ratios — top 5 units.
unit_name type pre_mspe post_mspe mspe_ratio rank fishers_exact_pvalue z_score
California Treated 3.166 392.198 123.87 1 0.026 5.324
Georgia Donor 3.786 178.712 47.208 2 0.051 1.702
Indiana Donor 25.171 769.656 30.577 3 0.077 0.916
West Virginia Donor 9.523 284.105 29.832 4 0.103 0.881
Wisconsin Donor 11.134 267.763 24.05 5 0.128 0.607

California’s MSPE ratio is around 124 — more than two and a half times higher than the next-highest unit. California ranks 1st out of 39 units. The Fisher exact \(p\)-value is rank divided by total units, so \(1/39 \approx 0.026\). Under the null hypothesis that Proposition 99 had no effect, the probability of seeing a unit this extreme purely by chance is about 2.6%.

Code
plot_mspe_ratio(prop99_syn)
Ignoring unknown labels:
• colour : ""
Figure 5.8: MSPE-ratio bar chart with California highlighted at rank 1 out of 39 units.

The orange bar at the top is California; every blue bar below it is a placebo donor. The gap between California and the second-place state is enormous. That gap is the visual signature of “a real treatment effect that the donor pool does not naturally replicate”.

5.12 Inspecting the nested tidysynth object

prop99_syn is not a plain data frame — it is a nested tibble with one row per unit (treated unit + every donor refit as a placebo) and list-columns that hold every intermediate output of the optimisation.

This one chunk prints with R’s default nested-tibble formatter on purpose — the <tibble [N × M]> glyphs in the list-columns are the pedagogical point that a styled table would hide.

Code
prop99_syn
# A tibble: 78 × 11
   .id      .placebo .type .outcome .predictors .synthetic_control .unit_weights
   <fct>       <dbl> <chr> <list>   <list>      <list>             <list>       
 1 Califor…        0 trea… <tibble> <tibble>    <tibble [31 × 3]>  <tibble>     
 2 Califor…        0 cont… <tibble> <tibble>    <tibble [31 × 3]>  <tibble>     
 3 Alabama         1 trea… <tibble> <tibble>    <tibble [31 × 3]>  <tibble>     
 4 Alabama         1 cont… <tibble> <tibble>    <tibble [31 × 3]>  <tibble>     
 5 Arkansas        1 trea… <tibble> <tibble>    <tibble [31 × 3]>  <tibble>     
 6 Arkansas        1 cont… <tibble> <tibble>    <tibble [31 × 3]>  <tibble>     
 7 Colorado        1 trea… <tibble> <tibble>    <tibble [31 × 3]>  <tibble>     
 8 Colorado        1 cont… <tibble> <tibble>    <tibble [31 × 3]>  <tibble>     
 9 Connect…        1 trea… <tibble> <tibble>    <tibble [31 × 3]>  <tibble>     
10 Connect…        1 cont… <tibble> <tibble>    <tibble [31 × 3]>  <tibble>     
# ℹ 68 more rows
# ℹ 4 more variables: .predictor_weights <list>, .original_data <list>,
#   .meta <list>, .loss <list>

Each list-column can be flattened with tidyr::unnest() for custom downstream work, or pulled out with one of the grab_*() helpers used above.

Code
# Flatten .outcome into a wide table: one row per (unit, year).
# The actual cigarette-sales column is named after each unit, so we
# select metadata + California's series for a clean preview.
prop99_syn |>
  tidyr::unnest(cols = c(.outcome)) |>
  select(.id, .placebo, .type, time_unit, California) |>
  head(8) |>
  gt_pretty(decimals = 2)
Table 5.5: Unnested .outcome (first 8 rows).
.id .placebo .type time_unit California
California 0 treated 1,970 123
California 0 treated 1,971 121
California 0 treated 1,972 123.5
California 0 treated 1,973 124.4
California 0 treated 1,974 126.7
California 0 treated 1,975 127.1
California 0 treated 1,976 128
California 0 treated 1,977 126.4

For the tidy observed-vs-synthetic table (which is what most analyses want), the dedicated helper is more convenient:

Code
grab_synthetic_control(prop99_syn) |>
  head(8) |>
  gt_pretty(decimals = 2)
Table 5.6: grab_synthetic_control() — observed vs synthetic (first 8 rows).
time_unit real_y synth_y
1,970 123 116.79
1,971 121 118.66
1,972 123.5 124.09
1,973 124.4 125.23
1,974 126.7 126.83
1,975 127.1 126.99
1,976 128 127.73
1,977 126.4 125.59

This is the whole point of the nested-tibble design: every step of the optimisation is introspectable from R, with no need to dig into S4 slots or attr() blobs.

5.13 Recap

Question Answer
What does Synthetic Control estimate? The ATT on California, 1989–2000
What is the point estimate? \(-18.85\) packs/capita per year
What is “synthetic California”? A convex combination of five Western/sunbelt states (Utah, Nevada, Montana, Colorado, Connecticut)
What predictors did the matching? Mostly two lagged outcomes — cigsale_1975 and cigsale_1980
How is the matching quality? Excellent — synthetic and observed California are near-identical through 1988
What is the inference statistic? Fisher exact \(p \approx 0.026\) (California ranks 1st out of 39 on the MSPE ratio)
What is the design-time pitfall? Don’t read the V matrix as a list of causal drivers — it is a list of good pre-period predictors

Synthetic Control is the book’s headline causal estimate, and the placebo / MSPE-ratio diagnostics both confirm that California’s post-1989 trajectory is unusual relative to what other states experienced in the same window. In chapter 6 we hand the same donor information to a Bayesian model and ask whether a credible interval (a direct probability statement about the effect) tells the same story.

5.14 Further reading

  • Abadie et al. (2010) — the original synthetic-control treatment of Proposition 99.
  • Abadie (2021) — methodological feasibility, data requirements, and assumptions in a JEL review.
  • Dunford (2024) — documentation for the tidysynth package used here.