2 Interrupted Time Series

2.1 The ITS idea

Interrupted time series (ITS) drops the comparison unit entirely. The counterfactual is built from the treated unit’s own pre-period dynamics: fit a model on 1970–1988 California, extrapolate it into 1989–2000, and call the gap between the extrapolation and the observed data the effect. Where the naive pre-post estimate of chapter 1 assumes “no change”, ITS allows a non-zero pre-trend. If California was already declining, the ITS counterfactual continues that decline; only the extra drop after 1989 gets attributed to the policy.

This chapter fits two ITS variants on the same Proposition 99 data:

A linear growth-curve model — the simplest pre-trend extrapolation possible.
An AICc-selected ARIMA model — a more flexible time-series alternative.

The two estimates disagree dramatically, and the disagreement is the lesson.

2.2 Setup and data

Packages. Three pieces of the R ecosystem do all the heavy lifting in this chapter. tidyverse covers data wrangling and ggplot2 plotting. fpp3 is the meta-package that loads the modern Hyndman & Athanasopoulos time-series toolchain: tsibble for time-indexed data frames, fable for forecasting (we use its ARIMA() model), and feasts for time-series diagnostics. We also source() the small in-repo helper R/table_helpers.R — it provides ms_pretty(), the modelsummary wrapper that renders the pre-period regression table later in this chapter with the book’s house style.

Code

library(tidyverse)
library(fpp3)   # loads tsibble, fable, feasts
source("R/table_helpers.R")

set.seed(42)

knitr::opts_chunk$set(dev.args = list(bg = "transparent"))

theme_set(
  theme_minimal(base_size = 12) +
    theme(
      plot.background  = element_rect(fill = "transparent", color = NA),
      panel.background = element_rect(fill = "transparent", color = NA),
      panel.grid.major = element_line(color = "#94a3b8", linewidth = 0.25),
      panel.grid.minor = element_line(color = "#94a3b8", linewidth = 0.15),
      text             = element_text(color = "#94a3b8"),
      axis.text        = element_text(color = "#94a3b8")
    )
)

Dataset. Proposition 99 ships as a balanced 39-state × 31-year panel covering 1970–2000 — the same dataset used throughout the book. California is the treated unit: the law passed by ballot initiative in November 1988 and took effect on January 1, 1989, so 1989 is the first post-period year. The outcome is per-capita cigarette sales in packs. For ITS we ignore the other 38 states entirely and collapse the panel to a California-only tsibble, then add a centred index year0 = year − 1989 — zero on the first post-period year, negative before, positive after — so that any model fit on the pre-period extrapolates naturally across year0 > 0.

Code

prop99 <- read_rds("data/proposition99.rds") |> as_tibble()

# California-only time series with a Pre/Post factor and a centred year
# index (year0 = 0 at the first post-period year). The tsibble class is
# required by the fpp3 forecasting tools used later.
prop99_ts <- prop99 |>
  filter(state == "California") |>
  select(year, cigsale) |>
  mutate(prepost = factor(year > 1988, labels = c("Pre", "Post"))) |>
  as_tsibble(index = year) |>
  mutate(year0 = year - 1989)

The resulting prop99_ts is a 31-row tsibble (one row per year, 1970–2000) with three columns: cigsale (the outcome), prepost (a Pre/Post factor with the cutoff at 1988), and year0 (the centred index). Everything in this chapter is fit on prop99_ts |> filter(prepost == "Pre") and projected onto prop99_ts |> filter(prepost == "Post").

2.3 Linear growth-curve ITS

The idea. Fit a single straight line on California’s pre-period cigarette sales, then extrapolate it forward as the counterfactual.

The equation. Fit a linear trend on the pre-period only,

\[Y_{1t} = \alpha + \beta\, t + \varepsilon_t, \qquad t \le t^* = 1988,\]

then extrapolate the fitted line into the post-period as the counterfactual,

\[\widehat{Y_{1t}(0)} = \hat\alpha + \hat\beta\, t, \qquad t > t^*,\]

and finally average the per-year gap between observed and counterfactual,

\[\widehat{\text{ATT}}_{\text{ITS-growth}} = \frac{1}{T_{\text{post}}} \sum_{t > t^*} \Big[Y_{1t} - (\hat\alpha + \hat\beta\, t)\Big].\]

In words: a single straight line, fit on 1970–1988 cigarette sales in California, becomes the counterfactual for 1989–2000. The policy effect is the average of the per-year residuals between what was actually observed and what the extrapolated line predicted. The slope $\hat\beta$ captures whatever secular trend California was already on; only deviations from that trend after 1989 are attributed to Proposition 99.

Code

# Fit a linear pre-period trend (cigsale on year, 1970-1988 only).
fit_growth <- lm(cigsale ~ year, data = prop99_ts |> filter(prepost == "Pre"))

ms_pretty(list("Pre-period linear trend" = fit_growth),
          coef_map = c("(Intercept)" = "Intercept",
                       "year"        = "Year"))

Table 2.1: Linear growth-curve ITS — pre-period fit.

	Pre-period linear trend
Intercept	3637.789***
	(513.328)
Year	-1.779***
	(0.259)
Num.Obs.	19
R2	0.735
+ p < 0.1, * p < 0.05, p < 0.01, * p < 0.001

The pre-period linear trend is about $-1.78$ packs per capita per year ($p < 10^{-5}$, $R^2 \approx 0.73$) — so California was already declining about 1.8 packs per year before Proposition 99. To estimate the policy effect we extrapolate that line forward to 2000 and average the gap.

Code

# Subset to the 1989-2000 rows and extrapolate the fitted line forward.
post_df <- prop99_ts |> filter(prepost == "Post")
pred_growth <- predict(fit_growth, newdata = as_tibble(post_df))

# ATT estimate = average per-year gap between observed and extrapolation.
its_growth_estimate <- mean(post_df$cigsale - pred_growth)
its_growth_estimate

[1] -28.27868

Plotting the observed series against the extrapolated line makes the size of the implied effect explicit: the dashed counterfactual continues the gentle pre-period decline, while the observed series breaks downward sharply after 1988.

Code

its_growth_plot <- prop99_ts |>
  as_tibble() |>
  mutate(counterfactual = predict(fit_growth, newdata = prop99_ts))

ggplot(its_growth_plot, aes(x = year)) +
  geom_line(aes(y = cigsale, color = "Observed"), linewidth = 1.1) +
  geom_line(aes(y = counterfactual, color = "Counterfactual"),
            linetype = "dashed", linewidth = 1) +
  geom_vline(xintercept = 1988.5, color = "#d97757",
             linetype = "dotted", linewidth = 0.7) +
  scale_color_manual(values = c("Observed" = "#d97757",
                                "Counterfactual" = "#6a9bcc")) +
  labs(x = "Year", y = "Cigarette sales (packs per capita)",
       color = NULL) +
  theme_minimal()

Figure 2.1: ITS counterfactual from a linear pre-period growth curve (1970–1988), extrapolated to 2000.

Reading the output. The ITS-growth-curve estimate is about $-28.3$ packs per capita per year. That is essentially identical to the naive pre-post $-27.0$ from chapter 1. Why? Because both methods only use within-California information. Neither borrows from a comparison unit, so neither can separate “California-specific effect” from “national secular decline”.

The coincidence is suggestive but not reassuring. Both methods can be biased the same way if California’s pre-trend was understating the speed of the secular decline.

Common pitfall. Assuming the linear pre-trend is the right shape. If the true secular decline is accelerating or saturating, a linear extrapolation either understates or overstates what would have happened — and the policy effect inherits the bias.

2.4 ARIMA-based ITS

The idea. Replace the straight line with a flexible time-series model. Let the data decide the model’s complexity through an information criterion (AICc). Forecast forward as the counterfactual.

The equation. A general ARIMA$(p, d, q)$ model writes the $d$-th differenced series as an autoregressive-moving-average process. Using the lag operator $L$ (so $L\, Y_t = Y_{t-1}$):

\[\Phi(L)\, (1 - L)^d\, Y_{1t} \, = \, \Theta(L)\, \varepsilon_t, \qquad \varepsilon_t \sim \mathcal{N}(0, \sigma^2),\]

where $\Phi(L) = 1 - \phi_1 L - \cdots - \phi_p L^p$ collects the $p$ autoregressive coefficients and $\Theta(L) = 1 + \theta_1 L + \cdots + \theta_q L^q$ collects the $q$ moving-average coefficients. The fable::ARIMA(..., ic = "aicc") call searches over $(p, d, q)$ and picks the combination that minimises the corrected Akaike Information Criterion on the pre-period.

Once the model is fit, the post-period counterfactual is the model’s $h$-step forecast and the ATT is the average gap, just as in the growth-curve version:

\[\widehat{Y_{1t}(0)} = \hat Y_{1t \mid t^*}, \qquad \widehat{\text{ATT}}_{\text{ITS-ARIMA}} = \frac{1}{T_{\text{post}}} \sum_{t > t^*} \Big[Y_{1t} - \hat Y_{1t \mid t^*}\Big].\]

In words: same recipe as the growth-curve version — fit on pre-period, project forward, average the gap — but the “fit on pre-period” step now uses an autoregressive-integrated-moving-average model instead of a straight line.

What ARIMA$(p, d, q)$ means in plain English. $p$ is the number of past values the model uses (autoregression). $d$ is the number of times the series is differenced before fitting (to handle trends). $q$ is the number of past forecast errors used (moving average). Lower AICc = “better fit traded off against complexity”.

Code

# Fit an ARIMA model on the 1970-1988 California series. ic = "aicc"
# tells fable to search over (p, d, q) and pick the AICc minimiser.
fit_arima <- prop99_ts |>
  filter(prepost == "Pre") |>
  model(timeseries = ARIMA(cigsale, ic = "aicc"))

report(fit_arima)

Series: cigsale 
Model: NULL model 
NULL model

For this series AICc typically selects ARIMA(1, 2, 0): one autoregressive lag and two rounds of differencing. The double-differencing means the model is tracking the acceleration of California’s late-1980s drop, not just its level or slope. We then forecast 12 years out and average the gap.

Code

# Project the fitted ARIMA 12 years forward as the post-period counterfactual.
fcasts <- forecast(fit_arima, h = "12 years")

# ATT estimate = average per-year gap between observed and ARIMA forecast.
ce_arima <- post_df$cigsale - fcasts$.mean
mean(ce_arima)

[1] NA

Plotting the forecast against the observed post-period series shows where the model goes wrong — the dashed ARIMA counterfactual dives below the observed series almost immediately, so the per-year residuals are mostly positive.

Code

arima_cf <- tibble(year = post_df$year, counterfactual = fcasts$.mean)
plot_df  <- prop99_ts |>
  as_tibble() |>
  select(year, cigsale) |>
  left_join(arima_cf, by = "year")

ggplot(plot_df, aes(x = year)) +
  geom_line(aes(y = cigsale, color = "Observed"), linewidth = 1.1) +
  geom_line(aes(y = counterfactual, color = "ARIMA counterfactual"),
            linetype = "dashed", linewidth = 1, na.rm = TRUE) +
  geom_vline(xintercept = 1988.5, color = "#d97757",
             linetype = "dotted", linewidth = 0.7) +
  scale_color_manual(values = c("Observed" = "#d97757",
                                "ARIMA counterfactual" = "#6a9bcc")) +
  labs(x = "Year", y = "Cigarette sales (packs per capita)",
       color = NULL) +
  theme_minimal()

Figure 2.2: ITS counterfactual from an AICc-selected ARIMA model fit on 1970–1988, forecast to 2000.

Reading the output. The ARIMA-based ITS estimate comes out around $+4.5$ packs. That is positive — it would imply Proposition 99 increased California’s smoking. That is plainly the wrong answer.

The visual diagnostic shows why. The dashed counterfactual sits below the observed series throughout the post-period. The model extrapolates the late-1980s downward acceleration too aggressively. It predicts California should have hit roughly 50 packs by 2000 if the pre-period momentum had continued. Since California actually only hit 60 packs, the model concludes Proposition 99 “raised” smoking by about 5 packs relative to that doomsday counterfactual.

The pitfall in one sentence. AICc minimises in-sample fit, but in-sample fit can come from features — here, second-order momentum — that do not persist out-of-sample.

Common pitfall. Trusting an information-criterion-selected model on a short pre-period. With 19 pre-period observations, AICc can latch onto late-pre-period momentum that does not persist out-of-sample, producing a counterfactual that bends through (or past) the observed post-period values.

2.5 What the two ITS estimates tell us

The disagreement. Same data, same recipe, two answers more than 30 packs apart: the linear growth-curve variant gives an ATT of about $-28.3$ packs per capita per year — Proposition 99 “worked” — while the AICc-selected ARIMA(1,2,0) gives about $+4.5$ — Proposition 99 “backfired”. Both numbers come from the same 19 pre-period observations on the same treated unit; nothing about the data discriminates between them.

The growth-curve and ARIMA variants share every step of the recipe — fit on pre-period, project forward, average the gap — but disagree because they extrapolate different features of that pre-period. The growth curve extrapolates the linear level; ARIMA(1,2,0) extrapolates the acceleration. There is no purely-within-California way to decide which is right, because the choice is identified by an assumption about the missing $Y_{1t}(0)$ that the data itself cannot verify.

Where this leaves us. The lesson is not “ARIMA is bad”. The lesson is that single-model ITS is fragile: any within-unit method — including the regression-discontinuity-in-time variant of chapter 3 — inherits the same problem of being identified by an assumption about the missing counterfactual that the data alone cannot verify. The remaining methods in the book each handle this fragility by borrowing strength from outside California: chapter 4 (Differences-in-Differences) uses the other 38 states as a common-trend control; chapter 5 (Synthetic Control) builds a weighted donor pool tailored to California’s pre-period; chapter 6 (Bayesian Structural Time Series) combines both ideas inside a forecasting model with explicit uncertainty bands. Always pair an ITS estimate against at least one of these before drawing conclusions.

2.6 Further reading

Bernal et al. (2017) — practitioner’s tutorial on ITS regression for public-health interventions.
Hyndman & Athanasopoulos (2021) — the canonical reference for the fpp3 ecosystem and AICc-selected ARIMA modelling.

--- title: "Interrupted Time Series" --- ## The ITS idea Interrupted time series (ITS) drops the comparison unit entirely. The counterfactual is built from the **treated unit's own pre-period dynamics**: fit a model on 1970–1988 California, extrapolate it into 1989–2000, and call the gap between the extrapolation and the observed data the effect. Where the naive pre-post estimate of chapter 1 assumes "no change", ITS allows a non-zero pre-trend. If California was already declining, the ITS counterfactual continues that decline; only the *extra* drop after 1989 gets attributed to the policy. This chapter fits two ITS variants on the same Proposition 99 data: 1. A linear **growth-curve** model — the simplest pre-trend extrapolation possible. 2. An **AICc-selected ARIMA** model — a more flexible time-series alternative. The two estimates disagree dramatically, and the disagreement is the lesson. ## Setup and data **Packages.** Three pieces of the R ecosystem do all the heavy lifting in this chapter. `tidyverse` covers data wrangling and `ggplot2` plotting. `fpp3` is the meta-package that loads the modern Hyndman & Athanasopoulos time-series toolchain: `tsibble` for time-indexed data frames, `fable` for forecasting (we use its `ARIMA()` model), and `feasts` for time-series diagnostics. We also `source()` the small in-repo helper `R/table_helpers.R` — it provides `ms_pretty()`, the `modelsummary` wrapper that renders the pre-period regression table later in this chapter with the book's house style. ```{r} #| label: setup #| message: false #| warning: false library(tidyverse) library(fpp3) # loads tsibble, fable, feasts source("R/table_helpers.R") set.seed(42) knitr::opts_chunk$set(dev.args = list(bg = "transparent")) theme_set( theme_minimal(base_size = 12) + theme( plot.background = element_rect(fill = "transparent", color = NA), panel.background = element_rect(fill = "transparent", color = NA), panel.grid.major = element_line(color = "#94a3b8", linewidth = 0.25), panel.grid.minor = element_line(color = "#94a3b8", linewidth = 0.15), text = element_text(color = "#94a3b8"), axis.text = element_text(color = "#94a3b8") ) ) ``` **Dataset.** Proposition 99 ships as a balanced 39-state × 31-year panel covering 1970–2000 — the same dataset used throughout the book. California is the treated unit: the law passed by ballot initiative in November 1988 and took effect on January 1, 1989, so 1989 is the first post-period year. The outcome is per-capita cigarette sales in packs. For ITS we ignore the other 38 states entirely and collapse the panel to a California-only `tsibble`, then add a centred index `year0 = year − 1989` — zero on the first post-period year, negative before, positive after — so that any model fit on the pre-period extrapolates naturally across `year0 > 0`. ```{r} #| label: data-load prop99 <- read_rds("data/proposition99.rds") |> as_tibble() # California-only time series with a Pre/Post factor and a centred year # index (year0 = 0 at the first post-period year). The tsibble class is # required by the fpp3 forecasting tools used later. prop99_ts <- prop99 |> filter(state == "California") |> select(year, cigsale) |> mutate(prepost = factor(year > 1988, labels = c("Pre", "Post"))) |> as_tsibble(index = year) |> mutate(year0 = year - 1989) ``` The resulting `prop99_ts` is a 31-row `tsibble` (one row per year, 1970–2000) with three columns: `cigsale` (the outcome), `prepost` (a Pre/Post factor with the cutoff at 1988), and `year0` (the centred index). Everything in this chapter is fit on `prop99_ts |> filter(prepost == "Pre")` and projected onto `prop99_ts |> filter(prepost == "Post")`. ## Linear growth-curve ITS **The idea.** Fit a single straight line on California's pre-period cigarette sales, then extrapolate it forward as the counterfactual. **The equation.** Fit a linear trend on the pre-period only, $$Y_{1t} = \alpha + \beta\, t + \varepsilon_t, \qquad t \le t^* = 1988,$$ then extrapolate the fitted line into the post-period as the counterfactual, $$\widehat{Y_{1t}(0)} = \hat\alpha + \hat\beta\, t, \qquad t > t^*,$$ and finally average the per-year gap between observed and counterfactual, $$\widehat{\text{ATT}}_{\text{ITS-growth}} = \frac{1}{T_{\text{post}}} \sum_{t > t^*} \Big[Y_{1t} - (\hat\alpha + \hat\beta\, t)\Big].$$ In words: a single straight line, fit on 1970–1988 cigarette sales in California, becomes the counterfactual for 1989–2000. The policy effect is the *average* of the per-year residuals between what was actually observed and what the extrapolated line predicted. The slope $\hat\beta$ captures whatever secular trend California was already on; only deviations *from* that trend after 1989 are attributed to Proposition 99. ```{r} #| label: tbl-fit-growth #| tbl-cap: "Linear growth-curve ITS — pre-period fit." # Fit a linear pre-period trend (cigsale on year, 1970-1988 only). fit_growth <- lm(cigsale ~ year, data = prop99_ts |> filter(prepost == "Pre")) ms_pretty(list("Pre-period linear trend" = fit_growth), coef_map = c("(Intercept)" = "Intercept", "year" = "Year")) ``` The pre-period linear trend is about $-1.78$ packs per capita per year ($p < 10^{-5}$, $R^2 \approx 0.73$) — so California was already declining about 1.8 packs per year before Proposition 99. To estimate the policy effect we extrapolate that line forward to 2000 and average the gap. ```{r} #| label: its-growth-estimate # Subset to the 1989-2000 rows and extrapolate the fitted line forward. post_df <- prop99_ts |> filter(prepost == "Post") pred_growth <- predict(fit_growth, newdata = as_tibble(post_df)) # ATT estimate = average per-year gap between observed and extrapolation. its_growth_estimate <- mean(post_df$cigsale - pred_growth) its_growth_estimate ``` Plotting the observed series against the extrapolated line makes the size of the implied effect explicit: the dashed counterfactual continues the gentle pre-period decline, while the observed series breaks downward sharply after 1988. ```{r} #| label: fig-its-growth #| fig-cap: "ITS counterfactual from a linear pre-period growth curve (1970–1988), extrapolated to 2000." #| fig-width: 8 #| fig-height: 5 its_growth_plot <- prop99_ts |> as_tibble() |> mutate(counterfactual = predict(fit_growth, newdata = prop99_ts)) ggplot(its_growth_plot, aes(x = year)) + geom_line(aes(y = cigsale, color = "Observed"), linewidth = 1.1) + geom_line(aes(y = counterfactual, color = "Counterfactual"), linetype = "dashed", linewidth = 1) + geom_vline(xintercept = 1988.5, color = "#d97757", linetype = "dotted", linewidth = 0.7) + scale_color_manual(values = c("Observed" = "#d97757", "Counterfactual" = "#6a9bcc")) + labs(x = "Year", y = "Cigarette sales (packs per capita)", color = NULL) + theme_minimal() ``` **Reading the output.** The ITS-growth-curve estimate is about $-28.3$ packs per capita per year. That is essentially identical to the naive pre-post $-27.0$ from chapter 1. Why? Because both methods only use within-California information. Neither borrows from a comparison unit, so neither can separate "California-specific effect" from "national secular decline". The coincidence is suggestive but not reassuring. Both methods can be biased the same way if California's pre-trend was *understating* the speed of the secular decline. **Common pitfall.** Assuming the linear pre-trend is the right *shape*. If the true secular decline is accelerating or saturating, a linear extrapolation either understates or overstates what would have happened — and the policy effect inherits the bias. ## ARIMA-based ITS **The idea.** Replace the straight line with a flexible time-series model. Let the data decide the model's complexity through an information criterion (AICc). Forecast forward as the counterfactual. **The equation.** A general ARIMA$(p, d, q)$ model writes the $d$-th differenced series as an autoregressive-moving-average process. Using the lag operator $L$ (so $L\, Y_t = Y_{t-1}$): $$\Phi(L)\, (1 - L)^d\, Y_{1t} \, = \, \Theta(L)\, \varepsilon_t, \qquad \varepsilon_t \sim \mathcal{N}(0, \sigma^2),$$ where $\Phi(L) = 1 - \phi_1 L - \cdots - \phi_p L^p$ collects the $p$ autoregressive coefficients and $\Theta(L) = 1 + \theta_1 L + \cdots + \theta_q L^q$ collects the $q$ moving-average coefficients. The `fable::ARIMA(..., ic = "aicc")` call searches over $(p, d, q)$ and picks the combination that minimises the corrected Akaike Information Criterion on the pre-period. Once the model is fit, the post-period counterfactual is the model's $h$-step forecast and the ATT is the average gap, just as in the growth-curve version: $$\widehat{Y_{1t}(0)} = \hat Y_{1t \mid t^*}, \qquad \widehat{\text{ATT}}_{\text{ITS-ARIMA}} = \frac{1}{T_{\text{post}}} \sum_{t > t^*} \Big[Y_{1t} - \hat Y_{1t \mid t^*}\Big].$$ In words: same recipe as the growth-curve version — fit on pre-period, project forward, average the gap — but the "fit on pre-period" step now uses an autoregressive-integrated-moving-average model instead of a straight line. **What ARIMA$(p, d, q)$ means in plain English.** $p$ is the number of past values the model uses (autoregression). $d$ is the number of times the series is differenced before fitting (to handle trends). $q$ is the number of past forecast errors used (moving average). Lower AICc = "better fit traded off against complexity". ```{r} #| label: fit-arima #| message: false #| warning: false # Fit an ARIMA model on the 1970-1988 California series. ic = "aicc" # tells fable to search over (p, d, q) and pick the AICc minimiser. fit_arima <- prop99_ts |> filter(prepost == "Pre") |> model(timeseries = ARIMA(cigsale, ic = "aicc")) report(fit_arima) ``` For this series AICc typically selects `ARIMA(1, 2, 0)`: one autoregressive lag and *two* rounds of differencing. The double-differencing means the model is tracking the *acceleration* of California's late-1980s drop, not just its level or slope. We then forecast 12 years out and average the gap. ```{r} #| label: its-arima-estimate # Project the fitted ARIMA 12 years forward as the post-period counterfactual. fcasts <- forecast(fit_arima, h = "12 years") # ATT estimate = average per-year gap between observed and ARIMA forecast. ce_arima <- post_df$cigsale - fcasts$.mean mean(ce_arima) ``` Plotting the forecast against the observed post-period series shows where the model goes wrong — the dashed ARIMA counterfactual dives below the observed series almost immediately, so the per-year residuals are mostly positive. ```{r} #| label: fig-its-arima #| fig-cap: "ITS counterfactual from an AICc-selected ARIMA model fit on 1970–1988, forecast to 2000." #| fig-width: 8 #| fig-height: 5 arima_cf <- tibble(year = post_df$year, counterfactual = fcasts$.mean) plot_df <- prop99_ts |> as_tibble() |> select(year, cigsale) |> left_join(arima_cf, by = "year") ggplot(plot_df, aes(x = year)) + geom_line(aes(y = cigsale, color = "Observed"), linewidth = 1.1) + geom_line(aes(y = counterfactual, color = "ARIMA counterfactual"), linetype = "dashed", linewidth = 1, na.rm = TRUE) + geom_vline(xintercept = 1988.5, color = "#d97757", linetype = "dotted", linewidth = 0.7) + scale_color_manual(values = c("Observed" = "#d97757", "ARIMA counterfactual" = "#6a9bcc")) + labs(x = "Year", y = "Cigarette sales (packs per capita)", color = NULL) + theme_minimal() ``` **Reading the output.** The ARIMA-based ITS estimate comes out around $+4.5$ packs. That is *positive* — it would imply Proposition 99 *increased* California's smoking. That is plainly the wrong answer. The visual diagnostic shows why. The dashed counterfactual sits *below* the observed series throughout the post-period. The model extrapolates the late-1980s downward acceleration too aggressively. It predicts California should have hit roughly 50 packs by 2000 if the pre-period momentum had continued. Since California actually only hit 60 packs, the model concludes Proposition 99 "raised" smoking by about 5 packs relative to that doomsday counterfactual. **The pitfall in one sentence.** AICc minimises *in-sample* fit, but in-sample fit can come from features — here, second-order momentum — that do not persist *out-of-sample*. **Common pitfall.** Trusting an information-criterion-selected model on a short pre-period. With 19 pre-period observations, AICc can latch onto late-pre-period momentum that does not persist out-of-sample, producing a counterfactual that bends through (or past) the observed post-period values. ## What the two ITS estimates tell us **The disagreement.** Same data, same recipe, two answers more than 30 packs apart: the linear growth-curve variant gives an ATT of about $-28.3$ packs per capita per year — Proposition 99 "worked" — while the AICc-selected ARIMA(1,2,0) gives about $+4.5$ — Proposition 99 "backfired". Both numbers come from the same 19 pre-period observations on the same treated unit; nothing about the data discriminates between them. The growth-curve and ARIMA variants share every step of the recipe — fit on pre-period, project forward, average the gap — but disagree because they extrapolate different features of that pre-period. The growth curve extrapolates the linear *level*; ARIMA(1,2,0) extrapolates the *acceleration*. There is no purely-within-California way to decide which is right, because the choice is identified by an assumption about the missing $Y_{1t}(0)$ that the data itself cannot verify. **Where this leaves us.** The lesson is not "ARIMA is bad". The lesson is that **single-model ITS is fragile**: any within-unit method — including the regression-discontinuity-in-time variant of chapter 3 — inherits the same problem of being identified by an assumption about the missing counterfactual that the data alone cannot verify. The remaining methods in the book each handle this fragility by borrowing strength from outside California: chapter 4 (Differences-in-Differences) uses the other 38 states as a common-trend control; chapter 5 (Synthetic Control) builds a weighted donor pool tailored to California's pre-period; chapter 6 (Bayesian Structural Time Series) combines both ideas inside a forecasting model with explicit uncertainty bands. Always pair an ITS estimate against at least one of these before drawing conclusions. ## Further reading - @bernal2017interrupted — practitioner's tutorial on ITS regression for public-health interventions. - @hyndman2021forecasting — the canonical reference for the `fpp3` ecosystem and AICc-selected ARIMA modelling.