Python: Panel Data with Multiple Time Periods#

In this example, a detailed guide on Difference-in-Differences with multiple time periods using the DoubleML-package. The implementation is based on Callaway and Sant’Anna(2021).

The notebook requires the following packages:

[1]:
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

from lightgbm import LGBMRegressor, LGBMClassifier
from sklearn.linear_model import LinearRegression, LogisticRegression

from doubleml.did import DoubleMLDIDMulti
from doubleml.data import DoubleMLPanelData

from doubleml.did.datasets import make_did_CS2021

Data#

We will rely on the make_did_CS2021 DGP, which is inspired by Callaway and Sant’Anna(2021) (Appendix SC) and Sant’Anna and Zhao (2020).

We will observe n_obs units over n_periods. Remark that the dataframe includes observations of the potential outcomes y0 and y1, such that we can use oracle estimates as comparisons.

[2]:
n_obs = 5000
n_periods = 6

df = make_did_CS2021(n_obs, dgp_type=4, n_periods=n_periods, n_pre_treat_periods=3, time_type="datetime")
df["ite"] = df["y1"] - df["y0"]

print(df.shape)
df.head()
(30000, 11)
[2]:
id y y0 y1 d t Z1 Z2 Z3 Z4 ite
0 0 217.920494 217.920494 218.574270 2025-04-01 2025-01-01 0.708359 0.632899 0.073737 1.361102 0.653776
1 0 227.112458 227.112458 227.885147 2025-04-01 2025-02-01 0.708359 0.632899 0.073737 1.361102 0.772688
2 0 237.970499 237.970499 236.438777 2025-04-01 2025-03-01 0.708359 0.632899 0.073737 1.361102 -1.531722
3 0 246.622811 247.549557 246.622811 2025-04-01 2025-04-01 0.708359 0.632899 0.073737 1.361102 -0.926746
4 0 256.799129 256.482250 256.799129 2025-04-01 2025-05-01 0.708359 0.632899 0.073737 1.361102 0.316880

Data Details#

Here, we slightly abuse the definition of the potential outcomes. :math:`Y_{i,t}(1)` corresponds to the (potential) outcome if unit :math:`i` would have received treatment at time period :math:`mathrm{g}` (where the group :math:`mathrm{g}` is drawn with probabilities based on :math:`Z`).

More specifically

\[\begin{split}\begin{align*} Y_{i,t}(0)&:= f_t(Z) + \delta_t + \eta_i + \varepsilon_{i,t,0}\\ Y_{i,t}(1)&:= Y_{i,t}(0) + \theta_{i,t,\mathrm{g}} + \epsilon_{i,t,1} - \epsilon_{i,t,0} \end{align*}\end{split}\]

where

  • \(f_t(Z)\) depends on pre-treatment observable covariates \(Z_1,\dots, Z_4\) and time \(t\)

  • \(\delta_t\) is a time fixed effect

  • \(\eta_i\) is a unit fixed effect

  • \(\epsilon_{i,t,\cdot}\) are time varying unobservables (iid. \(N(0,1)\))

  • \(\theta_{i,t,\mathrm{g}}\) correponds to the exposure effect of unit \(i\) based on group \(\mathrm{g}\) at time \(t\)

For the pre-treatment periods the exposure effect is set to

\[\theta_{i,t,\mathrm{g}}:= 0 \text{ for } t<\mathrm{g}\]

such that

\[\mathbb{E}[Y_{i,t}(1) - Y_{i,t}(0)] = \mathbb{E}[\epsilon_{i,t,1} - \epsilon_{i,t,0}]=0 \text{ for } t<\mathrm{g}\]

The DoubleML Coverage Repository includes coverage simulations based on this DGP.

Data Description#

The data is a balanced panel where each unit is observed over n_periods starting Janary 2025.

[3]:
df.groupby("t").size()
[3]:
t
2025-01-01    5000
2025-02-01    5000
2025-03-01    5000
2025-04-01    5000
2025-05-01    5000
2025-06-01    5000
dtype: int64

The treatment column d indicates first treatment period of the corresponding unit, whereas NaT units are never treated.

Generally, never treated units should take either on the value ``np.inf`` or ``pd.NaT`` depending on the data type (``float`` or ``datetime``).

The individual units are roughly uniformly divided between the groups, where treatment assignment depends on the pre-treatment covariates Z1 to Z4.

[4]:
df.groupby("d", dropna=False).size()
[4]:
d
2025-04-01    7644
2025-05-01    7164
2025-06-01    7254
NaT           7938
dtype: int64

Here, the group indicates the first treated period and NaT units are never treated. To simplify plotting and pands

[5]:
df.groupby("d", dropna=False).size()
[5]:
d
2025-04-01    7644
2025-05-01    7164
2025-06-01    7254
NaT           7938
dtype: int64

To get a better understanding of the underlying data and true effects, we will compare the unconditional averages and the true effects based on the oracle values of individual effects ite.

[6]:
# rename for plotting
df["First Treated"] = df["d"].dt.strftime("%Y-%m").fillna("Never Treated")

# Create aggregation dictionary for means
def agg_dict(col_name):
    return {
        f'{col_name}_mean': (col_name, 'mean'),
        f'{col_name}_lower_quantile': (col_name, lambda x: x.quantile(0.05)),
        f'{col_name}_upper_quantile': (col_name, lambda x: x.quantile(0.95))
    }

# Calculate means and confidence intervals
agg_dictionary = agg_dict("y") | agg_dict("ite")

agg_df = df.groupby(["t", "First Treated"]).agg(**agg_dictionary).reset_index()
agg_df.head()
[6]:
t First Treated y_mean y_lower_quantile y_upper_quantile ite_mean ite_lower_quantile ite_upper_quantile
0 2025-01-01 2025-04 208.630676 198.519642 218.742311 0.018269 -2.200940 2.322330
1 2025-01-01 2025-05 210.506376 200.611306 220.056222 0.001277 -2.266340 2.397205
2 2025-01-01 2025-06 212.547224 202.960971 222.254827 0.010360 -2.336081 2.337262
3 2025-01-01 Never Treated 214.549188 204.438485 224.290850 -0.035466 -2.421229 2.297681
4 2025-02-01 2025-04 208.332856 188.245139 227.873347 0.035797 -2.262409 2.368012
[7]:
def plot_data(df, col_name='y'):
    """
    Create an improved plot with colorblind-friendly features

    Parameters:
    -----------
    df : DataFrame
        The dataframe containing the data
    col_name : str, default='y'
        Column name to plot (will use '{col_name}_mean')
    """
    plt.figure(figsize=(12, 7))
    n_colors = df["First Treated"].nunique()
    color_palette = sns.color_palette("colorblind", n_colors=n_colors)

    sns.lineplot(
        data=df,
        x='t',
        y=f'{col_name}_mean',
        hue='First Treated',
        style='First Treated',
        palette=color_palette,
        markers=True,
        dashes=True,
        linewidth=2.5,
        alpha=0.8
    )

    plt.title(f'Average Values {col_name} by Group Over Time', fontsize=16)
    plt.xlabel('Time', fontsize=14)
    plt.ylabel(f'Average Value {col_name}', fontsize=14)


    plt.legend(title='First Treated', title_fontsize=13, fontsize=12,
               frameon=True, framealpha=0.9, loc='best')

    plt.grid(alpha=0.3, linestyle='-')
    plt.tight_layout()

    plt.show()

So let us take a look at the average values over time

[8]:
plot_data(agg_df, col_name='y')
../../_images/examples_did_py_panel_16_0.png

Instead the true average treatment treatment effects can be obtained by averaging (usually unobserved) the ite values.

The true effect just equals the exposure time (in months):

\[ATT(\mathrm{g}, t) = \min(\mathrm{t} - \mathrm{g} + 1, 0) =: e\]
[9]:
plot_data(agg_df, col_name='ite')
../../_images/examples_did_py_panel_18_0.png

DoubleMLPanelData#

Finally, we can construct our DoubleMLPanelData, specifying

  • y_col : the outcome

  • d_cols: the group variable indicating the first treated period for each unit

  • id_col: the unique identification column for each unit

  • t_col : the time column

  • x_cols: the additional pre-treatment controls

  • datetime_unit: unit required for datetime columns and plotting

[10]:
dml_data = DoubleMLPanelData(
    data=df,
    y_col="y",
    d_cols="d",
    id_col="id",
    t_col="t",
    x_cols=["Z1", "Z2", "Z3", "Z4"],
    datetime_unit="M"
)
print(dml_data)
================== DoubleMLPanelData Object ==================

------------------ Data summary      ------------------
Outcome variable: y
Treatment variable(s): ['d']
Covariates: ['Z1', 'Z2', 'Z3', 'Z4']
Instrument variable(s): None
Time variable: t
Id variable: id
No. Observations: 5000

------------------ DataFrame info    ------------------
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 30000 entries, 0 to 29999
Columns: 12 entries, id to First Treated
dtypes: datetime64[s](2), float64(8), int64(1), object(1)
memory usage: 2.7+ MB

ATT Estimation#

The DoubleML-package implements estimation of group-time average treatment effect via the DoubleMLDIDMulti class (see model documentation).

Basics#

The class basically behaves like other DoubleML classes and requires the specification of two learners (for more details on the regression elements, see score documentation).

The basic arguments of a DoubleMLDIDMulti object include

  • ml_g “outcome” regression learner

  • ml_m propensity Score learner

  • control_group the control group for the parallel trend assumption

  • gt_combinations combinations of \((\mathrm{g},t_\text{pre}, t_\text{eval})\)

  • anticipation_periods number of anticipation periods

We will construct a dict with “default” arguments.

[11]:
default_args = {
    "ml_g": LGBMRegressor(n_estimators=500, learning_rate=0.01, verbose=-1, random_state=123),
    "ml_m": LGBMClassifier(n_estimators=500, learning_rate=0.01, verbose=-1, random_state=123),
    "control_group": "never_treated",
    "gt_combinations": "standard",
    "anticipation_periods": 0,
    "n_folds": 5,
    "n_rep": 1,
}

The model will be estimated using the fit() method.

[12]:
dml_obj = DoubleMLDIDMulti(dml_data, **default_args)
dml_obj.fit()
print(dml_obj)
================== DoubleMLDIDMulti Object ==================

------------------ Data summary      ------------------
Outcome variable: y
Treatment variable(s): ['d']
Covariates: ['Z1', 'Z2', 'Z3', 'Z4']
Instrument variable(s): None
Time variable: t
Id variable: id
No. Observations: 5000

------------------ Score & algorithm ------------------
Score function: observational
Control group: never_treated
Anticipation periods: 0

------------------ Machine learner   ------------------
Learner ml_g: LGBMRegressor(learning_rate=0.01, n_estimators=500, random_state=123,
              verbose=-1)
Learner ml_m: LGBMClassifier(learning_rate=0.01, n_estimators=500, random_state=123,
               verbose=-1)
Out-of-sample Performance:
Regression:
Learner ml_g0 RMSE: [[1.92455736 1.85401828 1.95508181 2.75158606 3.80009835 1.91238036
  1.89574244 1.9777628  1.8789068  2.70666402 1.9356523  1.87256876
  1.96315271 1.83833416 1.8924527 ]]
Learner ml_g1 RMSE: [[1.87771116 1.85874947 1.87171844 2.81819889 3.94244057 1.88122156
  2.02220894 1.99580893 1.9526094  2.94290198 2.08010132 1.94315077
  1.99049812 2.05746343 2.06708145]]
Classification:
Learner ml_m Log Loss: [[0.65740721 0.66052073 0.667453   0.66634434 0.659713   0.70350879
  0.69087724 0.69517372 0.69727928 0.69557425 0.72618203 0.72833368
  0.72868196 0.73962008 0.73550892]]

------------------ Resampling        ------------------
No. folds: 5
No. repeated sample splits: 1

------------------ Fit summary       ------------------
                                  coef   std err          t     P>|t|  \
ATT(2025-04,2025-01,2025-02) -0.012948  0.121145  -0.106884  0.914881
ATT(2025-04,2025-02,2025-03) -0.093126  0.107265  -0.868189  0.385291
ATT(2025-04,2025-03,2025-04)  0.942551  0.098459   9.573053  0.000000
ATT(2025-04,2025-03,2025-05)  1.788025  0.161759  11.053639  0.000000
ATT(2025-04,2025-03,2025-06)  2.565917  0.216033  11.877421  0.000000
ATT(2025-05,2025-01,2025-02)  0.065768  0.089355   0.736036  0.461709
ATT(2025-05,2025-02,2025-03) -0.079422  0.098252  -0.808352  0.418888
ATT(2025-05,2025-03,2025-04) -0.086222  0.104292  -0.826736  0.408387
ATT(2025-05,2025-04,2025-05)  0.944255  0.090784  10.401119  0.000000
ATT(2025-05,2025-04,2025-06)  1.938790  0.149031  13.009341  0.000000
ATT(2025-06,2025-01,2025-02)  0.007658  0.089526   0.085538  0.931833
ATT(2025-06,2025-02,2025-03) -0.035132  0.091838  -0.382542  0.702059
ATT(2025-06,2025-03,2025-04) -0.080153  0.137984  -0.580885  0.561318
ATT(2025-06,2025-04,2025-05)  0.041015  0.091744   0.447064  0.654829
ATT(2025-06,2025-05,2025-06)  0.962464  0.093322  10.313411  0.000000

                                 2.5 %    97.5 %
ATT(2025-04,2025-01,2025-02) -0.250387  0.224491
ATT(2025-04,2025-02,2025-03) -0.303362  0.117110
ATT(2025-04,2025-03,2025-04)  0.749575  1.135527
ATT(2025-04,2025-03,2025-05)  1.470983  2.105067
ATT(2025-04,2025-03,2025-06)  2.142500  2.989334
ATT(2025-05,2025-01,2025-02) -0.109364  0.240901
ATT(2025-05,2025-02,2025-03) -0.271993  0.113148
ATT(2025-05,2025-03,2025-04) -0.290630  0.118186
ATT(2025-05,2025-04,2025-05)  0.766321  1.122188
ATT(2025-05,2025-04,2025-06)  1.646695  2.230884
ATT(2025-06,2025-01,2025-02) -0.167810  0.183126
ATT(2025-06,2025-02,2025-03) -0.215132  0.144868
ATT(2025-06,2025-03,2025-04) -0.350597  0.190291
ATT(2025-06,2025-04,2025-05) -0.138799  0.220830
ATT(2025-06,2025-05,2025-06)  0.779557  1.145370

The summary displays estimates of the \(ATT(g,t_\text{eval})\) effects for different combinations of \((g,t_\text{eval})\) via \(\widehat{ATT}(\mathrm{g},t_\text{pre},t_\text{eval})\), where

  • \(\mathrm{g}\) specifies the group

  • \(t_\text{pre}\) specifies the corresponding pre-treatment period

  • \(t_\text{eval}\) specifies the evaluation period

The choice gt_combinations="standard", used estimates all possible combinations of \(ATT(g,t_\text{eval})\) via \(\widehat{ATT}(\mathrm{g},t_\text{pre},t_\text{eval})\), where the standard choice is \(t_\text{pre} = \min(\mathrm{g}, t_\text{eval}) - 1\) (without anticipation).

Remark that this includes pre-tests effects if \(\mathrm{g} > t_{eval}\), e.g. \(\widehat{ATT}(g=\text{2025-04}, t_{\text{pre}}=\text{2025-01}, t_{\text{eval}}=\text{2025-02})\) which estimates the pre-trend from January to February even if the actual treatment occured in April.

As usual for the DoubleML-package, you can obtain joint confidence intervals via bootstrap.

[13]:
level = 0.95

ci = dml_obj.confint(level=level)
dml_obj.bootstrap(n_rep_boot=5000)
ci_joint = dml_obj.confint(level=level, joint=True)
ci_joint
[13]:
2.5 % 97.5 %
ATT(2025-04,2025-01,2025-02) -0.361836 0.335939
ATT(2025-04,2025-02,2025-03) -0.402042 0.215789
ATT(2025-04,2025-03,2025-04) 0.658997 1.226105
ATT(2025-04,2025-03,2025-05) 1.322171 2.253878
ATT(2025-04,2025-03,2025-06) 1.943758 3.188076
ATT(2025-05,2025-01,2025-02) -0.191567 0.323104
ATT(2025-05,2025-02,2025-03) -0.362381 0.203536
ATT(2025-05,2025-03,2025-04) -0.386575 0.214131
ATT(2025-05,2025-04,2025-05) 0.682804 1.205706
ATT(2025-05,2025-04,2025-06) 1.509593 2.367987
ATT(2025-06,2025-01,2025-02) -0.250171 0.265487
ATT(2025-06,2025-02,2025-03) -0.299620 0.229356
ATT(2025-06,2025-03,2025-04) -0.477538 0.317231
ATT(2025-06,2025-04,2025-05) -0.223200 0.305231
ATT(2025-06,2025-05,2025-06) 0.693704 1.231223

A visualization of the effects can be obtained via the plot_effects() method.

Remark that the plot used joint confidence intervals per default.

[14]:
dml_obj.plot_effects()
/opt/hostedtoolcache/Python/3.12.10/x64/lib/python3.12/site-packages/matplotlib/cbook.py:1719: FutureWarning: Calling float on a single element Series is deprecated and will raise a TypeError in the future. Use float(ser.iloc[0]) instead
  return math.isfinite(val)
[14]:
(<Figure size 1200x800 with 4 Axes>,
 [<Axes: title={'center': 'First Treated: 2025-04'}, ylabel='Effect'>,
  <Axes: title={'center': 'First Treated: 2025-05'}, ylabel='Effect'>,
  <Axes: title={'center': 'First Treated: 2025-06'}, xlabel='Evaluation Period', ylabel='Effect'>])
../../_images/examples_did_py_panel_30_2.png

Sensitivity Analysis#

As descripted in the Sensitivity Guide, robustness checks on omitted confounding/parallel trend violations are available, via the standard sensitivity_analysis() method.

[15]:
dml_obj.sensitivity_analysis()
print(dml_obj.sensitivity_summary)
================== Sensitivity Analysis ==================

------------------ Scenario          ------------------
Significance Level: level=0.95
Sensitivity parameters: cf_y=0.03; cf_d=0.03, rho=1.0

------------------ Bounds with CI    ------------------
                              CI lower  theta lower     theta  theta upper  \
ATT(2025-04,2025-01,2025-02) -0.283134    -0.106370 -0.012948     0.080473
ATT(2025-04,2025-02,2025-03) -0.381768    -0.197853 -0.093126     0.011600
ATT(2025-04,2025-03,2025-04)  0.663394     0.823763  0.942551     1.061339
ATT(2025-04,2025-03,2025-05)  1.378974     1.641539  1.788025     1.934511
ATT(2025-04,2025-03,2025-06)  1.972901     2.327642  2.565917     2.804191
ATT(2025-05,2025-01,2025-02) -0.191655    -0.045184  0.065768     0.176721
ATT(2025-05,2025-02,2025-03) -0.349760    -0.187553 -0.079422     0.028709
ATT(2025-05,2025-03,2025-04) -0.373141    -0.201937 -0.086222     0.029493
ATT(2025-05,2025-04,2025-05)  0.688804     0.837518  0.944255     1.050991
ATT(2025-05,2025-04,2025-06)  1.542597     1.784421  1.938790     2.093158
ATT(2025-06,2025-01,2025-02) -0.251260    -0.104641  0.007658     0.119957
ATT(2025-06,2025-02,2025-03) -0.289373    -0.138307 -0.035132     0.068043
ATT(2025-06,2025-03,2025-04) -0.466166    -0.134015 -0.080153    -0.026291
ATT(2025-06,2025-04,2025-05) -0.202953    -0.054682  0.041015     0.136713
ATT(2025-06,2025-05,2025-06)  0.703213     0.856262  0.962464     1.068665

                              CI upper
ATT(2025-04,2025-01,2025-02)  0.309518
ATT(2025-04,2025-02,2025-03)  0.183691
ATT(2025-04,2025-03,2025-04)  1.226446
ATT(2025-04,2025-03,2025-05)  2.209927
ATT(2025-04,2025-03,2025-06)  3.163812
ATT(2025-05,2025-01,2025-02)  0.324760
ATT(2025-05,2025-02,2025-03)  0.190702
ATT(2025-05,2025-03,2025-04)  0.201940
ATT(2025-05,2025-04,2025-05)  1.201608
ATT(2025-05,2025-04,2025-06)  2.343418
ATT(2025-06,2025-01,2025-02)  0.268681
ATT(2025-06,2025-02,2025-03)  0.219767
ATT(2025-06,2025-03,2025-04)  0.137827
ATT(2025-06,2025-04,2025-05)  0.291311
ATT(2025-06,2025-05,2025-06)  1.223161

------------------ Robustness Values ------------------
                              H_0     RV (%)    RVa (%)
ATT(2025-04,2025-01,2025-02)  0.0   0.421132   0.000604
ATT(2025-04,2025-02,2025-03)  0.0   2.672283   0.000370
ATT(2025-04,2025-03,2025-04)  0.0  21.424585  17.754272
ATT(2025-04,2025-03,2025-05)  0.0  30.905381  24.656626
ATT(2025-04,2025-03,2025-06)  0.0  27.860489  23.520878
ATT(2025-05,2025-01,2025-02)  0.0   1.789450   0.000346
ATT(2025-05,2025-02,2025-03)  0.0   2.212517   0.000498
ATT(2025-05,2025-03,2025-04)  0.0   2.244155   0.000629
ATT(2025-05,2025-04,2025-05)  0.0  23.560017  20.007214
ATT(2025-05,2025-04,2025-06)  0.0  31.632586  27.521197
ATT(2025-06,2025-01,2025-02)  0.0   0.207350   0.000665
ATT(2025-06,2025-02,2025-03)  0.0   1.031978   0.000345
ATT(2025-06,2025-03,2025-04)  0.0   4.431327   1.989953
ATT(2025-06,2025-04,2025-05)  0.0   1.297149   0.000493
ATT(2025-06,2025-05,2025-06)  0.0  24.056658  20.444706

In this example one can clearly, distinguish the robustness of the non-zero effects vs. the pre-treatment periods.

Control Groups#

The current implementation support the following control groups

  • "never_treated"

  • "not_yet_treated"

Remark that the ``”not_yet_treated” depends on anticipation.

For differences and recommendations, we refer to Callaway and Sant’Anna(2021).

[16]:
dml_obj_nyt = DoubleMLDIDMulti(dml_data, **(default_args | {"control_group": "not_yet_treated"}))
dml_obj_nyt.fit()
dml_obj_nyt.bootstrap(n_rep_boot=5000)
dml_obj_nyt.plot_effects()
/opt/hostedtoolcache/Python/3.12.10/x64/lib/python3.12/site-packages/matplotlib/cbook.py:1719: FutureWarning: Calling float on a single element Series is deprecated and will raise a TypeError in the future. Use float(ser.iloc[0]) instead
  return math.isfinite(val)
[16]:
(<Figure size 1200x800 with 4 Axes>,
 [<Axes: title={'center': 'First Treated: 2025-04'}, ylabel='Effect'>,
  <Axes: title={'center': 'First Treated: 2025-05'}, ylabel='Effect'>,
  <Axes: title={'center': 'First Treated: 2025-06'}, xlabel='Evaluation Period', ylabel='Effect'>])
../../_images/examples_did_py_panel_35_2.png

Linear Covariate Adjustment#

Remark that we relied on boosted trees to adjust for conditional parallel trends which allow for a nonlinear adjustment. In comparison to linear adjustment, we could rely on linear learners.

Remark that the DGP (``dgp_type=4``) is based on nonlinear conditional expectations such that the estimates will be biased

[17]:
linear_learners = {
    "ml_g": LinearRegression(),
    "ml_m": LogisticRegression(),
}

dml_obj_linear = DoubleMLDIDMulti(dml_data, **(default_args | linear_learners))
dml_obj_linear.fit()
dml_obj_linear.bootstrap(n_rep_boot=5000)
dml_obj_linear.plot_effects()
/opt/hostedtoolcache/Python/3.12.10/x64/lib/python3.12/site-packages/matplotlib/cbook.py:1719: FutureWarning: Calling float on a single element Series is deprecated and will raise a TypeError in the future. Use float(ser.iloc[0]) instead
  return math.isfinite(val)
[17]:
(<Figure size 1200x800 with 4 Axes>,
 [<Axes: title={'center': 'First Treated: 2025-04'}, ylabel='Effect'>,
  <Axes: title={'center': 'First Treated: 2025-05'}, ylabel='Effect'>,
  <Axes: title={'center': 'First Treated: 2025-06'}, xlabel='Evaluation Period', ylabel='Effect'>])
../../_images/examples_did_py_panel_37_2.png

Aggregated Effects#

As the did-R-package, the \(ATT\)’s can be aggregated to summarize multiple effects. For details on different aggregations and details on their interpretations see Callaway and Sant’Anna(2021).

The aggregations are implemented via the aggregate() method.

Group Aggregation#

To obtain group-specific effects one can would like to average \(ATT(\mathrm{g}, t_\text{eval})\) over \(t_\text{eval}\). As a sample oracle we will combine all ite’s based on group \(\mathrm{g}\).

[18]:
df_post_treatment = df[df["t"] >= df["d"]]
df_post_treatment.groupby("d")["ite"].mean()
[18]:
d
2025-04-01    1.949375
2025-05-01    1.515966
2025-06-01    0.991800
Name: ite, dtype: float64

To obtain group-specific effects it is possible to aggregate several \(\widehat{ATT}(\mathrm{g},t_\text{pre},t_\text{eval})\) values based on the group \(\mathrm{g}\) by setting the aggregation="group" argument.

[19]:
aggregated_group = dml_obj.aggregate(aggregation="group")
print(aggregated_group)
_ = aggregated_group.plot_effects()
================== DoubleMLDIDAggregation Object ==================
 Group Aggregation

------------------ Overall Aggregated Effects ------------------
    coef  std err         t  P>|t|    2.5 %   97.5 %
1.396258  0.08413 16.596481    0.0 1.231366 1.561149
------------------ Aggregated Effects         ------------------
             coef   std err          t  P>|t|     2.5 %    97.5 %
2025-04  1.765498  0.142368  12.400909    0.0  1.486461  2.044534
2025-05  1.441522  0.106704  13.509604    0.0  1.232387  1.650657
2025-06  0.962464  0.093322  10.313411    0.0  0.779557  1.145370
------------------ Additional Information     ------------------
Score function: observational
Control group: never_treated
Anticipation periods: 0

/home/runner/work/doubleml-docs/doubleml-docs/doubleml-for-py/doubleml/did/did_aggregation.py:368: UserWarning: Joint confidence intervals require bootstrapping which hasn't been performed yet. Automatically applying '.aggregated_frameworks.bootstrap(method="normal", n_rep_boot=500)' with default values. For different bootstrap settings, call bootstrap() explicitly before plotting.
  warnings.warn(
../../_images/examples_did_py_panel_42_2.png

The output is a DoubleMLDIDAggregation object which includes an overall aggregation summary based on group size.

Time Aggregation#

To obtain time-specific effects one can would like to average \(ATT(\mathrm{g}, t_\text{eval})\) over \(\mathrm{g}\) (respecting group size). As a sample oracle we will combine all ite’s based on group \(\mathrm{g}\). As oracle values, we obtain

[20]:
df_post_treatment.groupby("t")["ite"].mean()
[20]:
t
2025-04-01    1.016690
2025-05-01    1.484789
2025-06-01    1.988034
Name: ite, dtype: float64

To aggregate \(\widehat{ATT}(\mathrm{g},t_\text{pre},t_\text{eval})\), based on \(t_\text{eval}\), but weighted with respect to group size. Corresponds to Calendar Time Effects from the did-R-package.

For calendar time effects set aggregation="time".

[21]:
aggregated_time = dml_obj.aggregate("time")
print(aggregated_time)
fig, ax = aggregated_time.plot_effects()
================== DoubleMLDIDAggregation Object ==================
 Time Aggregation

------------------ Overall Aggregated Effects ------------------
    coef  std err         t  P>|t|    2.5 %   97.5 %
1.385808 0.091954 15.070635    0.0 1.205581 1.566035
------------------ Aggregated Effects         ------------------
             coef   std err          t  P>|t|     2.5 %    97.5 %
2025-04  0.942551  0.098459   9.573053    0.0  0.749575  1.135527
2025-05  1.379815  0.106729  12.928171    0.0  1.170629  1.589001
2025-06  1.835059  0.118809  15.445464    0.0  1.602198  2.067920
------------------ Additional Information     ------------------
Score function: observational
Control group: never_treated
Anticipation periods: 0

/home/runner/work/doubleml-docs/doubleml-docs/doubleml-for-py/doubleml/did/did_aggregation.py:368: UserWarning: Joint confidence intervals require bootstrapping which hasn't been performed yet. Automatically applying '.aggregated_frameworks.bootstrap(method="normal", n_rep_boot=500)' with default values. For different bootstrap settings, call bootstrap() explicitly before plotting.
  warnings.warn(
../../_images/examples_did_py_panel_47_2.png

Event Study Aggregation#

To obtain event-study-type effects one can would like to aggregate \(ATT(\mathrm{g}, t_\text{eval})\) over \(e = t_\text{eval} - \mathrm{g}\) (respecting group size). As a sample oracle we will combine all ite’s based on group \(\mathrm{g}\). As oracle values, we obtain

[22]:
df["e"] = pd.to_datetime(df["t"]).values.astype("datetime64[M]") - \
    pd.to_datetime(df["d"]).values.astype("datetime64[M]")
df.groupby("e")["ite"].mean()[1:]
[22]:
e
-122 days   -0.009192
-92 days     0.013879
-61 days     0.030282
-31 days    -0.033019
0 days       1.007054
31 days      1.972206
59 days      2.903752
Name: ite, dtype: float64

Analogously, aggregation="eventstudy" aggregates \(\widehat{ATT}(\mathrm{g},t_\text{pre},t_\text{eval})\) based on exposure time \(e = t_\text{eval} - \mathrm{g}\) (respecting group size).

[23]:
aggregated_eventstudy = dml_obj.aggregate("eventstudy")
print(aggregated_eventstudy)
aggregated_eventstudy.plot_effects()
================== DoubleMLDIDAggregation Object ==================
 Event Study Aggregation

------------------ Overall Aggregated Effects ------------------
    coef  std err         t  P>|t|   2.5 %   97.5 %
1.792177  0.12144 14.757748    0.0 1.55416 2.030195
------------------ Aggregated Effects         ------------------
               coef   std err          t     P>|t|     2.5 %    97.5 %
-4 months  0.007658  0.089526   0.085538  0.931833 -0.167810  0.183126
-3 months  0.015003  0.064792   0.231562  0.816878 -0.111986  0.141992
-2 months -0.056631  0.071440  -0.792708  0.427948 -0.196650  0.083388
-1 months -0.046778  0.061729  -0.757804  0.448569 -0.167765  0.074208
0 months   0.949651  0.059132  16.059900  0.000000  0.833755  1.065548
1 months   1.860964  0.127630  14.580936  0.000000  1.610814  2.111114
2 months   2.565917  0.216033  11.877421  0.000000  2.142500  2.989334
------------------ Additional Information     ------------------
Score function: observational
Control group: never_treated
Anticipation periods: 0

/home/runner/work/doubleml-docs/doubleml-docs/doubleml-for-py/doubleml/did/did_aggregation.py:368: UserWarning: Joint confidence intervals require bootstrapping which hasn't been performed yet. Automatically applying '.aggregated_frameworks.bootstrap(method="normal", n_rep_boot=500)' with default values. For different bootstrap settings, call bootstrap() explicitly before plotting.
  warnings.warn(
[23]:
(<Figure size 1200x600 with 1 Axes>,
 <Axes: title={'center': 'Aggregated Treatment Effects'}, ylabel='Effect'>)
../../_images/examples_did_py_panel_51_3.png

Aggregation Details#

The DoubleMLDIDAggregation objects include several DoubleMLFrameworks which support methods like bootstrap() or confint(). Further, the weights can be accessed via the properties

  • overall_aggregation_weights: weights for the overall aggregation

  • aggregation_weights: weights for the aggregation

To clarify, e.g. for the eventstudy aggregation

[24]:
print(aggregated_eventstudy)
================== DoubleMLDIDAggregation Object ==================
 Event Study Aggregation

------------------ Overall Aggregated Effects ------------------
    coef  std err         t  P>|t|   2.5 %   97.5 %
1.792177  0.12144 14.757748    0.0 1.55416 2.030195
------------------ Aggregated Effects         ------------------
               coef   std err          t     P>|t|     2.5 %    97.5 %
-4 months  0.007658  0.089526   0.085538  0.931833 -0.167810  0.183126
-3 months  0.015003  0.064792   0.231562  0.816878 -0.111986  0.141992
-2 months -0.056631  0.071440  -0.792708  0.427948 -0.196650  0.083388
-1 months -0.046778  0.061729  -0.757804  0.448569 -0.167765  0.074208
0 months   0.949651  0.059132  16.059900  0.000000  0.833755  1.065548
1 months   1.860964  0.127630  14.580936  0.000000  1.610814  2.111114
2 months   2.565917  0.216033  11.877421  0.000000  2.142500  2.989334
------------------ Additional Information     ------------------
Score function: observational
Control group: never_treated
Anticipation periods: 0

Here, the overall effect aggregation aggregates each effect with positive exposure

[25]:
print(aggregated_eventstudy.overall_aggregation_weights)
[0.         0.         0.         0.         0.33333333 0.33333333
 0.33333333]

If one would like to consider how the aggregated effect with \(e=0\) is computed, one would have to look at the corresponding set of weights within the aggregation_weights property

[26]:
# the weights for e=0 correspond to the fifth element of the aggregation weights
aggregated_eventstudy.aggregation_weights[4]
[26]:
array([0.        , 0.        , 0.34647811, 0.        , 0.        ,
       0.        , 0.        , 0.        , 0.32472124, 0.        ,
       0.        , 0.        , 0.        , 0.        , 0.32880065])

Taking a look at the original dml_obj, one can see that this combines the following estimates (only show month):

  • \(\widehat{ATT}(04,03,04)\)

  • \(\widehat{ATT}(05,04,05)\)

  • \(\widehat{ATT}(06,05,06)\)

[27]:
print(dml_obj.summary["coef"])
ATT(2025-04,2025-01,2025-02)   -0.012948
ATT(2025-04,2025-02,2025-03)   -0.093126
ATT(2025-04,2025-03,2025-04)    0.942551
ATT(2025-04,2025-03,2025-05)    1.788025
ATT(2025-04,2025-03,2025-06)    2.565917
ATT(2025-05,2025-01,2025-02)    0.065768
ATT(2025-05,2025-02,2025-03)   -0.079422
ATT(2025-05,2025-03,2025-04)   -0.086222
ATT(2025-05,2025-04,2025-05)    0.944255
ATT(2025-05,2025-04,2025-06)    1.938790
ATT(2025-06,2025-01,2025-02)    0.007658
ATT(2025-06,2025-02,2025-03)   -0.035132
ATT(2025-06,2025-03,2025-04)   -0.080153
ATT(2025-06,2025-04,2025-05)    0.041015
ATT(2025-06,2025-05,2025-06)    0.962464
Name: coef, dtype: float64

Anticipation#

As described in the Model Guide, one can include anticipation periods \(\delta>0\) by setting the anticipation_periods parameter.

Data with Anticipation#

The DGP allows to include anticipation periods via the anticipation_periods parameter. In this case the observations will be “shifted” such that units anticipate the effect earlier and the exposure effect is increased by the number of periods where the effect is anticipated.

[28]:
n_obs = 4000
n_periods = 6

df_anticipation = make_did_CS2021(n_obs, dgp_type=4, n_periods=n_periods, n_pre_treat_periods=3, time_type="datetime", anticipation_periods=1)

print(df_anticipation.shape)
df_anticipation.head()

(19098, 10)
[28]:
id y y0 y1 d t Z1 Z2 Z3 Z4
1 0 214.984050 214.984050 215.060279 2025-04-01 2025-01-01 0.561679 0.247085 -1.039505 0.936829
2 0 221.275853 221.275853 219.963575 2025-04-01 2025-02-01 0.561679 0.247085 -1.039505 0.936829
3 0 225.462727 223.376615 225.462727 2025-04-01 2025-03-01 0.561679 0.247085 -1.039505 0.936829
4 0 229.498095 226.740252 229.498095 2025-04-01 2025-04-01 0.561679 0.247085 -1.039505 0.936829
5 0 233.536333 230.214728 233.536333 2025-04-01 2025-05-01 0.561679 0.247085 -1.039505 0.936829

To visualize the anticipation, we will again plot the “oracle” values

[29]:
df_anticipation["ite"] = df_anticipation["y1"] - df_anticipation["y0"]
df_anticipation["First Treated"] = df_anticipation["d"].dt.strftime("%Y-%m").fillna("Never Treated")
agg_df_anticipation = df_anticipation.groupby(["t", "First Treated"]).agg(**agg_dictionary).reset_index()
agg_df_anticipation.head()
[29]:
t First Treated y_mean y_lower_quantile y_upper_quantile ite_mean ite_lower_quantile ite_upper_quantile
0 2025-01-01 2025-04 209.270692 193.022165 225.288455 -0.129132 -2.456083 2.096676
1 2025-01-01 2025-05 210.963424 193.373505 227.919593 0.066544 -2.267666 2.381496
2 2025-01-01 2025-06 212.957702 196.555837 230.538078 -0.025788 -2.283599 2.274034
3 2025-01-01 Never Treated 217.041256 201.170561 233.475554 0.028698 -2.287694 2.313850
4 2025-02-01 2025-04 209.274684 184.694158 232.948257 0.086458 -2.127437 2.316589

One can see that the effect is already anticipated one period before the actual treatment assignment.

[30]:
plot_data(agg_df_anticipation, col_name='ite')
../../_images/examples_did_py_panel_66_0.png

Initialize a corresponding DoubleMLPanelData object.

[31]:
dml_data_anticipation = DoubleMLPanelData(
    data=df_anticipation,
    y_col="y",
    d_cols="d",
    id_col="id",
    t_col="t",
    x_cols=["Z1", "Z2", "Z3", "Z4"],
    datetime_unit="M"
)

ATT Estimation#

Let us take a look at the estimation without anticipation.

[32]:
dml_obj_anticipation = DoubleMLDIDMulti(dml_data_anticipation, **default_args)
dml_obj_anticipation.fit()
dml_obj_anticipation.bootstrap(n_rep_boot=5000)
dml_obj_anticipation.plot_effects()
/home/runner/work/doubleml-docs/doubleml-docs/doubleml-for-py/doubleml/double_ml.py:1470: UserWarning: The estimated nu2 for d is not positive. Re-estimation based on riesz representer (non-orthogonal).
  warnings.warn(msg, UserWarning)
/opt/hostedtoolcache/Python/3.12.10/x64/lib/python3.12/site-packages/matplotlib/cbook.py:1719: FutureWarning: Calling float on a single element Series is deprecated and will raise a TypeError in the future. Use float(ser.iloc[0]) instead
  return math.isfinite(val)
[32]:
(<Figure size 1200x800 with 4 Axes>,
 [<Axes: title={'center': 'First Treated: 2025-04'}, ylabel='Effect'>,
  <Axes: title={'center': 'First Treated: 2025-05'}, ylabel='Effect'>,
  <Axes: title={'center': 'First Treated: 2025-06'}, xlabel='Evaluation Period', ylabel='Effect'>])
../../_images/examples_did_py_panel_70_2.png

The effects are obviously biased. To include anticipation periods, one can adjust the anticipation_periods parameter. Correspondingly, the outcome regression (and not yet treated units) are adjusted.

[33]:
dml_obj_anticipation = DoubleMLDIDMulti(dml_data_anticipation, **(default_args| {"anticipation_periods": 1}))
dml_obj_anticipation.fit()
dml_obj_anticipation.bootstrap(n_rep_boot=5000)
dml_obj_anticipation.plot_effects()
/opt/hostedtoolcache/Python/3.12.10/x64/lib/python3.12/site-packages/matplotlib/cbook.py:1719: FutureWarning: Calling float on a single element Series is deprecated and will raise a TypeError in the future. Use float(ser.iloc[0]) instead
  return math.isfinite(val)
/opt/hostedtoolcache/Python/3.12.10/x64/lib/python3.12/site-packages/matplotlib/cbook.py:1719: FutureWarning: Calling float on a single element Series is deprecated and will raise a TypeError in the future. Use float(ser.iloc[0]) instead
  return math.isfinite(val)
[33]:
(<Figure size 1200x800 with 4 Axes>,
 [<Axes: title={'center': 'First Treated: 2025-04'}, ylabel='Effect'>,
  <Axes: title={'center': 'First Treated: 2025-05'}, ylabel='Effect'>,
  <Axes: title={'center': 'First Treated: 2025-06'}, xlabel='Evaluation Period', ylabel='Effect'>])
../../_images/examples_did_py_panel_72_2.png

Group-Time Combinations#

The default option gt_combinations="standard" includes all group time values with the specific choice of \(t_\text{pre} = \min(\mathrm{g}, t_\text{eval}) - 1\) (without anticipation) which is the weakest possible parallel trend assumption.

Other options are possible or only specific combinations of \((\mathrm{g},t_\text{pre},t_\text{eval})\).

All combinations#

The option gt_combinations="all" includes all relevant group time values with \(t_\text{pre} < \min(\mathrm{g}, t_\text{eval})\), including longer parallel trend assumptions. This can result in multiple estimates for the same \(ATT(\mathrm{g},t)\), which have slightly different assumptions (length of parallel trends).

[34]:
dml_obj_all = DoubleMLDIDMulti(dml_data, **(default_args| {"gt_combinations": "all"}))
dml_obj_all.fit()
dml_obj_all.bootstrap(n_rep_boot=5000)
dml_obj_all.plot_effects()
[34]:
(<Figure size 1200x800 with 4 Axes>,
 [<Axes: title={'center': 'First Treated: 2025-04'}, ylabel='Effect'>,
  <Axes: title={'center': 'First Treated: 2025-05'}, ylabel='Effect'>,
  <Axes: title={'center': 'First Treated: 2025-06'}, xlabel='Evaluation Period', ylabel='Effect'>])
../../_images/examples_did_py_panel_75_1.png

Selected Combinations#

Instead it is also possible to just submit a list of tuples containing \((\mathrm{g}, t_\text{pre}, t_\text{eval})\) combinations. E.g. only two combinations

[35]:
gt_dict = {
    "gt_combinations": [
        (np.datetime64('2025-04'),
         np.datetime64('2025-01'),
         np.datetime64('2025-02')),
        (np.datetime64('2025-04'),
         np.datetime64('2025-02'),
         np.datetime64('2025-03')),
    ]
}

dml_obj_all = DoubleMLDIDMulti(dml_data, **(default_args| gt_dict))
dml_obj_all.fit()
dml_obj_all.bootstrap(n_rep_boot=5000)
dml_obj_all.plot_effects()
[35]:
(<Figure size 1200x800 with 2 Axes>,
 [<Axes: title={'center': 'First Treated: 2025-04'}, xlabel='Evaluation Period', ylabel='Effect'>])
../../_images/examples_did_py_panel_77_1.png