Note

https://docs.doubleml.org/stable/examples/did/py_rep_cs.ipynb

Python: Repeated Cross-Sectional Data with Multiple Time Periods#

In this example, a detailed guide on Difference-in-Differences with multiple time periods using the DoubleML-package. The implementation is based on Callaway and Sant’Anna(2021).

The notebook requires the following packages:

[1]:

import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

from lightgbm import LGBMRegressor, LGBMClassifier
from sklearn.linear_model import LinearRegression, LogisticRegression

from doubleml.did import DoubleMLDIDMulti
from doubleml.data import DoubleMLPanelData

from doubleml.did.datasets import make_did_cs_CS2021

Data#

We will rely on the make_did_cs_CS2021 DGP, which is inspired by Callaway and Sant’Anna(2021) (Appendix SC) and Sant’Anna and Zhao (2020).

We will observe approximately n_obs units over n_periods. The parameter lambda_t determines the probability of observing a unit i in time period t. The parameter lambda_t is set to 0.5 for all time periods, which means that each unit has a 50% chance of being observed in each time period.

Remark that the dataframe includes observations of the potential outcomes y0 and y1, such that we can use oracle estimates as comparisons.

[2]:

n_obs = 5000
n_periods = 6

df = make_did_cs_CS2021(n_obs, dgp_type=4, include_never_treated=True, n_periods=n_periods, n_pre_treat_periods=3,
                        lambda_t=0.5, time_type="float")
df["ite"] = df["y1"] - df["y0"]

print(df.shape)
df.head()

(29920, 11)

[2]:

	id	y	y0	y1	d	t	Z1	Z2	Z3	Z4	ite
2	0	179.325043	179.325043	178.082399	3.0	2	-1.338960	-2.208789	-2.970926	-1.758140	-1.242643
3	0	166.943112	167.268110	166.943112	3.0	3	-1.338960	-2.208789	-2.970926	-1.758140	-0.324998
4	0	159.707279	154.294812	159.707279	3.0	4	-1.338960	-2.208789	-2.970926	-1.758140	5.412466
5	0	147.042170	143.989120	147.042170	3.0	5	-1.338960	-2.208789	-2.970926	-1.758140	3.053050
6	1	220.276504	220.276504	220.817989	inf	0	-0.159451	0.616464	0.050021	0.566261	0.541485

Data Details#

Here, we slightly abuse the definition of the potential outcomes. :math:`Y_{i,t}(1)` corresponds to the (potential) outcome if unit :math:`i` would have received treatment at time period :math:`mathrm{g}` (where the group :math:`mathrm{g}` is drawn with probabilities based on :math:`Z`).

The data set with repeated cross-sectional data is generated on the basis of a panel data set with the following data generating process (DGP). To obtain repeated cross-sectional data, the number of generated individuals is increased to \(\frac{n_{obs}}{\lambda_t}\), where \(\lambda_t\) denotes the probability to observe a unit at each time period (time constant).

More specifically

\[\begin{split}\begin{align*} Y_{i,t}(0)&:= f_t(Z) + \delta_t + \eta_i + \varepsilon_{i,t,0}\\ Y_{i,t}(1)&:= Y_{i,t}(0) + \theta_{i,t,\mathrm{g}} + \epsilon_{i,t,1} - \epsilon_{i,t,0} \end{align*}\end{split}\]

where

\(f_t(Z)\) depends on pre-treatment observable covariates \(Z_1,\dots, Z_4\) and time \(t\)
\(\delta_t\) is a time fixed effect
\(\eta_i\) is a unit fixed effect
\(\epsilon_{i,t,\cdot}\) are time varying unobservables (iid. \(N(0,1)\))
\(\theta_{i,t,\mathrm{g}}\) correponds to the exposure effect of unit \(i\) based on group \(\mathrm{g}\) at time \(t\)

For the pre-treatment periods the exposure effect is set to

\[\theta_{i,t,\mathrm{g}}:= 0 \text{ for } t<\mathrm{g}\]

such that

\[\mathbb{E}[Y_{i,t}(1) - Y_{i,t}(0)] = \mathbb{E}[\epsilon_{i,t,1} - \epsilon_{i,t,0}]=0 \text{ for } t<\mathrm{g}\]

The DoubleML Coverage Repository includes coverage simulations based on this DGP.

Data Description#

The data is a balanced panel where each unit is observed over n_periods starting Janary 2025.

[3]:

df.groupby("t").size()

[3]:

t
0    5000
1    4996
2    5021
3    4949
4    5011
5    4943
dtype: int64

The treatment column d indicates first treatment period of the corresponding unit, whereas NaT units are never treated.

Generally, never treated units should take either on the value ``np.inf`` or ``pd.NaT`` depending on the data type (``float`` or ``datetime``).

The individual units are roughly uniformly divided between the groups, where treatment assignment depends on the pre-treatment covariates Z1 to Z4.

[4]:

df.groupby("d", dropna=False).size()

[4]:

d
3.0    7652
4.0    7401
5.0    7194
inf    7673
dtype: int64

Here, the group indicates the first treated period and NaT units are never treated. To simplify plotting and pands

[5]:

df.groupby("d", dropna=False).size()

[5]:

d
3.0    7652
4.0    7401
5.0    7194
inf    7673
dtype: int64

To get a better understanding of the underlying data and true effects, we will compare the unconditional averages and the true effects based on the oracle values of individual effects ite.

[6]:

# rename for plotting

# Create aggregation dictionary for means
def agg_dict(col_name):
    return {
        f'{col_name}_mean': (col_name, 'mean'),
        f'{col_name}_lower_quantile': (col_name, lambda x: x.quantile(0.05)),
        f'{col_name}_upper_quantile': (col_name, lambda x: x.quantile(0.95))
    }

# Calculate means and confidence intervals
agg_dictionary = agg_dict("y") | agg_dict("ite")

agg_df = df.groupby(["t", "d"]).agg(**agg_dictionary).reset_index()
agg_df.head()

[6]:

	t	d	y_mean	y_lower_quantile	y_upper_quantile	ite_mean	ite_lower_quantile	ite_upper_quantile
0	0	3.0	208.538910	198.427744	218.864450	0.028447	-2.232985	2.184457
1	0	4.0	210.477425	200.259821	220.224655	0.008754	-2.316989	2.300318
2	0	5.0	212.273933	202.777585	222.501765	-0.059191	-2.423249	2.153917
3	0	inf	214.320895	204.336954	224.006578	-0.026536	-2.307181	2.325677
4	1	3.0	208.263539	188.685310	228.239148	-0.024700	-2.326172	2.256290

[7]:

def plot_data(df, col_name='y'):
    """
    Create an improved plot with colorblind-friendly features

    Parameters:
    -----------
    df : DataFrame
        The dataframe containing the data
    col_name : str, default='y'
        Column name to plot (will use '{col_name}_mean')
    """
    plt.figure(figsize=(12, 7))
    n_colors = df["d"].nunique()
    color_palette = sns.color_palette("colorblind", n_colors=n_colors)

    sns.lineplot(
        data=df,
        x='t',
        y=f'{col_name}_mean',
        hue='d',
        style='d',
        palette=color_palette,
        markers=True,
        dashes=True,
        linewidth=2.5,
        alpha=0.8
    )

    plt.title(f'Average Values {col_name} by Group Over Time', fontsize=16)
    plt.xlabel('Time', fontsize=14)
    plt.ylabel(f'Average Value {col_name}', fontsize=14)


    plt.legend(title='d', title_fontsize=13, fontsize=12,
               frameon=True, framealpha=0.9, loc='best')

    plt.grid(alpha=0.3, linestyle='-')
    plt.tight_layout()

    plt.show()

So let us take a look at the average values over time

[8]:

plot_data(agg_df, col_name='y')

/opt/hostedtoolcache/Python/3.12.11/x64/lib/python3.12/site-packages/matplotlib/colors.py:2295: RuntimeWarning: invalid value encountered in divide
  resdat /= (vmax - vmin)

../../_images/examples_did_py_rep_cs_16_1.png

Instead the true average treatment treatment effects can be obtained by averaging (usually unobserved) the ite values.

The true effect just equals the exposure time (in months):

\[ATT(\mathrm{g}, t) = \min(\mathrm{t} - \mathrm{g} + 1, 0) =: e\]

[9]:

plot_data(agg_df, col_name='ite')

/opt/hostedtoolcache/Python/3.12.11/x64/lib/python3.12/site-packages/matplotlib/colors.py:2295: RuntimeWarning: invalid value encountered in divide
  resdat /= (vmax - vmin)

../../_images/examples_did_py_rep_cs_18_1.png

DoubleMLPanelData#

Finally, we can construct our DoubleMLPanelData, specifying

y_col : the outcome
d_cols: the group variable indicating the first treated period for each unit
id_col: the unique identification column for each unit
t_col : the time column
x_cols: the additional pre-treatment controls
datetime_unit: unit required for datetime columns and plotting

[10]:

dml_data = DoubleMLPanelData(
    data=df,
    y_col="y",
    d_cols="d",
    id_col="id",
    t_col="t",
    x_cols=["Z1", "Z2", "Z3", "Z4"],
)
print(dml_data)

================== DoubleMLPanelData Object ==================

------------------ Data summary      ------------------
Outcome variable: y
Treatment variable(s): ['d']
Covariates: ['Z1', 'Z2', 'Z3', 'Z4']
Instrument variable(s): None
Time variable: t
Id variable: id
No. Unique Ids: 9862
No. Observations: 29920

------------------ DataFrame info    ------------------
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 29920 entries, 0 to 29919
Columns: 11 entries, id to ite
dtypes: float64(9), int64(2)
memory usage: 2.5 MB

ATT Estimation#

The DoubleML-package implements estimation of group-time average treatment effect via the DoubleMLDIDMulti class (see model documentation).

Basics#

The class basically behaves like other DoubleML classes and requires the specification of two learners (for more details on the regression elements, see score documentation).

The basic arguments of a DoubleMLDIDMulti object include

ml_g “outcome” regression learner
ml_m propensity Score learner
control_group the control group for the parallel trend assumption
gt_combinations combinations of \((\mathrm{g},t_\text{pre}, t_\text{eval})\)
anticipation_periods number of anticipation periods

We will construct a dict with “default” arguments.

For repeated cross-sectional data, we additionally specify the argument

panel=False

[11]:

default_args = {
    "ml_g": LGBMRegressor(n_estimators=500, learning_rate=0.01, verbose=-1, random_state=123),
    "ml_m": LGBMClassifier(n_estimators=500, learning_rate=0.01, verbose=-1, random_state=123),
    "control_group": "never_treated",
    "anticipation_periods": 0,
    "n_folds": 5,
    "n_rep": 1,
    "panel": False,
}

The model will be estimated using the fit() method.

[12]:

dml_obj = DoubleMLDIDMulti(dml_data, **default_args)
dml_obj.fit()
print(dml_obj)

================== DoubleMLDIDMulti Object ==================

------------------ Data summary      ------------------
Outcome variable: y
Treatment variable(s): ['d']
Covariates: ['Z1', 'Z2', 'Z3', 'Z4']
Instrument variable(s): None
Time variable: t
Id variable: id
No. Unique Ids: 9862
No. Observations: 29920

------------------ Score & algorithm ------------------
Score function: observational
Control group: never_treated
Anticipation periods: 0

------------------ Machine learner   ------------------
Learner ml_g: LGBMRegressor(learning_rate=0.01, n_estimators=500, random_state=123,
              verbose=-1)
Learner ml_m: LGBMClassifier(learning_rate=0.01, n_estimators=500, random_state=123,
               verbose=-1)
Out-of-sample Performance:
Regression:
Learner ml_g_d0_t0 RMSE: [[1.8957998  2.8378867  3.8050464  3.79737716 4.01877828 1.91389564
  2.86222106 3.86183959 5.4690558  5.46170477 1.89551067 2.78726655
  3.92536544 5.42068824 6.02697949]]
Learner ml_g_d0_t1 RMSE: [[2.80396726 3.94426663 5.37830153 6.07720872 7.63683228 2.78485661
  3.93163715 5.30932115 6.13714951 7.42708237 2.76756064 3.9344184
  5.14647959 6.14660168 7.48103145]]
Learner ml_g_d1_t0 RMSE: [[1.92818559 2.88711505 3.97161134 3.83887862 3.86879936 1.97632595
  2.87972266 3.92957159 5.02625    5.00467589 1.93028391 2.97506313
  3.78691636 5.2158211  6.24506971]]
Learner ml_g_d1_t1 RMSE: [[2.93469253 3.85355399 5.26828797 6.35444105 7.67611715 2.8109852
  3.9494716  4.94561711 6.31248689 7.93742834 2.94034181 3.84243712
  5.20957758 6.25375305 7.4767101 ]]
Classification:
Learner ml_m Log Loss: [[0.5999758  0.59518886 0.60206662 0.59849123 0.59549877 0.63174826
  0.632665   0.63470619 0.63107687 0.62970104 0.65097171 0.65033372
  0.65490512 0.65963683 0.66321767]]

------------------ Resampling        ------------------
No. folds: 5
No. repeated sample splits: 1

------------------ Fit summary       ------------------
                  coef   std err         t     P>|t|     2.5 %    97.5 %
ATT(3.0,0,1)  0.083860  0.216052  0.388145  0.697909 -0.339595  0.507315
ATT(3.0,1,2) -0.823708  0.315798 -2.608338  0.009098 -1.442661 -0.204755
ATT(3.0,2,3)  1.564619  0.656547  2.383101  0.017167  0.277810  2.851429
ATT(3.0,2,4)  2.020098  0.583089  3.464479  0.000531  0.877266  3.162931
ATT(3.0,2,5)  2.779739  0.656095  4.236792  0.000023  1.493816  4.065661
ATT(4.0,0,1) -0.012009  0.174694 -0.068744  0.945193 -0.354403  0.330385
ATT(4.0,1,2) -0.160959  0.272839 -0.589940  0.555231 -0.695714  0.373796
ATT(4.0,2,3)  0.101214  0.334408  0.302667  0.762143 -0.554213  0.756641
ATT(4.0,3,4)  0.706721  0.421735  1.675747  0.093788 -0.119864  1.533305
ATT(4.0,3,5)  1.756859  0.465480  3.774296  0.000160  0.844535  2.669183
ATT(5.0,0,1) -0.115964  0.157160 -0.737870  0.460593 -0.423992  0.192065
ATT(5.0,1,2) -0.183146  0.240556 -0.761344  0.446452 -0.654628  0.288336
ATT(5.0,2,3)  0.433229  0.337260  1.284555  0.198948 -0.227789  1.094247
ATT(5.0,3,4) -0.141023  0.375952 -0.375110  0.707579 -0.877875  0.595829
ATT(5.0,4,5)  1.051549  0.419270  2.508047  0.012140  0.229795  1.873304

The summary displays estimates of the \(ATT(g,t_\text{eval})\) effects for different combinations of \((g,t_\text{eval})\) via \(\widehat{ATT}(\mathrm{g},t_\text{pre},t_\text{eval})\), where

\(\mathrm{g}\) specifies the group
\(t_\text{pre}\) specifies the corresponding pre-treatment period
\(t_\text{eval}\) specifies the evaluation period

The choice gt_combinations="standard", used estimates all possible combinations of \(ATT(g,t_\text{eval})\) via \(\widehat{ATT}(\mathrm{g},t_\text{pre},t_\text{eval})\), where the standard choice is \(t_\text{pre} = \min(\mathrm{g}, t_\text{eval}) - 1\) (without anticipation).

Remark that this includes pre-tests effects if \(\mathrm{g} > t_{eval}\), e.g. \(\widehat{ATT}(g=3, t_{\text{pre}}=0, t_{\text{eval}}=1)\) which estimates the pre-trend from time period \(0\) to \(1\) even if the actual treatment occured in time period \(3\).

As usual for the DoubleML-package, you can obtain joint confidence intervals via bootstrap.

[13]:

level = 0.95

ci = dml_obj.confint(level=level)
dml_obj.bootstrap(n_rep_boot=5000)
ci_joint = dml_obj.confint(level=level, joint=True)
ci_joint

[13]:

	2.5 %	97.5 %
ATT(3.0,0,1)	-0.548705	0.716424
ATT(3.0,1,2)	-1.748310	0.100894
ATT(3.0,2,3)	-0.357639	3.486877
ATT(3.0,2,4)	0.312915	3.727281
ATT(3.0,2,5)	0.858805	4.700672
ATT(4.0,0,1)	-0.523483	0.499464
ATT(4.0,1,2)	-0.959786	0.637868
ATT(4.0,2,3)	-0.877874	1.080303
ATT(4.0,3,4)	-0.528046	1.941487
ATT(4.0,3,5)	0.394014	3.119704
ATT(5.0,0,1)	-0.576102	0.344174
ATT(5.0,1,2)	-0.887454	0.521162
ATT(5.0,2,3)	-0.554211	1.420669
ATT(5.0,3,4)	-1.241745	0.959699
ATT(5.0,4,5)	-0.176002	2.279101

A visualization of the effects can be obtained via the plot_effects() method.

Remark that the plot used joint confidence intervals per default.

[14]:

dml_obj.plot_effects()

/opt/hostedtoolcache/Python/3.12.11/x64/lib/python3.12/site-packages/matplotlib/cbook.py:1719: FutureWarning: Calling float on a single element Series is deprecated and will raise a TypeError in the future. Use float(ser.iloc[0]) instead
  return math.isfinite(val)

[14]:

(<Figure size 1200x800 with 4 Axes>,
 [<Axes: title={'center': 'First Treated: 3.0'}, ylabel='Effect'>,
  <Axes: title={'center': 'First Treated: 4.0'}, ylabel='Effect'>,
  <Axes: title={'center': 'First Treated: 5.0'}, xlabel='Evaluation Period', ylabel='Effect'>])

../../_images/examples_did_py_rep_cs_30_2.png

Sensitivity Analysis#

As descripted in the Sensitivity Guide, robustness checks on omitted confounding/parallel trend violations are available, via the standard sensitivity_analysis() method.

[15]:

dml_obj.sensitivity_analysis()
print(dml_obj.sensitivity_summary)

================== Sensitivity Analysis ==================

------------------ Scenario          ------------------
Significance Level: level=0.95
Sensitivity parameters: cf_y=0.03; cf_d=0.03, rho=1.0

------------------ Bounds with CI    ------------------
              CI lower  theta lower     theta  theta upper  CI upper
ATT(3.0,0,1) -0.626501    -0.271239  0.083860     0.438959  0.800920
ATT(3.0,1,2) -1.884499    -1.341824 -0.823708    -0.305591  0.201297
ATT(3.0,2,3) -0.211977     0.864529  1.564619     2.264709  3.362745
ATT(3.0,2,4)  0.239753     1.177022  2.020098     2.863174  3.850903
ATT(3.0,2,5)  0.740288     1.861701  2.779739     3.697776  4.751530
ATT(4.0,0,1) -0.629908    -0.338846 -0.012009     0.314828  0.601514
ATT(4.0,1,2) -1.097404    -0.648827 -0.160959     0.326909  0.777849
ATT(4.0,2,3) -1.112326    -0.556530  0.101214     0.758958  1.305605
ATT(4.0,3,4) -0.755298    -0.069208  0.706721     1.482649  2.194101
ATT(4.0,3,5)  0.061873     0.829781  1.756859     2.683937  3.452219
ATT(5.0,0,1) -0.698714    -0.439535 -0.115964     0.207607  0.466381
ATT(5.0,1,2) -1.038265    -0.641723 -0.183146     0.275430  0.671900
ATT(5.0,2,3) -0.737241    -0.181523  0.433229     1.047982  1.603349
ATT(5.0,3,4) -1.518613    -0.898270 -0.141023     0.616224  1.235945
ATT(5.0,4,5) -0.517948     0.172062  1.051549     1.931037  2.622943

------------------ Robustness Values ------------------
              H_0    RV (%)   RVa (%)
ATT(3.0,0,1)  0.0  0.716920  0.000666
ATT(3.0,1,2)  0.0  4.726825  1.824477
ATT(3.0,2,3)  0.0  6.579782  2.103635
ATT(3.0,2,4)  0.0  7.037148  3.853744
ATT(3.0,2,5)  0.0  8.807589  5.220398
ATT(4.0,0,1)  0.0  0.111715  0.000652
ATT(4.0,1,2)  0.0  1.000062  0.000462
ATT(4.0,2,3)  0.0  0.467464  0.000597
ATT(4.0,3,4)  0.0  2.736207  0.052016
ATT(4.0,3,5)  0.0  5.608219  3.196097
ATT(5.0,0,1)  0.0  1.085858  0.000612
ATT(5.0,1,2)  0.0  1.209280  0.000612
ATT(5.0,2,3)  0.0  2.123786  0.000605
ATT(5.0,3,4)  0.0  0.565823  0.000413
ATT(5.0,4,5)  0.0  3.576301  1.246093

In this example one can clearly, distinguish the robustness of the non-zero effects vs. the pre-treatment periods.

Control Groups#

The current implementation support the following control groups

"never_treated"
"not_yet_treated"

Remark that the ``”not_yet_treated” depends on anticipation.

For differences and recommendations, we refer to Callaway and Sant’Anna(2021).

[16]:

dml_obj_nyt = DoubleMLDIDMulti(dml_data, **(default_args | {"control_group": "not_yet_treated"}))
dml_obj_nyt.fit()
dml_obj_nyt.bootstrap(n_rep_boot=5000)
dml_obj_nyt.plot_effects()

/opt/hostedtoolcache/Python/3.12.11/x64/lib/python3.12/site-packages/matplotlib/cbook.py:1719: FutureWarning: Calling float on a single element Series is deprecated and will raise a TypeError in the future. Use float(ser.iloc[0]) instead
  return math.isfinite(val)

[16]:

(<Figure size 1200x800 with 4 Axes>,
 [<Axes: title={'center': 'First Treated: 3.0'}, ylabel='Effect'>,
  <Axes: title={'center': 'First Treated: 4.0'}, ylabel='Effect'>,
  <Axes: title={'center': 'First Treated: 5.0'}, xlabel='Evaluation Period', ylabel='Effect'>])

../../_images/examples_did_py_rep_cs_35_2.png

Linear Covariate Adjustment#

Remark that we relied on boosted trees to adjust for conditional parallel trends which allow for a nonlinear adjustment. In comparison to linear adjustment, we could rely on linear learners.

Remark that the DGP (``dgp_type=4``) is based on nonlinear conditional expectations such that the estimates will be biased

[17]:

linear_learners = {
    "ml_g": LinearRegression(),
    "ml_m": LogisticRegression(),
}

dml_obj_linear = DoubleMLDIDMulti(dml_data, **(default_args | linear_learners))
dml_obj_linear.fit()
dml_obj_linear.bootstrap(n_rep_boot=5000)
dml_obj_linear.plot_effects()

/opt/hostedtoolcache/Python/3.12.11/x64/lib/python3.12/site-packages/matplotlib/cbook.py:1719: FutureWarning: Calling float on a single element Series is deprecated and will raise a TypeError in the future. Use float(ser.iloc[0]) instead
  return math.isfinite(val)

[17]:

(<Figure size 1200x800 with 4 Axes>,
 [<Axes: title={'center': 'First Treated: 3.0'}, ylabel='Effect'>,
  <Axes: title={'center': 'First Treated: 4.0'}, ylabel='Effect'>,
  <Axes: title={'center': 'First Treated: 5.0'}, xlabel='Evaluation Period', ylabel='Effect'>])

../../_images/examples_did_py_rep_cs_37_2.png

Aggregated Effects#

As the did-R-package, the \(ATT\)’s can be aggregated to summarize multiple effects. For details on different aggregations and details on their interpretations see Callaway and Sant’Anna(2021).

The aggregations are implemented via the aggregate() method.

Group Aggregation#

To obtain group-specific effects one can would like to average \(ATT(\mathrm{g}, t_\text{eval})\) over \(t_\text{eval}\). As a sample oracle we will combine all ite’s based on group \(\mathrm{g}\).

[18]:

df_post_treatment = df[df["t"] >= df["d"]]
df_post_treatment.groupby("d")["ite"].mean()

[18]:

d
3.0    1.901684
4.0    1.488150
5.0    0.957420
Name: ite, dtype: float64

To obtain group-specific effects it is possible to aggregate several \(\widehat{ATT}(\mathrm{g},t_\text{pre},t_\text{eval})\) values based on the group \(\mathrm{g}\) by setting the aggregation="group" argument.

[19]:

aggregated_group = dml_obj.aggregate(aggregation="group")
print(aggregated_group)
_ = aggregated_group.plot_effects()

================== DoubleMLDIDAggregation Object ==================
 Group Aggregation

------------------ Overall Aggregated Effects ------------------
    coef  std err        t        P>|t|    2.5 %   97.5 %
1.479522 0.238239 6.210247 5.290146e-10 1.012583 1.946462
------------------ Aggregated Effects         ------------------
         coef   std err         t         P>|t|     2.5 %    97.5 %
3.0  2.121485  0.404467  5.245136  1.561674e-07  1.328744  2.914226
4.0  1.231790  0.362754  3.395660  6.846344e-04  0.520805  1.942775
5.0  1.051549  0.419270  2.508047  1.214006e-02  0.229795  1.873304
------------------ Additional Information     ------------------
Score function: observational
Control group: never_treated
Anticipation periods: 0

/opt/hostedtoolcache/Python/3.12.11/x64/lib/python3.12/site-packages/doubleml/did/did_aggregation.py:368: UserWarning: Joint confidence intervals require bootstrapping which hasn't been performed yet. Automatically applying '.aggregated_frameworks.bootstrap(method="normal", n_rep_boot=500)' with default values. For different bootstrap settings, call bootstrap() explicitly before plotting.
  warnings.warn(

../../_images/examples_did_py_rep_cs_42_2.png

The output is a DoubleMLDIDAggregation object which includes an overall aggregation summary based on group size.

Time Aggregation#

To obtain time-specific effects one can would like to average \(ATT(\mathrm{g}, t_\text{eval})\) over \(\mathrm{g}\) (respecting group size). As a sample oracle we will combine all ite’s based on group \(\mathrm{g}\). As oracle values, we obtain

[20]:

df_post_treatment.groupby("t")["ite"].mean()

[20]:

t
3    0.963619
4    1.454712
5    1.940634
Name: ite, dtype: float64

To aggregate \(\widehat{ATT}(\mathrm{g},t_\text{pre},t_\text{eval})\), based on \(t_\text{eval}\), but weighted with respect to group size. Corresponds to Calendar Time Effects from the did-R-package.

For calendar time effects set aggregation="time".

[21]:

aggregated_time = dml_obj.aggregate("time")
print(aggregated_time)
fig, ax = aggregated_time.plot_effects()

================== DoubleMLDIDAggregation Object ==================
 Time Aggregation

------------------ Overall Aggregated Effects ------------------
    coef  std err        t        P>|t|    2.5 %   97.5 %
1.606529 0.280935 5.718514 1.074599e-08 1.055907 2.157151
------------------ Aggregated Effects         ------------------
       coef   std err         t         P>|t|     2.5 %    97.5 %
3  1.564619  0.656547  2.383101  1.716747e-02  0.277810  2.851429
4  1.374359  0.400288  3.433429  5.959983e-04  0.589810  2.158909
5  1.880609  0.374350  5.023661  5.069567e-07  1.146896  2.614323
------------------ Additional Information     ------------------
Score function: observational
Control group: never_treated
Anticipation periods: 0

/opt/hostedtoolcache/Python/3.12.11/x64/lib/python3.12/site-packages/doubleml/did/did_aggregation.py:368: UserWarning: Joint confidence intervals require bootstrapping which hasn't been performed yet. Automatically applying '.aggregated_frameworks.bootstrap(method="normal", n_rep_boot=500)' with default values. For different bootstrap settings, call bootstrap() explicitly before plotting.
  warnings.warn(

../../_images/examples_did_py_rep_cs_47_2.png

Event Study Aggregation#

To obtain event-study-type effects one can would like to aggregate \(ATT(\mathrm{g}, t_\text{eval})\) over \(e = t_\text{eval} - \mathrm{g}\) (respecting group size). As a sample oracle we will combine all ite’s based on group \(\mathrm{g}\). As oracle values, we obtain

[22]:

df_treated = df[df["d"] != np.inf].copy()
df_treated["e"] = df_treated["t"] - df_treated["d"]
df_treated.groupby("e")["ite"].mean().iloc[1:]

[22]:

e
-4.0    0.007194
-3.0    0.002252
-2.0   -0.045895
-1.0   -0.004072
 0.0    0.974712
 1.0    1.931487
 2.0    2.828817
Name: ite, dtype: float64

Analogously, aggregation="eventstudy" aggregates \(\widehat{ATT}(\mathrm{g},t_\text{pre},t_\text{eval})\) based on exposure time \(e = t_\text{eval} - \mathrm{g}\) (respecting group size).

[23]:

aggregated_eventstudy = dml_obj.aggregate("eventstudy")
print(aggregated_eventstudy)
aggregated_eventstudy.plot_effects()

================== DoubleMLDIDAggregation Object ==================
 Event Study Aggregation

------------------ Overall Aggregated Effects ------------------
    coef  std err        t        P>|t|   2.5 %   97.5 %
1.927906 0.313571 6.148239 7.834791e-10 1.31332 2.542493
------------------ Aggregated Effects         ------------------
          coef   std err         t         P>|t|     2.5 %    97.5 %
-4.0 -0.115964  0.157160 -0.737870  4.605932e-01 -0.423992  0.192065
-3.0 -0.096364  0.125683 -0.766724  4.432454e-01 -0.342698  0.149970
-2.0  0.115390  0.132329  0.871999  3.832091e-01 -0.143969  0.374750
-1.0 -0.295251  0.165081 -1.788522  7.369189e-02 -0.618804  0.028302
0.0   1.113307  0.258659  4.304148  1.676293e-05  0.606345  1.620270
1.0   1.890673  0.374477  5.048841  4.444988e-07  1.156712  2.624634
2.0   2.779739  0.656095  4.236792  2.267359e-05  1.493816  4.065661
------------------ Additional Information     ------------------
Score function: observational
Control group: never_treated
Anticipation periods: 0

/opt/hostedtoolcache/Python/3.12.11/x64/lib/python3.12/site-packages/doubleml/did/did_aggregation.py:368: UserWarning: Joint confidence intervals require bootstrapping which hasn't been performed yet. Automatically applying '.aggregated_frameworks.bootstrap(method="normal", n_rep_boot=500)' with default values. For different bootstrap settings, call bootstrap() explicitly before plotting.
  warnings.warn(

[23]:

(<Figure size 1200x600 with 1 Axes>,
 <Axes: title={'center': 'Aggregated Treatment Effects'}, ylabel='Effect'>)

../../_images/examples_did_py_rep_cs_51_3.png

Aggregation Details#

The DoubleMLDIDAggregation objects include several DoubleMLFrameworks which support methods like bootstrap() or confint(). Further, the weights can be accessed via the properties

overall_aggregation_weights: weights for the overall aggregation
aggregation_weights: weights for the aggregation

To clarify, e.g. for the eventstudy aggregation

[24]:

print(aggregated_eventstudy)

================== DoubleMLDIDAggregation Object ==================
 Event Study Aggregation

------------------ Overall Aggregated Effects ------------------
    coef  std err        t        P>|t|   2.5 %   97.5 %
1.927906 0.313571 6.148239 7.834791e-10 1.31332 2.542493
------------------ Aggregated Effects         ------------------
          coef   std err         t         P>|t|     2.5 %    97.5 %
-4.0 -0.115964  0.157160 -0.737870  4.605932e-01 -0.423992  0.192065
-3.0 -0.096364  0.125683 -0.766724  4.432454e-01 -0.342698  0.149970
-2.0  0.115390  0.132329  0.871999  3.832091e-01 -0.143969  0.374750
-1.0 -0.295251  0.165081 -1.788522  7.369189e-02 -0.618804  0.028302
0.0   1.113307  0.258659  4.304148  1.676293e-05  0.606345  1.620270
1.0   1.890673  0.374477  5.048841  4.444988e-07  1.156712  2.624634
2.0   2.779739  0.656095  4.236792  2.267359e-05  1.493816  4.065661
------------------ Additional Information     ------------------
Score function: observational
Control group: never_treated
Anticipation periods: 0

Here, the overall effect aggregation aggregates each effect with positive exposure

[25]:

print(aggregated_eventstudy.overall_aggregation_weights)

[0.         0.         0.         0.         0.33333333 0.33333333
 0.33333333]

If one would like to consider how the aggregated effect with \(e=0\) is computed, one would have to look at the corresponding set of weights within the aggregation_weights property

[26]:

# the weights for e=0 correspond to the fifth element of the aggregation weights
aggregated_eventstudy.aggregation_weights[4]

[26]:

array([0.        , 0.        , 0.34395649, 0.        , 0.        ,
       0.        , 0.        , 0.        , 0.33267407, 0.        ,
       0.        , 0.        , 0.        , 0.        , 0.32336944])

Taking a look at the original dml_obj, one can see that this combines the following estimates (only show month):

\(\widehat{ATT}(04,03,04)\)
\(\widehat{ATT}(05,04,05)\)
\(\widehat{ATT}(06,05,06)\)

[27]:

print(dml_obj.summary["coef"])

ATT(3.0,0,1)    0.083860
ATT(3.0,1,2)   -0.823708
ATT(3.0,2,3)    1.564619
ATT(3.0,2,4)    2.020098
ATT(3.0,2,5)    2.779739
ATT(4.0,0,1)   -0.012009
ATT(4.0,1,2)   -0.160959
ATT(4.0,2,3)    0.101214
ATT(4.0,3,4)    0.706721
ATT(4.0,3,5)    1.756859
ATT(5.0,0,1)   -0.115964
ATT(5.0,1,2)   -0.183146
ATT(5.0,2,3)    0.433229
ATT(5.0,3,4)   -0.141023
ATT(5.0,4,5)    1.051549
Name: coef, dtype: float64

Anticipation#

As described in the Model Guide, one can include anticipation periods \(\delta>0\) by setting the anticipation_periods parameter.

Data with Anticipation#

The DGP allows to include anticipation periods via the anticipation_periods parameter. In this case the observations will be “shifted” such that units anticipate the effect earlier and the exposure effect is increased by the number of periods where the effect is anticipated.

[28]:

n_obs = 4000
n_periods = 6

df_anticipation = make_did_cs_CS2021(n_obs, dgp_type=4, n_periods=n_periods, n_pre_treat_periods=3, time_type="datetime", anticipation_periods=1)

print(df_anticipation.shape)
df_anticipation.head()

(19093, 10)

[28]:

	id	y	y0	y1	d	t	Z1	Z2	Z3	Z4
8	1	208.864345	208.864345	209.503109	2025-06-01	2025-01-01	-0.551592	-0.837795	-0.138088	0.149087
9	1	209.876413	209.876413	210.401649	2025-06-01	2025-02-01	-0.551592	-0.837795	-0.138088	0.149087
11	1	210.760753	210.760753	210.609504	2025-06-01	2025-04-01	-0.551592	-0.837795	-0.138088	0.149087
12	1	210.356693	210.965322	210.356693	2025-06-01	2025-05-01	-0.551592	-0.837795	-0.138088	0.149087
13	1	211.444939	209.752450	211.444939	2025-06-01	2025-06-01	-0.551592	-0.837795	-0.138088	0.149087

To visualize the anticipation, we will again plot the “oracle” values

[29]:

df_anticipation["ite"] = df_anticipation["y1"] - df_anticipation["y0"]
agg_df_anticipation = df_anticipation.groupby(["t", "d"]).agg(**agg_dictionary).reset_index()
agg_df_anticipation.head()

[29]:

	t	d	y_mean	y_lower_quantile	y_upper_quantile	ite_mean	ite_lower_quantile	ite_upper_quantile
0	2025-01-01	2025-04-01	207.926944	191.189138	224.573937	0.028952	-2.544807	2.468552
1	2025-01-01	2025-05-01	211.249541	194.139411	227.117890	0.001448	-2.355013	2.146768
2	2025-01-01	2025-06-01	213.326436	195.907301	230.847452	0.044308	-2.262323	2.390176
3	2025-02-01	2025-04-01	207.634566	181.930105	231.737798	-0.003654	-2.267453	2.252344
4	2025-02-01	2025-05-01	210.712727	185.236688	235.576303	-0.018296	-2.274573	2.300010

One can see that the effect is already anticipated one period before the actual treatment assignment.

[30]:

plot_data(agg_df_anticipation, col_name='ite')

../../_images/examples_did_py_rep_cs_66_0.png

Initialize a corresponding DoubleMLPanelData object.

[31]:

dml_data_anticipation = DoubleMLPanelData(
    data=df_anticipation,
    y_col="y",
    d_cols="d",
    id_col="id",
    t_col="t",
    x_cols=["Z1", "Z2", "Z3", "Z4"],
    datetime_unit="M"
)

ATT Estimation#

Let us take a look at the estimation without anticipation.

[32]:

dml_obj_anticipation = DoubleMLDIDMulti(dml_data_anticipation, **default_args)
dml_obj_anticipation.fit()
dml_obj_anticipation.bootstrap(n_rep_boot=5000)
dml_obj_anticipation.plot_effects()

/opt/hostedtoolcache/Python/3.12.11/x64/lib/python3.12/site-packages/matplotlib/cbook.py:1719: FutureWarning: Calling float on a single element Series is deprecated and will raise a TypeError in the future. Use float(ser.iloc[0]) instead
  return math.isfinite(val)

[32]:

(<Figure size 1200x800 with 4 Axes>,
 [<Axes: title={'center': 'First Treated: 2025-04'}, ylabel='Effect'>,
  <Axes: title={'center': 'First Treated: 2025-05'}, ylabel='Effect'>,
  <Axes: title={'center': 'First Treated: 2025-06'}, xlabel='Evaluation Period', ylabel='Effect'>])

../../_images/examples_did_py_rep_cs_70_2.png

The effects are obviously biased. To include anticipation periods, one can adjust the anticipation_periods parameter. Correspondingly, the outcome regression (and not yet treated units) are adjusted.

[33]:

dml_obj_anticipation = DoubleMLDIDMulti(dml_data_anticipation, **(default_args| {"anticipation_periods": 1}))
dml_obj_anticipation.fit()
dml_obj_anticipation.bootstrap(n_rep_boot=5000)
dml_obj_anticipation.plot_effects()

/opt/hostedtoolcache/Python/3.12.11/x64/lib/python3.12/site-packages/matplotlib/cbook.py:1719: FutureWarning: Calling float on a single element Series is deprecated and will raise a TypeError in the future. Use float(ser.iloc[0]) instead
  return math.isfinite(val)
/opt/hostedtoolcache/Python/3.12.11/x64/lib/python3.12/site-packages/matplotlib/cbook.py:1719: FutureWarning: Calling float on a single element Series is deprecated and will raise a TypeError in the future. Use float(ser.iloc[0]) instead
  return math.isfinite(val)
/opt/hostedtoolcache/Python/3.12.11/x64/lib/python3.12/site-packages/matplotlib/cbook.py:1719: FutureWarning: Calling float on a single element Series is deprecated and will raise a TypeError in the future. Use float(ser.iloc[0]) instead
  return math.isfinite(val)
/opt/hostedtoolcache/Python/3.12.11/x64/lib/python3.12/site-packages/matplotlib/cbook.py:1719: FutureWarning: Calling float on a single element Series is deprecated and will raise a TypeError in the future. Use float(ser.iloc[0]) instead
  return math.isfinite(val)
/opt/hostedtoolcache/Python/3.12.11/x64/lib/python3.12/site-packages/matplotlib/cbook.py:1719: FutureWarning: Calling float on a single element Series is deprecated and will raise a TypeError in the future. Use float(ser.iloc[0]) instead
  return math.isfinite(val)

[33]:

(<Figure size 1200x800 with 4 Axes>,
 [<Axes: title={'center': 'First Treated: 2025-04'}, ylabel='Effect'>,
  <Axes: title={'center': 'First Treated: 2025-05'}, ylabel='Effect'>,
  <Axes: title={'center': 'First Treated: 2025-06'}, xlabel='Evaluation Period', ylabel='Effect'>])

../../_images/examples_did_py_rep_cs_72_2.png

Group-Time Combinations#

The default option gt_combinations="standard" includes all group time values with the specific choice of \(t_\text{pre} = \min(\mathrm{g}, t_\text{eval}) - 1\) (without anticipation) which is the weakest possible parallel trend assumption.

Other options are possible or only specific combinations of \((\mathrm{g},t_\text{pre},t_\text{eval})\).

All combinations#

The option gt_combinations="all" includes all relevant group time values with \(t_\text{pre} < \min(\mathrm{g}, t_\text{eval})\), including longer parallel trend assumptions. This can result in multiple estimates for the same \(ATT(\mathrm{g},t)\), which have slightly different assumptions (length of parallel trends).

[34]:

dml_obj_all = DoubleMLDIDMulti(dml_data, **(default_args| {"gt_combinations": "all"}))
dml_obj_all.fit()
dml_obj_all.bootstrap(n_rep_boot=5000)
dml_obj_all.plot_effects()

[34]:

(<Figure size 1200x800 with 4 Axes>,
 [<Axes: title={'center': 'First Treated: 3.0'}, ylabel='Effect'>,
  <Axes: title={'center': 'First Treated: 4.0'}, ylabel='Effect'>,
  <Axes: title={'center': 'First Treated: 5.0'}, xlabel='Evaluation Period', ylabel='Effect'>])

../../_images/examples_did_py_rep_cs_75_1.png

Universal Base Period#

The option gt_combinations="universal" set \(t_\text{pre} = \mathrm{g} - \delta - 1\), corresponding to a universal/constant comparison or base period.

Remark that this implies \(t_\text{pre} > t_\text{eval}\) for all pre-treatment periods (accounting for anticipation). Therefore these effects do not have the same straightforward interpretation as ATT’s.

[35]:

dml_obj_universal = DoubleMLDIDMulti(dml_data, **(default_args| {"gt_combinations": "universal"}))
dml_obj_universal.fit()
dml_obj_universal.bootstrap(n_rep_boot=5000)
dml_obj_universal.plot_effects()

/opt/hostedtoolcache/Python/3.12.11/x64/lib/python3.12/site-packages/matplotlib/cbook.py:1719: FutureWarning: Calling float on a single element Series is deprecated and will raise a TypeError in the future. Use float(ser.iloc[0]) instead
  return math.isfinite(val)

[35]:

(<Figure size 1200x800 with 4 Axes>,
 [<Axes: title={'center': 'First Treated: 3.0'}, ylabel='Effect'>,
  <Axes: title={'center': 'First Treated: 4.0'}, ylabel='Effect'>,
  <Axes: title={'center': 'First Treated: 5.0'}, xlabel='Evaluation Period', ylabel='Effect'>])

../../_images/examples_did_py_rep_cs_77_2.png

Selected Combinations#

Instead it is also possible to just submit a list of tuples containing \((\mathrm{g}, t_\text{pre}, t_\text{eval})\) combinations. E.g. only two combinations

[36]:

gt_dict = {
    "gt_combinations": [
        (4.0, 1, 2),
        (4.0, 1, 3),
        ]
}

dml_obj_all = DoubleMLDIDMulti(dml_data, **(default_args| gt_dict))
dml_obj_all.fit()
dml_obj_all.bootstrap(n_rep_boot=5000)
dml_obj_all.plot_effects()

[36]:

(<Figure size 1200x800 with 2 Axes>,
 [<Axes: title={'center': 'First Treated: 4.0'}, xlabel='Evaluation Period', ylabel='Effect'>])

../../_images/examples_did_py_rep_cs_79_1.png