Note

https://docs.doubleml.org/stable/examples/py_double_ml_gate_sensitivity.ipynb

Python: GATE Sensitivity Analysis#

In this simple example, we illustrate how the DoubleML package can be used to perfrom a sensitivity analysis for group average treatment effects in the DoubleMLIRM model.

Data#

For illustration purposes, we will work with generated data where the true individual treatment effects are known, to access the performance of the effect estimates.

[1]:

import numpy as np
import doubleml as dml

from doubleml.datasets import make_heterogeneous_data
from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier

We will add an independent feature V, which is treated as a confounder in the analysis. This is an simple option to test the robustness of the effect estimates against this “confounder” and evaluate the sensitivity analysis.

[2]:

n_obs = 1000
p = 5

np.random.seed(42)
data_dict = make_heterogeneous_data(n_obs, p, binary_treatment=True, n_x=2)
data = data_dict['data']

# add random covariate
data['V'] = np.random.normal(size=(n_obs, 1))
ite = data_dict['effects']

print("Average Treatment Effect: {:.2f}".format(np.mean(ite)))

Average Treatment Effect: 4.45

As mentioned before, the independent feature 'V', will be added to the DoubleMLData object as a possible confounder.

[3]:

dml_data = dml.DoubleMLData(data, 'y', 'd')
print(dml_data)

================== DoubleMLData Object ==================

------------------ Data summary      ------------------
Outcome variable: y
Treatment variable(s): ['d']
Covariates: ['X_0', 'X_1', 'X_2', 'X_3', 'X_4', 'V']
Instrument variable(s): None
No. Observations: 1000

------------------ DataFrame info    ------------------
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Columns: 8 entries, y to V
dtypes: float64(8)
memory usage: 62.6 KB

ATE Estimation and Sensitivity#

At first start with the estimation of the ATE and perform a sensitivity analysis. Throughout this example, we will rely on random forest for nuisance estimation. Further, we use $5$ repetitions to increase stability of the results.

[4]:

n_rep = 5

ml_g = RandomForestRegressor(n_estimators=100, min_samples_leaf=20, random_state=42)
ml_m = RandomForestClassifier(n_estimators=100, min_samples_leaf=20, random_state=42)

Set the weights to be explicitly None and fit the model.

[5]:

dml_irm_ate = dml.DoubleMLIRM(
    dml_data,
    ml_g,
    ml_m,
    n_rep=n_rep,
    weights=None)
dml_irm_ate.fit()
print(dml_irm_ate.summary)

       coef   std err          t  P>|t|     2.5 %    97.5 %
d  4.452013  0.084314  52.802764    0.0  4.286761  4.617266

The results seem very robust (also because the effect size is very large).

[6]:

dml_irm_ate.sensitivity_analysis()
print(dml_irm_ate.sensitivity_summary)

================== Sensitivity Analysis ==================

------------------ Scenario          ------------------
Significance Level: level=0.95
Sensitivity parameters: cf_y=0.03; cf_d=0.03, rho=1.0

------------------ Bounds with CI    ------------------
   CI lower  theta lower     theta  theta upper  CI upper
d  4.261888     4.400564  4.452013     4.503463  4.642202

------------------ Robustness Values ------------------
   H_0     RV (%)    RVa (%)
d  0.0  88.915819  87.854178

Benchmarking the covariate 'V', reveals very low confounding (which is expected, since the covariate 'V' is independent of the treatment and outcome).

[7]:

benchmarking_variable = 'V'
print(dml_irm_ate.sensitivity_benchmark(benchmarking_set=[benchmarking_variable]))

   cf_y  cf_d  rho  delta_theta
d   0.0   0.0  1.0     0.004565

GATE Estimation and Sensitivity#

Next, we will estimate a GATE and perform a sensitivity analysis. Here, we base the group $G$ on feature 'X_0'

\[1\{X\in G\} = 1\{X_0 \ge 0.5\},\]

as the treatment effect is heterogeneous in this feature.

[8]:

group = data['X_0'] >= 0.5

true_group_effect = ite[group].mean()
print("True group effect: {:.2f}".format(true_group_effect))
print("Group vector:\n{}".format(group[:4]))

True group effect: 5.81
Group vector:
0     True
1    False
2    False
3     True
Name: X_0, dtype: bool

The correct weights, to identify the GATE, have to still be normalized by the group probabilites, such that

\[\omega(X) = \frac{1\{X_0 \ge 0.5\}}{P(X_0 \ge 0.5)}.\]

Since the weights only depend on the features $X$, we can supply them as a numpy vector.

[9]:

weights = group.to_numpy() / group.mean()
print("Weights:\n{}".format(weights[:4]))

Weights:
[1.89035917 0.         0.         1.89035917]

[10]:

dml_irm_gate = dml.DoubleMLIRM(
    dml_data,
    ml_g,
    ml_m,
    n_rep=n_rep,
    weights=weights)
dml_irm_gate.fit()
print(dml_irm_gate.summary)

       coef   std err         t          P>|t|     2.5 %    97.5 %
d  5.736007  0.202005  28.39539  2.305325e-177  5.340084  6.131929

Again, we can repeat the sensitivity analysis for the GATE. The results are very similar to the ATE case (with slightly smaller robustness values).

[11]:

dml_irm_gate.sensitivity_analysis()
print(dml_irm_gate.sensitivity_summary)

================== Sensitivity Analysis ==================

------------------ Scenario          ------------------
Significance Level: level=0.95
Sensitivity parameters: cf_y=0.03; cf_d=0.03, rho=1.0

------------------ Bounds with CI    ------------------
   CI lower  theta lower     theta  theta upper  CI upper
d  5.329411     5.663371  5.736007     5.808939  6.139561

------------------ Robustness Values ------------------
   H_0     RV (%)    RVa (%)
d  0.0  87.123736  85.020918

Still, the benchmarking shows little evidence for confounding via the covariate 'V'.

[12]:

benchmarking_variable = 'V'
print(dml_irm_gate.sensitivity_benchmark(benchmarking_set=[benchmarking_variable]))

   cf_y  cf_d  rho  delta_theta
d   0.0   0.0  1.0     0.003377

GATET Estimation and Sensitivity#

Finally, we will estimate a group average treatment effect on the treated (GATET) and perform a sensitivity analysis. Here, we will use the same group as the previous section.

Instead of considering the effect for all units with $X\in G$, the GATET only refers the effect for the treated units such that $X\in G$ and $D=1$.

[13]:

group = data['X_0'] >= 0.5
group_treated = group & (data['d'] == 1)

true_gatet_effect = ite[group_treated].mean()
print("True GATET effect: {:.2f}".format(true_gatet_effect))
print("Group vector for treated units:\n{}".format(group_treated[:4]))

True GATET effect: 5.69
Group vector for treated units:
0    False
1    False
2    False
3    False
dtype: bool

As for GATEs, the correct weights, to identify the GATET, have to still be normalized by the group probabilites, such that

\[\omega(D,X) = \frac{D\cdot1\{X_0 \ge 0.5\}}{P(D=1, X_0 \ge 0.5)}.\]

As mentioned in the User Guide, the estimation and sensitivity analysis also relies on the conditional expectation of the weights

\[\bar{\omega}(D,X) = \mathbb{E}[\omega(D,X)|X].\]

Since for GATETs, the weights depend on the treatment $D$, the conditional expectation $\bar{\omega}(D,X)$ differs from $\omega(D, X)$.

Due to the form of the weights, the conditional expectation $\bar{\omega}(D,X)$ reduces to

\[\bar{\omega}(D,X) = \frac{\mathbb{E}[D|X]\cdot 1\{X_0 \ge 0.5\}}{P(D=1, X_0 \ge 0.5)} = \frac{m_0(X)\cdot 1\{X_0 \ge 0.65\}}{P(D=1, X_0 \ge 0.5)},\]

which can be estimated by plugging in the estimated propensity score $\hat{m}(X)$.

All the previous steps of estimation are performed automatically by the DoubleMLIRM class if the score is set to 'ATTE'.

Remark: This requires the weights argument to be binary and refer to the group indicator $1\{X\in G\} = 1\{X_0 \ge 0.5\}$ not the actual group of treated individuals $1{D = 1, X:nbsphinx-math:in `G} = D:nbsphinx-math:cdot 1{X:nbsphinx-math:in G} $. Further, the normalization by the group probabilities is then performed automatically by the ``DoubleMLIRM` class.

Consequently, we can just set weights to the group indicator $1\{X\in G\}$ and fit the model.

[14]:

group_indicator = group.to_numpy()
print("Group indicator:\n{}".format(group_indicator[:4]))

Group indicator:
[ True False False  True]

[15]:

dml_irm_gatet = dml.DoubleMLIRM(
    dml_data,
    ml_g,
    ml_m,
    n_rep=n_rep,
    score='ATTE',
    weights=group_indicator)
dml_irm_gatet.fit()
print(dml_irm_gatet.summary)

       coef   std err          t  P>|t|     2.5 %    97.5 %
d  5.631208  0.121992  46.160429    0.0  5.392108  5.870309

[16]:

dml_irm_gatet.sensitivity_analysis()
print(dml_irm_gatet.sensitivity_summary)

================== Sensitivity Analysis ==================

------------------ Scenario          ------------------
Significance Level: level=0.95
Sensitivity parameters: cf_y=0.03; cf_d=0.03, rho=1.0

------------------ Bounds with CI    ------------------
   CI lower  theta lower     theta  theta upper  CI upper
d  5.399228     5.599857  5.631208      5.66256  5.863285

------------------ Robustness Values ------------------
   H_0     RV (%)    RVa (%)
d  0.0  96.908503  96.324446

As for general sensitivity analysis, contour plots can be used to visualize the results.

[17]:

dml_irm_gatet.sensitivity_plot()

Again, the benchmarking of “confounding” feature 'V' shows little evidence for confounding.

[18]:

benchmarking_variable = 'V'
print(dml_irm_gatet.sensitivity_benchmark(benchmarking_set=[benchmarking_variable]))

   cf_y  cf_d  rho  delta_theta
d   0.0   0.0 -1.0    -0.001847