Python: GATE Sensitivity Analysis#

In this simple example, we illustrate how the DoubleML package can be used to perfrom a sensitivity analysis for group average treatment effects in the DoubleMLIRM model.

Data#

For illustration purposes, we will work with generated data where the true individual treatment effects are known, to access the performance of the effect estimates.

[1]:
import numpy as np
import doubleml as dml

from doubleml.datasets import make_heterogeneous_data
from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier

We will add an independent feature V, which is treated as a confounder in the analysis. This is an simple option to test the robustness of the effect estimates against this “confounder” and evaluate the sensitivity analysis.

[2]:
n_obs = 1000
p = 5

np.random.seed(42)
data_dict = make_heterogeneous_data(n_obs, p, binary_treatment=True, n_x=2)
data = data_dict['data']

# add random covariate
data['V'] = np.random.normal(size=(n_obs, 1))
ite = data_dict['effects']

print("Average Treatment Effect: {:.2f}".format(np.mean(ite)))
Average Treatment Effect: 4.45

As mentioned before, the independent feature 'V', will be added to the DoubleMLData object as a possible confounder.

[3]:
dml_data = dml.DoubleMLData(data, 'y', 'd')
print(dml_data)
================== DoubleMLData Object ==================

------------------ Data summary      ------------------
Outcome variable: y
Treatment variable(s): ['d']
Covariates: ['X_0', 'X_1', 'X_2', 'X_3', 'X_4', 'V']
Instrument variable(s): None
No. Observations: 1000

------------------ DataFrame info    ------------------
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Columns: 8 entries, y to V
dtypes: float64(8)
memory usage: 62.6 KB

ATE Estimation and Sensitivity#

At first start with the estimation of the ATE and perform a sensitivity analysis. Throughout this example, we will rely on random forest for nuisance estimation. Further, we use \(5\) repetitions to increase stability of the results.

[4]:
n_rep = 5

ml_g = RandomForestRegressor(n_estimators=100, min_samples_leaf=20, random_state=42)
ml_m = RandomForestClassifier(n_estimators=100, min_samples_leaf=20, random_state=42)

Set the weights to be explicitly None and fit the model.

[5]:
dml_irm_ate = dml.DoubleMLIRM(
    dml_data,
    ml_g,
    ml_m,
    n_rep=n_rep,
    weights=None)
dml_irm_ate.fit()
print(dml_irm_ate.summary)
       coef   std err          t  P>|t|     2.5 %    97.5 %
d  4.452013  0.084314  52.802764    0.0  4.286761  4.617266

The results seem very robust (also because the effect size is very large).

[6]:
dml_irm_ate.sensitivity_analysis()
print(dml_irm_ate.sensitivity_summary)
================== Sensitivity Analysis ==================

------------------ Scenario          ------------------
Significance Level: level=0.95
Sensitivity parameters: cf_y=0.03; cf_d=0.03, rho=1.0

------------------ Bounds with CI    ------------------
   CI lower  theta lower     theta  theta upper  CI upper
d  4.261888     4.400564  4.452013     4.503463  4.642202

------------------ Robustness Values ------------------
   H_0     RV (%)    RVa (%)
d  0.0  88.915819  87.854178

Benchmarking the covariate 'V', reveals very low confounding (which is expected, since the covariate 'V' is independent of the treatment and outcome).

[7]:
benchmarking_variable = 'V'
print(dml_irm_ate.sensitivity_benchmark(benchmarking_set=[benchmarking_variable]))
   cf_y  cf_d  rho  delta_theta
d   0.0   0.0  1.0     0.004565

GATE Estimation and Sensitivity#

Next, we will estimate a GATE and perform a sensitivity analysis. Here, we base the group \(G\) on feature 'X_0'

\[1\{X\in G\} = 1\{X_0 \ge 0.5\},\]

as the treatment effect is heterogeneous in this feature.

[8]:
group = data['X_0'] >= 0.5

true_group_effect = ite[group].mean()
print("True group effect: {:.2f}".format(true_group_effect))
print("Group vector:\n{}".format(group[:4]))
True group effect: 5.81
Group vector:
0     True
1    False
2    False
3     True
Name: X_0, dtype: bool

The correct weights, to identify the GATE, have to still be normalized by the group probabilites, such that

\[\omega(X) = \frac{1\{X_0 \ge 0.5\}}{P(X_0 \ge 0.5)}.\]

Since the weights only depend on the features \(X\), we can supply them as a numpy vector.

[9]:
weights = group.to_numpy() / group.mean()
print("Weights:\n{}".format(weights[:4]))
Weights:
[1.89035917 0.         0.         1.89035917]
[10]:
dml_irm_gate = dml.DoubleMLIRM(
    dml_data,
    ml_g,
    ml_m,
    n_rep=n_rep,
    weights=weights)
dml_irm_gate.fit()
print(dml_irm_gate.summary)
       coef   std err         t          P>|t|     2.5 %    97.5 %
d  5.736007  0.202005  28.39539  2.305325e-177  5.340084  6.131929

Again, we can repeat the sensitivity analysis for the GATE. The results are very similar to the ATE case (with slightly smaller robustness values).

[11]:
dml_irm_gate.sensitivity_analysis()
print(dml_irm_gate.sensitivity_summary)
================== Sensitivity Analysis ==================

------------------ Scenario          ------------------
Significance Level: level=0.95
Sensitivity parameters: cf_y=0.03; cf_d=0.03, rho=1.0

------------------ Bounds with CI    ------------------
   CI lower  theta lower     theta  theta upper  CI upper
d  5.329411     5.663371  5.736007     5.808939  6.139561

------------------ Robustness Values ------------------
   H_0     RV (%)    RVa (%)
d  0.0  87.123736  85.020918

Still, the benchmarking shows little evidence for confounding via the covariate 'V'.

[12]:
benchmarking_variable = 'V'
print(dml_irm_gate.sensitivity_benchmark(benchmarking_set=[benchmarking_variable]))
   cf_y  cf_d  rho  delta_theta
d   0.0   0.0  1.0     0.003377

GATET Estimation and Sensitivity#

Finally, we will estimate a group average treatment effect on the treated (GATET) and perform a sensitivity analysis. Here, we will use the same group as the previous section.

Instead of considering the effect for all units with \(X\in G\), the GATET only refers the effect for the treated units such that \(X\in G\) and \(D=1\).

[13]:
group = data['X_0'] >= 0.5
group_treated = group & (data['d'] == 1)

true_gatet_effect = ite[group_treated].mean()
print("True GATET effect: {:.2f}".format(true_gatet_effect))
print("Group vector for treated units:\n{}".format(group_treated[:4]))
True GATET effect: 5.69
Group vector for treated units:
0    False
1    False
2    False
3    False
dtype: bool

As for GATEs, the correct weights, to identify the GATET, have to still be normalized by the group probabilites, such that

\[\omega(D,X) = \frac{D\cdot1\{X_0 \ge 0.5\}}{P(D=1, X_0 \ge 0.5)}.\]

As mentioned in the User Guide, the estimation and sensitivity analysis also relies on the conditional expectation of the weights

\[\bar{\omega}(D,X) = \mathbb{E}[\omega(D,X)|X].\]

Since for GATETs, the weights depend on the treatment \(D\), the conditional expectation \(\bar{\omega}(D,X)\) differs from \(\omega(D, X)\).

Due to the form of the weights, the conditional expectation \(\bar{\omega}(D,X)\) reduces to

\[\bar{\omega}(D,X) = \frac{\mathbb{E}[D|X]\cdot 1\{X_0 \ge 0.5\}}{P(D=1, X_0 \ge 0.5)} = \frac{m_0(X)\cdot 1\{X_0 \ge 0.65\}}{P(D=1, X_0 \ge 0.5)},\]

which can be estimated by plugging in the estimated propensity score \(\hat{m}(X)\).

All the previous steps of estimation are performed automatically by the DoubleMLIRM class if the score is set to 'ATTE'.

Remark: This requires the weights argument to be binary and refer to the group indicator \(1\{X\in G\} = 1\{X_0 \ge 0.5\}\) not the actual group of treated individuals $1{D = 1, X:nbsphinx-math:in `G} = D:nbsphinx-math:cdot 1{X:nbsphinx-math:in G} $. Further, the normalization by the group probabilities is then performed automatically by the ``DoubleMLIRM` class.

Consequently, we can just set weights to the group indicator \(1\{X\in G\}\) and fit the model.

[14]:
group_indicator = group.to_numpy()
print("Group indicator:\n{}".format(group_indicator[:4]))
Group indicator:
[ True False False  True]
[15]:
dml_irm_gatet = dml.DoubleMLIRM(
    dml_data,
    ml_g,
    ml_m,
    n_rep=n_rep,
    score='ATTE',
    weights=group_indicator)
dml_irm_gatet.fit()
print(dml_irm_gatet.summary)
       coef   std err          t  P>|t|     2.5 %    97.5 %
d  5.631208  0.121992  46.160429    0.0  5.392108  5.870309
[16]:
dml_irm_gatet.sensitivity_analysis()
print(dml_irm_gatet.sensitivity_summary)
================== Sensitivity Analysis ==================

------------------ Scenario          ------------------
Significance Level: level=0.95
Sensitivity parameters: cf_y=0.03; cf_d=0.03, rho=1.0

------------------ Bounds with CI    ------------------
   CI lower  theta lower     theta  theta upper  CI upper
d  5.399228     5.599857  5.631208      5.66256  5.863285

------------------ Robustness Values ------------------
   H_0     RV (%)    RVa (%)
d  0.0  96.908503  96.324446

As for general sensitivity analysis, contour plots can be used to visualize the results.

[17]:
dml_irm_gatet.sensitivity_plot()

Again, the benchmarking of “confounding” feature 'V' shows little evidence for confounding.

[18]:
benchmarking_variable = 'V'
print(dml_irm_gatet.sensitivity_benchmark(benchmarking_set=[benchmarking_variable]))
   cf_y  cf_d  rho  delta_theta
d   0.0   0.0 -1.0    -0.001847