Note
-
Download Jupyter notebook:
https://docs.doubleml.org/stable/examples/py_double_ml_gate_sensitivity.ipynb.
Python: GATE Sensitivity Analysis#
In this simple example, we illustrate how the DoubleML package can be used to perfrom a sensitivity analysis for group average treatment effects in the DoubleMLIRM model.
Data#
For illustration purposes, we will work with generated data where the true individual treatment effects are known, to access the performance of the effect estimates.
[1]:
import numpy as np
import doubleml as dml
from doubleml.datasets import make_heterogeneous_data
from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier
We will add an independent feature V
, which is treated as a confounder in the analysis. This is an simple option to test the robustness of the effect estimates against this “confounder” and evaluate the sensitivity analysis.
[2]:
n_obs = 1000
p = 5
np.random.seed(42)
data_dict = make_heterogeneous_data(n_obs, p, binary_treatment=True, n_x=2)
data = data_dict['data']
# add random covariate
data['V'] = np.random.normal(size=(n_obs, 1))
ite = data_dict['effects']
print("Average Treatment Effect: {:.2f}".format(np.mean(ite)))
Average Treatment Effect: 4.45
As mentioned before, the independent feature 'V'
, will be added to the DoubleMLData
object as a possible confounder.
[3]:
dml_data = dml.DoubleMLData(data, 'y', 'd')
print(dml_data)
================== DoubleMLData Object ==================
------------------ Data summary ------------------
Outcome variable: y
Treatment variable(s): ['d']
Covariates: ['X_0', 'X_1', 'X_2', 'X_3', 'X_4', 'V']
Instrument variable(s): None
No. Observations: 1000
------------------ DataFrame info ------------------
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Columns: 8 entries, y to V
dtypes: float64(8)
memory usage: 62.6 KB
ATE Estimation and Sensitivity#
At first start with the estimation of the ATE and perform a sensitivity analysis. Throughout this example, we will rely on random forest for nuisance estimation. Further, we use \(5\) repetitions to increase stability of the results.
[4]:
n_rep = 5
ml_g = RandomForestRegressor(n_estimators=100, min_samples_leaf=20, random_state=42)
ml_m = RandomForestClassifier(n_estimators=100, min_samples_leaf=20, random_state=42)
Set the weights to be explicitly None
and fit the model.
[5]:
dml_irm_ate = dml.DoubleMLIRM(
dml_data,
ml_g,
ml_m,
n_rep=n_rep,
weights=None)
dml_irm_ate.fit()
print(dml_irm_ate.summary)
coef std err t P>|t| 2.5 % 97.5 %
d 4.448923 0.084212 52.82985 0.0 4.284271 4.613574
The results seem very robust (also because the effect size is very large).
[6]:
dml_irm_ate.sensitivity_analysis()
print(dml_irm_ate.sensitivity_summary)
================== Sensitivity Analysis ==================
------------------ Scenario ------------------
Significance Level: level=0.95
Sensitivity parameters: cf_y=0.03; cf_d=0.03, rho=1.0
------------------ Bounds with CI ------------------
CI lower theta lower theta theta upper CI upper
0 4.259395 4.397578 4.448923 4.500267 4.638488
------------------ Robustness Values ------------------
H_0 RV (%) RVa (%)
0 0.0 88.695581 87.625159
Benchmarking the covariate 'V'
, reveals very low confounding (which is expected, since the covariate 'V'
is independent of the treatment and outcome).
[7]:
benchmarking_variable = 'V'
print(dml_irm_ate.sensitivity_benchmark(benchmarking_set=[benchmarking_variable]))
cf_y cf_d rho delta_theta
d 0.0 0.004392 1.0 0.003779
GATE Estimation and Sensitivity#
Next, we will estimate a GATE and perform a sensitivity analysis. Here, we base the group \(G\) on feature 'X_0'
as the treatment effect is heterogeneous in this feature.
[8]:
group = data['X_0'] >= 0.5
true_group_effect = ite[group].mean()
print("True group effect: {:.2f}".format(true_group_effect))
print("Group vector:\n{}".format(group[:4]))
True group effect: 5.81
Group vector:
0 True
1 False
2 False
3 True
Name: X_0, dtype: bool
The correct weights, to identify the GATE, have to still be normalized by the group probabilites, such that
Since the weights only depend on the features \(X\), we can supply them as a numpy vector.
[9]:
weights = group.to_numpy() / group.mean()
print("Weights:\n{}".format(weights[:4]))
Weights:
[1.89035917 0. 0. 1.89035917]
[10]:
dml_irm_gate = dml.DoubleMLIRM(
dml_data,
ml_g,
ml_m,
n_rep=n_rep,
weights=weights)
dml_irm_gate.fit()
print(dml_irm_gate.summary)
coef std err t P>|t| 2.5 % 97.5 %
d 5.739817 0.201796 28.443701 5.830755e-178 5.344305 6.135329
Again, we can repeat the sensitivity analysis for the GATE. The results are very similar to the ATE case (with slightly smaller robustness values).
[11]:
dml_irm_gate.sensitivity_analysis()
print(dml_irm_gate.sensitivity_summary)
================== Sensitivity Analysis ==================
------------------ Scenario ------------------
Significance Level: level=0.95
Sensitivity parameters: cf_y=0.03; cf_d=0.03, rho=1.0
------------------ Bounds with CI ------------------
CI lower theta lower theta theta upper CI upper
0 5.334649 5.668452 5.739817 5.811155 6.141384
------------------ Robustness Values ------------------
H_0 RV (%) RVa (%)
0 0.0 87.164698 85.109811
Still, the benchmarking shows little evidence for confounding via the covariate 'V'
.
[12]:
benchmarking_variable = 'V'
print(dml_irm_gate.sensitivity_benchmark(benchmarking_set=[benchmarking_variable]))
cf_y cf_d rho delta_theta
d 0.0 0.0 -1.0 -0.000494
GATET Estimation and Sensitivity#
Finally, we will estimate a group average treatment effect on the treated (GATET) and perform a sensitivity analysis. Here, we will use the same group as the previous section.
Instead of considering the effect for all units with \(X\in G\), the GATET only refers the effect for the treated units such that \(X\in G\) and \(D=1\).
[13]:
group = data['X_0'] >= 0.5
group_treated = group & (data['d'] == 1)
true_gatet_effect = ite[group_treated].mean()
print("True GATET effect: {:.2f}".format(true_gatet_effect))
print("Group vector for treated units:\n{}".format(group_treated[:4]))
True GATET effect: 5.69
Group vector for treated units:
0 False
1 False
2 False
3 False
dtype: bool
As for GATEs, the correct weights, to identify the GATET, have to still be normalized by the group probabilites, such that
As mentioned in the User Guide, the estimation and sensitivity analysis also relies on the conditional expectation of the weights
Since for GATETs, the weights depend on the treatment \(D\), the conditional expectation \(\bar{\omega}(D,X)\) differs from \(\omega(D, X)\).
Due to the form of the weights, the conditional expectation \(\bar{\omega}(D,X)\) reduces to
which can be estimated by plugging in the estimated propensity score \(\hat{m}(X)\).
All the previous steps of estimation are performed automatically by the DoubleMLIRM
class if the score is set to 'ATTE'
.
Remark: This requires the weights
argument to be binary and refer to the group indicator \(1\{X\in G\} = 1\{X_0 \ge 0.5\}\) not the actual group of treated individuals \(1\{D = 1, X\in G\} = D\cdot 1\{X\in G\}\). Further, the normalization by the group probabilities is then performed automatically by the DoubleMLIRM
class.
Consequently, we can just set weights
to the group indicator \(1\{X\in G\}\) and fit the model.
[14]:
group_indicator = group.to_numpy()
print("Group indicator:\n{}".format(group_indicator[:4]))
Group indicator:
[ True False False True]
[15]:
dml_irm_gatet = dml.DoubleMLIRM(
dml_data,
ml_g,
ml_m,
n_rep=n_rep,
score='ATTE',
weights=group_indicator)
dml_irm_gatet.fit()
print(dml_irm_gatet.summary)
coef std err t P>|t| 2.5 % 97.5 %
d 5.630914 0.121774 46.240813 0.0 5.392242 5.869586
[16]:
dml_irm_gatet.sensitivity_analysis()
print(dml_irm_gatet.sensitivity_summary)
================== Sensitivity Analysis ==================
------------------ Scenario ------------------
Significance Level: level=0.95
Sensitivity parameters: cf_y=0.03; cf_d=0.03, rho=1.0
------------------ Bounds with CI ------------------
CI lower theta lower theta theta upper CI upper
0 5.360249 5.560723 5.630914 5.701106 5.901526
------------------ Robustness Values ------------------
H_0 RV (%) RVa (%)
0 0.0 87.003924 84.667985
As for general sensitivity analysis, contour plots can be used to visualize the results.
[17]:
dml_irm_gatet.sensitivity_plot()
Again, the benchmarking of “confounding” feature 'V'
shows little evidence for confounding.
[18]:
benchmarking_variable = 'V'
print(dml_irm_gatet.sensitivity_benchmark(benchmarking_set=[benchmarking_variable]))
cf_y cf_d rho delta_theta
d 0.0 0.0 1.0 0.002436