doubleml.DoubleMLDID#
- class doubleml.DoubleMLDID(obj_dml_data, ml_g, ml_m=None, n_folds=5, n_rep=1, score='observational', in_sample_normalization=True, trimming_rule='truncate', trimming_threshold=0.01, draw_sample_splitting=True)#
Double machine learning for difference-in-differences models with panel data (two time periods).
- Parameters:
obj_dml_data (
DoubleMLData
object) – TheDoubleMLData
object providing the data and specifying the variables for the causal model.ml_g (estimator implementing
fit()
andpredict()
) – A machine learner implementingfit()
andpredict()
methods (e.g.sklearn.ensemble.RandomForestRegressor
) for the nuisance function \(g_0(d,X) = E[Y_1-Y_0|D=d, X]\). For a binary outcome variable \(Y\) (with values 0 and 1), a classifier implementingfit()
andpredict_proba()
can also be specified. Ifsklearn.base.is_classifier()
returnsTrue
,predict_proba()
is used otherwisepredict()
.ml_m (classifier implementing
fit()
andpredict_proba()
) – A machine learner implementingfit()
andpredict_proba()
methods (e.g.sklearn.ensemble.RandomForestClassifier
) for the nuisance function \(m_0(X) = E[D=1|X]\). Only relevant forscore='observational'
.n_folds (int) – Number of folds. Default is
5
.n_rep (int) – Number of repetitons for the sample splitting. Default is
1
.score (str) – A str (
'observational'
or'experimental'
) specifying the score function. The'experimental'
scores refers to an A/B setting, where the treatment is independent from the pretreatment covariates. Default is'observational'
.in_sample_normalization (bool) – Indicates whether to use a sligthly different normalization from Sant’Anna and Zhao (2020). Default is
True
.trimming_rule (str) – A str (
'truncate'
is the only choice) specifying the trimming approach. Default is'truncate'
.trimming_threshold (float) – The threshold used for trimming. Default is
1e-2
.draw_sample_splitting (bool) – Indicates whether the sample splitting should be drawn during initialization of the object. Default is
True
.
Examples
>>> import numpy as np >>> import doubleml as dml >>> from doubleml.datasets import make_did_SZ2020 >>> from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier >>> np.random.seed(42) >>> ml_g = RandomForestRegressor(n_estimators=100, max_depth=5, min_samples_leaf=5) >>> ml_m = RandomForestClassifier(n_estimators=100, max_depth=5, min_samples_leaf=5) >>> data = make_did_SZ2020(n_obs=500, return_type='DataFrame') >>> obj_dml_data = dml.DoubleMLData(data, 'y', 'd') >>> dml_did_obj = dml.DoubleMLDID(obj_dml_data, ml_g, ml_m) >>> dml_did_obj.fit().summary coef std err t P>|t| 2.5 % 97.5 % d -2.685104 1.798071 -1.493325 0.135352 -6.209257 0.83905
Methods
bootstrap
([method, n_rep_boot])Multiplier bootstrap for DoubleML models.
confint
([joint, level])Confidence intervals for DoubleML models.
Construct a
doubleml.DoubleMLFramework
object.Draw sample splitting for DoubleML models.
evaluate_learners
([learners, metric])Evaluate fitted learners for DoubleML models on cross-validated predictions.
fit
([n_jobs_cv, store_predictions, ...])Estimate DoubleML models.
get_params
(learner)Get hyperparameters for the nuisance model of DoubleML models.
p_adjust
([method])Multiple testing adjustment for DoubleML models.
sensitivity_analysis
([cf_y, cf_d, rho, ...])Performs a sensitivity analysis to account for unobserved confounders.
sensitivity_benchmark
(benchmarking_set[, ...])Computes a benchmark for a given set of features.
sensitivity_plot
([idx_treatment, value, ...])Contour plot of the sensivity with respect to latent/confounding variables.
set_ml_nuisance_params
(learner, treat_var, ...)Set hyperparameters for the nuisance models of DoubleML models.
set_sample_splitting
(all_smpls[, ...])Set the sample splitting for DoubleML models.
tune
(param_grids[, tune_on_folds, ...])Hyperparameter-tuning for DoubleML models.
Attributes
all_coef
Estimates of the causal parameter(s) for the
n_rep
different sample splits after callingfit()
.all_se
Standard errors of the causal parameter(s) for the
n_rep
different sample splits after callingfit()
.boot_method
The method to construct the bootstrap replications.
boot_t_stat
Bootstrapped t-statistics for the causal parameter(s) after calling
fit()
andbootstrap()
.coef
Estimates for the causal parameter(s) after calling
fit()
.framework
The corresponding
doubleml.DoubleMLFramework
object.in_sample_normalization
Indicates whether the in sample normalization of weights are used.
learner
The machine learners for the nuisance functions.
learner_names
The names of the learners.
models
The fitted nuisance models.
n_folds
Number of folds.
n_rep
Number of repetitions for the sample splitting.
n_rep_boot
The number of bootstrap replications.
nuisance_loss
The losses of the nuisance models (root-mean-squared-errors or logloss).
nuisance_targets
The outcome of the nuisance models.
params
The hyperparameters of the learners.
params_names
The names of the nuisance models with hyperparameters.
predictions
The predictions of the nuisance models in form of a dictinary.
psi
Values of the score function after calling
fit()
; For models (e.g., PLR, IRM, PLIV, IIVM) with linear score (in the parameter) \(\psi(W; \theta, \eta) = \psi_a(W; \eta) \theta + \psi_b(W; \eta)\).psi_deriv
Values of the derivative of the score function with respect to the parameter \(\theta\) after calling
fit()
; For models (e.g., PLR, IRM, PLIV, IIVM) with linear score (in the parameter) \(\psi_a(W; \eta)\).psi_elements
Values of the score function components after calling
fit()
; For models (e.g., PLR, IRM, PLIV, IIVM) with linear score (in the parameter) a dictionary with entriespsi_a
andpsi_b
for \(\psi_a(W; \eta)\) and \(\psi_b(W; \eta)\).pval
p-values for the causal parameter(s) after calling
fit()
.score
The score function.
se
Standard errors for the causal parameter(s) after calling
fit()
.sensitivity_elements
Values of the sensitivity components after calling
fit()
; If available (e.g., PLR, IRM) a dictionary with entriessigma2
,nu2
,psi_sigma2
,psi_nu2
andriesz_rep
.sensitivity_params
Values of the sensitivity parameters after calling
sesitivity_analysis()
; If available (e.g., PLR, IRM) a dictionary with entriestheta
,se
,ci
,rv
andrva
.sensitivity_summary
Returns a summary for the sensitivity analysis after calling
sensitivity_analysis()
.smpls
The partition used for cross-fitting.
smpls_cluster
The partition of clusters used for cross-fitting.
summary
A summary for the estimated causal effect after calling
fit()
.t_stat
t-statistics for the causal parameter(s) after calling
fit()
.trimming_rule
Specifies the used trimming rule.
trimming_threshold
Specifies the used trimming threshold.
- DoubleMLDID.bootstrap(method='normal', n_rep_boot=500)#
Multiplier bootstrap for DoubleML models.
- DoubleMLDID.confint(joint=False, level=0.95)#
Confidence intervals for DoubleML models.
- DoubleMLDID.construct_framework()#
Construct a
doubleml.DoubleMLFramework
object. Can be used to construct e.g. confidence intervals.- Returns:
doubleml_framework
- Return type:
doubleml.DoubleMLFramework
- DoubleMLDID.draw_sample_splitting()#
Draw sample splitting for DoubleML models.
The samples are drawn according to the attributes
n_folds
andn_rep
.- Returns:
self
- Return type:
- DoubleMLDID.evaluate_learners(learners=None, metric=<function _rmse>)#
Evaluate fitted learners for DoubleML models on cross-validated predictions.
- Parameters:
learners (list) – A list of strings which correspond to the nuisance functions of the model.
metric (callable) – A callable function with inputs
y_pred
andy_true
of shape(1, n)
, wheren
specifies the number of observations. Remark that some models like IRM are not able to provide all values fory_true
for all learners and might contain somenan
values in the target vector. Default is the root-mean-square error.
- Returns:
dist – A dictionary containing the evaluated metric for each learner.
- Return type:
Examples
>>> import numpy as np >>> import doubleml as dml >>> from sklearn.metrics import mean_absolute_error >>> from doubleml.datasets import make_irm_data >>> from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier >>> np.random.seed(3141) >>> ml_g = RandomForestRegressor(n_estimators=100, max_features=20, max_depth=5, min_samples_leaf=2) >>> ml_m = RandomForestClassifier(n_estimators=100, max_features=20, max_depth=5, min_samples_leaf=2) >>> data = make_irm_data(theta=0.5, n_obs=500, dim_x=20, return_type='DataFrame') >>> obj_dml_data = dml.DoubleMLData(data, 'y', 'd') >>> dml_irm_obj = dml.DoubleMLIRM(obj_dml_data, ml_g, ml_m) >>> dml_irm_obj.fit() >>> def mae(y_true, y_pred): >>> subset = np.logical_not(np.isnan(y_true)) >>> return mean_absolute_error(y_true[subset], y_pred[subset]) >>> dml_irm_obj.evaluate_learners(metric=mae) {'ml_g0': array([[0.85974356]]), 'ml_g1': array([[0.85280376]]), 'ml_m': array([[0.35365143]])}
- DoubleMLDID.fit(n_jobs_cv=None, store_predictions=True, external_predictions=None, store_models=False)#
Estimate DoubleML models.
- Parameters:
n_jobs_cv (None or int) – The number of CPUs to use to fit the learners.
None
means1
. Default isNone
.store_predictions (bool) – Indicates whether the predictions for the nuisance functions should be stored in
predictions
. Default isTrue
.store_models (bool) – Indicates whether the fitted models for the nuisance functions should be stored in
models
. This allows to analyze the fitted models or extract information like variable importance. Default isFalse
.external_predictions (None or dict) – If None all models for the learners are fitted and evaluated. If a dictionary containing predictions for a specific learner is supplied, the model will use the supplied nuisance predictions instead. Has to be a nested dictionary where the keys refer to the treatment and the keys of the nested dictionarys refer to the corresponding learners. Default is None.
- Returns:
self
- Return type:
- DoubleMLDID.get_params(learner)#
Get hyperparameters for the nuisance model of DoubleML models.
- DoubleMLDID.p_adjust(method='romano-wolf')#
Multiple testing adjustment for DoubleML models.
- Parameters:
method (str) – A str (
'romano-wolf''
,'bonferroni'
,'holm'
, etc) specifying the adjustment method. In addition to'romano-wolf''
, all methods implemented instatsmodels.stats.multitest.multipletests()
can be applied. Default is'romano-wolf'
.- Returns:
p_val – A data frame with adjusted p-values.
- Return type:
pd.DataFrame
- DoubleMLDID.sensitivity_analysis(cf_y=0.03, cf_d=0.03, rho=1.0, level=0.95, null_hypothesis=0.0)#
Performs a sensitivity analysis to account for unobserved confounders.
The evaluated scenario is stored as a dictionary in the property
sensitivity_params
.- Parameters:
cf_y (float) – Percentage of the residual variation of the outcome explained by latent/confounding variables. Default is
0.03
.cf_d (float) – Percentage gains in the variation of the Riesz representer generated by latent/confounding variables. Default is
0.03
.rho (float) – The correlation between the differences in short and long representations in the main regression and Riesz representer. Has to be in [-1,1]. The absolute value determines the adversarial strength of the confounding (maximizes at 1.0). Default is
1.0
.level (float) – The confidence level. Default is
0.95
.null_hypothesis (float or numpy.ndarray) – Null hypothesis for the effect. Determines the robustness values. If it is a single float uses the same null hypothesis for all estimated parameters. Else the array has to be of shape (n_coefs,). Default is
0.0
.
- Returns:
self
- Return type:
- DoubleMLDID.sensitivity_benchmark(benchmarking_set, fit_args=None)#
Computes a benchmark for a given set of features. Returns a DataFrame containing the corresponding values for cf_y, cf_d, rho and the change in estimates. :returns: benchmark_results – Benchmark results. :rtype: pandas.DataFrame
- DoubleMLDID.sensitivity_plot(idx_treatment=0, value='theta', rho=1.0, level=0.95, null_hypothesis=0.0, include_scenario=True, benchmarks=None, fill=True, grid_bounds=(0.15, 0.15), grid_size=100)#
Contour plot of the sensivity with respect to latent/confounding variables.
- Parameters:
idx_treatment (int) – Index of the treatment to perform the sensitivity analysis. Default is
0
.value (str) – Determines which contours to plot. Valid values are
'theta'
(refers to the bounds) and'ci'
(refers to the bounds including statistical uncertainty). Default is'theta'
.rho (float) – The correlation between the differences in short and long representations in the main regression and Riesz representer. Has to be in [-1,1]. The absolute value determines the adversarial strength of the confounding (maximizes at 1.0). Default is
1.0
.level (float) – The confidence level. Default is
0.95
.null_hypothesis (float) – Null hypothesis for the effect. Determines the direction of the contour lines.
include_scenario (bool) – Indicates whether to highlight the scenario from the call of
sensitivity_analysis()
. Default isTrue
.benchmarks (dict or None) – Dictionary of benchmarks to be included in the plot. The keys are
cf_y
,cf_d
andname
. Default isNone
.fill (bool) – Indicates whether to use a heatmap style or only contour lines. Default is
True
.grid_bounds (tuple) – Determines the evaluation bounds of the grid for
cf_d
andcf_y
. Has to contain two floats in [0, 1). Default is(0.15, 0.15)
.grid_size (int) – Determines the number of evaluation points of the grid. Default is
100
.
- Returns:
fig – Plotly figure of the sensitivity contours.
- Return type:
- DoubleMLDID.set_ml_nuisance_params(learner, treat_var, params)#
Set hyperparameters for the nuisance models of DoubleML models.
- Parameters:
learner (str) – The nuisance model / learner (see attribute
params_names
).treat_var (str) – The treatment variable (hyperparameters can be set treatment-variable specific).
params (dict or list) – A dict with estimator parameters (used for all folds) or a nested list with fold specific parameters. The outer list needs to be of length
n_rep
and the inner list of lengthn_folds
.
- Returns:
self
- Return type:
- DoubleMLDID.set_sample_splitting(all_smpls, all_smpls_cluster=None)#
Set the sample splitting for DoubleML models.
The attributes
n_folds
andn_rep
are derived from the provided partition.- Parameters:
- If nested list of lists of tuples:
The outer list needs to provide an entry per repeated sample splitting (length of list is set as
n_rep
). The inner list needs to provide a tuple (train_ind, test_ind) per fold (length of list is set asn_folds
). test_ind must form a partition for each inner list.- If list of tuples:
The list needs to provide a tuple (train_ind, test_ind) per fold (length of list is set as
n_folds
). test_ind must form a partition.n_rep=1
is always set.- If tuple:
Must be a tuple with two elements train_ind and test_ind. Only viable option is to set train_ind and test_ind to np.arange(n_obs), which corresponds to no sample splitting.
n_folds=1
andn_rep=1
is always set.
all_smpls_cluster (list or None) – Nested list or
None
. The first level of nesting corresponds to the number of repetitions. The second level of nesting corresponds to the number of folds. The third level of nesting contains a tuple of training and testing lists. Both training and testing contain an array for each cluster variable, which form a partition of the clusters. Default isNone
.
- Returns:
self
- Return type:
Examples
>>> import numpy as np >>> import doubleml as dml >>> from doubleml.datasets import make_plr_CCDDHNR2018 >>> from sklearn.ensemble import RandomForestRegressor >>> from sklearn.base import clone >>> np.random.seed(3141) >>> learner = RandomForestRegressor(max_depth=2, n_estimators=10) >>> ml_g = learner >>> ml_m = learner >>> obj_dml_data = make_plr_CCDDHNR2018(n_obs=10, alpha=0.5) >>> dml_plr_obj = dml.DoubleMLPLR(obj_dml_data, ml_g, ml_m) >>> # simple sample splitting with two folds and without cross-fitting >>> smpls = ([0, 1, 2, 3, 4], [5, 6, 7, 8, 9]) >>> dml_plr_obj.set_sample_splitting(smpls) >>> # sample splitting with two folds and cross-fitting >>> smpls = [([0, 1, 2, 3, 4], [5, 6, 7, 8, 9]), >>> ([5, 6, 7, 8, 9], [0, 1, 2, 3, 4])] >>> dml_plr_obj.set_sample_splitting(smpls) >>> # sample splitting with two folds and repeated cross-fitting with n_rep = 2 >>> smpls = [[([0, 1, 2, 3, 4], [5, 6, 7, 8, 9]), >>> ([5, 6, 7, 8, 9], [0, 1, 2, 3, 4])], >>> [([0, 2, 4, 6, 8], [1, 3, 5, 7, 9]), >>> ([1, 3, 5, 7, 9], [0, 2, 4, 6, 8])]] >>> dml_plr_obj.set_sample_splitting(smpls)
- DoubleMLDID.tune(param_grids, tune_on_folds=False, scoring_methods=None, n_folds_tune=5, search_mode='grid_search', n_iter_randomized_search=100, n_jobs_cv=None, set_as_params=True, return_tune_res=False)#
Hyperparameter-tuning for DoubleML models.
The hyperparameter-tuning is performed using either an exhaustive search over specified parameter values implemented in
sklearn.model_selection.GridSearchCV
or via a randomized search implemented insklearn.model_selection.RandomizedSearchCV
.- Parameters:
param_grids (dict) – A dict with a parameter grid for each nuisance model / learner (see attribute
learner_names
).tune_on_folds (bool) – Indicates whether the tuning should be done fold-specific or globally. Default is
False
.scoring_methods (None or dict) – The scoring method used to evaluate the predictions. The scoring method must be set per nuisance model via a dict (see attribute
learner_names
for the keys). If None, the estimator’s score method is used. Default isNone
.n_folds_tune (int) – Number of folds used for tuning. Default is
5
.search_mode (str) – A str (
'grid_search'
or'randomized_search'
) specifying whether hyperparameters are optimized viasklearn.model_selection.GridSearchCV
orsklearn.model_selection.RandomizedSearchCV
. Default is'grid_search'
.n_iter_randomized_search (int) – If
search_mode == 'randomized_search'
. The number of parameter settings that are sampled. Default is100
.n_jobs_cv (None or int) – The number of CPUs to use to tune the learners.
None
means1
. Default isNone
.set_as_params (bool) – Indicates whether the hyperparameters should be set in order to be used when
fit()
is called. Default isTrue
.return_tune_res (bool) – Indicates whether detailed tuning results should be returned. Default is
False
.
- Returns:
self (object) – Returned if
return_tune_res
isFalse
.tune_res (list) – A list containing detailed tuning results and the proposed hyperparameters. Returned if
return_tune_res
isTrue
.