doubleml.DoubleMLDIDCS#

class doubleml.DoubleMLDIDCS(obj_dml_data, ml_g, ml_m=None, n_folds=5, n_rep=1, score='observational', in_sample_normalization=True, dml_procedure='dml2', trimming_rule='truncate', trimming_threshold=0.01, draw_sample_splitting=True, apply_cross_fitting=True)#

Double machine learning for difference-in-difference with repeated cross-sections.

Parameters:
  • obj_dml_data (DoubleMLData object) – The DoubleMLData object providing the data and specifying the variables for the causal model.

  • ml_g (estimator implementing fit() and predict()) – A machine learner implementing fit() and predict() methods (e.g. sklearn.ensemble.RandomForestRegressor) for the nuisance function \(g_0(d,t,X) = E[Y|D=d,T=t,X]\). For a binary outcome variable \(Y\) (with values 0 and 1), a classifier implementing fit() and predict_proba() can also be specified. If sklearn.base.is_classifier() returns True, predict_proba() is used otherwise predict().

  • ml_m (classifier implementing fit() and predict_proba()) – A machine learner implementing fit() and predict_proba() methods (e.g. sklearn.ensemble.RandomForestClassifier) for the nuisance function \(m_0(X) = E[D=1|X]\). Only relevant for score='observational'.

  • n_folds (int) – Number of folds. Default is 5.

  • n_rep (int) – Number of repetitons for the sample splitting. Default is 1.

  • score (str) – A str ('observational' or 'experimental') specifying the score function. The 'experimental' scores refers to an A/B setting, where the treatment is independent from the pretreatment covariates. Default is 'observational'.

  • in_sample_normalization (bool) – Indicates whether to use a sligthly different normalization from Sant’Anna and Zhao (2020). Default is True.

  • dml_procedure (str) – A str ('dml1' or 'dml2') specifying the double machine learning algorithm. Default is 'dml2'.

  • trimming_rule (str) – A str ('truncate' is the only choice) specifying the trimming approach. Default is 'truncate'.

  • trimming_threshold (float) – The threshold used for trimming. Default is 1e-2.

  • draw_sample_splitting (bool) – Indicates whether the sample splitting should be drawn during initialization of the object. Default is True.

  • apply_cross_fitting (bool) – Indicates whether cross-fitting should be applied. Default is True.

Examples

>>> import numpy as np
>>> import doubleml as dml
>>> from doubleml.datasets import make_did_SZ2020
>>> from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier
>>> np.random.seed(42)
>>> ml_g = RandomForestRegressor(n_estimators=100, max_depth=5, min_samples_leaf=5)
>>> ml_m = RandomForestClassifier(n_estimators=100, max_depth=5, min_samples_leaf=5)
>>> data = make_did_SZ2020(n_obs=500, cross_sectional_data=True, return_type='DataFrame')
>>> obj_dml_data = dml.DoubleMLData(data, 'y', 'd', t_col='t')
>>> dml_did_obj = dml.DoubleMLDIDCS(obj_dml_data, ml_g, ml_m)
>>> dml_did_obj.fit().summary
       coef   std err         t     P>|t|      2.5 %     97.5 %
d -6.604603  8.725802 -0.756905  0.449107 -23.706862  10.497655

Methods

bootstrap([method, n_rep_boot])

Multiplier bootstrap for DoubleML models.

confint([joint, level])

Confidence intervals for DoubleML models.

draw_sample_splitting()

Draw sample splitting for DoubleML models.

evaluate_learners([learners, metric])

Evaluate fitted learners for DoubleML models on cross-validated predictions.

fit([n_jobs_cv, store_predictions, ...])

Estimate DoubleML models.

get_params(learner)

Get hyperparameters for the nuisance model of DoubleML models.

p_adjust([method])

Multiple testing adjustment for DoubleML models.

sensitivity_analysis([cf_y, cf_d, rho, ...])

Performs a sensitivity analysis to account for unobserved confounders.

sensitivity_benchmark(benchmarking_set)

Computes a benchmark for a given set of features.

sensitivity_plot([idx_treatment, value, ...])

Contour plot of the sensivity with respect to latent/confounding variables.

set_ml_nuisance_params(learner, treat_var, ...)

Set hyperparameters for the nuisance models of DoubleML models.

set_sample_splitting(all_smpls)

Set the sample splitting for DoubleML models.

tune(param_grids[, tune_on_folds, ...])

Hyperparameter-tuning for DoubleML models.

Attributes

all_coef

Estimates of the causal parameter(s) for the n_rep different sample splits after calling fit().

all_dml1_coef

Estimates of the causal parameter(s) for the n_rep x n_folds different folds after calling fit() with dml_procedure='dml1'.

all_se

Standard errors of the causal parameter(s) for the n_rep different sample splits after calling fit().

apply_cross_fitting

Indicates whether cross-fitting should be applied.

boot_coef

Bootstrapped coefficients for the causal parameter(s) after calling fit() and bootstrap().

boot_method

The method to construct the bootstrap replications.

boot_t_stat

Bootstrapped t-statistics for the causal parameter(s) after calling fit() and bootstrap().

coef

Estimates for the causal parameter(s) after calling fit().

dml_procedure

The double machine learning algorithm.

in_sample_normalization

Indicates whether the in sample normalization of weights are used.

learner

The machine learners for the nuisance functions.

learner_names

The names of the learners.

models

The fitted nuisance models.

n_folds

Number of folds.

n_rep

Number of repetitions for the sample splitting.

n_rep_boot

The number of bootstrap replications.

nuisance_targets

The outcome of the nuisance models.

params

The hyperparameters of the learners.

params_names

The names of the nuisance models with hyperparameters.

predictions

The predictions of the nuisance models in form of a dictinary.

psi

Values of the score function after calling fit(); For models (e.g., PLR, IRM, PLIV, IIVM) with linear score (in the parameter) \(\psi(W; \theta, \eta) = \psi_a(W; \eta) \theta + \psi_b(W; \eta)\).

psi_deriv

Values of the derivative of the score function with respect to the parameter \(\theta\) after calling fit(); For models (e.g., PLR, IRM, PLIV, IIVM) with linear score (in the parameter) \(\psi_a(W; \eta)\).

psi_elements

Values of the score function components after calling fit(); For models (e.g., PLR, IRM, PLIV, IIVM) with linear score (in the parameter) a dictionary with entries psi_a and psi_b for \(\psi_a(W; \eta)\) and \(\psi_b(W; \eta)\).

pval

p-values for the causal parameter(s) after calling fit().

rmses

The root-mean-squared-errors of the nuisance models.

score

The score function.

se

Standard errors for the causal parameter(s) after calling fit().

sensitivity_elements

Values of the sensitivity components after calling fit(); If available (e.g., PLR, IRM) a dictionary with entries sigma2, nu2, psi_sigma2 and psi_nu2.

sensitivity_params

Values of the sensitivity parameters after calling sesitivity_analysis(); If available (e.g., PLR, IRM) a dictionary with entries theta, se, ci, rv and rva.

sensitivity_summary

Returns a summary for the sensitivity analysis after calling sensitivity_analysis().

smpls

The partition used for cross-fitting.

smpls_cluster

The partition of clusters used for cross-fitting.

summary

A summary for the estimated causal effect after calling fit().

t_stat

t-statistics for the causal parameter(s) after calling fit().

trimming_rule

Specifies the used trimming rule.

trimming_threshold

Specifies the used trimming threshold.

DoubleMLDIDCS.bootstrap(method='normal', n_rep_boot=500)#

Multiplier bootstrap for DoubleML models.

Parameters:
  • method (str) – A str ('Bayes', 'normal' or 'wild') specifying the multiplier bootstrap method. Default is 'normal'

  • n_rep_boot (int) – The number of bootstrap replications.

Returns:

self

Return type:

object

DoubleMLDIDCS.confint(joint=False, level=0.95)#

Confidence intervals for DoubleML models.

Parameters:
  • joint (bool) – Indicates whether joint confidence intervals are computed. Default is False

  • level (float) – The confidence level. Default is 0.95.

Returns:

df_ci – A data frame with the confidence interval(s).

Return type:

pd.DataFrame

DoubleMLDIDCS.draw_sample_splitting()#

Draw sample splitting for DoubleML models.

The samples are drawn according to the attributes n_folds, n_rep and apply_cross_fitting.

Returns:

self

Return type:

object

DoubleMLDIDCS.evaluate_learners(learners=None, metric=<function _rmse>)#

Evaluate fitted learners for DoubleML models on cross-validated predictions.

Parameters:
  • learners (list) – A list of strings which correspond to the nuisance functions of the model.

  • metric (callable) – A callable function with inputs y_pred and y_true of shape (1, n), where n specifies the number of observations. Remark that some models like IRM are not able to provide all values for y_true for all learners and might contain some nan values in the target vector. Default is the root-mean-square error.

Returns:

dist – A dictionary containing the evaluated metric for each learner.

Return type:

dict

Examples

>>> import numpy as np
>>> import doubleml as dml
>>> from sklearn.metrics import mean_absolute_error
>>> from doubleml.datasets import make_irm_data
>>> from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier
>>> np.random.seed(3141)
>>> ml_g = RandomForestRegressor(n_estimators=100, max_features=20, max_depth=5, min_samples_leaf=2)
>>> ml_m = RandomForestClassifier(n_estimators=100, max_features=20, max_depth=5, min_samples_leaf=2)
>>> data = make_irm_data(theta=0.5, n_obs=500, dim_x=20, return_type='DataFrame')
>>> obj_dml_data = dml.DoubleMLData(data, 'y', 'd')
>>> dml_irm_obj = dml.DoubleMLIRM(obj_dml_data, ml_g, ml_m)
>>> dml_irm_obj.fit()
>>> def mae(y_true, y_pred):
>>>     subset = np.logical_not(np.isnan(y_true))
>>>     return mean_absolute_error(y_true[subset], y_pred[subset])
>>> dml_irm_obj.evaluate_learners(metric=mae)
{'ml_g0': array([[0.85974356]]),
 'ml_g1': array([[0.85280376]]),
 'ml_m': array([[0.35365143]])}
DoubleMLDIDCS.fit(n_jobs_cv=None, store_predictions=True, external_predictions=None, store_models=False)#

Estimate DoubleML models.

Parameters:
  • n_jobs_cv (None or int) – The number of CPUs to use to fit the learners. None means 1. Default is None.

  • store_predictions (bool) – Indicates whether the predictions for the nuisance functions should be stored in predictions. Default is True.

  • store_models (bool) – Indicates whether the fitted models for the nuisance functions should be stored in models. This allows to analyze the fitted models or extract information like variable importance. Default is False.

  • external_predictions (None or dict) – If None all models for the learners are fitted and evaluated. If a dictionary containing predictions for a specific learner is supplied, the model will use the supplied nuisance predictions instead. Has to be a nested dictionary where the keys refer to the treatment and the keys of the nested dictionarys refer to the corresponding learners. Default is None.

Returns:

self

Return type:

object

DoubleMLDIDCS.get_params(learner)#

Get hyperparameters for the nuisance model of DoubleML models.

Parameters:

learner (str) – The nuisance model / learner (see attribute params_names).

Returns:

params – Parameters for the nuisance model / learner.

Return type:

dict

DoubleMLDIDCS.p_adjust(method='romano-wolf')#

Multiple testing adjustment for DoubleML models.

Parameters:

method (str) – A str ('romano-wolf'', 'bonferroni', 'holm', etc) specifying the adjustment method. In addition to 'romano-wolf'', all methods implemented in statsmodels.stats.multitest.multipletests() can be applied. Default is 'romano-wolf'.

Returns:

p_val – A data frame with adjusted p-values.

Return type:

pd.DataFrame

DoubleMLDIDCS.sensitivity_analysis(cf_y=0.03, cf_d=0.03, rho=1.0, level=0.95, null_hypothesis=0.0)#

Performs a sensitivity analysis to account for unobserved confounders.

The evaluated scenario is stored as a dictionary in the property sensitivity_params.

Parameters:
  • cf_y (float) – Percentage of the residual variation of the outcome explained by latent/confounding variables. Default is 0.03.

  • cf_d (float) – Percentage gains in the variation of the Riesz representer generated by latent/confounding variables. Default is 0.03.

  • rho (float) – The correlation between the differences in short and long representations in the main regression and Riesz representer. Has to be in [-1,1]. The absolute value determines the adversarial strength of the confounding (maximizes at 1.0). Default is 1.0.

  • level (float) – The confidence level. Default is 0.95.

  • null_hypothesis (float or numpy.ndarray) – Null hypothesis for the effect. Determines the robustness values. If it is a single float uses the same null hypothesis for all estimated parameters. Else the array has to be of shape (n_coefs,). Default is 0.0.

Returns:

self

Return type:

object

DoubleMLDIDCS.sensitivity_benchmark(benchmarking_set)#

Computes a benchmark for a given set of features. Returns a DataFrame containing the corresponding values for cf_y, cf_d, rho and the change in estimates. :returns: benchmark_results – Benchmark results. :rtype: pandas.DataFrame

DoubleMLDIDCS.sensitivity_plot(idx_treatment=0, value='theta', include_scenario=True, benchmarks=None, fill=True, grid_bounds=(0.15, 0.15), grid_size=100)#

Contour plot of the sensivity with respect to latent/confounding variables.

Parameters:
  • idx_treatment (int) – Index of the treatment to perform the sensitivity analysis. Default is 0.

  • value (str) – Determines which contours to plot. Valid values are 'theta' (refers to the bounds) and 'ci' (refers to the bounds including statistical uncertainty). Default is 'theta'.

  • include_scenario (bool) – Indicates whether to highlight the scenario from the call of sensitivity_analysis(). Default is True.

  • benchmarks (dict or None) – Dictionary of benchmarks to be included in the plot. The keys are cf_y, cf_d and name. Default is None.

  • fill (bool) – Indicates whether to use a heatmap style or only contour lines. Default is True.

  • grid_bounds (tuple) – Determines the evaluation bounds of the grid for cf_d and cf_y. Has to contain two floats in [0, 1). Default is (0.15, 0.15).

  • grid_size (int) – Determines the number of evaluation points of the grid. Default is 100.

Returns:

fig – Plotly figure of the sensitivity contours.

Return type:

object

DoubleMLDIDCS.set_ml_nuisance_params(learner, treat_var, params)#

Set hyperparameters for the nuisance models of DoubleML models.

Parameters:
  • learner (str) – The nuisance model / learner (see attribute params_names).

  • treat_var (str) – The treatment variable (hyperparameters can be set treatment-variable specific).

  • params (dict or list) – A dict with estimator parameters (used for all folds) or a nested list with fold specific parameters. The outer list needs to be of length n_rep and the inner list of length n_folds.

Returns:

self

Return type:

object

DoubleMLDIDCS.set_sample_splitting(all_smpls)#

Set the sample splitting for DoubleML models.

The attributes n_folds and n_rep are derived from the provided partition.

Parameters:

all_smpls (list or tuple) –

If nested list of lists of tuples:

The outer list needs to provide an entry per repeated sample splitting (length of list is set as n_rep). The inner list needs to provide a tuple (train_ind, test_ind) per fold (length of list is set as n_folds). If tuples for more than one fold are provided, it must form a partition and apply_cross_fitting is set to True. Otherwise apply_cross_fitting is set to False and n_folds=2.

If list of tuples:

The list needs to provide a tuple (train_ind, test_ind) per fold (length of list is set as n_folds). If tuples for more than one fold are provided, it must form a partition and apply_cross_fitting is set to True. Otherwise apply_cross_fitting is set to False and n_folds=2. n_rep=1 is always set.

If tuple:

Must be a tuple with two elements train_ind and test_ind. No sample splitting is achieved if train_ind and test_ind are range(n_rep). Otherwise n_folds=2. apply_cross_fitting=False and n_rep=1 is always set.

Returns:

self

Return type:

object

Examples

>>> import numpy as np
>>> import doubleml as dml
>>> from doubleml.datasets import make_plr_CCDDHNR2018
>>> from sklearn.ensemble import RandomForestRegressor
>>> from sklearn.base import clone
>>> np.random.seed(3141)
>>> learner = RandomForestRegressor(max_depth=2, n_estimators=10)
>>> ml_g = learner
>>> ml_m = learner
>>> obj_dml_data = make_plr_CCDDHNR2018(n_obs=10, alpha=0.5)
>>> dml_plr_obj = dml.DoubleMLPLR(obj_dml_data, ml_g, ml_m)
>>> # simple sample splitting with two folds and without cross-fitting
>>> smpls = ([0, 1, 2, 3, 4], [5, 6, 7, 8, 9])
>>> dml_plr_obj.set_sample_splitting(smpls)
>>> # sample splitting with two folds and cross-fitting
>>> smpls = [([0, 1, 2, 3, 4], [5, 6, 7, 8, 9]),
>>>          ([5, 6, 7, 8, 9], [0, 1, 2, 3, 4])]
>>> dml_plr_obj.set_sample_splitting(smpls)
>>> # sample splitting with two folds and repeated cross-fitting with n_rep = 2
>>> smpls = [[([0, 1, 2, 3, 4], [5, 6, 7, 8, 9]),
>>>           ([5, 6, 7, 8, 9], [0, 1, 2, 3, 4])],
>>>          [([0, 2, 4, 6, 8], [1, 3, 5, 7, 9]),
>>>           ([1, 3, 5, 7, 9], [0, 2, 4, 6, 8])]]
>>> dml_plr_obj.set_sample_splitting(smpls)
DoubleMLDIDCS.tune(param_grids, tune_on_folds=False, scoring_methods=None, n_folds_tune=5, search_mode='grid_search', n_iter_randomized_search=100, n_jobs_cv=None, set_as_params=True, return_tune_res=False)#

Hyperparameter-tuning for DoubleML models.

The hyperparameter-tuning is performed using either an exhaustive search over specified parameter values implemented in sklearn.model_selection.GridSearchCV or via a randomized search implemented in sklearn.model_selection.RandomizedSearchCV.

Parameters:
  • param_grids (dict) – A dict with a parameter grid for each nuisance model / learner (see attribute learner_names).

  • tune_on_folds (bool) – Indicates whether the tuning should be done fold-specific or globally. Default is False.

  • scoring_methods (None or dict) – The scoring method used to evaluate the predictions. The scoring method must be set per nuisance model via a dict (see attribute learner_names for the keys). If None, the estimator’s score method is used. Default is None.

  • n_folds_tune (int) – Number of folds used for tuning. Default is 5.

  • search_mode (str) – A str ('grid_search' or 'randomized_search') specifying whether hyperparameters are optimized via sklearn.model_selection.GridSearchCV or sklearn.model_selection.RandomizedSearchCV. Default is 'grid_search'.

  • n_iter_randomized_search (int) – If search_mode == 'randomized_search'. The number of parameter settings that are sampled. Default is 100.

  • n_jobs_cv (None or int) – The number of CPUs to use to tune the learners. None means 1. Default is None.

  • set_as_params (bool) – Indicates whether the hyperparameters should be set in order to be used when fit() is called. Default is True.

  • return_tune_res (bool) – Indicates whether detailed tuning results should be returned. Default is False.

Returns:

  • self (object) – Returned if return_tune_res is False.

  • tune_res (list) – A list containing detailed tuning results and the proposed hyperparameters. Returned if return_tune_res is True.