2.2.8. doubleml.irm.DoubleMLQTE#

class doubleml.irm.DoubleMLQTE(obj_dml_data, ml_g, ml_m=None, quantiles=0.5, n_folds=5, n_rep=1, score='PQ', normalize_ipw=True, kde=None, trimming_rule='truncate', trimming_threshold=0.01, ps_processor_config: PSProcessorConfig | None = None, draw_sample_splitting=True)#

Double machine learning for quantile treatment effects

Parameters:

obj_dml_data (DoubleMLData object) – The DoubleMLData object providing the data and specifying the variables for the causal model.
ml_g (classifier implementing fit() and predict()) – A machine learner implementing fit() and predict_proba() methods (e.g. sklearn.ensemble.RandomForestClassifier) for the nuisance elements which depend on priliminary estimation.
ml_m (classifier implementing fit() and predict_proba()) – A machine learner implementing fit() and predict_proba() methods (e.g. sklearn.ensemble.RandomForestClassifier) for the propensity nuisance functions.
quantiles (float or array_like) – Quantiles for treatment effect estimation. Entries have to be between 0 and 1. Default is 0.5.
n_folds (int) – Number of folds. Default is 5.
n_rep (int) – Number of repetitions for the sample splitting. Default is 1.
score (str) – A str ('PQ', 'LPQ' or 'CVaR') specifying the score function. Default is 'PQ'.
normalize_ipw (bool) – Indicates whether the inverse probability weights are normalized. Default is True.
kde (callable or None) – A callable object / function with signature deriv = kde(u, weights) for weighted kernel density estimation. Here deriv should evaluate the density in 0. Default is 'None', which uses statsmodels.nonparametric.kde.KDEUnivariate with a gaussian kernel and silverman for bandwidth determination.
trimming_rule (str, optional, deprecated) – (DEPRECATED) A str ('truncate' is the only choice) specifying the trimming approach. Use ps_processor_config instead. Will be removed in a future version.
trimming_threshold (float, optional, deprecated) – (DEPRECATED) The threshold used for trimming. Use ps_processor_config instead. Will be removed in a future version.
ps_processor_config (PSProcessorConfig, optional) – Configuration for propensity score processing (clipping, calibration, etc.).
draw_sample_splitting (bool) – Indicates whether the sample splitting should be drawn during initialization of the object. Default is True.

Examples

>>> import numpy as np
>>> import doubleml as dml
>>> from doubleml.irm.datasets import make_irm_data
>>> from sklearn.ensemble import RandomForestClassifier
>>> np.random.seed(3141)
>>> ml_g = RandomForestClassifier(n_estimators=100, max_features=20, max_depth=10, min_samples_leaf=2)
>>> ml_m = RandomForestClassifier(n_estimators=100, max_features=20, max_depth=10, min_samples_leaf=2)
>>> data = make_irm_data(theta=0.5, n_obs=500, dim_x=20, return_type='DataFrame')
>>> obj_dml_data = dml.DoubleMLData(data, 'y', 'd')
>>> dml_qte_obj = dml.DoubleMLQTE(obj_dml_data, ml_g, ml_m, quantiles=[0.25, 0.5, 0.75])
>>> dml_qte_obj.fit().summary  
          coef   std err         t     P>|t|     2.5 %    97.5 %
0.25  0.274825  0.347310  0.791297  0.428771 -0.405890  0.955541
0.50  0.449150  0.192539  2.332782  0.019660  0.071782  0.826519
0.75  0.709606  0.193308  3.670867  0.000242  0.330731  1.088482

Methods

`bootstrap`([method, n_rep_boot])	Multiplier bootstrap for DoubleML models.
`confint`([joint, level])	Confidence intervals for DoubleML models.
`draw_sample_splitting`()	Draw sample splitting for DoubleML models.
`fit`([n_jobs_models, n_jobs_cv, ...])	Estimate DoubleMLQTE models.
`p_adjust`([method])	Multiple testing adjustment for DoubleML models.
`set_sample_splitting`(all_smpls[, ...])	Set the sample splitting for DoubleML models.
`tune_ml_models`(ml_param_space[, ...])	Hyperparameter-tuning for DoubleML models using Optuna.

Attributes

`all_coef`	Estimates of the causal parameter(s) for the `n_rep` different sample splits after calling `fit()` (shape (`n_quantiles`, `n_rep`)).
`all_se`	Standard errors of the causal parameter(s) for the `n_rep` different sample splits after calling `fit()` (shape (`n_quantiles`, `n_rep`)).
`boot_method`	The method to construct the bootstrap replications.
`boot_t_stat`	Bootstrapped t-statistics for the causal parameter(s) after calling `fit()` and `bootstrap()` (shape (`n_rep_boot`, `n_quantiles`, `n_rep`)).
`coef`	Estimates for the causal parameter(s) after calling `fit()` (shape (`n_quantiles`,)).
`framework`	The corresponding `doubleml.DoubleMLFramework` object.
`kde`	The kernel density estimation of the derivative.
`modellist_0`	List of the models for the control group (`treatment==0`).
`modellist_1`	List of the models for the treatment group (`treatment==1`).
`n_folds`	Number of folds.
`n_quantiles`	Number of Quantiles.
`n_rep`	Number of repetitions for the sample splitting.
`n_rep_boot`	The number of bootstrap replications.
`normalize_ipw`	Indicates whether the inverse probability weights are normalized.
`ps_processor`	Propensity score processor.
`ps_processor_config`	Configuration for propensity score processing (clipping, calibration, etc.).
`pval`	p-values for the causal parameter(s) (shape (`n_quantiles`,)).
`quantiles`	Number of Quantiles.
`score`	The score function.
`se`	Standard errors for the causal parameter(s) after calling `fit()` (shape (`n_quantiles`,)).
`smpls`	The partition used for cross-fitting.
`summary`	A summary for the estimated causal effect after calling `fit()`.
`t_stat`	t-statistics for the causal parameter(s) after calling `fit()` (shape (`n_quantiles`,)).
`trimming_rule`	Specifies the used trimming rule.
`trimming_threshold`	Specifies the used trimming threshold.

DoubleMLQTE.bootstrap(method='normal', n_rep_boot=500)#

Multiplier bootstrap for DoubleML models.

Parameters:

method (str) – A str ('Bayes', 'normal' or 'wild') specifying the multiplier bootstrap method. Default is 'normal'
n_rep_boot (int) – The number of bootstrap replications.

Returns:

self

Return type:

object

DoubleMLQTE.confint(joint=False, level=0.95)#

Confidence intervals for DoubleML models.

Parameters:

joint (bool) – Indicates whether joint confidence intervals are computed. Default is False
level (float) – The confidence level. Default is 0.95.

Returns:

df_ci – A data frame with the confidence interval(s).

Return type:

pd.DataFrame

DoubleMLQTE.draw_sample_splitting()#

Draw sample splitting for DoubleML models.

The samples are drawn according to the attributes n_folds and n_rep.

Returns:: self
Return type:: object

DoubleMLQTE.fit(n_jobs_models=None, n_jobs_cv=None, store_predictions=True, store_models=False, external_predictions=None)#

Estimate DoubleMLQTE models.

Parameters:

n_jobs_models (None or int) – The number of CPUs to use to fit the quantiles. None means 1. Default is None.
n_jobs_cv (None or int) – The number of CPUs to use to fit the learners. None means 1. Does not speed up computation for quantile models. Default is None.
store_predictions (bool) – Indicates whether the predictions for the nuisance functions should be stored in predictions. Default is True.
store_models (bool) – Indicates whether the fitted models for the nuisance functions should be stored in models. This allows to analyze the fitted models or extract information like variable importance. Default is False.

Returns:

self

Return type:

object

DoubleMLQTE.p_adjust(method='romano-wolf')#

Multiple testing adjustment for DoubleML models.

Parameters:: method (str) – A str ('romano-wolf'', 'bonferroni', 'holm', etc) specifying the adjustment method. In addition to 'romano-wolf'', all methods implemented in statsmodels.stats.multitest.multipletests() can be applied. Default is 'romano-wolf'.
Returns:: p_val – A data frame with adjusted p-values.
Return type:: pd.DataFrame

DoubleMLQTE.set_sample_splitting(all_smpls, all_smpls_cluster=None)#

Set the sample splitting for DoubleML models.

The attributes n_folds and n_rep are derived from the provided partition.

Parameters:

all_smpls (list or tuple) –

If nested list of lists of tuples:
The outer list needs to provide an entry per repeated sample splitting (length of list is set as n_rep). The inner list needs to provide a tuple (train_ind, test_ind) per fold (length of list is set as n_folds). test_ind must form a partition for each inner list.

If list of tuples:
The list needs to provide a tuple (train_ind, test_ind) per fold (length of list is set as n_folds). test_ind must form a partition. n_rep=1 is always set.

If tuple:
Must be a tuple with two elements train_ind and test_ind. Only viable option is to set train_ind and test_ind to np.arange(n_obs), which corresponds to no sample splitting. n_folds=1 and n_rep=1 is always set.
all_smpls_cluster (list or None) – Nested list or None. The first level of nesting corresponds to the number of repetitions. The second level of nesting corresponds to the number of folds. The third level of nesting contains a tuple of training and testing lists. Both training and testing contain an array for each cluster variable, which form a partition of the clusters. Default is None.

Returns:

self

Return type:

object

Examples

>>> import numpy as np
>>> import doubleml as dml
>>> from doubleml.plm.datasets import make_plr_CCDDHNR2018
>>> from sklearn.ensemble import RandomForestRegressor
>>> from sklearn.base import clone
>>> np.random.seed(3141)
>>> learner = RandomForestRegressor(max_depth=2, n_estimators=10)
>>> ml_g = learner
>>> ml_m = learner
>>> obj_dml_data = make_plr_CCDDHNR2018(n_obs=10, alpha=0.5)
>>> dml_plr_obj = dml.DoubleMLPLR(obj_dml_data, ml_g, ml_m)
>>> # sample splitting with two folds and cross-fitting
>>> smpls = [([0, 1, 2, 3, 4], [5, 6, 7, 8, 9]),
...          ([5, 6, 7, 8, 9], [0, 1, 2, 3, 4])]
>>> dml_plr_obj.set_sample_splitting(smpls) 
<doubleml.plm.plr.DoubleMLPLR object at 0x...>
>>> # sample splitting with two folds and repeated cross-fitting with n_rep = 2
>>> smpls = [[([0, 1, 2, 3, 4], [5, 6, 7, 8, 9]),
...           ([5, 6, 7, 8, 9], [0, 1, 2, 3, 4])],
...          [([0, 2, 4, 6, 8], [1, 3, 5, 7, 9]),
...           ([1, 3, 5, 7, 9], [0, 2, 4, 6, 8])]]
>>> dml_plr_obj.set_sample_splitting(smpls) 
<doubleml.plm.plr.DoubleMLPLR object at 0x...>

DoubleMLQTE.tune_ml_models(ml_param_space, scoring_methods=None, cv=5, set_as_params=True, return_tune_res=False, optuna_settings=None)#

Hyperparameter-tuning for DoubleML models using Optuna.

The hyperparameter-tuning is performed using Optuna’s Bayesian optimization. Unlike grid/randomized search, Optuna tuning is performed once on the whole dataset using cross-validation, and the same optimal hyperparameters are used for all folds.

Parameters:

ml_param_space (dict) –
A dict with a parameter grid function for each nuisance model (see attribute params_names) or for each learner (see attribute learner_names). Mixed specification are allowed, i.e., some nuisance models can share the same learner. For mixed specifications, learner-specific settings will be overwritten by nuisance model-specific settings.

Each parameter grid must be specified as a callable function that takes an Optuna trial and returns a dictionary of hyperparameters.

For PLR models, keys should be: 'ml_l', 'ml_m' (and optionally 'ml_g' for IV-type score). For IRM models, keys should be: 'ml_g0', 'ml_g1' (or just 'ml_g' for both), 'ml_m'.

Example:
```
def ml_l_params(trial):
    return {
        'learning_rate': trial.suggest_float('learning_rate', 0.01, 0.3, log=True),
        'n_estimators': trial.suggest_int('n_estimators', 100, 500, step=50),
        'num_leaves': trial.suggest_int('num_leaves', 20, 256),
        'min_child_samples': trial.suggest_int('min_child_samples', 5, 100),
    }

ml_param_space = {'ml_l': ml_l_params, 'ml_m': ml_m_params}
```
Note: Optuna tuning is performed globally (not fold-specific) to ensure consistent hyperparameters across all folds.
scoring_methods (None or dict) – The scoring method used to evaluate the predictions. The scoring method must be set per nuisance model via a dict (see attribute params_names for the keys). If None, the estimator’s score method is used. Default is None.
cv (int, cross-validation splitter, or iterable of (train_indices, test_indices)) – Cross-validation strategy used for Optuna-based tuning. If an integer is provided, a shuffled sklearn.model_selection.KFold with the specified number of splits and random_state=42 is used. Custom splitters must implement split (and ideally get_n_splits), or be an iterable yielding (train_indices, test_indices) pairs. Default is 5.
set_as_params (bool) – Indicates whether the hyperparameters should be set in order to be used when fit() is called. Default is True.
return_tune_res (bool) – Indicates whether detailed tuning results should be returned. Default is False.
optuna_settings (None or dict) –
Optional configuration passed to the Optuna tuner. Supports global settings as well as learner-specific overrides (using the keys from ml_param_space). The dictionary can contain entries corresponding to Optuna’s study and optimize configuration such as:
- n_trials (int): Number of optimization trials (default: 100)
- timeout (float): Time limit in seconds for the study (default: None)
- direction (str): Optimization direction, ‘maximize’ or ‘minimize’. For sklearn scorers, use ‘maximize’ for negative metrics like ‘neg_mean_squared_error’ (since -0.1 > -0.2 means better performance). Can be set globally or per learner. (default: ‘maximize’)
- sampler (optuna.samplers.BaseSampler): Optuna sampler instance (default: None, uses TPE)
- callbacks (list): List of callback functions (default: None)
- show_progress_bar (bool): Show progress bar during optimization (default: False)
- n_jobs_optuna (int): Number of parallel trials (default: None)
- verbosity (int): Optuna logging verbosity level (default: None)
- study (optuna.study.Study): Pre-created study instance (default: None)
- study_kwargs (dict): Additional kwargs for study creation (default: {})
- optimize_kwargs (dict): Additional kwargs for study.optimize() (default: {})
To set direction per learner (similar to scoring_methods):
```
optuna_settings = {
    'n_trials': 50,
    'direction': 'maximize',  # Global default
    'ml_g0': {'direction': 'maximize'},  # Per-learner override
    'ml_m': {'n_trials': 100, 'direction': 'maximize'}
}
```
Defaults to None.

Returns:

self (object) – Returned if return_tune_res is False.
tune_res (list) – A list containing detailed tuning results and the proposed hyperparameters. Returned if return_tune_res is True.

Examples

>>> import numpy as np
>>> from doubleml import DoubleMLData, DoubleMLPLR
>>> from doubleml.plm.datasets import make_plr_CCDDHNR2018
>>> from lightgbm import LGBMRegressor
>>> import optuna
>>> # Generate data
>>> np.random.seed(42)
>>> data = make_plr_CCDDHNR2018(n_obs=500, dim_x=20, return_type='DataFrame')
>>> dml_data = DoubleMLData(data, 'y', 'd')
>>> # Initialize model
>>> dml_plr = DoubleMLPLR(
...    dml_data,
...    LGBMRegressor(n_estimators=50, verbose=-1, random_state=42),
...    LGBMRegressor(n_estimators=50, verbose=-1, random_state=42)
... )
>>> # Define parameter grid functions
>>> def ml_l_params(trial):
...     return {
...         'learning_rate': trial.suggest_float('learning_rate', 0.01, 0.3, log=True),
...     }
>>> def ml_m_params(trial):
...     return {
...         'learning_rate': trial.suggest_float('learning_rate', 0.01, 0.3, log=True),
...     }
>>> ml_param_space = {'ml_l': ml_l_params, 'ml_m': ml_m_params}
>>> # Tune with TPE sampler
>>> optuna_settings = {
...     'n_trials': 5,
...     'sampler': optuna.samplers.TPESampler(seed=42),
... }
>>> tune_res = dml_plr.tune_ml_models(ml_param_space, optuna_settings=optuna_settings, return_tune_res=True)
>>> print(tune_res[0]['ml_l'].best_params)  
{'learning_rate': 0.03907122389107094}
>>> # Fit and get results
>>> dml_plr.fit().summary 
      coef   std err          t         P>|t|     2.5 %    97.5 %
d  0.57436  0.045206  12.705519  5.510257e-37  0.485759  0.662961
>>> # Example with scoring methods and directions
>>> scoring_methods = {
...     'ml_l': 'neg_mean_squared_error',  # Negative metric
...     'ml_m': 'neg_mean_squared_error'
... }
>>> optuna_settings = {
...     'n_trials': 50,
...     'direction': 'maximize',  # Maximize negative MSE (minimize MSE)
...     'sampler': optuna.samplers.TPESampler(seed=42),
... }
>>> tune_res = dml_plr.tune_ml_models(ml_param_space, scoring_methods=scoring_methods,
...                                   optuna_settings=optuna_settings, return_tune_res=True)
>>> print(tune_res[0]['ml_l'].best_params)  
{'learning_rate': 0.04300012336462904}
>>> dml_plr.fit().summary 
       coef   std err          t         P>|t|     2.5 %    97.5 %
d  0.574796  0.045062  12.755721  2.896820e-37  0.486476  0.663115