Abstract base class that can't be initialized.
Format
R6::R6Class object.
See also
Other DoubleML:
DoubleMLIIVM
,
DoubleMLIRM
,
DoubleMLPLIV
,
DoubleMLPLR
Active bindings
all_coef
(
matrix()
)
Estimates of the causal parameter(s) for then_rep
different sample splits after callingfit()
.all_dml1_coef
(
array()
)
Estimates of the causal parameter(s) for then_rep
different sample splits after callingfit()
withdml_procedure = "dml1"
.all_se
(
matrix()
)
Standard errors of the causal parameter(s) for then_rep
different sample splits after callingfit()
.apply_cross_fitting
(
logical(1)
)
Indicates whether cross-fitting should be applied. Default isTRUE
.boot_coef
(
matrix()
)
Bootstrapped coefficients for the causal parameter(s) after callingfit()
andbootstrap()
.boot_t_stat
(
matrix()
)
Bootstrapped t-statistics for the causal parameter(s) after callingfit()
andbootstrap()
.coef
(
numeric()
)
Estimates for the causal parameter(s) after callingfit()
.data
(
data.table
)
Data object.dml_procedure
(
character(1)
)
Acharacter()
("dml1"
or"dml2"
) specifying the double machine learning algorithm. Default is"dml2"
.draw_sample_splitting
(
logical(1)
)
Indicates whether the sample splitting should be drawn during initialization of the object. Default isTRUE
.learner
(named
list()
)
The machine learners for the nuisance functions.n_folds
(
integer(1)
)
Number of folds. Default is5
.n_rep
(
integer(1)
)
Number of repetitions for the sample splitting. Default is1
.params
(named
list()
)
The hyperparameters of the learners.psi
(
array()
)
Value of the score function \(\psi(W;\theta, \eta)=\psi_a(W;\eta) \theta + \psi_b (W; \eta)\) after callingfit()
.psi_a
(
array()
)
Value of the score function component \(\psi_a(W;\eta)\) after callingfit()
.psi_b
(
array()
)
Value of the score function component \(\psi_b(W;\eta)\) after callingfit()
.predictions
(
array()
)
Predictions of the nuisance models after callingfit(store_predictions=TRUE)
.models
(
array()
)
The fitted nuisance models after callingfit(store_models=TRUE)
.pval
(
numeric()
)
p-values for the causal parameter(s) after callingfit()
.score
(
character(1)
,function()
)
Acharacter(1)
orfunction()
specifying the score function.se
(
numeric()
)
Standard errors for the causal parameter(s) after callingfit()
.smpls
(
list()
)
The partition used for cross-fitting.smpls_cluster
(
list()
)
The partition of clusters used for cross-fitting.t_stat
(
numeric()
)
t-statistics for the causal parameter(s) after callingfit()
.tuning_res
(named
list()
)
Results from hyperparameter tuning.
Methods
Method fit()
Estimate DoubleML models.
Arguments
store_predictions
(
logical(1)
)
Indicates whether the predictions for the nuisance functions should be stored in fieldpredictions
. Default isFALSE
.store_models
(
logical(1)
)
Indicates whether the fitted models for the nuisance functions should be stored in fieldmodels
if you want to analyze the models or extract information like variable importance. Default isFALSE
.
Method bootstrap()
Multiplier bootstrap for DoubleML models.
Method split_samples()
Draw sample splitting for DoubleML models.
The samples are drawn according to the attributes n_folds
, n_rep
and apply_cross_fitting
.
Method set_sample_splitting()
Set the sample splitting for DoubleML models.
The attributes n_folds
and n_rep
are derived from the provided
partition.
Arguments
smpls
(
list()
)
A nestedlist()
. The outer lists needs to provide an entry per repeated sample splitting (length of the list is set asn_rep
). The inner list is a namedlist()
with namestrain_ids
andtest_ids
. The entries intrain_ids
andtest_ids
must be partitions per fold (length oftrain_ids
andtest_ids
is set asn_folds
).
Examples
library(DoubleML)
library(mlr3)
set.seed(2)
obj_dml_data = make_plr_CCDDHNR2018(n_obs=10)
dml_plr_obj = DoubleMLPLR$new(obj_dml_data,
lrn("regr.rpart"), lrn("regr.rpart"))
# simple sample splitting with two folds and without cross-fitting
smpls = list(list(train_ids = list(c(1, 2, 3, 4, 5)),
test_ids = list(c(6, 7, 8, 9, 10))))
dml_plr_obj$set_sample_splitting(smpls)
# sample splitting with two folds and cross-fitting but no repeated cross-fitting
smpls = list(list(train_ids = list(c(1, 2, 3, 4, 5), c(6, 7, 8, 9, 10)),
test_ids = list(c(6, 7, 8, 9, 10), c(1, 2, 3, 4, 5))))
dml_plr_obj$set_sample_splitting(smpls)
# sample splitting with two folds and repeated cross-fitting with n_rep = 2
smpls = list(list(train_ids = list(c(1, 2, 3, 4, 5), c(6, 7, 8, 9, 10)),
test_ids = list(c(6, 7, 8, 9, 10), c(1, 2, 3, 4, 5))),
list(train_ids = list(c(1, 3, 5, 7, 9), c(2, 4, 6, 8, 10)),
test_ids = list(c(2, 4, 6, 8, 10), c(1, 3, 5, 7, 9))))
dml_plr_obj$set_sample_splitting(smpls)
Method tune()
Hyperparameter-tuning for DoubleML models.
The hyperparameter-tuning is performed using the tuning methods provided in the mlr3tuning package. For more information on tuning in mlr3, we refer to the section on parameter tuning in the mlr3 book.
Arguments
param_set
(named
list()
)
A namedlist
with a parameter grid for each nuisance model/learner (see methodlearner_names()
). The parameter grid must be an object of class ParamSet.tune_settings
(named
list()
)
A namedlist()
with arguments passed to the hyperparameter-tuning with mlr3tuning to set up TuningInstance objects.tune_settings
has entriesterminator
(Terminator)
A Terminator object. Specification ofterminator
is required to perform tuning.algorithm
(Tuner orcharacter(1)
)
A Tuner object (recommended) or key passed to the respective dictionary to specify the tuning algorithm used in tnr().algorithm
is passed as an argument to tnr(). Ifalgorithm
is not specified by the users, default is set to"grid_search"
. If set to"grid_search"
, then additional argument"resolution"
is required.rsmp_tune
(Resampling orcharacter(1)
)
A Resampling object (recommended) or option passed to rsmp() to initialize a Resampling for parameter tuning inmlr3
. If not specified by the user, default is set to"cv"
(cross-validation).n_folds_tune
(integer(1)
, optional)
Ifrsmp_tune = "cv"
, number of folds used for cross-validation. If not specified by the user, default is set to5
.measure
(NULL
, namedlist()
, optional)
Named list containing the measures used for parameter tuning. Entries in list must either be Measure objects or keys to be passed to passed to msr(). The names of the entries must match the learner names (see methodlearner_names()
). If set toNULL
, default measures are used, i.e.,"regr.mse"
for continuous outcome variables and"classif.ce"
for binary outcomes.resolution
(character(1)
)
The key passed to the respective dictionary to specify the tuning algorithm used in tnr().resolution
is passed as an argument to tnr().
tune_on_folds
(
logical(1)
)
Indicates whether the tuning should be done fold-specific or globally. Default isFALSE
.
Method summary()
Summary for DoubleML models after calling fit()
.
Method confint()
Confidence intervals for DoubleML models.
Arguments
parm
(
numeric()
orcharacter()
)
A specification of which parameters are to be given confidence intervals among the variables for which inference was done, either a vector of numbers or a vector of names. If missing, all parameters are considered (default).joint
(
logical(1)
)
Indicates whether joint confidence intervals are computed. Default isFALSE
.level
(
numeric(1)
)
The confidence level. Default is0.95
.
Returns
A matrix()
with the confidence interval(s).
Method learner_names()
Returns the names of the learners.
Returns
character()
with names of learners.
Method params_names()
Returns the names of the nuisance models with hyperparameters.
Returns
character()
with names of nuisance models with hyperparameters.
Method set_ml_nuisance_params()
Set hyperparameters for the nuisance models of DoubleML models.
Note that in the current implementation, either all parameters have to be set globally or all parameters have to be provided fold-specific.
Usage
DoubleML$set_ml_nuisance_params(
learner = NULL,
treat_var = NULL,
params,
set_fold_specific = FALSE
)
Arguments
learner
(
character(1)
)
The nuisance model/learner (see methodparams_names
).treat_var
(
character(1)
)
The treatment varaible (hyperparameters can be set treatment-variable specific).params
(named
list()
)
A namedlist()
with estimator parameters. Parameters are used for all folds by default. Alternatively, parameters can be passed in a fold-specific way if optionfold_specific
isTRUE
. In this case, the outer list needs to be of lengthn_rep
and the inner list of lengthn_folds
.set_fold_specific
(
logical(1)
)
Indicates if the parameters passed inparams
should be passed in fold-specific way. Default isFALSE
. IfTRUE
, the outer list needs to be of lengthn_rep
and the inner list of lengthn_folds
. Note that in the current implementation, either all parameters have to be set globally or all parameters have to be provided fold-specific.
Method p_adjust()
Multiple testing adjustment for DoubleML models.
Arguments
method
(
character(1)
)
Acharacter(1)
("romano-wolf"
,"bonferroni"
,"holm"
, etc) specifying the adjustment method. In addition to"romano-wolf"
, all methods implemented in p.adjust() can be applied. Default is"romano-wolf"
.return_matrix
(
logical(1)
)
Indicates if the output is returned as a matrix with corresponding coefficient names.
Method get_params()
Get hyperparameters for the nuisance model of DoubleML models.
Returns
named list()
with paramers for the nuisance model/learner.
Examples
## ------------------------------------------------
## Method `DoubleML$set_sample_splitting`
## ------------------------------------------------
library(DoubleML)
library(mlr3)
set.seed(2)
obj_dml_data = make_plr_CCDDHNR2018(n_obs=10)
dml_plr_obj = DoubleMLPLR$new(obj_dml_data,
lrn("regr.rpart"), lrn("regr.rpart"))
# simple sample splitting with two folds and without cross-fitting
smpls = list(list(train_ids = list(c(1, 2, 3, 4, 5)),
test_ids = list(c(6, 7, 8, 9, 10))))
dml_plr_obj$set_sample_splitting(smpls)
# sample splitting with two folds and cross-fitting but no repeated cross-fitting
smpls = list(list(train_ids = list(c(1, 2, 3, 4, 5), c(6, 7, 8, 9, 10)),
test_ids = list(c(6, 7, 8, 9, 10), c(1, 2, 3, 4, 5))))
dml_plr_obj$set_sample_splitting(smpls)
# sample splitting with two folds and repeated cross-fitting with n_rep = 2
smpls = list(list(train_ids = list(c(1, 2, 3, 4, 5), c(6, 7, 8, 9, 10)),
test_ids = list(c(6, 7, 8, 9, 10), c(1, 2, 3, 4, 5))),
list(train_ids = list(c(1, 3, 5, 7, 9), c(2, 4, 6, 8, 10)),
test_ids = list(c(2, 4, 6, 8, 10), c(1, 3, 5, 7, 9))))
dml_plr_obj$set_sample_splitting(smpls)