
Double machine learning data-backend for data with cluster variables
Source:R/double_ml_data.R
DoubleMLClusterData.RdDouble machine learning data-backend for data with cluster variables.
DoubleMLClusterData objects can be initialized from a
data.table. Alternatively DoubleML provides
functions to initialize from a collection of matrix objects or
a data.frame. The following functions can be used to create a new
instance of DoubleMLClusterData.
DoubleMLClusterData$new()for initialization from adata.table.double_ml_data_from_matrix()for initialization frommatrixobjects,double_ml_data_from_data_frame()for initialization from adata.frame.
Super class
DoubleML::DoubleMLData -> DoubleMLClusterData
Active bindings
cluster_cols(
character())
The cluster variable(s).x_cols(
NULL,character())
The covariates. IfNULL, all variables (columns ofdata) which are neither specified as outcome variabley_col, nor as treatment variablesd_cols, nor as instrumental variablesz_cols, nor as cluster variablescluster_colsare used as covariates. Default isNULL.n_cluster_vars(
integer(1))
The number of cluster variables.
Methods
Method new()
Creates a new instance of this R6 class.
Usage
DoubleMLClusterData$new(
data = NULL,
x_cols = NULL,
y_col = NULL,
d_cols = NULL,
cluster_cols = NULL,
z_cols = NULL,
s_col = NULL,
use_other_treat_as_covariate = TRUE
)Arguments
data(
data.table,data.frame())
Data object.x_cols(
NULL,character())
The covariates. IfNULL, all variables (columns ofdata) which are neither specified as outcome variabley_col, nor as treatment variablesd_cols, nor as instrumental variablesz_colsare used as covariates. Default isNULL.y_col(
character(1))
The outcome variable.d_cols(
character())
The treatment variable(s).cluster_cols(
character())
The cluster variable(s).z_cols(
NULL,character())
The instrumental variables. Default isNULL.s_col(
NULL,character())
The score or selection variable (only relevant/used for SSM Estimators). Default isNULL.use_other_treat_as_covariate(
logical(1))
Indicates whether in the multiple-treatment case the other treatment variables should be added as covariates. Default isTRUE.
Method set_data_model()
Setter function for data_model. The function implements the causal model
as specified by the user via y_col, d_cols, x_cols, z_cols and
cluster_cols and assigns the role for the treatment variables in the
multiple-treatment case.
Arguments
treatment_var(
character())
Active treatment variable that will be set totreat_col.
Examples
library(DoubleML)
dt = make_pliv_multiway_cluster_CKMS2021(return_type = "data.table")
obj_dml_data = DoubleMLClusterData$new(dt,
y_col = "Y",
d_cols = "D",
z_cols = "Z",
cluster_cols = c("cluster_var_i", "cluster_var_j"))