1.2. doubleml.data.DoubleMLClusterData#

class doubleml.data.DoubleMLClusterData(data, y_col, d_cols, cluster_cols, x_cols=None, z_cols=None, t_col=None, s_col=None, use_other_treat_as_covariate=True, force_all_x_finite=True)#

Backwards compatibility wrapper for DoubleMLData with cluster_cols. This class is deprecated and will be removed in a future version. Use DoubleMLData with cluster_cols instead.

Methods

from_arrays(x, y, d, cluster_vars[, z, t, ...])

Initialize DoubleMLClusterData from numpy.ndarray's.

set_x_d(treatment_var)

Function that assigns the role for the treatment variables in the multiple-treatment case.

Attributes

all_variables

All variables available in the dataset.

binary_outcome

Logical indicating whether the outcome variable is binary with values 0 and 1.

binary_treats

Series with logical(s) indicating whether the treatment variable(s) are binary with values 0 and 1.

cluster_cols

The cluster variable(s).

cluster_vars

Array of cluster variable(s).

d

Array of treatment variable; Dynamic! Depends on the currently set treatment variable; To get an array of all treatment variables (independent of the currently set treatment variable) call obj.data[obj.d_cols].values.

d_cols

The treatment variable(s).

data

The data.

force_all_d_finite

Indicates whether to raise an error on infinite values and / or missings in the treatment variables d.

force_all_x_finite

Indicates whether to raise an error on infinite values and / or missings in the covariates x.

is_cluster_data

Flag indicating whether this data object is being used for cluster data.

n_cluster_vars

The number of cluster variables.

n_coefs

The number of coefficients to be estimated.

n_instr

The number of instruments.

n_obs

The number of observations.

n_treat

The number of treatment variables.

use_other_treat_as_covariate

Indicates whether in the multiple-treatment case the other treatment variables should be added as covariates.

x

Array of covariates; Dynamic! May depend on the currently set treatment variable; To get an array of all covariates (independent of the currently set treatment variable) call obj.data[obj.x_cols].values.

x_cols

The covariates.

y

Array of outcome variable.

y_col

The outcome variable.

z

Array of instrumental variables.

z_cols

The instrumental variable(s).

classmethod DoubleMLClusterData.from_arrays(x, y, d, cluster_vars, z=None, t=None, s=None, use_other_treat_as_covariate=True, force_all_x_finite=True)#

Initialize DoubleMLClusterData from numpy.ndarray’s. This method is deprecated and will be removed with version 0.12.0, use DoubleMLData.from_arrays with cluster_vars instead.

DoubleMLClusterData.set_x_d(treatment_var)#

Function that assigns the role for the treatment variables in the multiple-treatment case.

Parameters:

treatment_var (str) – Active treatment variable that will be set to d.