Generates data from a interactive regression (IRM) model. The data generating process is defined as
di=1{exp(cdx′iβ)1+exp(cdx′iβ)>vi},
yi=θdi+cyx′iβdi+ζi,
with vi∼U(0,1), ζi∼N(0,1)
and covariates xi∼N(0,Σ), where Σ
is a matrix with entries Σkj=0.5|j−k|.
β is a dim_x
-vector with entries βj=1j2
and the constancts cy and cd are given by
cy=√R2y(1−R2y)β′Σβ,
cd=√(π2/3)R2d(1−R2d)β′Σβ.
The data generating process is inspired by a process used in the simulation experiment (see Appendix P) of Belloni et al. (2017).
Usage
make_irm_data(
n_obs = 500,
dim_x = 20,
theta = 0,
R2_d = 0.5,
R2_y = 0.5,
return_type = "DoubleMLData"
)
Arguments
- n_obs
(
integer(1)
)
The number of observations to simulate.- dim_x
(
integer(1)
)
The number of covariates.- theta
(
numeric(1)
)
The value of the causal parameter.- R2_d
(
numeric(1)
)
The value of the parameter R2d.- R2_y
(
numeric(1)
)
The value of the parameter R2y.- return_type
(
character(1)
)
If"DoubleMLData"
, returns aDoubleMLData
object. If"data.frame"
returns adata.frame()
. If"data.table"
returns adata.table()
. If"matrix"
a namedlist()
with entriesX
,y
,d
andz
is returned. Every entry in the list is amatrix()
object. Default is"DoubleMLData"
.