`R/datasets.R`

`make_plr_turrell2018.Rd`

Generates data from a partially linear regression model used in a blog article by Turrell (2018). The data generating process is defined as

\(d_i = m_0(x_i' b) + v_i,\)

\(y_i = \theta d_i + g_0(x_i' b) + u_i,\)

with \(v_i \sim \mathcal{N}(0,1)\), \(u_i \sim \mathcal{N}(0,1)\), and
covariates \(x_i \sim \mathcal{N}(0, \Sigma)\), where \(\Sigma\)
is a random symmetric, positive-definite matrix generated with
`clusterGeneration::genPositiveDefMat()`

. \(b\) is a vector with entries
\(b_j=\frac{1}{j}\) and the nuisance functions are given by

\(m_0(x_i) = \frac{1}{2 \pi} \frac{\sinh(\gamma)}{\cosh(\gamma) - \cos(x_i-\nu)},\)

\(g_0(x_i) = \sin(x_i)^2.\)

```
make_plr_turrell2018(
n_obs = 100,
dim_x = 20,
theta = 0.5,
return_type = "DoubleMLData",
nu = 0,
gamma = 1
)
```

- n_obs
(

`integer(1)`

)

The number of observations to simulate.- dim_x
(

`integer(1)`

)

The number of covariates.- theta
(

`numeric(1)`

)

The value of the causal parameter.- return_type
(

`character(1)`

)

If`"DoubleMLData"`

, returns a`DoubleMLData`

object. If`"data.frame"`

returns a`data.frame()`

. If`"data.table"`

returns a`data.table()`

. If`"matrix"`

a named`list()`

with entries`X`

,`y`

and`d`

is returned. Every entry in the list is a`matrix()`

object. Default is`"DoubleMLData"`

.- nu
(

`numeric(1)`

)

The value of the parameter \(\nu\). Default is`0`

.- gamma
(

`numeric(1)`

)

The value of the parameter \(\gamma\). Default is`1`

.

A data object according to the choice of `return_type`

.

Turrell, A. (2018), Econometrics in Python part I - Double machine learning, Markov Wanderer: A blog on economics, science, coding and data. http://aeturrell.com/2018/02/10/econometrics-in-python-partI-ML/.