3.2.10. doubleml.plm.datasets.make_pliv_multiway_cluster_CKMS2021#
- doubleml.plm.datasets.make_pliv_multiway_cluster_CKMS2021(N=25, M=25, dim_X=100, theta=1.0, return_type='DoubleMLData', **kwargs)#
- Generates data from a partially linear IV regression model with multiway cluster sample used in Chiang et al. (2021). The data generating process is defined as \[ \begin{align}\begin{aligned}Z_{ij} &= X_{ij}' \xi_0 + V_{ij},\\D_{ij} &= Z_{ij}' \pi_{10} + X_{ij}' \pi_{20} + v_{ij},\\Y_{ij} &= D_{ij} \theta + X_{ij}' \zeta_0 + \varepsilon_{ij},\end{aligned}\end{align} \]- with \[ \begin{align}\begin{aligned}X_{ij} &= (1 - \omega_1^X - \omega_2^X) \alpha_{ij}^X + \omega_1^X \alpha_{i}^X + \omega_2^X \alpha_{j}^X,\\\varepsilon_{ij} &= (1 - \omega_1^\varepsilon - \omega_2^\varepsilon) \alpha_{ij}^\varepsilon + \omega_1^\varepsilon \alpha_{i}^\varepsilon + \omega_2^\varepsilon \alpha_{j}^\varepsilon,\\v_{ij} &= (1 - \omega_1^v - \omega_2^v) \alpha_{ij}^v + \omega_1^v \alpha_{i}^v + \omega_2^v \alpha_{j}^v,\\V_{ij} &= (1 - \omega_1^V - \omega_2^V) \alpha_{ij}^V + \omega_1^V \alpha_{i}^V + \omega_2^V \alpha_{j}^V,\end{aligned}\end{align} \]- and \(\alpha_{ij}^X, \alpha_{i}^X, \alpha_{j}^X \sim \mathcal{N}(0, \Sigma)\) where \(\Sigma\) is a \(p_x \times p_x\) matrix with entries \(\Sigma_{kj} = s_X^{|j-k|}\). Further \[\begin{split}\left(\begin{matrix} \alpha_{ij}^\varepsilon \\ \alpha_{ij}^v \end{matrix}\right), \left(\begin{matrix} \alpha_{i}^\varepsilon \\ \alpha_{i}^v \end{matrix}\right), \left(\begin{matrix} \alpha_{j}^\varepsilon \\ \alpha_{j}^v \end{matrix}\right) \sim \mathcal{N}\left(0, \left(\begin{matrix} 1 & s_{\varepsilon v} \\ s_{\varepsilon v} & 1 \end{matrix} \right) \right)\end{split}\]- and \(\alpha_{ij}^V, \alpha_{i}^V, \alpha_{j}^V \sim \mathcal{N}(0, 1)\). - Parameters:
- N – The number of observations (first dimension). 
- M – The number of observations (second dimension). 
- dim_X – The number of covariates. 
- theta – The value of the causal parameter. 
- return_type – - If - 'DoubleMLData'or- DoubleMLData, returns a- DoubleMLDataobject where- DoubleMLData.datais a- pd.DataFrame.- If - 'DataFrame',- 'pd.DataFrame'or- pd.DataFrame, returns a- pd.DataFrame.- If - 'array',- 'np.ndarray',- 'np.array'or- np.ndarray, returns- np.ndarray’s- (x, y, d, cluster_vars, z).
- **kwargs – Additional keyword arguments to set non-default values for the parameters \(\pi_{10}=1.0\), \(\omega_X = \omega_{\varepsilon} = \omega_V = \omega_v = (0.25, 0.25)\), \(s_X = s_{\varepsilon v} = 0.25\), or the \(p_x\)-vectors \(\zeta_0 = \pi_{20} = \xi_0\) with default entries \((\zeta_{0})_j = 0.5^j\). 
 
 - References - Chiang, H. D., Kato K., Ma, Y. and Sasaki, Y. (2021), Multiway Cluster Robust Double/Debiased Machine Learning, Journal of Business & Economic Statistics, doi: 10.1080/07350015.2021.1895815, arXiv:1909.03489. 
 
    
  
  
    