{ "cells": [ { "cell_type": "markdown", "id": "f03c754a", "metadata": {}, "source": [ "# R: Sample Selection Models\n", "\n", "In this example, we illustrate how the [DoubleML](https://docs.doubleml.org/stable/index.html) package can be used to estimate the average treatment effect (ATE) under sample selection or outcome attrition. The estimation is based on a simulated DGP from Appendix E of [Bia, Huber and Lafférs (2023)](https://doi.org/10.1080/07350015.2023.2271071). \n", "\n", "Consider the following DGP:\n", "$$\n", "\\begin{align*}\n", "Y_i &= \\theta_0 D_i + X_i'\\beta_0 + \\varepsilon_i,\\\\\n", "S_i &= \\mathbb{1}\\{D_i + \\gamma_0 Z_i + X_i'\\beta_0 + \\upsilon_i > 0\\}, \\\\\n", "D_i &= \\mathbb{1}\\{X_i'\\beta_0 + \\xi_i > 0\\}\n", "\\end{align*}\n", "$$\n", "where $Y_i$ is observed if $S_i=1$\n", "with\n", "$$X_i \\sim N(0, \\sigma^2_X), \\quad Z_i \\sim N(0, 1), \\quad (\\varepsilon,_i \\nu_i) \\sim N(0, \\sigma^2_{\\varepsilon, \\nu}), \\quad \\xi_i \\sim N(0, 1).$$\n", "\n", "Let $D_i\\in\\{0,1\\}$ denote the treatment status of unit $i$ and let $Y_{i}$ be the outcome of interest of unit $i$.\n", "Using the potential outcome notation, we can write $Y_{i}(d)$ for the potential outcome of unit $i$ and treatment status $d$. Further, let $X_i$ denote a vector of pre-treatment covariates. \n", "\n", "## Outcome missing at random (MAR) \n", "Now consider the first setting, in which the outcomes are missing at random (MAR), according to assumptions in [Bia, Huber and Lafférs (2023)](https://doi.org/10.1080/07350015.2023.2271071). \n", "Let the covariance matrix $\\sigma^2_X$ be such that $a_{ij} = 0.5^{|i - j|}$, $\\gamma_0 = 0$, $\\sigma^2_{\\varepsilon, \\upsilon} = \\begin{pmatrix} 1 & 0 \\\\ 0 & 1 \\end{pmatrix}$ and finally, let the vector of coefficients $\\beta_0$ resemble a quadratic decay of coefficients importance; $\\beta_{0,j} = 0.4/j^2$ for $j = 1, \\ldots, p$. \n" ] }, { "cell_type": "markdown", "id": "25181797", "metadata": {}, "source": [ "### Data\n", "\n", "We will use the implemented data generating process `make_ssm_data` to generate data according to the simulation in Appendix E of [Bia, Huber and Lafférs (2023)](https://doi.org/10.1080/07350015.2023.2271071). The true ATE in this DGP is equal to $\\theta_0=1$ (it can be changed by setting the parameter `theta`). \n", "\n", "The data generating process `make_ssm_data` by default settings already returns a `DoubleMLData` object (however, it can return a pandas DataFrame or a NumPy array if `return_type` is specified accordingly). In this first setting, we are estimating the ATE under missingness at random, so we set `mar=True`.\n", "The selection indicator `S` can be set via `s_col`." ] }, { "cell_type": "code", "execution_count": 1, "id": "a1f2f984", "metadata": { "vscode": { "languageId": "r" } }, "outputs": [], "source": [ "library(DoubleML)\n", "library(mlr3)\n", "library(ggplot2)\n", "\n", "# suppress messages during fitting\n", "lgr::get_logger(\"mlr3\")$set_threshold(\"warn\")" ] }, { "cell_type": "code", "execution_count": 2, "id": "d35090ed", "metadata": { "vscode": { "languageId": "r" } }, "outputs": [ { "data": { "text/plain": [ "================= DoubleMLData Object ==================\n", "\n", "\n", "------------------ Data summary ------------------\n", "Outcome variable: y\n", "Treatment variable(s): d\n", "Covariates: X1, X2, X3, X4, X5, X6, X7, X8, X9, X10, X11, X12, X13, X14, X15, X16, X17, X18, X19, X20, X21, X22, X23, X24, X25, X26, X27, X28, X29, X30, X31, X32, X33, X34, X35, X36, X37, X38, X39, X40, X41, X42, X43, X44, X45, X46, X47, X48, X49, X50, X51, X52, X53, X54, X55, X56, X57, X58, X59, X60, X61, X62, X63, X64, X65, X66, X67, X68, X69, X70, X71, X72, X73, X74, X75, X76, X77, X78, X79, X80, X81, X82, X83, X84, X85, X86, X87, X88, X89, X90, X91, X92, X93, X94, X95, X96, X97, X98, X99, X100\n", "Instrument(s): \n", "Selection variable: s\n", "No. Observations: 2000" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "set.seed(3141)\n", "n_obs = 2000\n", "df = make_ssm_data(n_obs=n_obs, mar=TRUE, return_type=\"data.table\")\n", "\n", "dml_data = DoubleMLData$new(df, y_col=\"y\", d_cols=\"d\", s_col=\"s\")\n", "dml_data\n" ] }, { "cell_type": "markdown", "id": "79ee9309", "metadata": {}, "source": [ "### Estimation\n", "\n", "To estimate the ATE under sample selection, we will use the `DoubleMLSSM` class. \n", "\n", "As for all `DoubleML` classes, we have to specify learners, which have to be initialized first.\n", "Given the simulated quadratic decay of coefficients importance, Lasso regression should be a suitable option (as for propensity scores, this will be a $\\mathcal{l}_1$-penalized Logistic Regression). \n", "\n", "The learner `ml_g` is used to fit conditional expectations of the outcome $\\mathbb{E}[Y_i|D_i, S_i, X_i]$, whereas the learners `ml_m` and `ml_pi` will be used to estimate the treatment and selection propensity scores $P(D_i=1|X_i)$ and $P(S_i=1|D_i, X_i)$." ] }, { "cell_type": "code", "execution_count": 3, "id": "4a905df9", "metadata": { "vscode": { "languageId": "r" } }, "outputs": [], "source": [ "ml_g = lrn(\"regr.cv_glmnet\", nfolds = 5, s = \"lambda.min\")\n", "ml_m = lrn(\"classif.cv_glmnet\", nfolds = 5, s = \"lambda.min\")\n", "ml_pi = lrn(\"classif.cv_glmnet\", nfolds = 5, s = \"lambda.min\")" ] }, { "cell_type": "markdown", "id": "d90a87ea", "metadata": { "vscode": { "languageId": "r" } }, "source": [ "The `DoubleMLSSM` class can be used as any other `DoubleML` class. \n", "\n", "The score is set to `score='missing-at-random'`, since the parameters of the DGP were set to satisfy the assumptions of outcomes missing at random. Further, since the simulation in [Bia, Huber and Lafférs (2023)](https://doi.org/10.1080/07350015.2023.2271071) uses normalization of inverse probability weights, we will apply the same setting by `normalize_ipw=True`.\n", "\n", "After initialization, we have to call the `fit()` method to estimate the nuisance elements." ] }, { "cell_type": "code", "execution_count": 4, "id": "761ab3d5", "metadata": { "vscode": { "languageId": "r" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "================= DoubleMLSSM Object ==================\n", "\n", "\n", "------------------ Data summary ------------------\n", "Outcome variable: y\n", "Treatment variable(s): d\n", "Covariates: X1, X2, X3, X4, X5, X6, X7, X8, X9, X10, X11, X12, X13, X14, X15, X16, X17, X18, X19, X20, X21, X22, X23, X24, X25, X26, X27, X28, X29, X30, X31, X32, X33, X34, X35, X36, X37, X38, X39, X40, X41, X42, X43, X44, X45, X46, X47, X48, X49, X50, X51, X52, X53, X54, X55, X56, X57, X58, X59, X60, X61, X62, X63, X64, X65, X66, X67, X68, X69, X70, X71, X72, X73, X74, X75, X76, X77, X78, X79, X80, X81, X82, X83, X84, X85, X86, X87, X88, X89, X90, X91, X92, X93, X94, X95, X96, X97, X98, X99, X100\n", "Instrument(s): \n", "Selection variable: s\n", "No. Observations: 2000\n", "\n", "------------------ Score & algorithm ------------------\n", "Score function: missing-at-random\n", "DML algorithm: dml2\n", "\n", "------------------ Machine learner ------------------\n", "ml_g: regr.cv_glmnet\n", "ml_pi: classif.cv_glmnet\n", "ml_m: classif.cv_glmnet\n", "\n", "------------------ Resampling ------------------\n", "No. folds: 5\n", "No. repeated sample splits: 1\n", "Apply cross-fitting: TRUE\n", "\n", "------------------ Fit summary ------------------\n", " Estimates and significance testing of the effect of target variables\n", " Estimate. Std. Error t value Pr(>|t|) \n", "d 0.94473 0.03045 31.03 <2e-16 ***\n", "---\n", "Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1\n", "\n", "\n" ] } ], "source": [ "dml_ssm = DoubleMLSSM$new(dml_data, ml_g, ml_m, ml_pi, score=\"missing-at-random\",\n", " normalize_ipw = TRUE)\n", "dml_ssm$fit()\n", "\n", "print(dml_ssm)" ] }, { "cell_type": "markdown", "id": "781e5376", "metadata": {}, "source": [ "Confidence intervals at different levels can be obtained via" ] }, { "cell_type": "code", "execution_count": 5, "id": "1297e329", "metadata": { "vscode": { "languageId": "r" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 5 % 95 %\n", "d 0.8946549 0.9948104\n" ] } ], "source": [ "print(dml_ssm$confint(level = 0.9))" ] }, { "cell_type": "markdown", "id": "bfe05e01", "metadata": {}, "source": [ "### ATE estimates distribution\n", "\n", "Here, we add a small simulation where we generate multiple datasets, estimate the ATE and collect the results (this may take some time). " ] }, { "cell_type": "code", "execution_count": 6, "id": "80a10b47", "metadata": { "vscode": { "languageId": "r" } }, "outputs": [], "source": [ "n_rep = 100\n", "ATE_estimates = rep(NA, n_rep)\n", "ATE_estimates[1] = dml_ssm$coef" ] }, { "cell_type": "code", "execution_count": 7, "id": "8956cb51", "metadata": { "vscode": { "languageId": "r" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1] \"Iteration: 20/200\"\n", "[1] \"Iteration: 40/200\"\n", "[1] \"Iteration: 60/200\"\n", "[1] \"Iteration: 80/200\"\n", "[1] \"Iteration: 100/200\"\n", "[1] \"Iteration: 120/200\"\n", "[1] \"Iteration: 140/200\"\n", "[1] \"Iteration: 160/200\"\n", "[1] \"Iteration: 180/200\"\n", "[1] \"Iteration: 200/200\"\n" ] } ], "source": [ "n_rep = 200\n", "ATE = 1.0\n", "\n", "ATE_estimates = rep(NA, n_rep)\n", "\n", "set.seed(42)\n", "for (i_rep in seq_len(n_rep)) {\n", " if (i_rep %% (n_rep %/% 10) == 0) {\n", " print(paste0(\"Iteration: \", i_rep, \"/\", n_rep))\n", " }\n", " dml_data = make_ssm_data(n_obs=n_obs, mar=TRUE)\n", " dml_ssm = DoubleMLSSM$new(dml_data, ml_g, ml_m, ml_pi, score='missing-at-random', normalize_ipw=TRUE)\n", " dml_ssm$fit()\n", " ATE_estimates[i_rep] = dml_ssm$coef\n", "}\n" ] }, { "cell_type": "markdown", "id": "2e8a6094", "metadata": {}, "source": [ "The distribution of the estimates takes the following form" ] }, { "cell_type": "code", "execution_count": 14, "id": "e7cd8060", "metadata": { "vscode": { "languageId": "r" } }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAA0gAAANICAMAAADKOT/pAAAAPFBMVEUAAABNTU1oaGh1dfV8fHx/f/+MjIyampqnp6eysrK9vb3Hx8fQ0NDZ2dnh4eHp6enr6+vw8PD/AAD///+Zzx6PAAAACXBIWXMAABJ0AAASdAHeZh94AAAgAElEQVR4nO2di3baOhAA5TQPkuZxU/7/Xy9vZGODLe9KWu3MOW2gBAYjTWwEJWELAKsJpe8AQAsQEoAAhAQgACEBCEBIAAIQEoAAhAQgACEBCEBIAAIQEoAAK0MKR7q379lX2P31+3b31sLzx+VbY6autr8khKf4H7rQDW5zz+3Z0e8ZGm/uyNgdePgt0DQyIe3YzL3CdnpmXm/teey7pif06+kqZz535z+Ht5kQ0vHsjJBmfAs0zeqQDl9+PkJ4X3ytyX//6sLHopBC6O8RX8MmvI5ec/Q2Jm94dh+E5B2ZkLbb7xB+l19r6t+/9vuXRSH1zv7ujvOe4vtDSKCMVEjbzXGX9PW8O8j6PF206cLT4fT32/551NfpCodDqN/zk5rf67Ob/nw/nvt+605PwQZHXoML4g153+3QPuJd5LKQLvf2dLvnw9H34+Z87L58HLf5abexX/Fdu25+tM3gALGQvg/HUn+vT5h2M+pw+nO/gwnnk5eQduUdJ9zf64wfCekzvmqcy/CCeEO63d7o97LccHPDd7Ziz/Xe9kN6P5z9ejt8+Thohnct2vxom8EBYiEdTn4ffkB/Px8nV/e1/X3dH6Q9hb/b/Sx7ihcbfk57oufwc3Nrn+dDu90R4/suik3Yf1NvfxRfMCjh89D0WzSHF4U0vLfn+9x97ve73fHL036/t/8J8B4tjMSbH90KOEA2pM3xicnvfiYffxj/Xg/Srlc4nn89fcPTza19dued1+a0Gvi2/xrP994FgxKON/wZLTfEIY2szw0X9gbXO/3raXPOx63nZ2HRdfqbf/+Rg7aQDekpmpCni/ZfXndPFv7+XK9w/vl9PBh8j27izOZyg8er/Vz3D0d6F/Rn7fmgrrsuNywKaXhvz/f5N7ql05fvz/fnaIvizY9uBRwgGFIXz8jelPs5PJs4PEOPf+a/7hetoyO789W718tP/cvtD3Zs/Qv6Ib1f7sTEk697W7G9vbc9RfzloxvsxeLNj24FHCAW0tfweKY38z7fThM7npT7XVJ8ZDeY0KkhdZfp3N1896xVu8G9nQrpI4Tnzd+fXkijtwIOEAtps39y3UUvjPZn3mE5uBvM+93e6O/4KvX5XMKh3fW50etluWFhSP17OxXSU/gaXNoNXhc+3Qo4QCqk78OMeQtvxzPPtyFFe5XLokLYxEd2IyFtjjc4ttgQXdC75DWcX7v5uiS1PKR4HzgR0mlfG10ab/69m4b2kAlp/xah/bOB78Mywfdl0W17/tm9XwreXA7+wrmep663PHwb0u7AaXNc5f6OrrYdXBBfMT5WvLy7YVFIvXt7WVwf2yN9HBYYr1sUb350K+CA1SHF62yXl0mjvcT+y+nFye78WtDTec3ss/8c4jak+HXX69W2gwviK75HLx99nm99dNUu3lXG/xjd26frssnoc6QjX5e7Fm1+dCvgAJmQnjbnJwc/m/MbaOKZ93V4u8zlp/v30+mpw2+8kxkNKXon0PVq28EF8RW7+EnJ+cyikKJ7ezTeWbXr3r4Oz8nOd+26+dGtgAPKHsJ/8MI/tEHRkHbPJnhTJzRBwZCuz6wArFMwpCde94dm4GUOAAEICUAAQgIQgJAABCAkAAEICUAAQgIQgJAABCAkAAHqC+m/xjz/MnlybU9zAyTjISQ8lYpseQgJT6UiWx5CwlOpyJaHkPBUKrLlISRtWGxw4SEkbQjJhYeQtCEkFx5C0oaQXHgICU+lIlseQsJTqciWh5DwVCqy5SEkPJWKbHkISRsWG1x4CEkbQnLhISRtCMmFh5C0ISQXHkLCU6nIloeQ8FQqsuUhJDyVimx5CAlPpSJbHkLShsUGFx5C0oaQXHgISRtCcuEhJG0IyYWHkPBUKrLlISQ8lYpseQgJT6UiWx5CwlOpyJZnVUj/wWP+lb4DoIdQSCrY+kH0GFbtXHgISRtCcuEhJG0IyYWHkCQJ4fbxtLw9RUW2PIQkRwgvLy9hGJPd7SkssuUhJCHCIaMjvZSMbk95kS0PIUkQV3RMScdzD0Iq6iEkAQYV9UtiscGFh5BWM9wbDQ7vCMmFh5DWMppRtFMiJBceQlrJZEeE5MpDSKsYP6zrlWRqe2oS2fIQ0hruZXQuydL2VCWy5SGkFTzo6FiSoe2pS2TLQ0jpPOyIkPx4CCmZxx0dSmKxwYWHkFKZ0xEhufEQUhp3l+t6JRGSCw8hJTEzo31JhOTCQ0gpzO/oJVjYnipFtjyElMCCjnpvBFeFkIp6CGk5c58fEZIjDyEtZlFGLy9/Mj3EhFTUQ0hLWdjRyz9C8uAhpKUsDinTwR0hFfUQ0kKWdkRIPjyEtIzFHe1CylMSIRX1ENIilnf08oeQPHgIaQkJHRGSDw8hLSExpCwlEVJRDyEtIKUjQvLhIaT5JHW0X2wgpPY9hDSbtI4IyYeHkGazIqQcJRFSUQ8hzSWxI0Ly4SGkuaSG9IeQPHgIaSapHR1DylASIRX1ENI8kjsiJB8eQprH2pD0SyKkoh5CmkV6R8fFBkJq3UNIsyCk/CJbHkKaw4qOziGpl0RIRT2ENAdCKiCy5SGkGazp6LzYQEhtewhpBoRUQmTLQ0iPWdXRJSTtkgipqIeQHkNIRUS2PIT0kHUdXRYbCKlpDyE9hJDKiGx5COkRKzu6hqRcEiEV9RDSIwipkMiWh5AesLaj62KDckmEVNRDSA8gpFIiWx5Cus/qjgjJh4eQ7kNIxUS2PIR0n/Uh/YtOaz7ahFTUQ0h3Wd8RIfnwENJdCKmcyJaHkO4h0BEh+fAQ0j0kQvoTn1F8uAmpqIeQ7iDRESH58BDSHeRDUiyJkIp6COkOhFRSZMtDSNOIdNRbbCCkZj2ENA0hFRXZ8hDSNIRUVGTLQ0jTaISkVxIhFfUQ0iQyHQ0WGwipUQ8hTUJIZUW2PIQ0CSGVFdnyENIUQh0Rkg8PIU0hFRKLDS48hDSBVEfDkNRKIqSiHkKagJBKi2x5CGkCQiotsuUhpAnEQhosNhBSmx5CGkeso5uQtEoipKIeQhqHkIqLbHkIaRxCKi6y5SGkceRCGi42aJVESEU9hDSKXEeE5MNDSKMQUnmRLQ8hjUJI5UW2PIQ0imBIN4sNhNSih5DGEOxoJCSdkgipqIeQxiCkCkS2PIQ0gmRHhOTDQ0gjiIZ0u9hASA16CGkEQqpBZMtDSCMQUg0iWx5CukW0o7GQVEoipKIeQrpFNqSRxQZCas/zeEi7A/EZEfEk5R8/QqpCZMszc0i7wVdFyj9+hFSFyJZn3pB2Nyf0KP74yXY0GpJGSYRU1LMspAwdlX/8hEMaW2wgpOY8s0b0ukPqP0X6r0XCH1n+jf1jKL2VIEJySL1zOpT+QSS8Q2KP5MOzLKSRc+KUfvyyhKRQEiEV9cwZ0O7uWWlKP37SIY0uNhBSa55lITk4tJPuiJB8eJaHpL1yR0ii2yOPrQmey7MgpENF6m9sKP34EVItIlse3ms3QDyk8cUG+ZIIqaiHkAYQUi0iWx5CGkBItYhseQipj3hHhOTDQ0h95EMaX2yQL4mQinoIqQ8hVSOy5SGkPoRUjciWh5B6yHdESD48hNRDIaSJxQbxkgipqIeQehBSPSJbHkKKUeiIkHx4CCmGkCoS2fIQUoxGSFOLDYTUlIeQYrKGJFwSIRX1EFKERkeE5MNDSBGEVJPIloeQIlRCmlxsIKSWPIQUQUg1iWx5CCmCkGoS2fIQUgQh1SSy5SGkKyod3VlskC2JkIp6COkKIVUlsuUhpCuEVJXIloeQrhBSVSJbHkK6ohPS9GIDITXkIaQLOh3dC0m0JEIq6iGkC4RUl8iWh5AuEFJdIlseQjqj1NG9xQZCasdDSGdKhCRZEiEV9RDSGUKqTGTLQ0hnCKkykS0PIZ3Q6ujuYgMhNeMhpBOEVJvIloeQTpQJSbAkQirqIaQThFSbyJaHkE6ohXR3sYGQWvEQ0hG1jgjJh4eQjhBSdSJbHkI6QkjViWx5COmIXkgsNrjwENIBvY4ehCRXEiEV9RDSAUKqT2TLQ0gHCKk+kS0PIR1QDOn+YgMhNeIhpD2KHRGSDw8h7SkYklhJhFTUQ0h7CKlCkS0PIe3RDOnBYgMhteEhpD2EVKHIloeQ9pQMSaokQirqIaQ9hFShyJaHkLa6HT1cbCCkJjyEtCWkOkW2PIS0JaQ6RbY8hLQtHZJQSYRU1ENIW+WQHi42EFILHkJS7oiQfHgIiZAqFdnyEBIhVSqy5SEk7ZBYbHDhISTljgjJh4eQCKlSkS0PIZUPSaYkQirqISTtkB4vNhBSAx5CIqRKRbY8hKTcESH58BASIVUqsuUhJO2QZiw2iJRESEU9hERIlYpseQiJkCoV2fIQEiFVKrLlcR+SdkdzFhtESiKkoh5CIqRKRbY8hERIlYpseQiJkCoV2fIQknZIcxYbCMm8x3tI6h3NC0mgJEIq6iEkbQjJhYeQtCEkFx5C0mbWYgMhWfc4D0m/I0Ly4SEkbeaFtL4kQirqISRtCMmFh5C0mbfYQEjGPYSkDSG58PgOKUNHc0NaXRIhFfUQkjaE5MJDSNrMXGwgJNseQtKGkFx4CEkbQnLhWTV8/xkn/MnAv7nfGEo/HLAYoZBUyPiDKMcOafZiw9pdEnukoh5C0oaQXHgISRtCcuHxHFKWjgjJh4eQtJm7akdIpj2EpA0hufAQkjazQ1pZEiEV9RCSNoTkwuM4pDwdzV9sICTLHkLShpBceAhJG0Jy4SEkbeaHtK4kQirqISRt5i82EJJhDyFpQ0guPISkDSG58PgNKVNHS0JaVRIhFfUQkjYLFhsIya6HkLQhJBceQtKGkFx4CEkbQnLhcRtSWDLB17BksWFNSYRU1ENI2hCSCw8haUNILjxeQwqLnrusgZBceAhJm0UeQrLqISRtCMmFh5C0WeZJHw9CKuohJG0IyYXHaUhh6QRPZ9FiAyFZ9RCSNoTkwkNI2hCSCw8habMspPSSCKmoh5C0WeghJJseQtKGkFx4CEkbQnLh8RlSWD7Bk1nqSR0RQirqISRtFi42EJJNDyFpQ0guPISkDSG58BCSNoTkwuMypMPnntS62JBaEiEV9RCSNoTkwkNI2hCSCw8haUNILjweQwppEzyRpYsNhGTSQ0jaLA4psSRCKuohJG0IyYWHkLQhJBcehyGdPj2/2sWGxJIIqaiHkLQhJBceQtKGkFx4CEkbQnLhISRtli82EJJBDyFpkxBSUkmEVNRDSNoQkguPv5DOvzuWkBKxNcFzeQhJmxQPIZnzEJI2hOTCQ0jaEJILDyFpQ0guPO5COndU9WJDSkmEVNRDSNoQkgsPIWlDSC483kK6dERIqdia4Lk8hKRNmmf5uBBSUQ8haUNILjyEpA0hufAQkjaJnsUDQ0hFPc5CunZU92IDIVnzEJI2hOTCQ0jaEJILDyFpQ0guPISkTapn6cgQUlGPr5CijggpFVsTPJeHkLQhJBceQtKGkFx4XIUUd1T7YgMh2fIQkjaE5MJDSNoQkgsPIWmTGtLSkgipqMdTSL2Oql9sICRTHkLShpBceAhJG0Jy4SEkbdI9y8aGkIp6CEmb5MUGQrLkISRtCMmFx1FI/Y4MhLSsJEIq6iEkbQjJhYeQtFnhISQ7HkLShpBceAhJG0Jy4fET0qAjCyEtKomQinoISZsViw2EZMczY6i6PdFpEe80hHSFkMx45oQ0OKlcktLjN+yIkFKxNcFzeQhJmzUhLSmJkIp6Ho9UNzxNSItY5SEkK54ZIcVPkS5/He+CIcIfi4TSDxvcZVFIUTx290g3OyQbe6QFuyT2SEU9MweKkJIhJBceQtJm1WIDIVnxcGinDSG58MwLaWKxQQVCiiEkI56Z72w4fI1OK0JIPWaXREhFPU7ea3fbkZHFBkIy4iEkbQjJhYeQtCEkFx5C0oaQXHgISRsWG1x4fIQ00pGVkGaXREhFPYSkDSG58LgIaawjQkrF1gTP5SEkbdZ6CMmEh5C0We2ZOUSEVNRDSNoQkguPh5BGOyKkVGxN8FweQtJm7WLD3JIIqaiHkLQhJBeewSA9vX+J3OwKCGkIIRnwDAYphNC9fYrcciqENISQDHgGg/T793XXUnj++yNy6ymIP37jHdlZbCAkC56RQfrcdLuWnkrtlwjphlklEVJRz9gY/WzCYbckIliM9OM30REhpWJrgufy3I7R9+thd/T1HF5FDEshpBsIqX7PcIw+ny9HdaHM0nhrIa1fbCAkA57h8ncIr9/ni7Q/L2gcQrqBkOr3DJe/N9/j35cP4cdvqiNCSsXWBM/lGS5/i9zoKgjpljklEVJRz80Lssev6h8DOU1rIUl4CKl6TzxEXYgQufUUCOkWQqreEw/RR9TRh8itp0BItxBS9Z6JQ7uCENIIM4aFkIp6yoczRPbxm+zI1GIDIVXviUdotztq7jkSIYlja4Ln8hCSNoTkwsOhnTYiIc0oiZCKehoPabojW4sNhFS7ZzhAH912+xW6d5EbT4KQxiCkyj2DAfrYPTn62b8wW64kycfvTkeElIqtCZ7Lc/Pu76/dn4/vQu/83kNIYxBS5Z7bF2Q/w1PRF2ZbC0lmseFxSYRU1DMYny78vIXv/bMkkVtPgZBGIaS6PYPxed9/Htd+h7QRufUUBB+/ex0RUiq2Jnguz3B8NqH73O2YynVESOMQUt2epl9HqiIkIQ8h1e0hJG0IyYXn5tCua+i9di2F9LAkQirqGQzPpqU3rd7tiJBSsTXBc3lulr/L/dfYE62FJLTYQEh1e1r+H7KEpIKtCZ7LMxie11D8A7kIaYIHJRFSUc9gdH6653K/0OWI2ON3vyNCSsXWBM/luf1FY80sNlQSkpiHkGr2tBvSg47shfSgJEIq6im/uDCEkKYgpIo9hKQNIbnw3AzOx+vusO654O+kaC0kscUGQqrZM/xtFE+H50chfIncegpCj9+jjggpFVsTPJdnMDhvYbN/UfZvqV8guyWkO9wtiZCKekbe2XD+UwhCmoSQ6vW0GtLDjgwuNhBSxZ7xQ7tNeBO59RQIaRJCqtczXGw4/XekrtwbhUS263FHFkO6WxIhFfXcDM37UwhPm4JvXSWkaQipWk+jL8hWFJLgYgMh1eshJG0kQ7pXEiEV9fRH5vf9efcE6bXo/5KV2K4ZHRFSKrYmeC5Pb2Q+z5980pV7YwMh3YOQavXEI/MTwtv+TXZfr6Hgf5RtLSRRDyHV6olH5vrq0ZvxjywmJEVsTfBcnnhkunB+9ehn/xspCiGwXXM6shnSnZIIqahn8MuYR07mhpDuQUiVeghJG9HFBkKq1UNI2hCSC0+LIc3qiJBSsTXBc3n6IYUmPkWIkFSxNcFzeRoMaV5HLDakYmuC5/I0+F67pkOaLomQinoISRtCcuFpL6SZHRFSKrYmeC4PIWkju9hASJV6CEkb4ZAmSyKkoh5C0oaQXHiaC2luR4SUiq0JnstDSNqIeyaGjJCKeghJG0Jy4WktpNkdEVIqtiZ4Lg8haUNILjyNhTS/I7OLDYRUpYeQtBEPaaIkQirqISRtCMmFp62QFnRESKnYmuC5PISkjbyHkCr0EJI2hOTCsyqk/2oj/PFAKP0wwxmhkFRY8QNiyQ6JPVIqtvYUuTyEpI38YsN4SYRU1ENI2hCSC09LIS3qiJBSsTXBc3kISRtCcuFpKKRlHRlebBgviZCKeghJG0Jy4SEkbQjJhaedkBZ2ZDqksZIIqaiHkLRRWGwgpPo8hKQNIbnwNBPS0o4IKRVbEzyXh5C0ISQXnlZCWtyR7cWGkZIIqaiHkLQhJBceQtKGkFx4GglpeUeElIqtCZ7LQ0jaqCw2EFJtHkLSRiek25IIqainjZASOiKkVGxN8FweQtKGkFx4CEkbLc9w5AipqKeJkFI6IqRUbE3wXB5C0oaQXHhaCCmpI0JKxdYEz+UhJG2UFhsIqS5PAyGldWQ+pGFJhFTUQ0jaEJILDyFpQ0guPISkjZqHkGry2A8psSNCSsXWBM/lISRtCMmFx3xIqR0RUiq2JnguDyFpo7bYMCiJkIp6rIeU3BEhpWJrgufyEJI2hOTCYzyk9I4IKRVbEzyXh5C0UfT0Bo+QinoISRtCcuGxHdKKjggpFVsTPJeHkLQhJBce0yGt6aiBxYZ+SYRU1ENI2hCSC4/lkFZ1REip2JrguTyEpA0hufAQkjaaHkKqxmM4pHUdNRFSXBIhFfUQkjaE5MJjN6SVHRFSKrYmeC4PIWmjudhASNV4zIa0tiNCSsXWBM/lISRtCMmFx2pIqztqI6SoJEIq6iEkbXQ9hFSJh5C0ISQXHqMhre+IkFKxNcFzeQhJG0Jy4bEZkkBHjSw2XEsipKIeQtKGkFx4TIYk0REhpWJrgufyEJI2yiFdSiKkoh6LIYl01MhiAyFV4iEkbQjJhYeQtCEkFx6DIcl0REip2JrguTyEpI32YsO5JEIq6rEXklBHhJSKrQmey2MuJKmOCCkVWxM8l4eQtCEkFx5rIYl11MxiAyFV4SEkbQjJhcdYSHIdEVIqtiZ4Lg8haaPvCY8fN0lsTfBcHlshCXbUzmIDIdXgISRtCMmFx1RIkh0RUiq2JnguDyFpox/SsSRCKuohJG0yeAipvMdSSKIdEZL8AHn2EJI2OTzh7uOWbYA8ewyFJNsRIYkPkGsPIWmTYbGBkMp77IQk3BEhSQ+Qb4+ZkKQ7IiThAXLuISRtcoS0L4mQinoISZssHkIq7bESknhHhCQ7QN49hKQNIbnwGAlJviNCEh0g9x5C0ibLYsOuJEIq6rERkkJHhCQ5QHgISRtCcuExEZJGR42F9BIIqahnRkjdjvh0d++b19NaSJk8hFTW8zik7vJX9FWR2+1S6YiQ5AYIDyHpQ0guPDOfI3W9L6rcbJdOR4QkNkB4totD6j9F+i8L4Y9p/mXyhDyjATFLQ+od2WVebFDaIbW2aveSbf3V1p4il2dZSLdn5CGkNAipqGfWw9/dOScOIaVBSEU9cx7+rn8qb0haHbW22EBIZT1zXpDtn9ReuSOkNAipqGfG60jnpbpu23+XgxL97VLrqLmQ/uQqydYEz+Wp/b12hDTbQ0glPZWHpNdRc4sNhFTUQ0jaEJILDyFpky2kXMsNtiZ4Lk/dISl2REgSA4TnDCFpk89DSAU9hKRNRk+ewbQ1wXN5qg5JsyNCEhggPBcISRtCcuGpOSTVjhpcbCCkgh5C0oaQXHgqDkm3oxZDylOSrQmey0NI2hCSC0+9ISl31OJiAyGV8xCSNoTkwkNI2hCSC0+1IWl31GRIWUqyNcFzeQhJm5yLDYRUzFNrSOodEdK6AcLTh5C0ISQXnkpD0u+ozZBylGRrgufyEFJbHkIq5CGktjyEVMhDSI159AfU1gTP5akzpAwdEdKaAcqBLQ8haZN3sYGQCnmqDClHR4S0YoCyYMtDSNpkDkm/JFsTPJenxpCydERI6QOUB1seQmrNQ0hFPITUmoeQingqDClPR82GpF6SrQmey0NIzXkIqYSHkLTJvdigXpKtCZ7LU19ImToipFRsTfBcHkLShpBceAhJG0Jy4akupNDsIkA+j+6g2prguTyE1KCHkPJ7agsptDzBc3kIKb+HkBr0EFJ+DyFpk3+xgZAKeCoLKbS3pygQkm5JtiZ4Lg8haUNILjyEpA0hufAQUoseQsruqSukUGjiNefRHFZbEzyXh5Ca9BBSbk9VIYViE685j+K42prguTyEpE2JxQZCyu6pKaQwMiFyTTw9CMmFh5C0ISQXHkLSpkxIiiXZmuC5PBWFFEYnRK6J15qHkPJ6CKlRDyHl9RBSox5CyuupJ6QwPiFyTbzWPISU10NI2hRabNArydYEz+WpJqQwNSFyTTwtCMmFh5C0ISQXHkLSplRIaiXZmuC5PLWEFKYnRK6J15qHkHJ6CKldj9LY2prguTyE1K6HkDJ6CKldDyFl9FQSUrg3IXJNPB2KLTYQUk4PIWlTLiSlkmxN8FyeOkIK9ydEromnAiG58BCSNoTkwlNFSOHBhMg18ZrzqIyurQmey0NILXsIKZuHkJr2aAyvrQmey1NDSOHxhFDAhYeQcnkISZuCiw0vKiXZmuC5PISkDSG58BCSNoTkwlNBSGHOhJCHkBKxNcFzeQipcY/8ANua4Lk85UMadFR64rXmIaQ8HkJq3SM+wrYmeC5P8ZCGHRWfeK15CCmLh5C0KbvY8CJfkq0JnstDSNoQkgtP6ZBuOiKkRAipqIeQtCkeknRJtiZ4Lg8hte8hpAyewiHddlTBxGvNQ0gZPITkwCM7yLYmeC5P2ZBGOqph4rXmISR9DyFpU36xQbgkWxM8l2fVQ/zfSsIfB/wrfQd2hLUjBeMIhbSWsR0Se6RE7m6P5DDb2lPk8hCSNoTkwkNIPjyC42xrgufylAxptKNKJl5rHkJS9hCSE4/cQNua4Lk8BUMa76iWideah5B0PYSkTRWLDS+CJdma4Lk85UKa6IiQEnm4PVJDbWuC5/IQkjaE5MJTLKSpjggpkcfbIzTWtiZ4Lg8h+fEQkqKHkBx5ZAbb1gTP5SEkTx6R0bY1wXN5SoU02VFVE681DyGpeQhJm3oWG15kSrI1wXN5CoU03REhJTJvewTG29YEz+UhJG3qCkmgJFsTPJenTEh3OiKkROZuz+oRtzXBc3kIyZuHkFQ8RUK611F1E685z9ohtzXBc3kIyZ9n5ZjbmuC5PCVCuttRhROvOU9Y99lRUvOgKQ8haVPbYsOBNcNua4Ln8hQI6X5HhJTIsu1ZMe62JnguDyFpU2dIK0qyNcFzefKH9KAjQkpk6fYkP1EKF0TnxQ2E9MAoPCFSwbOwhHM+f6Ib0OyJkB4YxSdEGnhmp3TsZVKklBIh3RdqTIgU8Ox5sEPpJzQpUtktEdJ9oc6EWA6eE6NHZ2EsoXsi+ZYI6a5Pb0IsxPtiQ48wJEXk+ldsZg7pcUeElEiu7TFH4iAAAAdsSURBVMn2y2oJ6Z5uzTjJQkjyItHjO0K6Y1s3TqIQkopILiVCumNbPU5y4FES+fxE16whzemouYnXmifbZyMT0rRMZJyEwKMmkjm8I6RJl9A4yYBHUeTvY78ISRtniw0n3H3sV8aQ5nVESInUFZLA4R0hTZhEx2k9hKQs8vUhK4SkjduQ1u6UCGlcJD5OK8GjL3L0ISuEhEdR5OdDVnKFNLuj5iZea55sH1dESGMapXFaAZ4sIi8fspInpAUdNTfx/C42HEmeYYQ0YtEbp2QIKZModYoR0q1Ec5xSIaRcosRlcEK6ceiOUyKElE+UNMsIaahQH6ck8GQUpeyUCGmo0B+nFPBkFS2faIQ0MGQZp+XgyStaPNMIaWDIM06LwZNZtPTwjpD6glzjVKuHxYYLyyYbIfVuP+M41ekhpCuLZhshxWQdpyo9hBSx5PCOkGLyjlONHkLqMX/CEVJM7nHCU7to9owjpJjs44SndtHczzUmpJj844SnftGslAgppsg44aleNGPeEVJMoXGqyMNiwyiPD/AIKabUONXjIaQpHqRESDEFx6kSDyFNc3e3REgxRcepCg8h3WW6JUKKKT1OeOoXTbRESDEVjBOe+kVjKRFSzPLHtLWJ15pHSXT4Xeq9qUNIMcsf0dYmXmseTVEvJkKKWf5gtjbxWGxYxiUlQopZ/kASUhqthHRZfCCkmOUPIyGl0U5IL8djPEKKWf4YElIaTYW0Z+67xNdCSOvAU7noj8zvdH4IIa0DT+WivSfHXmk0pJvF+EcQEp5KRSePekuDkA4vaB0+s2fRrRASnkpFV49uS1FI54SOLLoVQtKGxQYJj+I8vYQUBp8dt+hWCEkbQhLx6O2VjiGFYUaEVJmHkKQ8SnP1v9GKXgipMg8hiXl09krjFb0QEp42RKMe8ZZCmNyeZbcje7duEHr8FMBTuWjKIzhlDzsjQloHnspFkx6pvdLpkI6Q1oGnctE9j0BKl2dGhLQOFhsqF933rEspXmAgpHUQUuWiR57klAbLdIS0DkKqXPTYk/Rk6Waxm5DWQUiVi2Z5FrY09poRIa0DT+WiuZ65/+MhTLzySkjrwFO5aInnUUxTEd31LJrohISnUtFST7j9aLxt/N+LFnsWTfQZIXU7xk7PYuGDcW+7hMFTuSjNEwas8AiH1F3+6p8mpHmw2GDVQ0hVeQjJqoeQqvIQklVPvpD+g8f8K30HQA+hkFSw9QGb/jzNbVCuz7UjJDxFRLY8hISnUpEtDyHhqVRky0NI2vzL5CGkop4F72zootOK2Hr8HkNILjx5fnPGEmw9fo8hJBceQtKGkFx4CAlPpSJbHkLCU6nIloeQ8FQqsuUhJDyVimx5CEkbFhtceAhJG0Jy4SEkbQjJhYeQtCEkFx5CwlOpyJaHkPBUKrLlISQ8lYpseQgJT6UiWx5C0obFBhceQtKGkFx4CEkbQnLhISRtCMmFh5DwVCqy5SEkPJWKbHkICU+lIlseQsJTqciWh5C0YbHBhYeQtCEkFx5C0oaQXHgISRtCcuGpLyQAgxASgACEBCAAIQEIQEgAAhASgACEBCAAIQEIQEgAAhASgACEBCBAPSHFvzA9wy9Pz8B1E9rYnm03cdoqkgNUTUjd5a/+abtEPxcuf5kmnmst/GAQHSBC0qLbNhZSF/8Et785wgNESHo0FlLLh3a355ZCSHoQUuW4CKmBg3BCqpxu8sxyqgxpv4jSwEgRUuW0H9LwtE0IqXIkt6fKkBqZeIRUOd3oyTQISQ9Cqpxu5FQq1YR0eXG5i07bprv83cb2RBvUUkjdcXxWjlA9IQEYhpAABCAkAAEICUAAQgIQgJAABCAkAAEIqQpCCOevJ/qnby688NEdLpm+6Y8WXvGpH0Kqgc9dHJ/7E8tDOpy5F9K9y0AMHuUaeAuv4e185jzzbwoYTeJhJ4SUBR7lGgjh9zrf54X03oWnj+Ne6njJ7s9reN3+PIXX393lX68hdJvzN2x/30J4+42uCLIQUgV87nZHb8dju+3MkDaHQ7yPXki7dMLfp91fb8eDxR2bc0jd/utTdEWQhZAqYB/R5+XY7hrS4BlR/A8h/Gy/Qhc9R9r383efzt/9uafwd7v9Pke2fd/9+66hj+iKIAohVUB/wWBWSF14++xfd1/I/hjxfAM/n+/Pl5CeDv+0O/K7XhFEIaTynA7Dzsd2sw7tPnfHak8/215I8bnt83Xxb3ttMLoiiEJI5Xk7zfLTsd3MVbvvp9B9TYb0Fp4+Pn9GQrpcEUQhpPJ0++Ox7e/5icvs5e+PXj/9kI5LdYNDu94VQRQe0eJ8nXZFb+G4n5gVUrf75u/BYsO2d+5r+3t9jrTZLzb8Dc/RFUEUQirO5hTQ536yb0cWG8be2XBcxX7f/2M3FtImeoPE7ht+D8vf4Tu6IohCSMW5fFjA6cSskLabLnT7HD7GQ9o/8Xr+2p86fMP253B+G10RRCEkAAEICUAAQgIQgJAABCAkAAEICUAAQgIQgJAABCAkAAEICUAAQgIQgJAABCAkAAH+Bz9z3UtlmI8JAAAAAElFTkSuQmCC", "text/plain": [ "plot without title" ] }, "metadata": { "image/png": { "height": 420, "width": 420 } }, "output_type": "display_data" } ], "source": [ "# Plot density plot of ATE_estimates with ggplot\n", "ggplot(data.frame(ATE_estimates), aes(x = ATE_estimates)) +\n", " geom_density(fill = \"blue\", alpha = 0.5) +\n", " geom_vline(aes(xintercept = ATE), color = \"red\", linetype = \"dashed\") +\n", " labs(title = \"Density Plot of ATE Estimates\",\n", " x = \"ATE Estimates\",\n", " y = \"Density\") +\n", " theme_minimal()" ] }, { "cell_type": "markdown", "id": "645df4f8", "metadata": {}, "source": [ "## Outcome missing under nonignorable nonresponse\n", "Now consider a different setting, in which the outcomes are missing under nonignorable nonresponse assumptions in [Bia, Huber and Lafférs (2023)](https://doi.org/10.1080/07350015.2023.2271071). \n", "Let the covariance matrix $\\sigma^2_X$ again be such that $a_{ij} = 0.5^{|i - j|}$, but now $\\gamma_0 = 1$ and $\\sigma^2_{\\varepsilon, \\upsilon} = \\begin{pmatrix} 1 & 0.8 \\\\ 0.8 & 1 \\end{pmatrix}$ to show a strong correlation between $\\varepsilon$ and $\\upsilon$. Let the vector of coefficients $\\beta$ again resemble a quadratic decay of coefficients importance; $\\beta_{0,j} = 0.4/j^2$ for $j = 1, \\ldots, p$.\n", "\n", "The directed acyclic graph (DAG) shows the the structure of the causal model." ] }, { "cell_type": "markdown", "id": "bcfc2d49", "metadata": {}, "source": [ "
\n",
" \n",
"