{ "cells": [ { "cell_type": "markdown", "id": "e401eb2c", "metadata": {}, "source": [ "# R: Basic Instrumental Variables Calculation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this example we show how to use the DoubleML functionality of Instrumental Variables (IVs) in the basic setting shown in the graph below, where:\n", "\n", "- Z is the instrument\n", "- C is a vector of unobserved confounders\n", "- D is the decision or treatment variable\n", "- Y is the outcome\n", "\n", "So, we will first generate synthetic data using linear models compatible with the diagram, and then use the DoubleML package to estimate the causal effect from D to Y. \n", "\n", "We assume that you have basic knowledge of instrumental variables and linear regression." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "vscode": { "languageId": "r" } }, "outputs": [], "source": [ "library(DoubleML)\n", "library(mlr3learners)\n", "\n", "set.seed(1234)\n", "options(warn=-1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Instrumental Variables Directed Acyclic Graph (IV - DAG)" ] }, { "attachments": { "basic_iv_example_nb.png": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAaMAAACbCAYAAADPy/SAAAAAAXNSR0IArs4c6QAAIABJREFUeF7tnQl0FUUWhi9CWAwEoyAgYAyi7OuAMg4CCsMSUTQuoCyOrIqAgiDg4OACAoqjw6LRIKNsIsIYEWWJIAKCOIgaQWBkF1FkH1EWWeb85XR4Sfq91+/1e73+dY5Hj+nuqvrqvv67qm7dW+jcuXNnROQCYSEBEiABEiABewjMLkQxsoc8ayUBEiABEsglQDGiMZAACZAACdhOgGJk+xCwASRAAiRAAhQj2gAJkMB5Alu2bJEff/xRzp07J0lJSdKwYUP1xx07dsiuXbt0UdWrV0+Sk5PV35YvX657TZkyZaR27drqb6jjhx9+0L2ucePGkpiYKIcOHZKcnBzdaxISEqRcuXJSoUIFdS2LJwhQjDwxjOwECZggMHjwYJk+fbr89NNPeZ5yzTXXyNq1a9X/GzFihIwePVq3luzsbGnVqpX8/PPPSsD0SocOHSQrK0v9qWfPnvLaa6/pXvf1118r0Vq4cKGkpaWF7dXVV18tbdq0ke7du0v9+vXDXs8LHEuAYuTYoWHDSCCOBBYvXqxe4igrV66UrVu3SpUqVdQ/lStXjmPNsXn0zp07Zc+ePbJs2TJ555135PHHH5f09PTYPJxPsYMAxcgO6qyTBOwisGLFCrn//vvlyJEjsm3bNilRooRdTYlLvZ988omMGjVKzfSwNMjiGgIUI9cMFRtKAiYJDBw4UF588UV54IEHZOzYsUGX1ExWY+vtu3fvVjOk77//Xt58801p0aKFre1h5YYJUIwMo+KFJOBiAr1791azhXfffVdat27t4p4YazoENyMjQ9566y256667jN3Eq+wkQDGykz7rJgErCMBBAQ4D2CeCU4JfymOPPSZjxoyRuXPnyu233+6Xbru1nxQjt44c200CRglgyWr//v2+9Dbr0qWLzJw5U3kKli1b1igyXmc9AYqR9cxZIwlYQwBOChdddJE1lTm4llWrVknTpk0d3EI2TUQoRjQDEvAiAZz5SU1NlVmzZvlij8jIGB49elRKly5t5FJeYz0BipH1zFkjCcSfAA6pYgN/+/btnvSai5QglipxmHbp0qW5USUifQavjysBilFc8fLhJGADAeyPpKSkqIgJgwYNsqEFzqyyZs2aat8Ms0UWxxGgGDluSNggEjBJYPz48TJs2DAVnsdrh1rNoMnMzJS+ffuqOHuVKlUy8yjeG3sCFKPYM+UTScBeAm3btpUzZ84IYsaxnCdw8uRJueyyy+SZZ56RPn36EI2zCFCMnDUebA0JmCNw+vRpKVmypDz77LMyYMAAcw/z4N2dOnWSiy++WF566SUP9s7VXaIYuXr42HgSyEfg+PHjKgwOZkeYBbDkJYCAqkhzgeCwLI4iQDFy1HCwMSRAAnElsG7dOiVGiErB4igCFCNHDQcbQwIkQAL+JEAx8ue4s9deJTBnzhzp2LGjytTKQgIuIkAxctFgsakkEJYAxSg0IkTxvvvuu+Xs2bNhWfICSwlQjCzFzcpIIM4EKEYUozibWLweTzGKF1k+lwTsIOAUMfrll19U7iTkUEL21YSEBGnZsqVK7Fe1alXZuHGj8vizunBmZDVxw/VRjAyj4oUk4AICThCj9evXqzNOl19+uQwfPlxq1KghRYoUkcOHD8s777wjTz31lApV1LlzZ8uJUowsR260QoqRUVK8jgTcQMBuMfr222+la9euUrduXXnhhRckMTGxALaPP/5YPv30Uxk6dKjlSClGliM3WiHFyCgpXkcCbiCAMzSYeSxbtszy5iIE0ZNPPilTpkyRrKysoFllEZYH0bPT0tIsb+Pbb7+t0pDT29By9OEqpBiFI8S/kwAJGCOwa9cuueeeeyQ5OVntFeHfTiucGTltRHLbQzFy7NCwYSTgMgJYemvdurUSJCzRMWK4ywbQ3uZSjOzlz9pJILYE4KU2d+5cGTlyZGwfbOBp8JpDem9ExKYYGQDGSwIJUIxoDyTgJQJ27ols3bpVEBUbSewmT54spUqVchza/fv3yzfffCPNmzd3XNt83iCKkc8NgN33GAE7velwtqhfv36ydu1agSjWqlXLcXS5Z+S4IdEaRDFy7NCwYSQQBQE7xQjNxVIdzg/ddNNNKqeSnms3RGvJkiXSrl07KV68eBS9jP4WilH07OJ8J8UozoD5eBKwlIDdYgSXaRxsHTRokNSpU0eGDBkiTZo0kaJFi8qpU6fU+SJkoMWh2LJly1rKBpVRjCxHbrRCipFRUryOBNxAwG4x0hjt3r1bpk2bJv/617/kiy++UNlVEQ4IQUpxvqhYsWK24KQY2YLdSKUUIyOUeA0JuIUAPOnuvPNOHuoMMmCzZs1Sy4g89Oo4i6YYOW5I2CASIIG4EZgwYYKKi7dv37641cEHR0WAYhQVNt5EAiTgWgKbN2+W6tWru7b9Hm04xcijA8tu+ZgAPNXgMNCiRQsfU2DXXUaAYuSyAWNzSSAsgf79+8vq1avl888/D3utny7Ys2ePbNmyRTlSsDiOAMXIcUPCBpGASQLwXmvYsKF8+eWXUq9ePZNP887tEOl58+bJzp071cyRxVEEKEaOGg42hgRiRKB+/fpSpUoV5VrNInLgwAGpWLGijB07VgYOHEgkziNAMXLemLBFJGCewOLFi1Va7w8//JDLUiLSpUsXef/992Xv3r2MJm7evOLxBIpRPKjymSTgBAJ33HGHOnAK77GEhAQnNMmWNmjBY19//XW59957bWkDKw1LgGIUFhEvIAGXEvjxxx8lPT1dhecpV66cS3thrtlwWqhdu7aaJc6ePdvcw3h3PAlQjOJJl88mARKwl8ChQ4dUWgs4LjgxpYW9dBxVO8XIUcPBxpBAnAgsXLhQkG8IHmUsJOBAAhQjBw4Km0QCMScAh4a77rpLOnToIFOmTPG0azNctxGQdeLEidKoUaOYs+QD40KAYhQXrHwoCTiQwLZt21SeIZRx48YpYfJaycrKkq5du0rdunUFQVFTUlK81kWv9odi5NWRZb9IQI/AiRMnVKDQUaNGyXXXXSdvvPGGVK1a1dWwdu3apYQH6SG++uorldQPeZRYXEWAYuSq4WJjSSBGBDBL6t27t2RkZMhVV10lBw8elPnz50tqamqBGpAcT8vIumLFCjl79myBa5CvCLMRlE2bNgWNiq0969ixY7Ju3Trd3lSuXFmuvPJK9becnByBE0JggaDin1tvvVX979tuu01Fm+jRo4f07NlTypcvHyNKfIyFBChGFsJmVSTgOALr169X4oGXP5LeHT58uEAbd+zYIVdccYX6/4UKFdLtQ5s2bWTRokXqbzhgOnPmTN3rtm/frgRPC1mkdxGyxD7//PPqT3DJxn5X/tK8eXNZvny5+t+DBw+Wp59+modZHWddETWIYhQRLl5MAh4igBTgeNkjjt2yZctc2TN4CKL9mJVBDEuWLOnKfrDRQjGiEZCAHwkgqjdmM/A2++CDD1w9q8AS3Y033qj2vhD+KCkpyY9D6vY+U4zcPoJsPwlESgDLW1iSa9q0qbz33ntSrFixSB/huOs3btyo8jdVqlRJzfKSk5Md10Y2KCQBihENhAT8RCA7O1tuvvlmadWqlQoT5KWYdchVhL2kMmXKqP0k/JvFNQQoRq4ZKjaUBEwSQNRqeJ61b99e5syZI0WKFDH5ROfdDgeJZs2aqb2jjz/+2Lcx+Zw3MmFbRDEKi4gXkIAHCGAWhAgMiOQ9Y8YMKVy4sAd6pd8FnDvCDAl9hCs68hixOJ4AxcjxQ8QGkoBJAohWDXfrzp07C9IoBHPPNlmNo27//vvv1QzpzJkzaobESAyOGh69xlCMHD9EbCAJmCAwbdo0+ctf/qIOg77yyiu+ECIN1759+9QMCQdsMUNC5lsWxxKgGDl2aNgwEjBJIDMzU/r06SMPPvigChrqx4J04/Cyw78xQ6pWrZofMbihzxQjN4wS20gCkRKYNGmSSheB6ATPPfdcpLd76npElcA5JCTag5ddrVq1PNU/j3SGYuSRgWQ3SCCXwPjx41WgUITIGTFiBMmIyH//+1/lzo6IDTiHVL9+fXJxFgGKkbPGg60hAXME/va3vykRwmwIsyKW8wSwd4SoExs2bFCRGho3bkw8ziFAMXLOWLAlJGCOAGZDCDA6YcIE6devn7mHefTu48ePq+gTiBiOAKxIo8HiCAIUI0cMAxtBAiYJYH9o8uTJynW7W7duJp/m7dtPnjypolCsWrVKxeWDgwOL7QQoRrYPARtAAiYInDt3TnnMTZ06VR1m7dSpk4mn+efW3377TUWjwHId4vP9+c9/9k/nndlTipEzx4WtIoHwBCBEOEOELKcI74OXK4txAqdPn1ZRKRYsWKDi9Gkp2Y0/gVfGkADFKIYw+SgSsIwAIgsgqsK8efP4IjVBXeM4d+5cCroJjjG4lWIUA4h8BAlYSkD7osd+B5eYzKPXZpjITsulTvM8o3wCxShKcLyNBGwhoO114KwMN99jNwTa3tuUKVPoBBI7rJE8iWIUCS1eSwJ2EtC8wNasWUO35DgNhOaViDh+vXr1ilMtfKwOAYoRzYIE3EBAOx+zfv16HtiM84DhvBaiWCCeH89rxRn2+cdTjCxDzYpIIEoCWuSATZs2MZRNlAwjvQ1RLBDNgpEsIiUX9fUUo6jR8UYSsIDAkSNHpGXLloKEcR999JHUqVPHglpZBQhoMf5GjhwpTzzxBKHElwDFKL58+XQSiJ4Aok1DiJAoDtGma9SoEf3DeGdUBLBUN2DAAHn00Udl3LhxUT2DNxkiQDEyhIkXkYDFBALz8KxevZqJ4SzmH1idlqDQz3mhLMBPMbIAMqsggYgIQIiaNm3KDKURUYvvxYEZc1999dX4VubPp1OM/Dnu7LVTCQSmyv7kk08kJSXFqU31Xbtmz54t99xzjy9TuFsw2BQjCyCzChIwRAB7Q82aNROEqEGKbAqRIWyWXgRBQhimzp07q8OxhQoVsrR+D1dGMfLw4LJrLiIAIfrTn/4khQsXlhUrVkjFihVd1Hp/NRVBVRFgFbMkClLMxp5iFDOUfBAJREkAbtvNmzdXQgRnhXLlykX5JN5mFYH3339fbr31VrnjjjtUPDuMHYspAhQjU/h4MwmYJLB9+3a1NFeyZEm1NEchMgnUwtshSEjb0b59e3n77bcpSObYU4zM8ePdJBCaALKJwjNOr0CIkPa6TJky6hwR/s3iLgLZ2dkqDxIECTmlihQpUqADv/76q1x44YXu6pj1raUYWc+cNfqJAL6cGzRooELLBJYtW7aopTkI0MqVKyU5OdlPWDzVVwgS0pi3atVKsrKy8ggSoqzfeOON0qNHD5UIkSUoAYoRjYME4kUArtnarOiZZ56R4cOHq6ogRPj/lSpVUrHmKETxGgHrnouZbVpamhIeODgkJCSoypHOHA4pV1xxhRp3FooRbYAELCeQnp4u2Fc4deqUqhtBN9u1ayctWrSQ1NRUFX07KSnJ8naxwvgQgPMJZkf40EDSw9tvv12WLFkimB2h/POf/+TsKDh6zoziY5Z8qt8JBM6KAllg7wDBTrG0U6pUKb9j8lz/Me5t27ZVe0SHDh0SZOXVSrVq1WTz5s2e63OMOkQxihFIPoYE8hDIPysK/OPf//53GThwIIl5lACW5rD8evbs2QI9xLmke++916M9N9UtipEpfLyZBHQIYLkGB1hDlczMTBVWhsVbBLp27SqzZs3SFSL0lLOjoONNMfLWT4G9cQIB7BUsWLAgd68oWJv4leyE0YpdG+AxN3Xq1LAP5LjrIqIYhbUcXkACERAwMivC44oWLarEaubMmSqsDIu7Cdx///3yyiuvGOoEZ0cUI0OGwotIwAwBhIeBJ5XmQZf/WcWKFZOTJ0/K3XffLciPE245z0xbeK+1BN544w3BfmBOTo5ccMEFQZfq0Cpc261bN2sb6OzaODNy9viwdW4isGbNGhVRQa9oM6G+ffsqEapZs6abusa2RkBg/vz58uKLL6o08dq457+9evXqsmnTpgie6vlLKUaeH2J20DICd955p+BFFDgrwhdyiRIl5KGHHpJ+/fpJhQoVLGsPK7KXAA67Tp48WYUJ0mbEgS3i7CjP+FCM7DVX1u4VAnqzIgjPoEGD1EwIgsTiTwJff/21TJo0SZAhNnCmxNkRxcifvwj2Oq4EAmdFNWrUkEceeUTFI2MhAY3Anj17lCj94x//UFEZkESRs6Nc++DMiD8VEjBLQJsVIQzMww8/rMLAsJBAMALHjh1Ty3dwdihbtqxs2LCBsERiK0Z79+6V/fv3y5EjR+TcuXOOBYww74iWDEO45JJLHNtOvzXMrfaDrJ9w7UWQTBYSiIQA3MEROgiHZZHt98CBA359f5oTo8OHD6ukUm+99ZbKx6IX/iKSgbHj2tKlS6uMjZ06dVIRdpmx0bpRgP1gcxf2g8RybrQfJMWDO3fHjh1pP9aZjidqQuw67f3pVvu/6KKLVIJB2D+CxJp4f0YvRk899ZSMHDlSihcvrl7k+DpMSUmRSy+91BVJwvAVsm/fPvniiy9UHnskQatSpYpkZGSolwpLfAk8+eST8sQTTxSwH2Q6dcNsNZj94EsXP0oWEghFQLN/zIpw5gxLu3h/utH+EY0cAWKvvPJK5aQR5QpB5GL06aefyn333SdYUnn66afVJm1iYqLrLQ9ZNx977DH1lY4T8dhkZObN2A8r9le6d+8uP/zwg+CDxqv2M2HCBFeIauxHmE8MRQD2jyR7+BCG/ffq1csTnpbbtm2Tv/71r7nvz4kTJ8rFF18ciTFEJkZIEIYKcXJ4/Pjxas/FawVi27t3bzl48KCaQgc7xOi1flvRn9GjR8uIESNoP1bAZh2OI4CPd2T8xQfY2LFjPfmxixkS9k+1LZw//vGPRsfBmBjBIQHr4kuXLvWNKyI2FGfMmCFjxoyRYcOGGQXK63QIwH7g+oxkctOnT5cuXbp4mtOJEyfUCwfRm/HSGTp0qKf7y86FJoAXM+wf70+8Uzp37uxpZMePH1dpMvAx/+yzz8qQIUOM9NeYGCEz5TfffCMffPCBNGrUyMiDPXENQOJF8vzzz6vDiyzREYD9IPQJsp76yX4gREg1jtAwiMDA4k8CiD+4detW9f78wx/+4BsI+JDH1ge2PAYMGBCu3+HFCCqOnO7Y4G/YsGG4B3ru71jXxWbju+++K+3bt/dc/+LdIc1+1q5dqzKc+q3ASQPLM7Qfv4387/31u/1jWRLL84sWLQrnGBZajHA6GM4KyM2SlpbmT2v6v0EhEjOcHOjUYNwM4KWI5Sqw87P9YIl7yZIltB/jpuOJK5HbCA4KtP/ft3i+/fbbUO/P4GKEdf6qVasqt0N4Rvi5IOT/VVddJThhj30AlvAENPvBlyGm6X4uv/76qyAOWbNmzdSeAYv3CcD+U1NTlefcCy+84P0Oh+gh7B9agiMP06ZNC3ZlcDGCRxmW5zAbKFWqlK9hovNYZsHhWOS2v+GGG3zPIxwAfBFmZWXRfv4PCr+l9PR02k84w/HI37EigAjufH/+PqDz5s1TTnDwtgvioawvRj/99JOUL19epdCFsrP8TgDKjigBECSW4AQ0+8EyHROIneekHYaFVyGLdwlo9o9ZgNc9RyMZRdg/UqpgyVqn6IsRvH+w6Yo4SYUKFYqkPk9fi2RZLVu2VAc2cVKaRZ8AliVGjRpF+8mHB+vmiO5B+/H2LwfetziHiXFmOU8gOztb2rVrp2Lw6bw/9cXommuukQYNGhjO6e4X4Aj5jlA1CIM0cOBAv3Q74n42btxYuXC//PLLEd/r5Rs0+4F3Jl29vTvScN9u0qSJiszNcp6AZv+Y6PTv3z8/moJidPr0aUlISFDx2nBwiSUvgbZt2wqCY86dO5dodAho9oMlChwcZslLoHXr1oLgkggQy+I9Apr9++FwazSjh6U6eCTPnj07vBhhww0B75Ay9/rrr4+mPk/fg9TRn332mfqHpSABxKiC5wzOpeGwH0teAn379pX169cLwk6xeI8A3JevvvpqWb16tUQQCsd7IIL0CKGCcnJyFJ98peDMCCLUvHlzta532WWX+QaS0Y5iP23cuHFcDw4CDKHwEXEB6+VwgmHJSwAJ1bCngN8Xi/cIYF8ZUasRCBUZDFjyEsBeGo56fPfdd+HFSHPBc3JyPDsHGPGWkC6DfPRHgfYT2joRFR4pV2g/dv6K41c3lu8Rh47jq8/4zTffVFkRdPgUnBlpMLHZBDc8lrwEND5w8aanYUHroFiH/sVgrwiJyPiy8uabBR8bCBTgxkSRVowI9opwEB76EnaZjmIUeki8LkbYz8CeT1JSUlS2STGiGEVlOA65CQGhK1SoIMnJyVG1iGIUGhvFKCqz0r/JD2IE11SclsZyA/4dyQyZYkQxiuHPzfJHQYxq1aqloq1gOR72D+9io4Vi5CExOnr0qPTs2TOk63SHDh3UGSg7Dp76RYyKFCkicFMtUaKEWlaCMBkJdmqXGG3ZskW186uvvirwa0DGSZx9gncfXi6IE2fXEiuX6Yy+1u25ThOjwoULq6WkokWL5tr/zTffHLZRdokRwuwgdmZgadOmjcycOTM34zD+OzAiRP6/h+1cDC5w1cwIGVYnTZqksq1iuqwVrLEDJjKFIpo4PP7sKH4Ro0C2+EGeOnVKpRHG5iNe6MH42yVGaC/W6TMzM1WmycBzHr/88os6qgAvHriUImMxroHgWl0oRlYTj6w+TYz07B/nw7AfhA+zYPEp7RIjtHfnzp0qfxCcBAYPHqyioBQrViwPgHXr1inbx8F9REOw+jfgKjHavXu3IOJt3bp180CEyzCCbyI/DAzCri9bP4pR4EDAuBHFvHLlymojEsIUmDDMTjFCO7UvRL1Dhz///LNKloi/ZWRk2GJHFKPIxMHqq/XESM/+K1asmGv/mHVrxU4xQhvgUt6nTx+1qjFlypQ8xyu0D3osu9v1DnWVGOkZH85k4DR/jRo1VBrbxMREq200tz5NjLx6gn7Hjh2G02RrwoRxgTDhixHLZHa6vocSIwyitpxXtmxZFc4+cPZthVFpYuRV+7GCYTzr2LNnj+Gszpr945Arlr/wYYYDnXZ70+HDHdFzbrrppjzvS2RbXr58ufqot3pGpI2Zq8UISyyPPvqoOiRo1z5RoPFrYhTPH4Tbnq0t4+HEOfI+4SVvl+tyODE6ceKEPPLII/LSSy8JfrTIMWRl0cTIyjpZV3wJaPaPmJ6wf+Q8s9O1G789LNUhfiaCFkMcEUkcy9RDhgyJ2lMwFhRdK0aAOmHCBAXUzn0iPTHy6jkjuHYHLrsFM0DtqxA/Pm1WVLNmTXHyMp3WFwRqRDpkO+KHcZkuFq+0+D0j3DKdVrNm/1WqVMmdFdWpU0fsXqbT2odlOkT6QLBiLNdt3rxZZRzAKoadxbVi5JR9IorR7wS0HyDCRMGRActy+BoMLBSj0D91ipGdr8LwdYcSI83+4cWr7Zfmjz/nFDFCT7X9I+Rfw4oOgvTaXVwpRgg4iH0ifKXbvU/kZzHCGYvffvtNSpcunetJhNhbwYrTxej48eNq+QJLKUjyhVD/VhaKkZW0I68rvxhp9o9s11juwr4QclIFK04SI7QRqwAI0YV2VatWLXIgMb7DdWLktH0iv4oR1sK1g38422WkOF2MNAcGLK+89tprlq+fU4yMWJF91wSKETb5Nfu/7bbbDDWKYhQak6vECAfNENkYiamC7RMdPnxYYDR2pCjwg2s3vqa06AsQpEiKk8Uo0LV7+vTpYlRgI+l/uGspRuEI2ft3vFeGDRuWa/849B1JoRh5SIzC7RNBiCBUOLBlZKM9EkMycq3XxejQoUPqcGu0xU4xCnboFUtz8LJD+Pp///vfMnHiRBU5O5IwR9HyyH8fxShWJOPznP379wvc/qMtThIj2P3jjz8ur776qlqmQ2JQu85najxdMzPS9onWrl0b0hauu+46teafkpISrc1EfZ/XxShqMP+/0S4xChcOCDaDcC6IOWZnnhmKkVkLc/b9ThEjI+GB7CAZkRgtWLBA/WihqsWLF7ejvY6uU4vvZNc5GkfDERHNfshHf6SwPIhwLNgXZfEegffee09uueUW287ZOZ0otl4efPBBOXbsWP6mFsxn9Pnnn0ujRo1k69atKv04S14CyPKKw7dIz85SkABiXyE8CvikpqYSUT4CY8aMUY4T+H2xeI/AZ599Jtdee62KE2fHyo3TiY4ePVr5AvznP/8JL0ZIF41zJAgbYVcwUicD7devnwp5s3LlSic307a27d27VxC3C4FJr7/+etva4dSK+/btKxs3blTRH1i8RwDhhBC3cdWqVbY4WDmdKFYFsKSO9Oz5SsGZES7Ay2TAgAGGY5Q5HUAs2weniRYtWqjTzSz6BGA/Dz/8sAo9wpKXQIMGDaRVq1by3HPPEY1HCcD+Bw0apMJOseQlUK9ePeVIgRUmQ2KEr384EcDziOU8ge+++04uv/xyfvWHMQqsCWO5Lpwjit9sS7MffjV7e+QfeOAB+fLLL2XNmjXe7miEvcPSNMKHIY1L/sgVIqI/M/rwww/VKeNdu3aply/L7wQQCQL/wP3TbhdJJ49Jdna2Cj1C+8k7SmPHjlXu5bQfJ1uv+bYtXrxYff1jy6N8+fLmH+iRJyCPGM6QBrF/fTHCeQ1Mp+AOi816FlHehTi1j68eBNlkCU5Asx9knkSgRpbz9oNZIxJEsniXAOwfAUmxHIszkSwiOHCOVBv9+/dXCQB1ir4Y4UJsMCHKK5ZaApNH+RUs0ljArRteYvmzJ/qVSah+037y0kHmTYT1p/3449eyaNEiSUtLU8t1+ROF+oNA3l5iDw1nsELYf3AxwqMAE6HH4e6dnJzsR4aqz/B8QnDQqVOnqqRVLMYIIEoGPGfwg0xKSjJ2kwevwrIllm1ef/11FfyXxR8E8M5AHjbsvdP+26o8Z4h2HqSEFiOseWKDBoFxAAADF0lEQVS5DtMrvJALFy7sDysK6CWiQuDcFZKwzZ8/n3tFEViAZj+1a9cWvJD9aj/woLvhhhsEByJZ/EMADivwvvWz/SO7LFbW8DGG6DUhSmgxwo3wisJ5EaTVzczM9I8licjBgwcVSHzVwDMm0qCJvoIVpLOa/XTr1s13+48HDhxQ9oP0G7Qff/4aMCtCzq8+ffpIRkaGryDAUQF9v+SSS9S5qzARfcKLEehhrQ+BJTt27CiILeSHghlRmzZt5OTJk0qQK1So4Idux6WPmv0gHwxiCvqhYHkSX4PIBQX7oVeVH0Zdv494Z8L2kZAS+85+KNjewTJ9BPZvTIwAD+6K6enpgtS6iD9WpkwZzzLFkiTiS8GtHf1GRAoWcwRoP+b48W53E8A7Ex/zeH8uXLjQ03vwge/PpUuXGg1MbFyMYAo5OTnKqQGzheHDh6uAd17yLNu9e7dy254xY4aaFSEC9YUXXujuX4GDWh9oP3DvRGgcr9kPQvbDfjArov04yPgc0BTYP87fwfUb9o9jIl6zfxxbwOwvCvuPTIwwnsh3gzAX8AzCjKFHjx7qHzcHBUSOeHh6IIAf0gvjYCtiKLHEngD24eDmDPtB2JTu3bu73n7w9QfbQURu2A8Otvbu3Tv28PhE1xPIb/+wk/vuu0/Fs3NrQZAE2D8+wmD/CJXWq1evSLsTuRhpNWBDCmdvtJAXCO/QsGFDNQ2F952TIxScOHFCzfLwDwKeYkaUmJgoSC2MF0m5cuUiBcnrIyQA7kOHDi1gPziTgZAhbrEfBISF15RmP/gh2pkvKcJh4OU2EYD946MMUb5R8P6E5x3enz61/+jFSBvDo0ePqgjfEKUNGzaodOA7duywaYiNVQuvjmrVqknNmjWV6zrctnViJRl7GK8yRcCt9lO9enV1yp72Y2r4fX8z7B8HxBGvDdHc4Qrthvcn7D/w/dmkSROzY2lejMy2gPeTAAmQAAn4ngDFyPcmQAAkQAIkYD8BipH9Y8AWkAAJkIDvCVCMfG8CBEACJEAC9hNQYjTH/nawBSRAAiRAAj4mcOZ/DjaH9m+1TRoAAAAASUVORK5CYII=" } }, "cell_type": "markdown", "id": "a3250ef4", "metadata": {}, "source": [ "![basic_iv_example_nb.png](attachment:basic_iv_example_nb.png)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data Simulation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This code generates `n` samples in which there is a unique binary confounder. The treatment is also a binary variable, while the outcome is a continuous linear model. \n", "\n", "The quantity we want to recover using IVs is the `decision_impact`, which is the impact of the decision variable into the outcome. " ] }, { "cell_type": "code", "execution_count": null, "id": "5f8b1555", "metadata": { "vscode": { "languageId": "r" } }, "outputs": [], "source": [ "n <- 10000\n", "decision_effect <- -2\n", "instrument_effect <- 0.7\n", "\n", "confounder <- rbinom(n, 1, 0.3)\n", "instrument <- rbinom(n, 1, 0.5)\n", "decision <- as.numeric(runif(n) <= instrument_effect*instrument + 0.4*confounder)\n", "outcome <- 30 + decision_effect*decision + 10 * confounder + rnorm(n, sd=2)\n", "df <- data.frame(instrument, decision, outcome)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Naive estimation" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can see that if we make a direct estimation of the impact of the `decision` into the `outcome`, though the difference of the averages of outcomes between the two decision groups, we obtain a biased estimate. " ] }, { "cell_type": "code", "execution_count": null, "id": "2d00221a", "metadata": { "vscode": { "languageId": "r" } }, "outputs": [], "source": [ "mean(df[df$decision==1, 'outcome']) - mean(df[df$decision==0, 'outcome'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Using DoubleML" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "DoubleML assumes that there is at least one observed confounder. For this reason, we create a fake variable that doesn't bring any kind of information to the model, called `obs_confounder`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To use the DoubleML we need to specify the Machine Learning methods we want to use to estimate the different relationships between variables:\n", "\n", "- `ml_g` models the functional relationship betwen the `outcome` and the pair `instrument` and observed confounders `obs_confounders`. In this case we choose a `LinearRegression` because the outcome is continuous. \n", "- `ml_m` models the functional relationship betwen the `obs_confounders` and the `instrument`. In this case we choose a `LogisticRegression` because the outcome is dichotomic.\n", "- `ml_r` models the functional relationship betwen the `decision` and the pair `instrument` and observed confounders `obs_confounders`. In this case we choose a `LogisticRegression` because the outcome is dichotomic.\n", "\n", "\n", "Notice that instead of using linear and logistic regression, we could use more flexible models capable of dealing with non-linearities such as random forests, boosting, ... " ] }, { "cell_type": "code", "execution_count": null, "id": "600b8196", "metadata": { "vscode": { "languageId": "r" } }, "outputs": [], "source": [ "df['obs_confounders'] <- 1\n", "\n", "obj_dml_data = DoubleMLData$new(\n", " df, y_col=\"outcome\", d_col = \"decision\", \n", " z_cols= \"instrument\", x_cols = \"obs_confounders\"\n", ")\n", "\n", "ml_g = lrn(\"regr.lm\")\n", "ml_m = lrn(\"classif.log_reg\")\n", "ml_r = ml_m$clone()\n", "\n", "iv_2 = DoubleMLIIVM$new(obj_dml_data, ml_g, ml_m, ml_r)\n", "result <- iv_2$fit()\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "vscode": { "languageId": "r" } }, "outputs": [], "source": [ "result" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can see that the causal effect is estimated without bias." ] }, { "cell_type": "markdown", "id": "bc9390cd", "metadata": {}, "source": [ "## References\n", "\n", "Ruiz de Villa, A. Causal Inference for Data Science, Manning Publications, 2024." ] } ], "metadata": { "kernelspec": { "display_name": "R", "language": "R", "name": "ir" }, "language_info": { "codemirror_mode": "r", "file_extension": ".r", "mimetype": "text/x-r-source", "name": "R", "pygments_lexer": "r", "version": "4.2.2" } }, "nbformat": 4, "nbformat_minor": 5 }