{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# Python: Sensitivity Analysis\n", "\n", "This notebook illustrates the sensitivity analysis tools with the partiallly linear regression model (PLR).
\n", "The DoubleML package implements sensitivity analysis based on [Chernozhukov et al. (2022)](https://www.nber.org/papers/w30302)." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Simulation Example\n", "\n", "For illustration purposes, we will work with generated data. This enables us to set the counfounding strength, such that we can correctly access quality of e.g. the robustness values." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd\n", "\n", "from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier\n", "\n", "import doubleml as dml\n", "from doubleml.datasets import make_confounded_plr_data, fetch_401K" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Data\n", "\n", "The data will be generated via `make_confounded_plr_data` and set the confounding values to `cf_y=0.1` and `cf_d=0.1`.\n", "\n", "Both parameters determine the strength of the confounding\n", "\n", "- `cf_y` measures the proportion of residual variance in the outcome explained by unobserved confounders\n", "- `cf_d` measires the porportion of residual variance of the treatment representer generated by unobserved confounders.\n", "\n", "For further details, see [Chernozhukov et al. (2022)](https://www.nber.org/papers/w30302) or [User guide](https://docs.doubleml.org/stable/guide/guide.html).\n", "Further, the true treatment effect is set to `theta=5.0`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "cf_y = 0.1\n", "cf_d = 0.1\n", "theta = 5.0" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.random.seed(42)\n", "dpg_dict = make_confounded_plr_data(n_obs=1000, cf_y=cf_y, cf_d=cf_d, theta=theta)\n", "x_cols = [f'X{i + 1}' for i in np.arange(dpg_dict['x'].shape[1])]\n", "df = pd.DataFrame(np.column_stack((dpg_dict['x'], dpg_dict['y'], dpg_dict['d'])), columns=x_cols + ['y', 'd'])\n", "dml_data = dml.DoubleMLData(df, 'y', 'd')\n", "print(dml_data)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### DoubleML Object\n", "\n", "We fit a `DoubleMLPLR` object and use basic random forest to flexibly estimate the nuisance elements." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.random.seed(42)\n", "dml_obj = dml.DoubleMLPLR(dml_data,\n", " ml_l=RandomForestRegressor(),\n", " ml_m=RandomForestRegressor(),\n", " n_folds=5,\n", " score='partialling out')\n", "dml_obj.fit()\n", "print(dml_obj)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "The effect estimate is biased due to the unobserved confounding and the corresponding confidence interval does not cover the true parameter `theta=5.0`" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(dml_obj.confint())" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Sensitivity Analysis\n", "\n", "To perform a sensitivity analysis with the [DoubleML](https://docs.doubleml.org/stable/index.html) package you can use the `sensitivity_analysis()` method.
\n", "The sensitivity analysis is based on the strength of the confounding `cf_y` and `cf_d` (default values $0.03$) and the parameter `rho`, which measures the correlation between the difference of the long and short form of the outcome regression and the Riesz representer (the default value $1.0$ is conservative and considers adversarial counfounding). To additionally incorporate statistical uncertainty, a significance level (default $0.95$) is used.\n", "\n", "These input parameters are used to calculate upper and lower bounds (including the corresponding confidence level) on the treatment effect estimate. \n", "\n", "Further, the sensitivity analysis includes a null hypothesis (default $0.0$), which is used to compute robustness values. The robustness value $RV$ is defined as the required confounding strength (`cf_y=rv` and `cf_d=rv`), such that the lower or upper bound of the causal parameter includes the null hypothesis. The robustness value $RVa$ defined analogous but additionally incorporates statistical uncertainty (as it is based on the confidence intervals of the bounds). For a more detailed explanation\n", "see the [User Guide](https://docs.doubleml.org/stable/guide/guide.html) or [Chernozhukov et al. (2022)](https://www.nber.org/papers/w30302).\n", "\n", "The results of the analysis can be printed via the `sensitivity_summary` property or directly accessed via the `sensitivity_params` property." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dml_obj.sensitivity_analysis()\n", "print(dml_obj.sensitivity_summary)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dml_obj.sensitivity_params" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "The robustness value $RV$ means that unobserved counfounders would have to account for $34\\%$ of the residual variation (in treatment and outcome) to explain away the treatment effect.\n", "\n", "We can also consider a contour plot to consider different values of confounding `cf_y` and `cf_d`. The contour plot and robustness values are based on the upper or lower bounds based on the null hypothesis (in this case this results in the lower bound).\n", "\n", "Remark that both, the robustness values and the contour plot use the prespecified value of `rho` and the ``null_hypothesis``." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dml_obj.sensitivity_plot()" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "To consider a different null hypothesis and add a different scenario to the the contour plot, we can adjust the parameter `null_hypothesis`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dml_obj.sensitivity_analysis(cf_y=cf_y, cf_d=cf_d, null_hypothesis=theta)\n", "print(dml_obj.sensitivity_summary)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dml_obj.sensitivity_plot()" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Application: 401(k)\n", "\n", "In this real-data example, we illustrate how to use the sensitivity analysis from the [DoubleML](https://docs.doubleml.org/stable/index.html) package to evaluate effect estimates from 401(k) eligibility on accumulated assets. The 401(k) data set has been analyzed in several studies, among others [Chernozhukov et al. (2018)](https://arxiv.org/abs/1608.00060), see [Kallus et al. (2019)](https://arxiv.org/abs/1912.12945) for quantile effects.\n", "\n", "**Remark:**\n", "This notebook focuses on sensitivity analysis. For a basic introduction to the [DoubleML](https://docs.doubleml.org/stable/index.html) package and a detailed example of the average treatment effect estimation for the 401(k) data set, we refer to the notebook [Python: Impact of 401(k) on Financial Wealth](https://docs.doubleml.org/stable/examples/py_double_ml_pension.html)." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Data and Effect Estimation\n", "\n", "Define eligiblity as treatment and the net financial assets as outcome to construct a DoubleML object." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "data = fetch_401K(return_type='DataFrame')\n", "\n", "# Set up basic model: Specify variables for data-backend\n", "features_base = ['age', 'inc', 'educ', 'fsize', 'marr',\n", " 'twoearn', 'db', 'pira', 'hown']\n", "\n", "# Initialize DoubleMLData (data-backend of DoubleML)\n", "data_dml = dml.DoubleMLData(data,\n", " y_col='net_tfa',\n", " d_cols='e401',\n", " x_cols=features_base)\n", "print(data_dml)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Use basic random forests to estimate the nuisance elements and fit a [DoubleMLPLR](https://docs.doubleml.org/stable/api/generated/doubleml.DoubleMLPLR.html) model." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "learner_l = RandomForestRegressor(n_estimators=500, max_depth=10,\n", " max_features=5, min_samples_leaf=10)\n", "learner_m = RandomForestClassifier(n_estimators=500, max_depth=10,\n", " max_features=5, min_samples_leaf=10)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.random.seed(42)\n", "dml_plr_obj = dml.DoubleMLPLR(data_dml,\n", " ml_l = learner_l,\n", " ml_m = learner_m,\n", " n_folds = 5,\n", " n_rep=3)\n", "dml_plr_obj.fit()\n", "print(dml_plr_obj)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Sensitivity Analysis\n", "\n", "In their paper [Chernozhukov et al. (2022)](https://www.nber.org/papers/w30302) argue that confounding should not account for more than $4\\%$ of the residual variation of the outcome and $3\\%$ of the residual variation of the treatment. Consequently, we set `cf_y=0.04` and `cf_d=0.03`. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dml_plr_obj.sensitivity_analysis(cf_y=0.04, cf_d=0.03)\n", "print(dml_plr_obj.sensitivity_summary)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Our robustness values are similar to the ones in the paper. The effect estimate is robust in the sense that unobserved confounders (e.g. latent firm characteristics) would have to account for at least $7,1\\%$ of the residual variation (in the outcome and treatment) to reduce the lower bound on the effect to zero. Including estimation uncertainty unobserved confounders still have to explain $5.3\\%$ of the residual variation to render the effect estimate insignificant." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dml_plr_obj.sensitivity_plot()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Benchmarking Analysis\n", "\n", "Usually, it is quite hard to argue which strength of possible confounding `cf_y` and `cf_d` is reasonable.
\n", "A common approach is to use observed confounders to get an informed guess on the strength of possible confounding.\n", "\n", "This is implemented via the `sensitivity_benchmark()` method.\n", "For a more detailed description see [User Guide](https://docs.doubleml.org/stable/guide/guide.html) or [Chernozhukov et al. (2022)](https://www.nber.org/papers/w30302) Appendix D.\n", "\n", "To benchmark the confounding values, one has to specify a set of observed confounders as `benchmarking_set`. The methods then returns a dataframe with the benchmarked values for `cf_y`,`cf_d`,`rho` and the change in the estimates if the `benchmarking_set` would be omitted from the set of confounders.\n", "To consider the confounding strength of multiple variables one has to repeat the benchmarking procedure for each variable separately (this might take some time as a seperate benchmark model has to be fitted each repetion).\n", "\n", "In this example, we will try to consider the same benchmarks as in [Chernozhukov et al. (2022)](https://www.nber.org/papers/w30302) Appendix D. Therefore, we will take a look on the following variables:\n", " - income `inc`\n", " - participation individual retirement account `pira`\n", " - two-earner household `twoearn`" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "benchmark_inc = dml_plr_obj.sensitivity_benchmark(benchmarking_set=[\"inc\"])\n", "benchmark_pira = dml_plr_obj.sensitivity_benchmark(benchmarking_set=[\"pira\"])\n", "benchmark_twoearn = dml_plr_obj.sensitivity_benchmark(benchmarking_set=[\"twoearn\"])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(benchmark_inc)\n", "print(benchmark_pira)\n", "print(benchmark_twoearn)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can add the scenario to the contour plot via a dictionary (remark that `rho` will be still based on the original value from `sensitivity_params` and can be manually adjusted by calling `sensitivity_analysis()` again)." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "benchmark_dict = {\"cf_y\" : [benchmark_inc.loc[\"e401\", \"cf_y\"], benchmark_pira.loc[\"e401\", \"cf_y\"], benchmark_twoearn.loc[\"e401\", \"cf_y\"]],\n", " \"cf_d\" : [benchmark_inc.loc[\"e401\", \"cf_d\"], benchmark_pira.loc[\"e401\", \"cf_d\"], benchmark_twoearn.loc[\"e401\", \"cf_d\"]],\n", " \"name\" : [\"inc\", \"pira\", \"twoearn\"]}\n", "dml_plr_obj.sensitivity_plot(benchmarks=benchmark_dict)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "### Sensitivity Analysis with IRM\n", "\n", "The sensitivity analysis with the IRM model is basically analogous." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "np.random.seed(42)\n", "dml_irm_obj = dml.DoubleMLIRM(data_dml,\n", " ml_g = learner_l,\n", " ml_m = learner_m,\n", " n_folds = 5,\n", " n_rep = 3)\n", "dml_irm_obj.fit()\n", "\n", "print(dml_irm_obj)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dml_irm_obj.sensitivity_analysis(cf_y=0.04, cf_d=0.03)\n", "print(dml_irm_obj.sensitivity_summary)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dml_irm_obj.sensitivity_plot()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "benchmark_inc = dml_irm_obj.sensitivity_benchmark(benchmarking_set=[\"inc\"])\n", "benchmark_pira = dml_irm_obj.sensitivity_benchmark(benchmarking_set=[\"pira\"])\n", "benchmark_twoearn = dml_irm_obj.sensitivity_benchmark(benchmarking_set=[\"twoearn\"])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(benchmark_inc)\n", "print(benchmark_pira)\n", "print(benchmark_twoearn)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "benchmark_dict = {\"cf_y\" : [benchmark_inc.loc[\"e401\", \"cf_y\"], benchmark_pira.loc[\"e401\", \"cf_y\"], benchmark_twoearn.loc[\"e401\", \"cf_y\"]],\n", " \"cf_d\" : [benchmark_inc.loc[\"e401\", \"cf_d\"], benchmark_pira.loc[\"e401\", \"cf_d\"], benchmark_twoearn.loc[\"e401\", \"cf_d\"]],\n", " \"name\" : [\"inc\", \"pira\", \"twoearn\"]}\n", "dml_irm_obj.sensitivity_plot(benchmarks=benchmark_dict)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "nbsphinx": "hidden" }, "outputs": [], "source": [ "fig = dml_irm_obj.sensitivity_plot()\n", "\n", "fig.update_layout(\n", " autosize=False,\n", " width=500,\n", " height=500)\n", "\n", "fig" ] } ], "metadata": { "kernelspec": { "display_name": "dml_dev", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.2" }, "orig_nbformat": 4 }, "nbformat": 4, "nbformat_minor": 2 }