4. Heterogeneous treatment effects#
Most implemented solutions focus on the IRM or IIVM models, as for the PLR and PLIV models heterogeneous treatment effects can be usually modelled via feature construction.
4.1. Group average treatment effects (GATEs)#
The DoubleMLIRM
and DoubleMLPLR
classes contain the gate()
method, which enables the estimation and construction of confidence intervals
for GATEs after fitting the DoubleML
object. To estimate GATEs, the user has to specify a pandas DataFrame
containing
the groups (dummy coded or one column with strings).
This will construct and fit a DoubleMLBLP
object. Confidence intervals can then be constructed via
the confint()
method. Jointly valid confidence intervals will be based on a gaussian multiplier bootstrap.
4.1.1. GATEs for IRM models#
Group Average Treatment Effects (GATEs) for DoubleMLIRM
models consider the target parameters
where \(G_k\) denotes a group indicator and \(Y(d)\) the potential outcome with \(d \in \{0, 1\}\).
Point estimates and confidence intervals can be obtained via the gate()
and confint()
methods.
Remark that for straightforward interpretation, the groups have to be mutually exclusive.
In [1]: import numpy as np
In [2]: import pandas as pd
In [3]: import doubleml as dml
In [4]: from doubleml.datasets import make_irm_data
In [5]: from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier
In [6]: ml_g = RandomForestRegressor(n_estimators=100, max_features=20, max_depth=5, min_samples_leaf=2)
In [7]: ml_m = RandomForestClassifier(n_estimators=100, max_features=20, max_depth=5, min_samples_leaf=2)
In [8]: np.random.seed(3333)
In [9]: data = make_irm_data(theta=0.5, n_obs=500, dim_x=20, return_type='DataFrame')
In [10]: obj_dml_data = dml.DoubleMLData(data, 'y', 'd')
In [11]: dml_irm_obj = dml.DoubleMLIRM(obj_dml_data, ml_g, ml_m)
In [12]: _ = dml_irm_obj.fit()
# define groups
In [13]: np.random.seed(42)
In [14]: groups = pd.DataFrame(np.random.choice(3, 500), columns=['Group'], dtype=str)
In [15]: print(groups.head())
Group
0 2
1 0
2 2
3 2
4 0
In [16]: gate_obj = dml_irm_obj.gate(groups=groups)
In [17]: ci = gate_obj.confint()
In [18]: print(ci)
2.5 % effect 97.5 %
Group_0 0.286027 0.691136 1.096245
Group_1 -4.954536 -1.107872 2.738793
Group_2 0.135755 0.571707 1.007659
A more detailed notebook on GATEs with DoubleMLIRM
models is available in the example gallery.
4.1.2. GATEs for PLR models#
Group Average Treatment Effects (GATEs) for DoubleMLPLR
models consider a slightly adjusted version of the DoubleMLPLR
model.
Instead of considering a constant treatment effect \(\theta_0\) for all observations, the adjusted model allows for a different effect based on groups.
where \(G_k\) for \(k=1,\dots, K\) denotes a group indicator where the groups can depend on the counfounding features \(X\).
Point estimates and confidence intervals can be obtained via the gate()
and confint()
methods.
Remark that for straightforward interpretation, the groups have to be mutually exclusive.
In [19]: import numpy as np
In [20]: import pandas as pd
In [21]: import doubleml as dml
In [22]: from doubleml.datasets import make_plr_CCDDHNR2018
In [23]: from sklearn.ensemble import RandomForestRegressor
In [24]: ml_g = RandomForestRegressor(n_estimators=100, max_features=20, max_depth=5, min_samples_leaf=2)
In [25]: ml_m = RandomForestRegressor(n_estimators=100, max_features=20, max_depth=5, min_samples_leaf=2)
In [26]: np.random.seed(3333)
In [27]: dml_data = make_plr_CCDDHNR2018(alpha=0.5, n_obs=500, dim_x=20)
In [28]: dml_plr_obj = dml.DoubleMLPLR(dml_data, ml_g, ml_m)
In [29]: _ = dml_plr_obj.fit()
# define groups
In [30]: np.random.seed(42)
In [31]: groups = pd.DataFrame(np.random.choice(3, 500), columns=['Group'], dtype=str)
In [32]: print(groups.head())
Group
0 2
1 0
2 2
3 2
4 0
In [33]: gate_obj = dml_plr_obj.gate(groups=groups)
In [34]: ci = gate_obj.confint()
In [35]: print(ci)
2.5 % effect 97.5 %
Group_0 0.497964 0.644113 0.790261
Group_1 0.247617 0.395268 0.542919
Group_2 0.452114 0.593981 0.735848
A more detailed notebook on GATEs with DoubleMLPLR
models is available in the example gallery.
4.2. Conditional average treatment effects (CATEs)#
The DoubleMLIRM
and DoubleMLPLR
classes contain the cate()
method, which enables the estimation and construction of confidence intervals
for CATEs after fitting the DoubleML
object. To estimate CATEs, the user has to specify a pandas DataFrame
containing
the basis (e.g. B-splines) for the conditional treatment effects.
This will construct and fit a DoubleMLBLP
object. Confidence intervals can then be constructed via
the confint()
method. Jointly valid confidence intervals will be based on a gaussian multiplier bootstrap.
4.2.1. CATEs for IRM models#
Conditional Average Treatment Effects (CATEs) for DoubleMLIRM
models consider the target parameters
for a low-dimensional feature \(X\), where \(Y(d)\) the potential outcome with \(d \in \{0, 1\}\).
Point estimates and confidence intervals can be obtained via the gate()
and confint()
methods.
In [36]: import numpy as np
In [37]: import pandas as pd
In [38]: import patsy
In [39]: import doubleml as dml
In [40]: from doubleml.datasets import make_irm_data
In [41]: from sklearn.ensemble import RandomForestRegressor
In [42]: ml_g = RandomForestRegressor(n_estimators=100, max_features=20, max_depth=5, min_samples_leaf=2)
In [43]: ml_m = RandomForestClassifier(n_estimators=100, max_features=20, max_depth=5, min_samples_leaf=2)
In [44]: np.random.seed(3333)
In [45]: data = make_irm_data(theta=0.5, n_obs=500, dim_x=20, return_type='DataFrame')
In [46]: obj_dml_data = dml.DoubleMLData(data, 'y', 'd')
In [47]: dml_irm_obj = dml.DoubleMLIRM(obj_dml_data, ml_g, ml_m)
In [48]: _ = dml_irm_obj.fit()
# define a basis with respect to the first variable
In [49]: design_matrix = patsy.dmatrix("bs(x, df=5, degree=2)", {"x":obj_dml_data.data["X1"]})
In [50]: spline_basis = pd.DataFrame(design_matrix)
In [51]: print(spline_basis.head())
0 1 2 3 4 5
0 1.0 0.000000 0.191397 0.782646 0.025958 0.0
1 1.0 0.342467 0.653991 0.000000 0.000000 0.0
2 1.0 0.460535 0.511022 0.000000 0.000000 0.0
3 1.0 0.000000 0.456552 0.543358 0.000091 0.0
4 1.0 0.046405 0.778852 0.174743 0.000000 0.0
In [52]: cate_obj = dml_irm_obj.cate(basis=spline_basis)
In [53]: ci = cate_obj.confint(basis=spline_basis)
In [54]: print(ci.head())
2.5 % effect 97.5 %
0 0.139921 0.644182 1.148443
1 -5.521085 -1.707197 2.106691
2 -9.190869 -2.944149 3.302571
3 0.022954 1.016429 2.009904
4 -0.568111 1.130526 2.829162
A more detailed notebook on CATEs for DoubleMLIRM
models is available in the example gallery.
The examples also include the construction of a two-dimensional basis with B-splines.
4.2.2. CATEs for PLR models#
Conditional Average Treatment Effects (CATEs) for DoubleMLPLR
models consider a slightly adjusted version of the DoubleMLPLR
model.
Instead of considering a constant treatment effect \(\theta_0\) for all observations, the adjusted model allows for a different effect based on groups.
where \(\theta_0(X)\) denotes the heterogeneous treatment effect.
Point estimates and confidence intervals can be obtained via the gate()
and confint()
methods.
In [55]: import numpy as np
In [56]: import pandas as pd
In [57]: import patsy
In [58]: import doubleml as dml
In [59]: from doubleml.datasets import make_plr_CCDDHNR2018
In [60]: from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier
In [61]: ml_g = RandomForestRegressor(n_estimators=100, max_features=20, max_depth=5, min_samples_leaf=2)
In [62]: ml_m = RandomForestRegressor(n_estimators=100, max_features=20, max_depth=5, min_samples_leaf=2)
In [63]: np.random.seed(3333)
In [64]: dml_data = make_plr_CCDDHNR2018(alpha=0.5, n_obs=500, dim_x=20)
In [65]: dml_plr_obj = dml.DoubleMLPLR(dml_data, ml_g, ml_m)
In [66]: _ = dml_plr_obj.fit()
# define a basis with respect to the first variable
In [67]: design_matrix = patsy.dmatrix("bs(x, df=5, degree=2)", {"x":obj_dml_data.data["X1"]})
In [68]: spline_basis = pd.DataFrame(design_matrix)
In [69]: print(spline_basis.head())
0 1 2 3 4 5
0 1.0 0.000000 0.191397 0.782646 0.025958 0.0
1 1.0 0.342467 0.653991 0.000000 0.000000 0.0
2 1.0 0.460535 0.511022 0.000000 0.000000 0.0
3 1.0 0.000000 0.456552 0.543358 0.000091 0.0
4 1.0 0.046405 0.778852 0.174743 0.000000 0.0
In [70]: cate_obj = dml_plr_obj.cate(basis=spline_basis)
In [71]: ci = cate_obj.confint(basis=spline_basis)
In [72]: print(ci.head())
2.5 % effect 97.5 %
0 0.393060 0.549645 0.706231
1 0.434519 0.600934 0.767349
2 0.382684 0.588000 0.793316
3 0.440364 0.572153 0.703942
4 0.421234 0.594241 0.767247
A more detailed notebook on CATEs for DoubleMLPLR
models is available in the example gallery.
Theory: In the model above, it holds
such that
Remark that for the generated \(\sigma\)-agebras \(\sigma(\tilde{D})\subseteq \sigma(D,X)\) implying
and consequently
Consequently, \(\theta_0(X)\) can be estimated by regressing \(\tilde{Y}\) on \(\tilde{D}\):
The DoubleML implementation approximates the effect \(\theta_0(X)\) by a linear projection on a supplied basis \(\phi(X)\):
where \(\beta_0\) are coefficients to be estimated. The coverage of the confidence intervals is meant to include the the approximation \(\beta_0^T\phi(X)\).
4.3. Weighted Average Treatment Effects#
The DoubleMLIRM
class allows to specify weights via the weights
argument in the initialization of the DoubleMLIRM
object.
Given some weights, \(\omega(Y,D,X)\) the model identifies the weighted average treatment effect
The interpretation depends on the choice of weights. The simplest examples include
\(\omega(Y,D,X) = 1\) which corresponds to the average treatment effect (ATE)
\(\omega(Y,D,X) = \frac{1\{X\in G\}}{P(X\in G)}\) which corresponds to the group average treatment effect (GATE) for group \(G\)
\(\omega(Y,D,X) = \pi(X)\) which corresponds to the average value of policy \(\pi\), where \(0\le \pi \le 1\)
where the weights \(\omega(Y,D,X)\) only depend on the features \(X\).
In these cases the weights can be specified as an array via the weights
argument in the initialization of the DoubleMLIRM
object.
In [73]: import numpy as np
In [74]: import pandas as pd
In [75]: import doubleml as dml
In [76]: from doubleml.datasets import make_irm_data
In [77]: from sklearn.ensemble import RandomForestRegressor
In [78]: ml_g = RandomForestRegressor(n_estimators=100, max_features=20, max_depth=5, min_samples_leaf=2)
In [79]: ml_m = RandomForestClassifier(n_estimators=100, max_features=20, max_depth=5, min_samples_leaf=2)
In [80]: np.random.seed(3333)
In [81]: data = make_irm_data(theta=0.5, n_obs=500, dim_x=20, return_type='DataFrame')
In [82]: obj_dml_data = dml.DoubleMLData(data, 'y', 'd')
In [83]: weights = np.ones(500)
In [84]: dml_irm_obj = dml.DoubleMLIRM(obj_dml_data, ml_g, ml_m, weights=weights)
In [85]: _ = dml_irm_obj.fit()
In [86]: print(dml_irm_obj.summary)
coef std err t P>|t| 2.5 % 97.5 %
d 0.050856 0.664276 0.076559 0.938975 -1.251101 1.352813
If the weights do not only depend on the features \(X\), but e.g. also on the treatment \(D\) estimation becomes more involved. To identifiy the correct parameter not only the weights \(\omega(Y,D,X)\) but also their conditional expectation
has to be specified. A common example is the average treatment effect on the treated (ATTE) which can be identified by setting
\(\omega(Y,D,X) = \frac{D}{P(D=1)}\)
\(\bar{\omega}(X) = \frac{\mathbb{E}[D|X]}{P(D=1)} = \frac{m_0(X)}{P(D=1)}\)
which depends on the propensity score \(m_0(X)\).
In this case the weights can be specified as a dictionary
the weights
argument in the initialization of the DoubleMLIRM
object.
One other important example would be the sensitivity analysis for group average treatment effects on the treated (GATET). In this case the weights would take the following form
\(\omega(Y,D,X) = \frac{1\{D=1, X\in G\}}{P(D=1, X\in G)}= \frac{D \cdot 1\{X\in G\}}{P(D=1, X\in G)}\)
\(\bar{\omega}(X) = \frac{\mathbb{E}[D|X]1\{X\in G\}}{P(D=1, X\in G)} = \frac{m_0(X)1\{X\in G\}}{P(D=1, X\in G)}.\)
To simplify the specification of the weights, the DoubleMLIRM
with score='ATTE'
accepts binary weights, which should correspond to \(1\{X\in G\}\).
This automatically relies on the propensity score \(m(X)\) to construct the weights mentioned above (e.g. for weights equal to one this refers to the average treatment effect on the treated).
A more detailed notebook on weighted average treatment effects for on GATE sensitivity analysis is available in the example gallery.
4.4. Quantiles#
The DoubleML package includes (local) quantile estimation for potential outcomes for IRM and IIVM models.
4.4.1. Potential quantiles (PQs)#
For a quantile \(\tau \in (0,1)\) the target parameters \(\theta_{\tau}(d)\) of interest are the potential quantiles (PQs),
and local potential quantiles (LPQs),
where \(Y(d)\) denotes the potential outcome with \(d \in \{0, 1\}\).
DoubleMLPQ
implements potential quantile estimation. Estimation is conducted via its fit()
method:
In [87]: import numpy as np
In [88]: import doubleml as dml
In [89]: from doubleml.datasets import make_irm_data
In [90]: from sklearn.ensemble import RandomForestClassifier
In [91]: np.random.seed(3141)
In [92]: ml_g = RandomForestClassifier(n_estimators=100, max_features=20, max_depth=10, min_samples_leaf=2)
In [93]: ml_m = RandomForestClassifier(n_estimators=100, max_features=20, max_depth=10, min_samples_leaf=2)
In [94]: data = make_irm_data(theta=0.5, n_obs=500, dim_x=20, return_type='DataFrame')
In [95]: obj_dml_data = dml.DoubleMLData(data, 'y', 'd')
In [96]: dml_pq_obj = dml.DoubleMLPQ(obj_dml_data, ml_g, ml_m, treatment=1, quantile=0.5)
In [97]: dml_pq_obj.fit().summary
Out[97]:
coef std err t P>|t| 2.5 % 97.5 %
d 0.553878 0.149858 3.696011 0.000219 0.260161 0.847595
DoubleMLLPQ
implements local potential quantile estimation, where the argument treatment
indicates the potential outcome.
Estimation is conducted via its fit()
method:
In [98]: import numpy as np
In [99]: import doubleml as dml
In [100]: from doubleml.datasets import make_iivm_data
In [101]: from sklearn.ensemble import RandomForestClassifier
In [102]: np.random.seed(3141)
In [103]: ml_g = RandomForestClassifier(n_estimators=100, max_features=20, max_depth=10, min_samples_leaf=2)
In [104]: ml_m = RandomForestClassifier(n_estimators=100, max_features=20, max_depth=10, min_samples_leaf=2)
In [105]: data = make_iivm_data(theta=0.5, n_obs=1000, dim_x=20, return_type='DataFrame')
In [106]: obj_dml_data = dml.DoubleMLData(data, 'y', 'd', z_cols='z')
In [107]: dml_lpq_obj = dml.DoubleMLLPQ(obj_dml_data, ml_g, ml_m, treatment=1, quantile=0.5)
In [108]: dml_lpq_obj.fit().summary
Out[108]:
coef std err t P>|t| 2.5 % 97.5 %
d 0.217244 0.636453 0.341336 0.73285 -1.03018 1.464668
4.4.2. Quantile treatment effects (QTEs)#
For a quantile \(\tau \in (0,1)\) the target parameter \(\theta_{\tau}\) of interest are the quantile treatment effect (QTE),
where \(\theta_{\tau}(d)\) denotes the corresponding potential quantile.
Analogously, the local quantile treatment effect (LQTE) can be defined as the difference of the corresponding local potential quantiles.
DoubleMLQTE
implements quantile treatment effect estimation. Estimation is conducted via its fit()
method:
In [109]: import numpy as np
In [110]: import doubleml as dml
In [111]: from doubleml.datasets import make_irm_data
In [112]: from sklearn.ensemble import RandomForestClassifier
In [113]: np.random.seed(3141)
In [114]: ml_g = RandomForestClassifier(n_estimators=100, max_features=20, max_depth=10, min_samples_leaf=2)
In [115]: ml_m = RandomForestClassifier(n_estimators=100, max_features=20, max_depth=10, min_samples_leaf=2)
In [116]: data = make_irm_data(theta=0.5, n_obs=500, dim_x=20, return_type='DataFrame')
In [117]: obj_dml_data = dml.DoubleMLData(data, 'y', 'd')
In [118]: dml_qte_obj = dml.DoubleMLQTE(obj_dml_data, ml_g, ml_m, score='PQ', quantiles=[0.25, 0.5, 0.75])
In [119]: dml_qte_obj.fit().summary
Out[119]:
coef std err t P>|t| 2.5 % 97.5 %
0.25 0.274825 0.347310 0.791297 0.428771 -0.405890 0.955541
0.50 0.449150 0.192539 2.332782 0.019660 0.071782 0.826519
0.75 0.709606 0.193308 3.670867 0.000242 0.330731 1.088482
To estimate local quantile effects the score
argument has to be set to 'LPQ'
.
A detailed notebook on PQs and QTEs is available in the example gallery.
4.5. Conditional value at risk (CVaR)#
The DoubleML package includes conditional value at risk estimation for IRM models.
4.5.1. CVaR of potential outcomes#
For a quantile \(\tau \in (0,1)\) the target parameters \(\theta_{\tau}(d)\) of interest are the conditional values at risk (CVaRs) of the potential outcomes,
where \(Y(d)\) denotes the potential outcome with \(d \in \{0, 1\}\) and \(F_{Y(d)}(x)\) the corresponding cdf of \(Y(d)\).
DoubleMLCVAR
implements conditional value at risk estimation for potential outcomes, where the argument treatment
indicates the potential outcome.
Estimation is conducted via its fit()
method:
In [120]: import numpy as np
In [121]: import doubleml as dml
In [122]: from doubleml.datasets import make_irm_data
In [123]: from sklearn.ensemble import RandomForestClassifier, RandomForestRegressor
In [124]: np.random.seed(3141)
In [125]: ml_g = RandomForestRegressor(n_estimators=100, max_features=20, max_depth=10, min_samples_leaf=2)
In [126]: ml_m = RandomForestClassifier(n_estimators=100, max_features=20, max_depth=10, min_samples_leaf=2)
In [127]: data = make_irm_data(theta=0.5, n_obs=500, dim_x=20, return_type='DataFrame')
In [128]: obj_dml_data = dml.DoubleMLData(data, 'y', 'd')
In [129]: dml_cvar_obj = dml.DoubleMLCVAR(obj_dml_data, ml_g, ml_m, treatment=1, quantile=0.5)
In [130]: dml_cvar_obj.fit().summary
Out[130]:
coef std err t P>|t| 2.5 % 97.5 %
d 1.588364 0.096616 16.43989 9.909942e-61 1.398999 1.777728
4.5.2. CVaR treatment effects#
For a quantile \(\tau \in (0,1)\) the target parameter \(\theta_{\tau}\) of interest are the treatment effects on the conditional value at risk,
where \(\theta_{\tau}(d)\) denotes the corresponding conditional values at risk of the potential outcomes.
DoubleMLQTE
implements CVaR treatment effect estimation, if the score
argument has been set to 'CVaR'
(default is 'PQ'
).
Estimation is conducted via its fit()
method:
In [131]: import numpy as np
In [132]: import doubleml as dml
In [133]: from doubleml.datasets import make_irm_data
In [134]: from sklearn.ensemble import RandomForestClassifier, RandomForestRegressor
In [135]: np.random.seed(3141)
In [136]: ml_g = RandomForestRegressor(n_estimators=100, max_features=20, max_depth=10, min_samples_leaf=2)
In [137]: ml_m = RandomForestClassifier(n_estimators=100, max_features=20, max_depth=10, min_samples_leaf=2)
In [138]: data = make_irm_data(theta=0.5, n_obs=500, dim_x=20, return_type='DataFrame')
In [139]: obj_dml_data = dml.DoubleMLData(data, 'y', 'd')
In [140]: dml_cvar_obj = dml.DoubleMLQTE(obj_dml_data, ml_g, ml_m, score='CVaR', quantiles=[0.25, 0.5, 0.75])
In [141]: dml_cvar_obj.fit().summary
Out[141]:
coef std err t P>|t| 2.5 % 97.5 %
0.25 0.474731 0.246624 1.924921 5.423921e-02 -0.008642 0.958105
0.50 0.691911 0.143495 4.821855 1.422293e-06 0.410667 0.973156
0.75 1.001714 0.166375 6.020819 1.735369e-09 0.675625 1.327803
A detailed notebook on CVaR estimation for potential outcomes and treatment effects is available in the example gallery.
4.6. Policy Learning with Trees#
Policy Learning considers to find an optimal decision policy. We consider deterministic binary policies, which are defined as mapping
Using the score component \(\psi_b(W_i,\hat{\eta})\) of the IRM score, we can find the optimal treatment policy by solving the weighted classification problem
where \(\Pi\) denotes a policy class, which we define as depth-\(m\) classification trees. Thus, we estimate splits in the features \(X\) that reflect the heterogeneity of the treatment effect and consequently maximize the sum of the estimated individual treatment effects of all individuals by assigning different treatments.
The DoubleMLIRM
class contains the policy_tree()
method, which enables the estimation of a policy tree
using weighted classification after fitting the DoubleMLIRM
object. To estimate a policy tree, the user has to specify a pandas DataFrame
containing
the covariates on based on which the policy will make treatment decisions. These can be either the original covariates used in the
DoubleMLIRM
estimation, or a subset, or new covariates.
This will construct and fit a DoubleMLPolicyTree
object. A plot of the decision rules can be displayed by the
plot_tree()
method. The predict()
method enables the application of the estimated policy on new data.
The depth
parameter, which defaults to 2
, can be used to adjust the maximum depth of the tree.
In [142]: import numpy as np
In [143]: import pandas as pd
In [144]: import doubleml as dml
In [145]: from doubleml.datasets import make_irm_data
In [146]: from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier
In [147]: ml_g = RandomForestRegressor(n_estimators=100, max_features=20, max_depth=5, min_samples_leaf=2)
In [148]: ml_m = RandomForestClassifier(n_estimators=100, max_features=20, max_depth=5, min_samples_leaf=2)
In [149]: np.random.seed(3333)
In [150]: data = make_irm_data(theta=0.5, n_obs=500, dim_x=20, return_type='DataFrame')
In [151]: obj_dml_data = dml.DoubleMLData(data, 'y', 'd')
In [152]: dml_irm_obj = dml.DoubleMLIRM(obj_dml_data, ml_g, ml_m)
In [153]: _ = dml_irm_obj.fit()
# define features to learn policy on
In [154]: np.random.seed(42)
In [155]: features = data[["X1","X2","X3"]]
In [156]: print(features.head())
X1 X2 X3
0 0.305133 0.497298 -0.811398
1 -0.696770 -1.717860 -2.030087
2 -0.903135 0.174940 0.185585
3 0.095475 -0.653820 -1.800272
4 -0.198953 0.203893 0.204653
# fits a tree of depth 2
In [157]: policy_tree_obj = dml_irm_obj.policy_tree(features=features)
In [158]: policy_tree_obj.plot_tree();
A more detailed notebook on Policy Trees is available in the example gallery.