Type: | Package |
Title: | Optimal Policy Learning |
Version: | 1.0.2 |
Author: | Federico Brogi [aut, cre], Barbara Guardabascio [aut], Giovanni Cerulli [aut] |
Maintainer: | Federico Brogi <federicobrogi@gmail.com> |
Description: | Provides functions for optimal policy learning in socioeconomic applications helping users to learn the most effective policies based on data in order to maximize empirical welfare. Specifically, 'OPL' allows to find "treatment assignment rules" that maximize the overall welfare, defined as the sum of the policy effects estimated over all the policy beneficiaries. Documentation about 'OPL' is provided by several international articles via Athey et al (2021, <doi:10.3982/ECTA15732>), Kitagawa et al (2018, <doi:10.3982/ECTA13288>), Cerulli (2022, <doi:10.1080/13504851.2022.2032577>), the paper by Cerulli (2021, <doi:10.1080/13504851.2020.1820939>) and the book by Gareth et al (2013, <doi:10.1007/978-1-4614-7138-7>). |
License: | GPL-3 |
Encoding: | UTF-8 |
RoxygenNote: | 7.3.1 |
Imports: | stats, dplyr, ggplot2, pander, randomForest, tidyr |
Suggests: | knitr, rmarkdown |
VignetteBuilder: | knitr |
NeedsCompilation: | no |
Packaged: | 2025-02-27 12:47:38 UTC; UTENTE |
Repository: | CRAN |
Date/Publication: | 2025-02-27 13:00:06 UTC |
OPL: Optimal Policy Learning Package
Description
The OPL package provides tools for estimating and optimizing policy assignment rules based on machine learning and econometric techniques.
Main functions
-
make_cate()
: Computes conditional average treatment effects. -
opl_tb()
: Optimal policy learning for threshold-based policies. -
opl_lc()
: Optimal policy learning for linear combination policies. -
opl_dt()
: Optimal policy learning for decision tree-based policies.
Installation
To install the package from CRAN, use:
install.packages("OPL")
Acknowledgments
The development of this software was supported by FOSSR (Fostering Open Science in Social Science Research), a project funded by the European Union - NextGenerationEU under the NPRR Grant agreement n. MURIR0000008.
Function to calculate the Causal Treatment Effect
Description
Predicting conditional average treatment effect (CATE) on a new policy based on the training over an old policy
Usage
make_cate(
model,
train_data,
test_data,
w,
x,
y,
family = gaussian(),
ntree = 100,
mtry = 2,
verbose = TRUE
)
Arguments
model |
A |
train_data |
The training dataset. |
test_data |
The test dataset. |
w |
Set the treatment variable. |
x |
set Independent variables for the model. |
y |
Set the outcome variable. |
family |
The family type for the model (e.g., 'binomial'). |
ntree |
Number of trees for the Random Forest model. |
mtry |
Number of variables to consider at each tree split in the Random Forest model. |
verbose |
Set TRUE to print the output on the console. |
Value
An object containing the estimated causal treatment effect results.
References
Athey, S., and Wager S. 2021. Policy Learning with Observational Data, Econometrica, 89, 1, 133–161.
Cerulli, G. 2021. Improving econometric prediction by machine learning, Applied Economics Letters, 28, 16, 1419-1425.
Cerulli, G. 2022. Optimal treatment assignment of a threshold-based policy: empirical protocol and related issues, Applied Economics Letters, DOI: 10.1080/13504851.2022.2032577.
Gareth, J., Witten, D., Hastie, D.T., Tibshirani, R. 2013. An Introduction to Statistical Learning : with Applications in R. New York, Springer.
Kitagawa, T., and A. Tetenov. 2018. Who Should Be Treated? Empirical Welfare Maximization Methods for Treatment Choice, Econometrica, 86, 2, 591–616.
Optimal Policy Learning with Decision Tree
Description
Implementing ex-ante treatment assignment using as policy class a 2-layer fixed-depth decision-tree at specific splitting variables and threshold values.
Usage
opl_dt_c(make_cate_result, z, w, c1 = NA, c2 = NA, c3 = NA, verbose = TRUE)
Arguments
make_cate_result |
A data frame resulting from the |
z |
A character vector containing the names of the variables used for treatment assignment. |
w |
A string representing the treatment indicator variable name. |
c1 |
Value of the threshold value c1 for the first splitting variable. This number must be chosen between 0 and 1. |
c2 |
Value of the threshold value c2 for the second splitting variable. This number must be chosen between 0 and 1. |
c3 |
Value of the threshold value c3 for the third splitting variable. This number must be chosen between 0 and 1. |
verbose |
Set TRUE to print the output on the console. |
Value
A list containing:
-
W_opt_constr
: The maximum average constrained welfare. -
W_opt_unconstr
: The average unconstrained welfare. -
units_to_be_treated
: A data frame of the units to be treated based on the optimal policy. A plot showing the optimal policy assignment.
References
Athey, S., and Wager S. 2021. Policy Learning with Observational Data, Econometrica, 89, 1, 133–161.
Cerulli, G. 2021. Improving econometric prediction by machine learning, Applied Economics Letters, 28, 16, 1419-1425.
Cerulli, G. 2022. Optimal treatment assignment of a threshold-based policy: empirical protocol and related issues, Applied Economics Letters, DOI: 10.1080/13504851.2022.2032577.
Gareth, J., Witten, D., Hastie, D.T., Tibshirani, R. 2013. An Introduction to Statistical Learning : with Applications in R. New York, Springer.
Kitagawa, T., and A. Tetenov. 2018. Who Should Be Treated? Empirical Welfare Maximization Methods for Treatment Choice, Econometrica, 86, 2, 591–616.
User selection on multiple choice
Description
Function that allows the user to select a row of maximum welfare among the rows with maximum welfare constrained. The function prints out the result and requires user input to select the row.
Usage
opl_dt_max_choice(nc, col_max, verbose = TRUE)
Arguments
nc |
Numeber of max welfare. |
col_max |
Row index for max constrained welfare. |
verbose |
Set TRUE to print the output on the console. |
Value
Return the user's selection as an input.
Linear Combination Based Policy Learning
Description
Implementing ex-ante treatment assignment using as policy class a linear-combination approach at specific parameters' values c1, c2, and c3 for the linear-combination of variables var1 and var2: c1var1+c2var2>=c3.
Usage
opl_lc_c(make_cate_result, z, w, c1 = NA, c2 = NA, c3 = NA, verbose = TRUE)
Arguments
make_cate_result |
A data frame containing the input data. It must include
a column named |
z |
A character vector of length 2 specifying the column names of the two threshold variables to be standardized. |
w |
A character string specifying the column name indicating treatment assignment (binary variable). |
c1 |
Threshold for var1 given by the user or optimized by the the function. This number must be chosen between 0 and 1. |
c2 |
Threshold for var2 given by the user or optimized by the the function. This number must be chosen between 0 and 1. |
c3 |
Third parameter of the linear-combination. This number must be chosen between 0 and 1. |
verbose |
Set TRUE to print the output on the console. |
Details
The function performs the following steps:
Standardizes the threshold variables using a min-max scaling technique.
Determines the optimal treatment assignment based on the linear combination of the threshold variables.
Performs a grid search to estimate the optimal policy.
Outputs a plot visualizing the optimal treatment assignments.
Prints the main results, including the percentage of treated units, the unconstrained and constrained welfare, and the policy parameters.
Value
The function returns a data frame containing the standardized variables and treatment assignments, and prints a summary of the results and a plot showing the optimal policy assignment.
References
Athey, S., and Wager S. 2021. Policy Learning with Observational Data, Econometrica, 89, 1, 133–161.
Cerulli, G. 2021. Improving econometric prediction by machine learning, Applied Economics Letters, 28, 16, 1419-1425.
Cerulli, G. 2022. Optimal treatment assignment of a threshold-based policy: empirical protocol and related issues, Applied Economics Letters, DOI: 10.1080/13504851.2022.2032577.
Gareth, J., Witten, D., Hastie, D.T., Tibshirani, R. 2013. An Introduction to Statistical Learning : with Applications in R. New York, Springer.
Kitagawa, T., and A. Tetenov. 2018. Who Should Be Treated? Empirical Welfare Maximization Methods for Treatment Choice, Econometrica, 86, 2, 591–616.
Threshold-based policy learning at specific values
Description
Implementing ex-ante treatment assignment using as policy class a threshold-based (or quadrant) approach at specific threshold values c1 and c2 for respectively the selection variables var1 and var2.
Usage
opl_tb_c(make_cate_result, z, w, c1 = NA, c2 = NA, verbose = TRUE)
Arguments
make_cate_result |
A data frame containing the input data. It must include
a column named |
z |
A character vector of length 2 specifying the column names of the two threshold variables to be standardized. |
w |
A character string specifying the column name indicating treatment assignment (binary variable). |
c1 |
Threshold for var1 given by the user or optimized by the the function. This number must be chosen between 0 and 1. |
c2 |
Threshold for var2 given by the user or optimized by the the function. This number must be chosen between 0 and 1. |
verbose |
Set TRUE to print the output on the console. |
Details
The function:
Standardizes the threshold variables to a 0-1 range.
Identifies the optimal thresholds based on grid search for maximizing constrained welfare.
Computes and displays key statistics, including average welfare measures and the percentage of treated units.
Value
The function invisibly returns the input data frame augmented with the following columns:
-
z[1]_std
: Standardized version of the first threshold variable. -
z[2]_std
: Standardized version of the second threshold variable. -
units_to_be_treated
: Binary indicator for whether a unit should be treated based on the optimal policy.
Additionally, the function:
Prints the main results summary, including optimal threshold values, average constrained and unconstrained welfare, and treatment proportions.
Displays a scatter plot visualizing the policy assignment.
References
Athey, S., and Wager S. 2021. Policy Learning with Observational Data, Econometrica, 89, 1, 133–161.
Cerulli, G. 2021. Improving econometric prediction by machine learning, Applied Economics Letters, 28, 16, 1419-1425.
Cerulli, G. 2022. Optimal treatment assignment of a threshold-based policy: empirical protocol and related issues, Applied Economics Letters, DOI: 10.1080/13504851.2022.2032577.
Gareth, J., Witten, D., Hastie, D.T., Tibshirani, R. 2013. An Introduction to Statistical Learning : with Applications in R. New York, Springer.
Kitagawa, T., and A. Tetenov. 2018. Who Should Be Treated? Empirical Welfare Maximization Methods for Treatment Choice, Econometrica, 86, 2, 591–616.
Testing overlap between old and new policy sample
Description
Function to perform overlap analysis between train and test datasets. The function performs principal component analysis (PCA) on the covariates for both sets and calculates the Kolmogorov-Smirnov test for overlap.
Usage
overlapping(train_data, test_data, x)
Arguments
train_data |
Train Dataset indicating the old policy sample. |
test_data |
Test Dataset indicating the new policy sample. |
x |
Vector of predictor variables. |
Value
The function prints the superposition graph and the results of the Kolmogorov-Smirnov test.