Help for package OPL

Type:

Package

Title:

Optimal Policy Learning

Version:

1.0.2

Author:

Federico Brogi [aut, cre], Barbara Guardabascio [aut], Giovanni Cerulli [aut]

Maintainer:

Federico Brogi <federicobrogi@gmail.com>

Description:

Provides functions for optimal policy learning in socioeconomic applications helping users to learn the most effective policies based on data in order to maximize empirical welfare. Specifically, 'OPL' allows to find "treatment assignment rules" that maximize the overall welfare, defined as the sum of the policy effects estimated over all the policy beneficiaries. Documentation about 'OPL' is provided by several international articles via Athey et al (2021, <doi:10.3982/ECTA15732>), Kitagawa et al (2018, <doi:10.3982/ECTA13288>), Cerulli (2022, <doi:10.1080/13504851.2022.2032577>), the paper by Cerulli (2021, <doi:10.1080/13504851.2020.1820939>) and the book by Gareth et al (2013, <doi:10.1007/978-1-4614-7138-7>).

License:

GPL-3

Encoding:

UTF-8

RoxygenNote:

7.3.1

Imports:

stats, dplyr, ggplot2, pander, randomForest, tidyr

Suggests:

knitr, rmarkdown

VignetteBuilder:

knitr

NeedsCompilation:

Packaged:

2025-02-27 12:47:38 UTC; UTENTE

Repository:

CRAN

Date/Publication:

2025-02-27 13:00:06 UTC

OPL: Optimal Policy Learning Package

Description

The OPL package provides tools for estimating and optimizing policy assignment rules based on machine learning and econometric techniques.

Main functions

make_cate(): Computes conditional average treatment effects.
opl_tb(): Optimal policy learning for threshold-based policies.
opl_lc(): Optimal policy learning for linear combination policies.
opl_dt(): Optimal policy learning for decision tree-based policies.

Installation

To install the package from CRAN, use: install.packages("OPL")

Acknowledgments

The development of this software was supported by FOSSR (Fostering Open Science in Social Science Research), a project funded by the European Union - NextGenerationEU under the NPRR Grant agreement n. MURIR0000008.

Function to calculate the Causal Treatment Effect

Description

Predicting conditional average treatment effect (CATE) on a new policy based on the training over an old policy

Usage

make_cate(
  model,
  train_data,
  test_data,
  w,
  x,
  y,
  family = gaussian(),
  ntree = 100,
  mtry = 2,
  verbose = TRUE
)

Arguments

model

A model object used for estimation.

train_data

The training dataset.

test_data

The test dataset.

w

Set the treatment variable.

x

set Independent variables for the model.

y

Set the outcome variable.

family

The family type for the model (e.g., 'binomial').

ntree

Number of trees for the Random Forest model.

mtry

Number of variables to consider at each tree split in the Random Forest model.

verbose

Set TRUE to print the output on the console.

Value

An object containing the estimated causal treatment effect results.

References

Athey, S., and Wager S. 2021. Policy Learning with Observational Data, Econometrica, 89, 1, 133–161.
Cerulli, G. 2021. Improving econometric prediction by machine learning, Applied Economics Letters, 28, 16, 1419-1425.
Cerulli, G. 2022. Optimal treatment assignment of a threshold-based policy: empirical protocol and related issues, Applied Economics Letters, DOI: 10.1080/13504851.2022.2032577.
Gareth, J., Witten, D., Hastie, D.T., Tibshirani, R. 2013. An Introduction to Statistical Learning : with Applications in R. New York, Springer.
Kitagawa, T., and A. Tetenov. 2018. Who Should Be Treated? Empirical Welfare Maximization Methods for Treatment Choice, Econometrica, 86, 2, 591–616.

Optimal Policy Learning with Decision Tree

Description

Implementing ex-ante treatment assignment using as policy class a 2-layer fixed-depth decision-tree at specific splitting variables and threshold values.

Usage

opl_dt_c(make_cate_result, z, w, c1 = NA, c2 = NA, c3 = NA, verbose = TRUE)

Arguments

make_cate_result

A data frame resulting from the make_cate function, containing the predicted treatment effects (my_cate) and other variables for treatment assignment.

z

A character vector containing the names of the variables used for treatment assignment.

w

A string representing the treatment indicator variable name.

c1

Value of the threshold value c1 for the first splitting variable. This number must be chosen between 0 and 1.

c2

Value of the threshold value c2 for the second splitting variable. This number must be chosen between 0 and 1.

c3

Value of the threshold value c3 for the third splitting variable. This number must be chosen between 0 and 1.

verbose

Set TRUE to print the output on the console.

Value

A list containing:

W_opt_constr: The maximum average constrained welfare.
W_opt_unconstr: The average unconstrained welfare.
units_to_be_treated: A data frame of the units to be treated based on the optimal policy.
A plot showing the optimal policy assignment.

References

Athey, S., and Wager S. 2021. Policy Learning with Observational Data, Econometrica, 89, 1, 133–161.
Cerulli, G. 2021. Improving econometric prediction by machine learning, Applied Economics Letters, 28, 16, 1419-1425.
Cerulli, G. 2022. Optimal treatment assignment of a threshold-based policy: empirical protocol and related issues, Applied Economics Letters, DOI: 10.1080/13504851.2022.2032577.
Gareth, J., Witten, D., Hastie, D.T., Tibshirani, R. 2013. An Introduction to Statistical Learning : with Applications in R. New York, Springer.
Kitagawa, T., and A. Tetenov. 2018. Who Should Be Treated? Empirical Welfare Maximization Methods for Treatment Choice, Econometrica, 86, 2, 591–616.

User selection on multiple choice

Description

Function that allows the user to select a row of maximum welfare among the rows with maximum welfare constrained. The function prints out the result and requires user input to select the row.

Usage

opl_dt_max_choice(nc, col_max, verbose = TRUE)

Arguments

nc

Numeber of max welfare.

col_max

Row index for max constrained welfare.

verbose

Set TRUE to print the output on the console.

Value

Return the user's selection as an input.

Linear Combination Based Policy Learning

Description

Implementing ex-ante treatment assignment using as policy class a linear-combination approach at specific parameters' values c1, c2, and c3 for the linear-combination of variables var1 and var2: c1var1+c2var2>=c3.

Usage

opl_lc_c(make_cate_result, z, w, c1 = NA, c2 = NA, c3 = NA, verbose = TRUE)

Arguments

make_cate_result

A data frame containing the input data. It must include a column named my_cate representing conditional average treatment effects (CATE) generated using make_cate function.

z

A character vector of length 2 specifying the column names of the two threshold variables to be standardized.

w

A character string specifying the column name indicating treatment assignment (binary variable).

c1

Threshold for var1 given by the user or optimized by the the function. This number must be chosen between 0 and 1.

c2

Threshold for var2 given by the user or optimized by the the function. This number must be chosen between 0 and 1.

c3

Third parameter of the linear-combination. This number must be chosen between 0 and 1.

verbose

Set TRUE to print the output on the console.

Details

The function performs the following steps:

Standardizes the threshold variables using a min-max scaling technique.
Determines the optimal treatment assignment based on the linear combination of the threshold variables.
Performs a grid search to estimate the optimal policy.
Outputs a plot visualizing the optimal treatment assignments.
Prints the main results, including the percentage of treated units, the unconstrained and constrained welfare, and the policy parameters.

Value

The function returns a data frame containing the standardized variables and treatment assignments, and prints a summary of the results and a plot showing the optimal policy assignment.

References

Athey, S., and Wager S. 2021. Policy Learning with Observational Data, Econometrica, 89, 1, 133–161.
Cerulli, G. 2021. Improving econometric prediction by machine learning, Applied Economics Letters, 28, 16, 1419-1425.
Cerulli, G. 2022. Optimal treatment assignment of a threshold-based policy: empirical protocol and related issues, Applied Economics Letters, DOI: 10.1080/13504851.2022.2032577.
Gareth, J., Witten, D., Hastie, D.T., Tibshirani, R. 2013. An Introduction to Statistical Learning : with Applications in R. New York, Springer.
Kitagawa, T., and A. Tetenov. 2018. Who Should Be Treated? Empirical Welfare Maximization Methods for Treatment Choice, Econometrica, 86, 2, 591–616.

Threshold-based policy learning at specific values

Description

Implementing ex-ante treatment assignment using as policy class a threshold-based (or quadrant) approach at specific threshold values c1 and c2 for respectively the selection variables var1 and var2.

Usage

opl_tb_c(make_cate_result, z, w, c1 = NA, c2 = NA, verbose = TRUE)

Arguments

make_cate_result

A data frame containing the input data. It must include a column named my_cate representing conditional average treatment effects (CATE) generated using make_cate function.

z

A character vector of length 2 specifying the column names of the two threshold variables to be standardized.

w

A character string specifying the column name indicating treatment assignment (binary variable).

c1

Threshold for var1 given by the user or optimized by the the function. This number must be chosen between 0 and 1.

c2

Threshold for var2 given by the user or optimized by the the function. This number must be chosen between 0 and 1.

verbose

Set TRUE to print the output on the console.

Details

The function:

Standardizes the threshold variables to a 0-1 range.
Identifies the optimal thresholds based on grid search for maximizing constrained welfare.
Computes and displays key statistics, including average welfare measures and the percentage of treated units.

Value

The function invisibly returns the input data frame augmented with the following columns:

z[1]_std: Standardized version of the first threshold variable.
z[2]_std: Standardized version of the second threshold variable.
units_to_be_treated: Binary indicator for whether a unit should be treated based on the optimal policy.

Additionally, the function:

Prints the main results summary, including optimal threshold values, average constrained and unconstrained welfare, and treatment proportions.
Displays a scatter plot visualizing the policy assignment.

References

Athey, S., and Wager S. 2021. Policy Learning with Observational Data, Econometrica, 89, 1, 133–161.
Cerulli, G. 2021. Improving econometric prediction by machine learning, Applied Economics Letters, 28, 16, 1419-1425.
Cerulli, G. 2022. Optimal treatment assignment of a threshold-based policy: empirical protocol and related issues, Applied Economics Letters, DOI: 10.1080/13504851.2022.2032577.
Gareth, J., Witten, D., Hastie, D.T., Tibshirani, R. 2013. An Introduction to Statistical Learning : with Applications in R. New York, Springer.
Kitagawa, T., and A. Tetenov. 2018. Who Should Be Treated? Empirical Welfare Maximization Methods for Treatment Choice, Econometrica, 86, 2, 591–616.

Testing overlap between old and new policy sample

Description

Function to perform overlap analysis between train and test datasets. The function performs principal component analysis (PCA) on the covariates for both sets and calculates the Kolmogorov-Smirnov test for overlap.

Usage

overlapping(train_data, test_data, x)

Arguments

train_data

Train Dataset indicating the old policy sample.

test_data

Test Dataset indicating the new policy sample.

x

Vector of predictor variables.

Value

The function prints the superposition graph and the results of the Kolmogorov-Smirnov test.