Title: | Explainers for Regression Models in HIV Research |
Version: | 1.3.1 |
Maintainer: | Juan Pablo Acuña González <22253567@uagro.mx> |
Description: | A dedicated viral-explainer model tool designed to empower researchers in the field of HIV research, particularly in viral load and CD4 (Cluster of Differentiation 4) lymphocytes regression modeling. Drawing inspiration from the 'tidymodels' framework for rigorous model building of Max Kuhn and Hadley Wickham (2020) https://www.tidymodels.org, and the 'DALEXtra' tool for explainability by Przemyslaw Biecek (2020) <doi:10.48550/arXiv.2009.13248>. It aims to facilitate interpretable and reproducible research in biostatistics and computational biology for the benefit of understanding HIV dynamics. |
License: | MIT + file LICENSE |
Encoding: | UTF-8 |
RoxygenNote: | 7.3.2 |
Suggests: | Cubist, dplyr, earth, kknn, rsample, rules, testthat (≥ 3.0.0), vdiffr |
Config/testthat/edition: | 3 |
Depends: | R (≥ 4.1.0) |
LazyData: | true |
Imports: | DALEX, DALEXtra, parsnip, recipes, stats, workflows |
URL: | https://github.com/juanv66x/viralx |
BugReports: | https://github.com/juanv66x/viralx/issues |
NeedsCompilation: | no |
Packaged: | 2025-07-04 15:58:42 UTC; jp |
Author: | Juan Pablo Acuña González
|
Repository: | CRAN |
Date/Publication: | 2025-07-04 21:40:02 UTC |
viralx: Explainers for Regression Models in HIV Research
Description
A dedicated viral-explainer model tool designed to empower researchers in the field of HIV research, particularly in viral load and CD4 (Cluster of Differentiation 4) lymphocytes regression modeling. Drawing inspiration from the 'tidymodels' framework for rigorous model building of Max Kuhn and Hadley Wickham (2020) https://www.tidymodels.org, and the 'DALEXtra' tool for explainability by Przemyslaw Biecek (2020) doi:10.48550/arXiv.2009.13248. It aims to facilitate interpretable and reproducible research in biostatistics and computational biology for the benefit of understanding HIV dynamics.
Author(s)
Maintainer: Juan Pablo Acuña González 22253567@uagro.mx (ORCID)
See Also
Useful links:
Global Visualization of SHAP Values for Cubist Rules Model
Description
This function generates a visualization for the global feature importance of a Cubist Rules (CR) model trained on HIV data with specified hyperparameters.
Usage
glob_cr_vis(vip_featured, hiv_data, cr_hyperparameters, vip_train, v_train)
Arguments
vip_featured |
The name of the response variable to explain. |
hiv_data |
The training dataset containing predictor variables and the response variable. |
cr_hyperparameters |
A list of hyperparameters for the CR model, including:
|
vip_train |
The dataset used for training the CR model. |
v_train |
The response variable used for training the CR model. |
Value
A visualization of global feature importance for the CR model.
Examples
## Not run:
library(dplyr)
library(rsample)
library(rules)
library(Cubist)
set.seed(123)
hiv_data <- train2
cr_hyperparameters <- list(neighbors = 5, committees = 58)
vip_featured <- c("cd_2022")
vip_features <- c("cd_2019", "vl_2019", "cd_2021", "vl_2021", "vl_2022")
vip_train <- train2 |>
dplyr::select(rsample::all_of(vip_features))
v_train <- train2 |>
dplyr::select(rsample::all_of(vip_featured))
glob_cr_vis(vip_featured, hiv_data, cr_hyperparameters, vip_train, v_train)
## End(Not run)
Global Visualization of SHAP Values for K-Nearest Neighbor Model
Description
This function generates a visualization for the global feature importance of a K-Nearest Neighbors (KNN) model trained on HIV data with specified hyperparameters.
Usage
glob_knn_vis(vip_featured, hiv_data, knn_hyperparameters, vip_train, v_train)
Arguments
vip_featured |
The name of the response variable to explain. |
hiv_data |
The training dataset containing predictor variables and the response variable. |
knn_hyperparameters |
A list of hyperparameters for the KNN model, including:
|
vip_train |
The dataset used for training the KNN model. |
v_train |
The response variable used for training the KNN model. |
Value
A visualization of global feature importance for the KNN model.
Examples
## Not run:
library(dplyr)
library(rsample)
set.seed(123)
hiv_data <- train2
knn_hyperparameters <- list(neighbors = 5, weight_func = "optimal", dist_power = 0.3304783)
vip_featured <- "cd_2022"
vip_features <- c("cd_2019", "vl_2019", "cd_2021", "vl_2021", "vl_2022")
vip_train <- train2 |>
select(all_of(vip_features))
v_train <- train2 |>
select(all_of(vip_featured))
glob_knn_vis(vip_featured, hiv_data, knn_hyperparameters, vip_train, v_train)
## End(Not run)
Global Visualization of SHAP Values for Neural Network Model
Description
The glob_nn_vis
function generates a global visualization of SHAP (Shapley
Additive Explanations) values for a neural network model. It utilizes the
DALEXtra package to explain the model's predictions and then creates a global
SHAP visualization.
Usage
glob_nn_vis(vip_featured, hiv_data, hu, plty, epo, vip_train, v_train)
Arguments
vip_featured |
A character value specifying the featured variable of interest. |
hiv_data |
A data frame containing the HIV research data used for model training. |
hu |
A numeric value specifying the number of hidden units in the neural network model. |
plty |
A numeric value specifying the penalty parameter for the neural network model. |
epo |
A numeric value specifying the number of epochs (training iterations) for the neural network model. |
vip_train |
A data frame containing the training data used to fit the neural network model. |
v_train |
A numeric vector representing the response variable corresponding to the training data. |
Value
A global visualization of SHAP values for the specified neural network model.
Examples
## Not run:
library(dplyr)
library(rsample)
cd_2019 <- c(824, 169, 342, 423, 441, 507, 559,
173, 764, 780, 244, 527, 417, 800,
602, 494, 345, 780, 780, 527, 556,
559, 238, 288, 244, 353, 169, 556,
824, 169, 342, 423, 441, 507, 559)
vl_2019 <- c(40, 11388, 38961, 40, 75, 4095, 103,
11388, 46, 103, 11388, 40, 0, 11388,
0, 4095, 40, 93, 49, 49, 49,
4095, 6837, 38961, 38961, 0, 0, 93,
40, 11388, 38961, 40, 75, 4095, 103)
cd_2021 <- c(992, 275, 331, 454, 479, 553, 496,
230, 605, 432, 170, 670, 238, 238,
634, 422, 429, 513, 327, 465, 479,
661, 382, 364, 109, 398, 209, 1960,
992, 275, 331, 454, 479, 553, 496)
vl_2021 <- c(80, 1690, 5113, 71, 289, 3063, 0,
262, 0, 15089, 13016, 1513, 60, 60,
49248, 159308, 56, 0, 516675, 49, 237,
84, 292, 414, 26176, 62, 126, 93,
80, 1690, 5113, 71, 289, 3063, 0)
cd_2022 <- c(700, 127, 127, 547, 547, 547, 777,
149, 628, 614, 253, 918, 326, 326,
574, 361, 253, 726, 659, 596, 427,
447, 326, 253, 248, 326, 260, 918,
700, 127, 127, 547, 547, 547, 777)
vl_2022 <- c(0, 0, 53250, 0, 40, 1901, 0,
955, 0, 0, 0, 0, 40, 0,
49248, 159308, 56, 0, 516675, 49, 237,
0, 23601, 0, 40, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0)
x <- cbind(cd_2019, vl_2019, cd_2021, vl_2021, cd_2022, vl_2022) |>
as.data.frame()
set.seed(123)
hi_data <- rsample::initial_split(x)
set.seed(123)
hiv_data <- hi_data |>
rsample::training()
hu <- 5
plty <- 1.131656e-09
epo <- 176
vip_featured <- c("cd_2022")
vip_features <- c("cd_2019", "vl_2019", "cd_2021", "vl_2021", "vl_2022")
set.seed(123)
vi_train <- rsample::initial_split(x)
set.seed(123)
vip_train <- vi_train |>
rsample::training() |>
dplyr::select(rsample::all_of(vip_features))
v_train <- vi_train |>
rsample::training() |>
dplyr::select(rsample::all_of(vip_featured))
glob_nn_vis(vip_featured, hiv_data, hu, plty, epo, vip_train, v_train)
## End(Not run)
Training Data for Explainability of Models
Description
This dataset contains training data for viral load explainer models. It includes CD4 and viral load measurements for different years.
Usage
data(train2)
Format
A tibble (data frame) with 25 rows and 6 columns.
Note
To explore more rows of this dataset, you can use the print(n = ...)
function.
Author(s)
Juan Pablo Acuña González 22253567@uagro.mx
Examples
data(train2)
train2
Explain K-Nearest Neighbors Model
Description
Explains the predictions of a K-Nearest Neighbors (KNN) model for CD4 and viral load data using the DALEX and DALEXtra packages. It provides insights into the specified variable's impact on the KNN model's predictions.
Usage
viralx_knn(vip_featured, hiv_data, knn_hyperparameters, vip_train, vip_new)
Arguments
vip_featured |
The name of the variable to be explained. |
hiv_data |
The data frame containing the CD4 and viral load data. |
knn_hyperparameters |
A list of hyperparameters for the KNN model, including:
|
vip_train |
The training data used for creating the explainer object. |
vip_new |
A new observation for which to generate explanations. |
Value
A data frame containing explanations for the specified variable.
Examples
## Not run:
hiv_data <- train2
knn_hyperparameters <- list(neighbors = 5, weight_func = "optimal", dist_power = 0.3304783)
vip_featured <- "cd_2022"
vip_train <- hiv_data
vip_new <- vip_train[1,]
viralx_knn(vip_featured, hiv_data, knn_hyperparameters, vip_train, vip_new)
## End(Not run)
Global Explainers for K-Nearest Neighbor Models
Description
This function calculates global feature importance for a K-Nearest Neighbors (KNN) model trained on HIV data with specified hyperparameters.
Usage
viralx_knn_glob(
vip_featured,
hiv_data,
knn_hyperparameters,
vip_train,
v_train
)
Arguments
vip_featured |
The name of the response variable to explain. |
hiv_data |
The training dataset containing predictor variables and the response variable. |
knn_hyperparameters |
A list of hyperparameters for the KNN model, including:
|
vip_train |
The dataset used for training the KNN model. |
v_train |
The response variable used for training the KNN model. |
Value
A list of global feature importance measures for each predictor variable.
Examples
## Not run:
library(dplyr)
library(rsample)
set.seed(123)
hiv_data <- train2
knn_hyperparameters <- list(neighbors = 5, weight_func = "optimal", dist_power = 0.3304783)
vip_featured <- "cd_2022"
vip_features <- c("cd_2019", "vl_2019", "cd_2021", "vl_2021", "vl_2022")
vip_train <- train2 |>
select(all_of(vip_features))
v_train <- train2 |>
select(all_of(vip_featured))
viralx_knn_glob(vip_featured, hiv_data, knn_hyperparameters, vip_train, v_train)
## End(Not run)
Explain K Nearest Neighbor Model using SHAP values
Description
This function calculates SHAP (SHapley Additive exPlanations) values for a K-Nearest Neighbors (KNN) model trained on HIV data with specified hyperparameters.
Usage
viralx_knn_shap(
vip_featured,
hiv_data,
knn_hyperparameters,
vip_train,
vip_new,
orderings
)
Arguments
vip_featured |
The name of the response variable to explain. |
hiv_data |
The training dataset containing predictor variables and the response variable. |
knn_hyperparameters |
A list of hyperparameters for the KNN model, including:
|
vip_train |
The dataset used for training the KNN model. |
vip_new |
The dataset for which SHAP values are calculated. |
orderings |
The number of orderings for SHAP value calculations. |
Value
A list of SHAP values for each observation in vip_new
.
Examples
## Not run:
set.seed(123)
hiv_data <- train2
knn_hyperparameters <- list(neighbors = 5, weight_func = "optimal", dist_power = 0.3304783)
vip_featured <- "cd_2022"
vip_train <- hiv_data
vip_new <- vip_train[1, ]
orderings <- 20
viralx_knn_shap(vip_featured, hiv_data, knn_hyperparameters, vip_train, vip_new, orderings)
## End(Not run)
Visualize SHAP Values for K-Nearest Neighbor Model
Description
Visualizes SHAP (Shapley Additive Explanations) values for a KNN (K-Nearest Neighbor) model by employing the DALEXtra and DALEX packages to provide visual insights into the impact of a specified variable on the model's predictions.
Usage
viralx_knn_vis(
vip_featured,
hiv_data,
knn_hyperparameters,
vip_train,
vip_new,
orderings
)
Arguments
vip_featured |
The name of the response variable to explain. |
hiv_data |
The training dataset containing predictor variables and the response variable. |
knn_hyperparameters |
A list of hyperparameters for the KNN model, including:
|
vip_train |
The dataset used for training the KNN model. |
vip_new |
The dataset for which SHAP values are calculated. |
orderings |
The number of orderings for SHAP value calculations. |
Value
A list of SHAP values for each observation in vip_new
.
Examples
## Not run:
set.seed(123)
hiv_data <- train2
knn_hyperparameters <- list(neighbors = 5, weight_func = "optimal", dist_power = 0.3304783)
vip_featured <- "cd_2022"
vip_train <- hiv_data
vip_new <- vip_train[1,]
orderings <- 20
viralx_knn_vis(vip_featured, hiv_data, knn_hyperparameters, vip_train, vip_new, orderings)
## End(Not run)
Explain Multivariate Adaptive Regression Splines Model
Description
Explains the predictions of a Multivariate Adaptive Regression Splines (MARS) model for viral load or CD4 counts using the DALEX and DALEXtra tools.
Usage
viralx_mars(vip_featured, hiv_data, nt, pd, pru, vip_train, vip_new)
Arguments
vip_featured |
A character value |
hiv_data |
A data frame |
nt |
A numeric value |
pd |
A numeric value |
pru |
A character value |
vip_train |
A data frame |
vip_new |
A numeric vector |
Value
A data frame
Examples
## Not run:
library(dplyr)
library(rsample)
cd_2019 <- c(824, 169, 342, 423, 441, 507, 559,
173, 764, 780, 244, 527, 417, 800,
602, 494, 345, 780, 780, 527, 556,
559, 238, 288, 244, 353, 169, 556,
824, 169, 342, 423, 441, 507, 559)
vl_2019 <- c(40, 11388, 38961, 40, 75, 4095, 103,
11388, 46, 103, 11388, 40, 0, 11388,
0, 4095, 40, 93, 49, 49, 49,
4095, 6837, 38961, 38961, 0, 0, 93,
40, 11388, 38961, 40, 75, 4095, 103)
cd_2021 <- c(992, 275, 331, 454, 479, 553, 496,
230, 605, 432, 170, 670, 238, 238,
634, 422, 429, 513, 327, 465, 479,
661, 382, 364, 109, 398, 209, 1960,
992, 275, 331, 454, 479, 553, 496)
vl_2021 <- c(80, 1690, 5113, 71, 289, 3063, 0,
262, 0, 15089, 13016, 1513, 60, 60,
49248, 159308, 56, 0, 516675, 49, 237,
84, 292, 414, 26176, 62, 126, 93,
80, 1690, 5113, 71, 289, 3063, 0)
cd_2022 <- c(700, 127, 127, 547, 547, 547, 777,
149, 628, 614, 253, 918, 326, 326,
574, 361, 253, 726, 659, 596, 427,
447, 326, 253, 248, 326, 260, 918,
700, 127, 127, 547, 547, 547, 777)
vl_2022 <- c(0, 0, 53250, 0, 40, 1901, 0,
955, 0, 0, 0, 0, 40, 0,
49248, 159308, 56, 0, 516675, 49, 237,
0, 23601, 0, 40, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0)
x <- cbind(cd_2019, vl_2019, cd_2021, vl_2021, cd_2022, vl_2022) |>
as.data.frame()
set.seed(123)
hi_data <- rsample::initial_split(x)
set.seed(123)
hiv_data <- hi_data |>
rsample::training()
nt <- 3
pd <- 1
pru <- "none"
vip_featured <- c("cd_2022")
vip_features <- c("cd_2019", "vl_2019", "cd_2021", "vl_2021", "vl_2022")
set.seed(123)
vi_train <- rsample::initial_split(x)
set.seed(123)
vip_train <- vi_train |>
rsample::training() |>
dplyr::select(rsample::all_of(vip_features))
vip_new <- vip_train[1,]
viralx_mars(vip_featured, hiv_data, nt, pd, pru, vip_train, vip_new)
## End(Not run)
Explain Multivariate Adaptive Regression Splines Using SHAP Values
Description
Explains the predictions of a MARS (Multivariate Adaptive Regression Splines) model using SHAP (Shapley Additive Explanations) values. It utilizes the DALEXtra and DALEX packages to provide SHAP-based explanations for the specified model.
Usage
viralx_mars_shap(
vip_featured,
hiv_data,
nt,
pd,
pru,
vip_train,
vip_new,
orderings
)
Arguments
vip_featured |
A character value |
hiv_data |
A data frame |
nt |
A numeric value |
pd |
A numeric value |
pru |
A character value |
vip_train |
A data frame |
vip_new |
A numeric vector |
orderings |
A numeric value |
Value
A data frame
Examples
## Not run:
library(dplyr)
library(rsample)
cd_2019 <- c(824, 169, 342, 423, 441, 507, 559,
173, 764, 780, 244, 527, 417, 800,
602, 494, 345, 780, 780, 527, 556,
559, 238, 288, 244, 353, 169, 556,
824, 169, 342, 423, 441, 507, 559)
vl_2019 <- c(40, 11388, 38961, 40, 75, 4095, 103,
11388, 46, 103, 11388, 40, 0, 11388,
0, 4095, 40, 93, 49, 49, 49,
4095, 6837, 38961, 38961, 0, 0, 93,
40, 11388, 38961, 40, 75, 4095, 103)
cd_2021 <- c(992, 275, 331, 454, 479, 553, 496,
230, 605, 432, 170, 670, 238, 238,
634, 422, 429, 513, 327, 465, 479,
661, 382, 364, 109, 398, 209, 1960,
992, 275, 331, 454, 479, 553, 496)
vl_2021 <- c(80, 1690, 5113, 71, 289, 3063, 0,
262, 0, 15089, 13016, 1513, 60, 60,
49248, 159308, 56, 0, 516675, 49, 237,
84, 292, 414, 26176, 62, 126, 93,
80, 1690, 5113, 71, 289, 3063, 0)
cd_2022 <- c(700, 127, 127, 547, 547, 547, 777,
149, 628, 614, 253, 918, 326, 326,
574, 361, 253, 726, 659, 596, 427,
447, 326, 253, 248, 326, 260, 918,
700, 127, 127, 547, 547, 547, 777)
vl_2022 <- c(0, 0, 53250, 0, 40, 1901, 0,
955, 0, 0, 0, 0, 40, 0,
49248, 159308, 56, 0, 516675, 49, 237,
0, 23601, 0, 40, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0)
x <- cbind(cd_2019, vl_2019, cd_2021, vl_2021, cd_2022, vl_2022) |>
as.data.frame()
set.seed(123)
hi_data <- rsample::initial_split(x)
set.seed(123)
hiv_data <- hi_data |>
rsample::training()
nt <- 3
pd <- 1
pru <- "none"
vip_featured <- c("cd_2022")
vip_features <- c("cd_2019", "vl_2019", "cd_2021", "vl_2021", "vl_2022")
set.seed(123)
vi_train <- rsample::initial_split(x)
set.seed(123)
vip_train <- vi_train |>
rsample::training() |>
dplyr::select(rsample::all_of(vip_features))
vip_new <- vip_train[1,]
orderings <- 20
viralx_mars_shap(vip_featured, hiv_data, nt, pd, pru, vip_train, vip_new,orderings)
## End(Not run)
Visualize SHAP Values for Multivariate Adaptive Regression Splines Model
Description
Visualizes SHAP (Shapley Additive Explanations) values for a MARS (Multivariate Adaptive Regression Splines) model by employing the DALEXtra and DALEX packages to provide visual insights into the impact of a specified variable on the model's predictions.
Usage
viralx_mars_vis(
vip_featured,
hiv_data,
nt,
pd,
pru,
vip_train,
vip_new,
orderings
)
Arguments
vip_featured |
A character value |
hiv_data |
A data frame |
nt |
A numeric value |
pd |
A numeric value |
pru |
A character value |
vip_train |
A data frame |
vip_new |
A numeric vector |
orderings |
A numeric value |
Value
A ggplot object
Examples
## Not run:
library(dplyr)
library(rsample)
cd_2019 <- c(824, 169, 342, 423, 441, 507, 559,
173, 764, 780, 244, 527, 417, 800,
602, 494, 345, 780, 780, 527, 556,
559, 238, 288, 244, 353, 169, 556,
824, 169, 342, 423, 441, 507, 559)
vl_2019 <- c(40, 11388, 38961, 40, 75, 4095, 103,
11388, 46, 103, 11388, 40, 0, 11388,
0, 4095, 40, 93, 49, 49, 49,
4095, 6837, 38961, 38961, 0, 0, 93,
40, 11388, 38961, 40, 75, 4095, 103)
cd_2021 <- c(992, 275, 331, 454, 479, 553, 496,
230, 605, 432, 170, 670, 238, 238,
634, 422, 429, 513, 327, 465, 479,
661, 382, 364, 109, 398, 209, 1960,
992, 275, 331, 454, 479, 553, 496)
vl_2021 <- c(80, 1690, 5113, 71, 289, 3063, 0,
262, 0, 15089, 13016, 1513, 60, 60,
49248, 159308, 56, 0, 516675, 49, 237,
84, 292, 414, 26176, 62, 126, 93,
80, 1690, 5113, 71, 289, 3063, 0)
cd_2022 <- c(700, 127, 127, 547, 547, 547, 777,
149, 628, 614, 253, 918, 326, 326,
574, 361, 253, 726, 659, 596, 427,
447, 326, 253, 248, 326, 260, 918,
700, 127, 127, 547, 547, 547, 777)
vl_2022 <- c(0, 0, 53250, 0, 40, 1901, 0,
955, 0, 0, 0, 0, 40, 0,
49248, 159308, 56, 0, 516675, 49, 237,
0, 23601, 0, 40, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0)
x <- cbind(cd_2019, vl_2019, cd_2021, vl_2021, cd_2022, vl_2022) |>
as.data.frame()
set.seed(123)
hi_data <- rsample::initial_split(x)
set.seed(123)
hiv_data <- hi_data |>
rsample::training()
nt <- 3
pd <- 1
pru <- "none"
vip_featured <- c("cd_2022")
vip_features <- c("cd_2019", "vl_2019", "cd_2021", "vl_2021", "vl_2022")
set.seed(123)
vi_train <- rsample::initial_split(x)
set.seed(123)
vip_train <- vi_train |>
rsample::training() |>
dplyr::select(rsample::all_of(vip_features))
vip_new <- vip_train[1,]
orderings <- 20
viralx_mars_vis(vip_featured, hiv_data, nt, pd, pru, vip_train, vip_new, orderings)
## End(Not run)
Explain Neural Network Regression Model
Description
Explains the predictions of a neural network regression model for viral load or CD4 counts using the DALEX and DALEXtra tools
Usage
viralx_nn(vip_featured, hiv_data, hu, plty, epo, vip_train, vip_new)
Arguments
vip_featured |
A character value |
hiv_data |
A data frame |
hu |
A numeric value |
plty |
A numeric value |
epo |
A numeric value |
vip_train |
A data frame |
vip_new |
A numeric vector |
Value
A data frame
Examples
## Not run:
library(dplyr)
library(rsample)
cd_2019 <- c(824, 169, 342, 423, 441, 507, 559,
173, 764, 780, 244, 527, 417, 800,
602, 494, 345, 780, 780, 527, 556,
559, 238, 288, 244, 353, 169, 556,
824, 169, 342, 423, 441, 507, 559)
vl_2019 <- c(40, 11388, 38961, 40, 75, 4095, 103,
11388, 46, 103, 11388, 40, 0, 11388,
0, 4095, 40, 93, 49, 49, 49,
4095, 6837, 38961, 38961, 0, 0, 93,
40, 11388, 38961, 40, 75, 4095, 103)
cd_2021 <- c(992, 275, 331, 454, 479, 553, 496,
230, 605, 432, 170, 670, 238, 238,
634, 422, 429, 513, 327, 465, 479,
661, 382, 364, 109, 398, 209, 1960,
992, 275, 331, 454, 479, 553, 496)
vl_2021 <- c(80, 1690, 5113, 71, 289, 3063, 0,
262, 0, 15089, 13016, 1513, 60, 60,
49248, 159308, 56, 0, 516675, 49, 237,
84, 292, 414, 26176, 62, 126, 93,
80, 1690, 5113, 71, 289, 3063, 0)
cd_2022 <- c(700, 127, 127, 547, 547, 547, 777,
149, 628, 614, 253, 918, 326, 326,
574, 361, 253, 726, 659, 596, 427,
447, 326, 253, 248, 326, 260, 918,
700, 127, 127, 547, 547, 547, 777)
vl_2022 <- c(0, 0, 53250, 0, 40, 1901, 0,
955, 0, 0, 0, 0, 40, 0,
49248, 159308, 56, 0, 516675, 49, 237,
0, 23601, 0, 40, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0)
x <- cbind(cd_2019, vl_2019, cd_2021, vl_2021, cd_2022, vl_2022) |>
as.data.frame()
set.seed(123)
hi_data <- rsample::initial_split(x)
set.seed(123)
hiv_data <- hi_data |>
rsample::training()
hu <- 5
plty <- 1.131656e-09
epo <- 176
vip_featured <- c("cd_2022")
vip_features <- c("cd_2019", "vl_2019", "cd_2021", "vl_2021", "vl_2022")
set.seed(123)
vi_train <- rsample::initial_split(x)
set.seed(123)
vip_train <- vi_train |>
rsample::training() |>
dplyr::select(rsample::all_of(vip_features))
vip_new <- vip_train[1,]
viralx_nn(vip_featured, hiv_data, hu, plty, epo, vip_train, vip_new)
## End(Not run)
Global Explainers for Neural Network Models
Description
The viralx_nn_glob function is designed to provide global explanations for the specified neural network model.
Usage
viralx_nn_glob(vip_featured, hiv_data, hu, plty, epo, vip_train, v_train)
Arguments
vip_featured |
A character value specifying the variable of interest for which you want to explain predictions. |
hiv_data |
A data frame containing the dataset used for training the neural network model. |
hu |
A numeric value representing the number of hidden units in the neural network. |
plty |
A numeric value representing the penalty term for the neural network model. |
epo |
A numeric value specifying the number of epochs for training the neural network. |
vip_train |
A data frame containing the training data used for generating global explanations. |
v_train |
A numeric vector representing the target variable for the global explanations. |
Value
A list containing global explanations for the specified neural network model.
Examples
## Not run:
library(dplyr)
library(rsample)
cd_2019 <- c(824, 169, 342, 423, 441, 507, 559,
173, 764, 780, 244, 527, 417, 800,
602, 494, 345, 780, 780, 527, 556,
559, 238, 288, 244, 353, 169, 556,
824, 169, 342, 423, 441, 507, 559)
vl_2019 <- c(40, 11388, 38961, 40, 75, 4095, 103,
11388, 46, 103, 11388, 40, 0, 11388,
0, 4095, 40, 93, 49, 49, 49,
4095, 6837, 38961, 38961, 0, 0, 93,
40, 11388, 38961, 40, 75, 4095, 103)
cd_2021 <- c(992, 275, 331, 454, 479, 553, 496,
230, 605, 432, 170, 670, 238, 238,
634, 422, 429, 513, 327, 465, 479,
661, 382, 364, 109, 398, 209, 1960,
992, 275, 331, 454, 479, 553, 496)
vl_2021 <- c(80, 1690, 5113, 71, 289, 3063, 0,
262, 0, 15089, 13016, 1513, 60, 60,
49248, 159308, 56, 0, 516675, 49, 237,
84, 292, 414, 26176, 62, 126, 93,
80, 1690, 5113, 71, 289, 3063, 0)
cd_2022 <- c(700, 127, 127, 547, 547, 547, 777,
149, 628, 614, 253, 918, 326, 326,
574, 361, 253, 726, 659, 596, 427,
447, 326, 253, 248, 326, 260, 918,
700, 127, 127, 547, 547, 547, 777)
vl_2022 <- c(0, 0, 53250, 0, 40, 1901, 0,
955, 0, 0, 0, 0, 40, 0,
49248, 159308, 56, 0, 516675, 49, 237,
0, 23601, 0, 40, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0)
x <- cbind(cd_2019, vl_2019, cd_2021, vl_2021, cd_2022, vl_2022) |>
as.data.frame()
set.seed(123)
hi_data <- rsample::initial_split(x)
set.seed(123)
hiv_data <- hi_data |>
rsample::training()
hu <- 5
plty <- 1.131656e-09
epo <- 176
vip_featured <- c("cd_2022")
vip_features <- c("cd_2019", "vl_2019", "cd_2021", "vl_2021", "vl_2022")
set.seed(123)
vi_train <- rsample::initial_split(x)
set.seed(123)
vip_train <- vi_train |>
rsample::training() |>
dplyr::select(rsample::all_of(vip_features))
v_train <- vi_train |>
rsample::training() |>
dplyr::select(rsample::all_of(vip_featured))
viralx_nn_glob(vip_featured, hiv_data, hu, plty, epo, vip_train, v_train)
## End(Not run)
Explain Neural Network Model Using SHAP Values
Description
Explains the predictions of a neural network model using SHAP (Shapley Additive Explanations) values. It utilizes the DALEXtra and DALEX packages to provide SHAP-based explanations for the specified model.
Usage
viralx_nn_shap(
vip_featured,
hiv_data,
hu,
plty,
epo,
vip_train,
vip_new,
orderings
)
Arguments
vip_featured |
A character value |
hiv_data |
A data frame |
hu |
A numeric value |
plty |
A numeric value |
epo |
A numeric value |
vip_train |
A data frame |
vip_new |
A numeric vector |
orderings |
A numeric value |
Value
A data frame
Examples
## Not run:
library(dplyr)
library(rsample)
cd_2019 <- c(824, 169, 342, 423, 441, 507, 559,
173, 764, 780, 244, 527, 417, 800,
602, 494, 345, 780, 780, 527, 556,
559, 238, 288, 244, 353, 169, 556,
824, 169, 342, 423, 441, 507, 559)
vl_2019 <- c(40, 11388, 38961, 40, 75, 4095, 103,
11388, 46, 103, 11388, 40, 0, 11388,
0, 4095, 40, 93, 49, 49, 49,
4095, 6837, 38961, 38961, 0, 0, 93,
40, 11388, 38961, 40, 75, 4095, 103)
cd_2021 <- c(992, 275, 331, 454, 479, 553, 496,
230, 605, 432, 170, 670, 238, 238,
634, 422, 429, 513, 327, 465, 479,
661, 382, 364, 109, 398, 209, 1960,
992, 275, 331, 454, 479, 553, 496)
vl_2021 <- c(80, 1690, 5113, 71, 289, 3063, 0,
262, 0, 15089, 13016, 1513, 60, 60,
49248, 159308, 56, 0, 516675, 49, 237,
84, 292, 414, 26176, 62, 126, 93,
80, 1690, 5113, 71, 289, 3063, 0)
cd_2022 <- c(700, 127, 127, 547, 547, 547, 777,
149, 628, 614, 253, 918, 326, 326,
574, 361, 253, 726, 659, 596, 427,
447, 326, 253, 248, 326, 260, 918,
700, 127, 127, 547, 547, 547, 777)
vl_2022 <- c(0, 0, 53250, 0, 40, 1901, 0,
955, 0, 0, 0, 0, 40, 0,
49248, 159308, 56, 0, 516675, 49, 237,
0, 23601, 0, 40, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0)
x <- cbind(cd_2019, vl_2019, cd_2021, vl_2021, cd_2022, vl_2022) |>
as.data.frame()
set.seed(123)
hi_data <- rsample::initial_split(x)
set.seed(123)
hiv_data <- hi_data |>
rsample::training()
hu <- 5
plty <- 1.131656e-09
epo <- 176
vip_featured <- c("cd_2022")
vip_features <- c("cd_2019", "vl_2019", "cd_2021", "vl_2021", "vl_2022")
set.seed(123)
vi_train <- rsample::initial_split(x)
set.seed(123)
vip_train <- vi_train |>
rsample::training() |>
dplyr::select(rsample::all_of(vip_features))
vip_new <- vip_train[1,]
orderings <- 20
viralx_nn_shap(vip_featured, hiv_data, hu, plty, epo, vip_train, vip_new, orderings)
## End(Not run)
Visualize SHAP Values for Neural Network Model
Description
Visualizes SHAP (Shapley Additive Explanations) values for a neural network model by employing the DALEXtra and DALEX packages to provide visual insights into the impact of a specified variable on the model's predictions.
Usage
viralx_nn_vis(
vip_featured,
hiv_data,
hu,
plty,
epo,
vip_train,
vip_new,
orderings
)
Arguments
vip_featured |
A character value |
hiv_data |
A data frame |
hu |
A numeric value |
plty |
A numeric value |
epo |
A numeric value |
vip_train |
A data frame |
vip_new |
A numeric vector |
orderings |
A numeric value |
Value
A ggplot object
Examples
## Not run:
library(dplyr)
library(rsample)
cd_2019 <- c(824, 169, 342, 423, 441, 507, 559,
173, 764, 780, 244, 527, 417, 800,
602, 494, 345, 780, 780, 527, 556,
559, 238, 288, 244, 353, 169, 556,
824, 169, 342, 423, 441, 507, 559)
vl_2019 <- c(40, 11388, 38961, 40, 75, 4095, 103,
11388, 46, 103, 11388, 40, 0, 11388,
0, 4095, 40, 93, 49, 49, 49,
4095, 6837, 38961, 38961, 0, 0, 93,
40, 11388, 38961, 40, 75, 4095, 103)
cd_2021 <- c(992, 275, 331, 454, 479, 553, 496,
230, 605, 432, 170, 670, 238, 238,
634, 422, 429, 513, 327, 465, 479,
661, 382, 364, 109, 398, 209, 1960,
992, 275, 331, 454, 479, 553, 496)
vl_2021 <- c(80, 1690, 5113, 71, 289, 3063, 0,
262, 0, 15089, 13016, 1513, 60, 60,
49248, 159308, 56, 0, 516675, 49, 237,
84, 292, 414, 26176, 62, 126, 93,
80, 1690, 5113, 71, 289, 3063, 0)
cd_2022 <- c(700, 127, 127, 547, 547, 547, 777,
149, 628, 614, 253, 918, 326, 326,
574, 361, 253, 726, 659, 596, 427,
447, 326, 253, 248, 326, 260, 918,
700, 127, 127, 547, 547, 547, 777)
vl_2022 <- c(0, 0, 53250, 0, 40, 1901, 0,
955, 0, 0, 0, 0, 40, 0,
49248, 159308, 56, 0, 516675, 49, 237,
0, 23601, 0, 40, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0)
x <- cbind(cd_2019, vl_2019, cd_2021, vl_2021, cd_2022, vl_2022) |>
as.data.frame()
set.seed(123)
hi_data <- rsample::initial_split(x)
set.seed(123)
hiv_data <- hi_data |>
rsample::training()
hu <- 5
plty <- 1.131656e-09
epo <- 176
vip_featured <- c("cd_2022")
vip_features <- c("cd_2019", "vl_2019", "cd_2021", "vl_2021", "vl_2022")
set.seed(123)
vi_train <- rsample::initial_split(x)
set.seed(123)
vip_train <- vi_train |>
rsample::training() |>
dplyr::select(rsample::all_of(vip_features))
vip_new <- vip_train[1,]
orderings <- 20
viralx_nn_vis(vip_featured, hiv_data, hu, plty, epo, vip_train, vip_new, orderings)
## End(Not run)