Title: | Hospital Data Analysis Workflow Tools |
Version: | 0.2.2 |
Description: | Hospital data analysis workflow tools, modeling, and automations. This library provides many useful tools to review common administrative hospital data. Some of these include average length of stay, readmission rates, average net pay amounts by service lines just to name a few. The aim is to provide a simple and consistent verb framework that takes the guesswork out of everything. |
License: | MIT + file LICENSE |
Encoding: | UTF-8 |
LazyData: | true |
RoxygenNote: | 7.3.2 |
URL: | https://github.com/spsanderson/healthyR |
BugReports: | https://github.com/spsanderson/healthyR/issues |
Imports: | magrittr, rlang (≥ 0.1.2), tibble, timetk, ggplot2, dplyr, lubridate, graphics, purrr, stringr, writexl, cowplot, scales, sqldf, plotly |
Suggests: | knitr, rmarkdown, pacman, healthyR.data, broom, tidyselect |
VignetteBuilder: | knitr |
Depends: | R (≥ 3.3) |
NeedsCompilation: | no |
Packaged: | 2024-07-01 12:58:29 UTC; steve |
Author: | Steven Sanderson |
Maintainer: | Steven Sanderson <spsanderson@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2024-07-01 13:20:02 UTC |
Pipe operator
Description
See magrittr::%>%
for details.
Usage
lhs %>% rhs
Counts by Category
Description
Get the counts of a column by a particular grouping if supplied, otherwise just get counts of a column.
Usage
category_counts_tbl(.data, .count_col, .arrange_value = TRUE, ...)
Arguments
.data |
The data.frame/tibble supplied. |
.count_col |
The column that has the values you want to count. |
.arrange_value |
Defaults to true, this will arrange the resulting tibble in descending order by .count_col |
... |
Place the values you want to pass in for grouping here. |
Details
Requires a data.frame/tibble.
Requires a value column, a column that is going to counted.
Author(s)
Steven P. Sanderson II, MPH
See Also
Other Data Table Functions:
los_ra_index_summary_tbl()
,
named_item_list()
,
top_n_tbl()
,
ts_census_los_daily_tbl()
,
ts_signature_tbl()
Examples
library(healthyR.data)
library(dplyr)
healthyR_data %>%
category_counts_tbl(
.count_col = payer_grouping
, .arrange = TRUE
, ip_op_flag
)
healthyR_data %>%
category_counts_tbl(
.count_col = ip_op_flag
, .arrange_value = TRUE
, service_line
)
Provide Colorblind Compliant Colors
Description
8 Hex RGB color definitions suitable for charts for colorblind people.
Usage
color_blind()
Details
This function is used in others in order to help render plots for those that are color blind.
Value
A vector of 8 Hex RGB definitions.
Author(s)
Steven P. Sanderson II, MPH
See Also
Other Color Blind:
hr_scale_color_colorblind()
,
hr_scale_fill_colorblind()
Examples
color_blind()
Diverging Bar Chart
Description
Diverging Bars is a bar chart that can handle both negative and positive
values. This can be implemented by a smart tweak with geom_bar()
. But the
usage of geom_bar()
can be quite confusing. That's because, it can be used to
make a bar chart as well as a histogram. Let me explain.
By default, geom_bar()
has the stat set to count. That means, when you
provide just a continuous X variable (and no Y variable), it tries to make
a histogram out of the data.
In order to make a bar chart create bars instead of histogram,
you need to do two things. Set stat = identity
and provide both x
and y
inside aes()
where, x
is either character or factor and y
is numeric.
In order to make sure you get diverging bars instead of just bars, make sure,
your categorical variable has 2 categories that changes values at a certain
threshold of the continuous variable. In below example, the mpg from mtcars
data set is normalized by computing the z score. Those vehicles with mpg
above zero are marked green and those below are marked red.
Usage
diverging_bar_plt(
.data,
.x_axis,
.y_axis,
.fill_col,
.plot_title = NULL,
.plot_subtitle = NULL,
.plot_caption = NULL,
.interactive = FALSE
)
Arguments
.data |
The data to pass to the function, must be a tibble/data.frame. |
.x_axis |
The data that is passed to the x-axis. |
.y_axis |
The data that is passed to the y-axis. This will also equal the
parameter |
.fill_col |
The column that will be used to fill the color of the bars. |
.plot_title |
Default is NULL |
.plot_subtitle |
Default is NULL |
.plot_caption |
Default is NULL |
.interactive |
Default is FALSE. TRUE returns a plotly plot |
Details
This function takes only a few arguments and returns a ggplot2 object.
Value
A plotly
plot or a ggplot2
static plot
Author(s)
Steven P. Sanderson II, MPH
See Also
Other Plotting Functions:
diverging_lollipop_plt()
,
gartner_magic_chart_plt()
,
los_ra_index_plt()
,
ts_alos_plt()
,
ts_median_excess_plt()
,
ts_plt()
,
ts_readmit_rate_plt()
Examples
suppressPackageStartupMessages(library(ggplot2))
data("mtcars")
mtcars$car_name <- rownames(mtcars)
mtcars$mpg_z <- round((mtcars$mpg - mean(mtcars$mpg))/sd(mtcars$mpg), 2)
mtcars$mpg_type <- ifelse(mtcars$mpg_z < 0, "below", "above")
mtcars <- mtcars[order(mtcars$mpg_z), ] # sort
mtcars$car_name <- factor(mtcars$car_name, levels = mtcars$car_name)
diverging_bar_plt(
.data = mtcars
, .x_axis = car_name
, .y_axis = mpg_z
, .fill_col = mpg_type
, .interactive = FALSE
)
Diverging Lollipop Chart
Description
This is a diverging lollipop function. Lollipop chart conveys the same information as bar chart and diverging bar. Except that it looks more modern. Instead of geom_bar, I use geom_point and geom_segment to get the lollipops right. Let’s draw a lollipop using the same data I prepared in the previous example of diverging bars.
Usage
diverging_lollipop_plt(
.data,
.x_axis,
.y_axis,
.plot_title = NULL,
.plot_subtitle = NULL,
.plot_caption = NULL,
.interactive = FALSE
)
Arguments
.data |
The data to pass to the function, must be a tibble/data.frame. |
.x_axis |
The data that is passed to the x-axis. This will also be the
|
.y_axis |
The data that is passed to the y-axis. This will also equal the
parameters of |
.plot_title |
Default is NULL |
.plot_subtitle |
Default is NULL |
.plot_caption |
Default is NULL |
.interactive |
Default is FALSE. TRUE returns a plotly plot |
Details
This function takes only a few arguments and returns a ggplot2 object.
Value
A plotly
plot or a ggplot2
static plot
Author(s)
Steven P. Sanderson II, MPH
See Also
Other Plotting Functions:
diverging_bar_plt()
,
gartner_magic_chart_plt()
,
los_ra_index_plt()
,
ts_alos_plt()
,
ts_median_excess_plt()
,
ts_plt()
,
ts_readmit_rate_plt()
Examples
suppressPackageStartupMessages(library(ggplot2))
data("mtcars")
mtcars$car_name <- rownames(mtcars)
mtcars$mpg_z <- round((mtcars$mpg - mean(mtcars$mpg))/sd(mtcars$mpg), 2)
mtcars$mpg_type <- ifelse(mtcars$mpg_z < 0, "below", "above")
mtcars <- mtcars[order(mtcars$mpg_z), ] # sort
mtcars$car_name <- factor(mtcars$car_name, levels = mtcars$car_name)
diverging_lollipop_plt(.data = mtcars, .x_axis = car_name
, .y_axis = mpg_z)
Diagnosis to Condition Code Mapping file
Description
Diagnosis to Condition Code Mapping file
Usage
data(dx_cc_mapping)
Format
A data frame with 86852 rows and 5 variables
See Also
Other AHRQ:
px_cc_mapping
Gartner Magic Chart - Plotting of two continuous variables
Description
Plot a Gartner Magic Chart of two continuous variables.
Usage
gartner_magic_chart_plt(
.data,
.x_col,
.y_col,
.point_size_col = NULL,
.y_lab = "",
.x_lab = "",
.plot_title = "",
.top_left_label = "",
.top_right_label = "",
.bottom_right_label = "",
.bottom_left_label = ""
)
Arguments
.data |
The dataset you want to plot. |
.x_col |
The x-axis for the plot. |
.y_col |
The y-axis for the plot. |
.point_size_col |
The default is NULL. If you want to size the dots by a column in the data frame/tibble, enter the column name here. |
.y_lab |
The y-axis label (default: ""). |
.x_lab |
The x-axis label (default: ""). |
.plot_title |
The title of the plot (default: ""). |
.top_left_label |
The top left label (default: ""). |
.top_right_label |
The top right label (default: ""). |
.bottom_right_label |
The bottom right label (default: ""). |
.bottom_left_label |
The bottom left label (default: ""). |
Value
A ggplot
plot.
Author(s)
Steven P. Sanderson II, MPH
See Also
Other Plotting Functions:
diverging_bar_plt()
,
diverging_lollipop_plt()
,
los_ra_index_plt()
,
ts_alos_plt()
,
ts_median_excess_plt()
,
ts_plt()
,
ts_readmit_rate_plt()
Examples
library(dplyr)
library(ggplot2)
data_tbl <- tibble(
x = rnorm(100, 0, 1),
y = rnorm(100, 0, 1),
z = abs(x) + abs(y)
)
gartner_magic_chart_plt(
.data = data_tbl,
.x_col = x,
.y_col = y,
.point_size_col = z,
.x_lab = "los",
.y_lab = "ra",
.plot_title = "tst",
.top_right_label = "High RA-LOS",
.top_left_label = "High RA",
.bottom_left_label = "Leader",
.bottom_right_label = "High LOS"
)
gartner_magic_chart_plt(
.data = data_tbl,
.x_col = x,
.y_col = y,
.point_size_col = NULL,
.x_lab = "los",
.y_lab = "ra",
.plot_title = "tst",
.top_right_label = "High RA-LOS",
.top_left_label = "High RA",
.bottom_left_label = "Leader",
.bottom_right_label = "High LOS"
)
Provide Colorblind Compliant Colors
Description
8 Hex RGB color definitions suitable for charts for colorblind people.
Usage
hr_scale_color_colorblind(..., theme = "hr")
Arguments
... |
Data passed in from a |
theme |
Right now this is |
Details
This function is used in others in order to help render plots for those that are color blind.
Value
A gggplot
layer
Author(s)
Steven P. Sanderson II, MPH
See Also
Other Color Blind:
color_blind()
,
hr_scale_fill_colorblind()
Provide Colorblind Compliant Colors
Description
8 Hex RGB color definitions suitable for charts for colorblind people.
Usage
hr_scale_fill_colorblind(..., theme = "hr")
Arguments
... |
Data passed in from a |
theme |
Right now this is |
Details
This function is used in others in order to help render plots for those that are color blind.
Value
A gggplot
layer
Author(s)
Steven P. Sanderson II, MPH
See Also
Other Color Blind:
color_blind()
,
hr_scale_color_colorblind()
Plot LOS and Readmit Index with Variance
Description
Plot the index of the length of stay and readmit rate against each other along with the variance
Usage
los_ra_index_plt(.data)
Arguments
.data |
The data supplied from |
Details
Expects a tibble
Expects a Length of Stay and Readmit column, must be numeric
Uses
cowplot
to stack plots
Value
A patchwork
ggplot2
plot
Author(s)
Steven P. Sanderson II, MPH
See Also
Other Plotting Functions:
diverging_bar_plt()
,
diverging_lollipop_plt()
,
gartner_magic_chart_plt()
,
ts_alos_plt()
,
ts_median_excess_plt()
,
ts_plt()
,
ts_readmit_rate_plt()
Examples
suppressPackageStartupMessages(library(dplyr))
data_tbl <- tibble(
"alos" = runif(186, 1, 20)
, "elos" = runif(186, 1, 17)
, "readmit_rate" = runif(186, 0, .25)
, "readmit_rate_bench" = runif(186, 0, .2)
)
los_ra_index_summary_tbl(
.data = data_tbl
, .max_los = 15
, .alos_col = alos
, .elos_col = elos
, .readmit_rate = readmit_rate
, .readmit_bench = readmit_rate_bench
) %>%
los_ra_index_plt()
los_ra_index_summary_tbl(
.data = data_tbl
, .max_los = 10
, .alos_col = alos
, .elos_col = elos
, .readmit_rate = readmit_rate
, .readmit_bench = readmit_rate_bench
) %>%
los_ra_index_plt()
Make LOS and Readmit Index Summary Tibble
Description
Create the length of stay and readmit index summary tibble
Usage
los_ra_index_summary_tbl(
.data,
.max_los = 15,
.alos_col,
.elos_col,
.readmit_rate,
.readmit_bench
)
Arguments
.data |
The data you are going to analyze. |
.max_los |
You can give a maximum LOS value. Lets say you typically do not see los over 15 days, you would then set .max_los to 15 and all values greater than .max_los will be grouped to .max_los |
.alos_col |
The Average Length of Stay column |
.elos_col |
The Expected Length of Stay column |
.readmit_rate |
The Actual Readmit Rate column |
.readmit_bench |
The Expected Readmit Rate column |
Details
Expects a tibble
Expects the following columns and there should only be these 4
Length Of Stay Actual - Should be an integer
Length Of Stacy Benchmark - Should be an integer
Readmit Rate Actual - Should be 0/1 for each record, 1 = readmitted, 0 did not.
Readmit Rate Benchmark - Should be a percentage from the benchmark file.
This will add a column called visits that will be the count of records per length of stay from 1 to .max_los
The .max_los param can be left blank and the function will default to 15. If this is not a good default and you don't know what it should be then set it to 75 percentile from the
stats::quantile()
function using the defaults, like so .max_los =stats::quantile(data_tbl$alos)[[4]]
Uses all data to compute variance, if you want it for a particular time frame you will have to filter the data that goes into the .data argument. It is suggested to use
timetk::filter_by_time()
The index is computed as the excess of the length of stay or readmit rates over their respective expectations.
Value
A tibble
Author(s)
Steven P. Sanderson II, MPH
See Also
Other Data Table Functions:
category_counts_tbl()
,
named_item_list()
,
top_n_tbl()
,
ts_census_los_daily_tbl()
,
ts_signature_tbl()
Examples
suppressPackageStartupMessages(library(dplyr))
data_tbl <- tibble(
"alos" = runif(186, 1, 20)
, "elos" = runif(186, 1, 17)
, "readmit_rate" = runif(186, 0, .25)
, "readmit_bench" = runif(186, 0, .2)
)
los_ra_index_summary_tbl(
.data = data_tbl
, .max_los = 15
, .alos_col = alos
, .elos_col = elos
, .readmit_rate = readmit_rate
, .readmit_bench = readmit_bench
)
los_ra_index_summary_tbl(
.data = data_tbl
, .max_los = 10
, .alos_col = alos
, .elos_col = elos
, .readmit_rate = readmit_rate
, .readmit_bench = readmit_bench
)
Tibble to named list
Description
Takes in a data.frame/tibble and creates a named list from a supplied grouping
variable. Can be used in conjunction with save_to_excel()
to create a new
sheet for each group of data.
Usage
named_item_list(.data, .group_col)
Arguments
.data |
The data.frame/tibble. |
.group_col |
The column that contains the groupings. |
Details
Requires a data.frame/tibble and a grouping column.
Author(s)
Steven P. Sanderson II, MPH
See Also
Other Data Table Functions:
category_counts_tbl()
,
los_ra_index_summary_tbl()
,
top_n_tbl()
,
ts_census_los_daily_tbl()
,
ts_signature_tbl()
Examples
library(healthyR.data)
df <- healthyR_data
df_list <- named_item_list(.data = df, .group_col = service_line)
df_list
Get the optimal binwidth for a histogram
Description
Gives the optimal binwidth for a histogram given a data set, it's value and the desired amount of bins
Usage
opt_bin(.data, .value_col, .iters = 30)
Arguments
.data |
The data set in question |
.value_col |
The column that holds the values |
.iters |
How many times the cost function loop should run |
Details
Supply a data.frame/tibble with a value column. from this an optimal binwidth will be computed for the amount of binds desired
Value
A tibble of histogram breakpoints
Author(s)
Steven P. Sanderson II, MPH
Modified from Hideaki Shimazaki Department of Physics, Kyoto University shimazaki at ton.scphys.kyoto-u.ac.jp Feel free to modify/distribute this program.
See Also
Other Utilities:
save_to_excel()
,
sql_left()
,
sql_mid()
,
sql_right()
Examples
suppressPackageStartupMessages(library(purrr))
suppressPackageStartupMessages(library(dplyr))
df_tbl <- rnorm(n = 1000, mean = 0, sd = 1)
df_tbl <- df_tbl %>%
as_tibble() %>%
set_names("value")
df_tbl %>%
opt_bin(
.value_col = value
, .iters = 100
)
Procedure to Condition Code Mapping file
Description
Procedure to Condition Code Mapping file
Usage
data(px_cc_mapping)
Format
A data frame with 79721 rows and 5 variables
See Also
Other AHRQ:
dx_cc_mapping
Save a file to Excel
Description
Save a tibble/data.frame to an excel .xlsx
file. The file will automatically
with a save_dtime in the format of 20201109_132416 for November 11th, 2020
at 1:24:16PM.
Usage
save_to_excel(.data, .file_name)
Arguments
.data |
The tibble/data.frame that you want to save as an |
.file_name |
the name you want to give to the file. |
Details
Requires a tibble/data.frame to be passed to it.
Value
A saved excel file
Author(s)
Steven P. Sanderson II, MPH
See Also
Other Utilities:
opt_bin()
,
sql_left()
,
sql_mid()
,
sql_right()
Service Line Grouper Augment Function
Description
Takes a few arguments from a data.frame/tibble and returns a service line augmented to a data.frame/tibble for a set of patients.
Usage
service_line_augment(.data, .dx_col, .px_col, .drg_col)
Arguments
.data |
The data being passed that will be augmented by the function. |
.dx_col |
The column containing the Principal Diagnosis for the discharge. |
.px_col |
The column containing the Principal Coded Procedure for the discharge. It is possible that this could be blank. |
.drg_col |
The DRG Number coded to the inpatient discharge. |
Details
This is an augment function in that appends a vector to an data.frame/tibble
that is passed to the .data
parameter. A data.frame/tibble
is required, along with a principal diagnosis column, a principal procedure column,
and a column for the DRG number. These are needed so that the function can
join the dx_cc_mapping and px_cc_mapping columns to provide the service line.
This function only works on visits that are coded using ICD Version 10 only.
Lets take an example discharge, the DRG is 896 and the Principal Diagnosis code
maps to DX_660, then this visit would get grouped to alcohol_abuse
DRG 896: ALCOHOL, DRUG ABUSE OR DEPENDENCE WITHOUT REHABILITATION THERAPY WITH MAJOR COMPLICATION OR COMORBIDITY (MCC)
DX_660 Maps to the following ICD-10 Codes ie F1010 Alcohol abuse, uncomplicated:
library(healthyR) dx_cc_mapping %>% filter(CC_Code == "DX_660", ICD_Ver_Flag == "10")
Value
An augmented data.frame/tibble with the service line appended as a new column.
Author(s)
Steven P. Sanderson II, MPH
Examples
df <- data.frame(
dx_col = "F10.10",
px_col = NA,
drg_col = "896"
)
service_line_augment(
.data = df,
.dx_col = dx_col,
.px_col = px_col,
.drg_col = drg_col
)
Service Line Grouper Vectorized Function
Description
Takes a few arguments from a data.frame/tibble and returns a service line vector for a set of patients.
Usage
service_line_vec(.data, .dx_col, .px_col, .drg_col)
Arguments
.data |
The data being passed that will be augmented by the function. |
.dx_col |
The column containing the Principal Diagnosis for the discharge. |
.px_col |
The column containing the Principal Coded Procedure for the discharge. It is possible that this could be blank. |
.drg_col |
The DRG Number coded to the inpatient discharge. |
Details
This is a vectorized function in that it returns a vector. It can be applied
inside of a mutate
statement when using dplyr
if desired. A data.frame/tibble
is required, along with a principal diagnosis column, a principal procedure column,
and a column for the DRG number. These are needed so that the function can
join the dx_cc_mapping and px_cc_mapping columns to provide the service line.
This function only works on visits that are coded using ICD Version 10 only.
Lets take an example discharge, the DRG is 896 and the Principal Diagnosis code
maps to DX_660, then this visit would get grouped to alcohol_abuse
DRG 896: ALCOHOL, DRUG ABUSE OR DEPENDENCE WITHOUT REHABILITATION THERAPY WITH MAJOR COMPLICATION OR COMORBIDITY (MCC)
DX_660 Maps to the following ICD-10 Codes ie F1010 Alcohol abuse, uncomplicated:
library(healthyR) dx_cc_mapping %>% filter(CC_Code == "DX_660", ICD_Ver_Flag == "10")
Value
A vector of service line assignments.
Author(s)
Steven P. Sanderson II, MPH
Examples
df <- data.frame(
dx_col = "F10.10",
px_col = NA,
drg_col = "896"
)
service_line_vec(
.data = df,
.dx_col = dx_col,
.px_col = px_col,
.drg_col = drg_col
)
Use SQL LEFT type function
Description
Perform an SQL LEFT() type function on a piece of text
Usage
sql_left(.text, .num_char)
Arguments
.text |
A piece of text/string to be manipulated |
.num_char |
How many characters do you want to grab |
Details
You must supply data that you want to manipulate.
Author(s)
Steven P. Sanderson II, MPH
See Also
Other Utilities:
opt_bin()
,
save_to_excel()
,
sql_mid()
,
sql_right()
Examples
sql_left("text", 3)
Use SQL MID type function
Description
Perform an SQL SUBSTRING type function
Usage
sql_mid(.text, .start_num, .num_char)
Arguments
.text |
A piece of text/string to be manipulated |
.start_num |
What place to start at |
.num_char |
How many characters do you want to grab |
Details
You must supply data that you want to manipulate.
Author(s)
Steven P. Sanderson II, MPH
See Also
Other Utilities:
opt_bin()
,
save_to_excel()
,
sql_left()
,
sql_right()
Examples
sql_mid("this is some text", 6, 2)
Use SQL RIGHT type functions
Description
Perform an SQL RIGHT type function
Usage
sql_right(.text, .num_char)
Arguments
.text |
A piece of text/string to be manipulated |
.num_char |
How many characters do you want to grab |
Details
You must supply data that you want to manipulate.
Author(s)
Steven P. Sanderson II, MPH
See Also
Other Utilities:
opt_bin()
,
save_to_excel()
,
sql_left()
,
sql_mid()
Examples
sql_right("this is some more text", 3)
Tidy eval helpers
Description
-
sym()
creates a symbol from a string andsyms()
creates a list of symbols from a character vector. -
enquo()
andenquos()
delay the execution of one or several function arguments.enquo()
returns a single quoted expression, which is like a blueprint for the delayed computation.enquos()
returns a list of such quoted expressions. -
expr()
quotes a new expression locally. It is mostly useful to build new expressions around arguments captured withenquo()
orenquos()
:expr(mean(!!enquo(arg), na.rm = TRUE))
. -
as_name()
transforms a quoted variable name into a string. Supplying something else than a quoted variable name is an error.That's unlike
as_label()
which also returns a single string but supports any kind of R object as input, including quoted function calls and vectors. Its purpose is to summarise that object into a single label. That label is often suitable as a default name.If you don't know what a quoted expression contains (for instance expressions captured with
enquo()
could be a variable name, a call to a function, or an unquoted constant), then useas_label()
. If you know you have quoted a simple variable name, or would like to enforce this, useas_name()
.
To learn more about tidy eval and how to use these tools, visit Metaprogramming section of Advanced R.
Top N tibble
Description
Get a tibble returned with n records sorted either by descending order (default) or ascending order.
Usage
top_n_tbl(.data, .n_records, .arrange_value = TRUE, ...)
Arguments
.data |
The data you want to pass to the function |
.n_records |
How many records you want returned |
.arrange_value |
A boolean with TRUE as the default. TRUE sorts data in descending order |
... |
The columns you want to pass to the function. |
Details
Requires a data.frame/tibble
Requires at least one column to be chosen inside of the ...
Will return the tibble in sorted order that is chosen with descending as the default
Author(s)
Steven P. Sanderson II, MPH
See Also
Other Data Table Functions:
category_counts_tbl()
,
los_ra_index_summary_tbl()
,
named_item_list()
,
ts_census_los_daily_tbl()
,
ts_signature_tbl()
Examples
library(healthyR.data)
df <- healthyR_data
df_tbl <- top_n_tbl(
.data = df
, .n_records = 3
, .arrange_value = TRUE
, service_line
, payer_grouping
)
print(df_tbl)
Plot ALOS - Average Length of Stay
Description
Plot ALOS - Average Length of Stay
Usage
ts_alos_plt(.data, .date_col, .value_col, .by_grouping, .interactive)
Arguments
.data |
The time series data you need to pass |
.date_col |
The date column |
.value_col |
The value column |
.by_grouping |
How you want the data summarized - "sec", "min", "hour", "day", "week", "month", "quarter" or "year" |
.interactive |
TRUE or FALSE. TRUE returns a |
Details
Expects a tibble with a date time column and a value column
Uses
timetk
for underlying sumarization and plotIf .by_grouping is missing it will default to "day"
A static ggplot2 object is return if the .interactive function is FALSE otherwise a
plotly
plot is returned.
Value
A timetk time series plot
Author(s)
Steven P. Sanderson II, MPH
See Also
Other Plotting Functions:
diverging_bar_plt()
,
diverging_lollipop_plt()
,
gartner_magic_chart_plt()
,
los_ra_index_plt()
,
ts_median_excess_plt()
,
ts_plt()
,
ts_readmit_rate_plt()
Examples
library(healthyR)
library(healthyR.data)
library(timetk)
library(dplyr)
library(purrr)
# Make A Series of Dates ----
data_tbl <- healthyR_data
df_tbl <- data_tbl %>%
filter(ip_op_flag == "I") %>%
select(visit_end_date_time, length_of_stay) %>%
summarise_by_time(
.date_var = visit_end_date_time
, .by = "day"
, visits = mean(length_of_stay, na.rm = TRUE)
) %>%
filter_by_time(
.date_var = visit_end_date_time
, .start_date = "2012"
, .end_date = "2019"
) %>%
set_names("Date","Values")
ts_alos_plt(
.data = df_tbl
, .date_col = Date
, .value_col = Values
, .by = "month"
, .interactive = FALSE
)
Time Series - Census and LOS by Day
Description
Sometimes it is important to know what the census was on any given day, or what
the average length of stay is on given day, including for those patients that
are not yet discharged. This can be easily achieved. This will return one
record for every account so the data will still need to be summarized. If there
are multiple entries per day then those records will show up and you will
therefore have multiple entries in the column date
in the resulting tibble
.
If you want to aggregate from there you should be able to do so easily.
If you have a record where the .start_date_col
is filled in but the corresponding
end_date
is null then the end date will be set equal to Sys.Date()
If a record has a start_date
that is NA
then it will be discarded.
This function can take a little bit of time to run while the join comparison runs.
Usage
ts_census_los_daily_tbl(
.data,
.keep_nulls_only = FALSE,
.start_date_col,
.end_date_col,
.by_time = "day"
)
Arguments
.data |
The data you want to pass to the function |
.keep_nulls_only |
A boolean that will keep only those records that have a NULL end date, meaning the patient is still admitted. The default is FALSE which brings back all records. |
.start_date_col |
The column containing the start date for the record |
.end_date_col |
The column containing the end date for the record. |
.by_time |
How you want the data presented, defaults to day and should remain that way unless you need more granular data. |
Details
Requires a dataset that has at least a start date column and an end date column
Takes a single boolean parameter
Value
A tibble object
Author(s)
Steven P. Sanderson II, MPH
See Also
Other Data Table Functions:
category_counts_tbl()
,
los_ra_index_summary_tbl()
,
named_item_list()
,
top_n_tbl()
,
ts_signature_tbl()
Examples
library(healthyR)
library(healthyR.data)
library(dplyr)
df <- healthyR_data
df_tbl <- df %>%
filter(ip_op_flag == "I") %>%
select(visit_start_date_time, visit_end_date_time) %>%
timetk::filter_by_time(.date_var = visit_start_date_time, .start_date = "2020")
ts_census_los_daily_tbl(
.data = df_tbl
, .keep_nulls_only = FALSE
, .start_date_col = visit_start_date_time
, .end_date_col = visit_end_date_time
)
Create a plot showing the excess of the median value
Description
Plot out the excess +/- of the median value grouped by certain time parameters.
Usage
ts_median_excess_plt(
.data,
.date_col,
.value_col,
.x_axis,
.ggplot_group_var,
.years_back
)
Arguments
.data |
The data that is being analyzed, data must be a tibble/data.frame. |
.date_col |
The column of the tibble that holds the date. |
.value_col |
The column that holds the value of interest. |
.x_axis |
What is the be the x-axis, day, week, etc. |
.ggplot_group_var |
The variable to group the ggplot on. |
.years_back |
How many yeas back do you want to go in order to compute the median value. |
Details
Supply data that you want to view and you will see the excess +/- of the median values over a specified time series tibble.
Value
A ggplot2
plot
See Also
Other Plotting Functions:
diverging_bar_plt()
,
diverging_lollipop_plt()
,
gartner_magic_chart_plt()
,
los_ra_index_plt()
,
ts_alos_plt()
,
ts_plt()
,
ts_readmit_rate_plt()
Examples
suppressPackageStartupMessages(library(timetk))
ts_signature_tbl(
.data = m4_daily
, .date_col = date
) %>%
ts_median_excess_plt(
.date_col = date
, .value_col = value
, .x_axis = month
, .ggplot_group_var = year
, .years_back = 1
)
Time Series Plot
Description
This is a warpper function to the timetk::plot_time_series()
function with
a limited functionality parameter set. To see the full reference please visit
the timetk
package site.
Usage
ts_plt(
.data,
.date_col,
.value_col,
.color_col = NULL,
.facet_col = NULL,
.facet_ncol = NULL,
.interactive = FALSE
)
Arguments
.data |
The data to pass to the function, must be a tibble/data.frame. |
.date_col |
The column holding the date. |
.value_col |
The column holding the value. |
.color_col |
The column holding the variable for color. |
.facet_col |
The column holding the variable for faceting. |
.facet_ncol |
How many columns do you want. |
.interactive |
Return a |
Details
This function takes only a few of the arguments in the function and presets others while choosing the defaults on others. The smoother functionality is turned off.
Value
A plotly
plot or a ggplot2
static plot
Author(s)
Steven P. Sanderson II, MPH
See Also
https://business-science.github.io/timetk/reference/plot_time_series.html
Other Plotting Functions:
diverging_bar_plt()
,
diverging_lollipop_plt()
,
gartner_magic_chart_plt()
,
los_ra_index_plt()
,
ts_alos_plt()
,
ts_median_excess_plt()
,
ts_readmit_rate_plt()
Examples
suppressPackageStartupMessages(library(dplyr))
library(timetk)
library(healthyR.data)
healthyR.data::healthyR_data %>%
filter(ip_op_flag == "I") %>%
select(visit_end_date_time, service_line) %>%
filter_by_time(
.date_var = visit_end_date_time
, .start_date = "2020"
) %>%
group_by(service_line) %>%
summarize_by_time(
.date_var = visit_end_date_time
, .by = "month"
, visits = n()
) %>%
ungroup() %>%
ts_plt(
.date_col = visit_end_date_time
, .value_col = visits
, .color_col = service_line
)
Plot Readmit Rate
Description
Plot Readmit Rate
Usage
ts_readmit_rate_plt(.data, .date_col, .value_col, .by_grouping, .interactive)
Arguments
.data |
The data you need to pass. |
.date_col |
The date column. |
.value_col |
The value column. |
.by_grouping |
How you want the data summarized - "sec", "min", "hour", "day", "week", "month", "quarter" or "year". |
.interactive |
TRUE or FALSE. TRUE returns a |
Details
Expects a tibble with a date time column and a value column
Uses
timetk
for underlying sumarization and plotIf .by_grouping is missing it will default to "day"
Value
A timetk
time series plot that is interactive
Author(s)
Steven P. Sanderson II, MPH
See Also
Other Plotting Functions:
diverging_bar_plt()
,
diverging_lollipop_plt()
,
gartner_magic_chart_plt()
,
los_ra_index_plt()
,
ts_alos_plt()
,
ts_median_excess_plt()
,
ts_plt()
Examples
set.seed(123)
suppressPackageStartupMessages(library(timetk))
suppressPackageStartupMessages(library(purrr))
suppressPackageStartupMessages(library(dplyr))
ts_tbl <- tk_make_timeseries(
start = "2019-01-01"
, by = "day"
, length_out = "1 year 6 months"
)
values <- arima.sim(
model = list(
order = c(0, 1, 0))
, n = 547
, mean = 1
, sd = 5
)
df_tbl <- tibble(
x = ts_tbl
, y = values
) %>%
set_names("Date","Values")
ts_readmit_rate_plt(
.data = df_tbl
, .date_col = Date
, .value_col = Values
, .by = "month"
, .interactive = FALSE
)
Make a Time Enhanced Tibble
Description
Returns a tibble that adds the time series signature from the
timetk::tk_augment_timeseries_signature()
function. All added from a chosen
date column defined by the .date_col
parameter.
Usage
ts_signature_tbl(.data, .date_col, .pad_time = TRUE, ...)
Arguments
.data |
The data that is being analyzed. |
.date_col |
The column that holds the date. |
.pad_time |
Boolean TRUE/FALSE. If TRUE then the |
... |
Grouping variables to be used by |
Details
Supply data with a date column and this will add the year, month, week, week day and hour to the tibble. The original date column is kept.
Returns a time-series signature tibble.
You must know the data going into the function and if certain columns should be dropped or kept when using further functions
Value
A tibble
Author(s)
Steven P. Sanderson II, MPH
See Also
Other Data Table Functions:
category_counts_tbl()
,
los_ra_index_summary_tbl()
,
named_item_list()
,
top_n_tbl()
,
ts_census_los_daily_tbl()
Examples
library(timetk)
ts_signature_tbl(
.data = m4_daily
, .date_col = date
, .pad_time = TRUE
, id
)