Help for package REDCapCAST

Title:

REDCap Metadata Casting and Castellated Data Handling

Version:

25.3.2

Description:

Casting metadata for REDCap database creation and handling of castellated data using repeated instruments and longitudinal projects in 'REDCap'. Keeps a focused data export approach, by allowing to only export required data from the database. Also for casting new REDCap databases based on datasets from other sources. Originally forked from the R part of 'REDCapRITS' by Paul Egeler. See https://github.com/pegeler/REDCapRITS. 'REDCap' (Research Electronic Data Capture) is a secure, web-based software platform designed to support data capture for research studies, providing 1) an intuitive interface for validated data capture; 2) audit trails for tracking data manipulation and export procedures; 3) automated export procedures for seamless data downloads to common statistical packages; and 4) procedures for data integration and interoperability with external sources (Harris et al (2009) <doi:10.1016/j.jbi.2008.08.010>; Harris et al (2019) <doi:10.1016/j.jbi.2019.103208>).

Depends:

R (≥ 4.1.0)

Suggests:

httr, jsonlite, testthat, Hmisc, knitr, rmarkdown, styler, devtools, roxygen2, spelling, rhub, rsconnect, pkgconfig

License:

GPL (≥ 3)

Encoding:

UTF-8

LazyData:

true

RoxygenNote:

7.3.2

URL:

https://github.com/agdamsbo/REDCapCAST, https://agdamsbo.github.io/REDCapCAST/

BugReports:

https://github.com/agdamsbo/REDCapCAST/issues

Imports:

dplyr, REDCapR, tidyr, tidyselect, keyring, purrr, readr, stats, zip, assertthat, forcats, vctrs, gt, bslib, here, glue, gtsummary, shiny, haven, openxlsx2, readODS

Language:

en-US

VignetteBuilder:

knitr

Collate:

'REDCapCAST-package.R' 'utils.r' 'process_user_input.r' 'REDCap_split.r' 'as_factor.R' 'as_logical.R' 'doc2dd.R' 'ds2dd_detailed.R' 'easy_redcap.R' 'export_redcap_instrument.R' 'fct_drop.R' 'html_styling.R' 'mtcars_redcap.R' 'read_redcap_instrument.R' 'read_redcap_tables.R' 'redcap_wider.R' 'redcapcast_data.R' 'redcapcast_meta.R' 'shiny_cast.R'

NeedsCompilation:

Packaged:

2025-03-10 10:39:02 UTC; au301842

Author:

Andreas Gammelgaard Damsbo

[aut, cre], Paul Egeler

[aut]

Maintainer:

Andreas Gammelgaard Damsbo <agdamsbo@clin.au.dk>

Repository:

CRAN

Date/Publication:

2025-03-10 14:30:10 UTC

REDCapCAST: REDCap Metadata Casting and Castellated Data Handling

Description

Casting metadata for REDCap database creation and handling of castellated data using repeated instruments and longitudinal projects in 'REDCap'. Keeps a focused data export approach, by allowing to only export required data from the database. Also for casting new REDCap databases based on datasets from other sources. Originally forked from the R part of 'REDCapRITS' by Paul Egeler. See https://github.com/pegeler/REDCapRITS. 'REDCap' (Research Electronic Data Capture) is a secure, web-based software platform designed to support data capture for research studies, providing 1) an intuitive interface for validated data capture; 2) audit trails for tracking data manipulation and export procedures; 3) automated export procedures for seamless data downloads to common statistical packages; and 4) procedures for data integration and interoperability with external sources (Harris et al (2009) doi:10.1016/j.jbi.2008.08.010; Harris et al (2019) doi:10.1016/j.jbi.2019.103208).

Author(s)

Maintainer: Andreas Gammelgaard Damsbo agdamsbo@clin.au.dk (ORCID)

Authors:

Paul Egeler paulegeler@gmail.com (ORCID)

Split REDCap repeating instruments table into multiple tables

Description

This will take output from a REDCap export and split it into a base table and child tables for each repeating instrument. Metadata is used to determine which fields should be included in each resultant table.

Usage

REDCap_split(
  records,
  metadata,
  primary_table_name = "",
  forms = c("repeating", "all")
)

Arguments

records

Exported project records. May be a data.frame, response, or character vector containing JSON from an API call.

metadata

Project metadata (the data dictionary). May be a data.frame, response, or character vector containing JSON from an API call.

primary_table_name

Name given to the list element for the primary output table. Ignored if forms = 'all'.

forms

Indicate whether to create separate tables for repeating instruments only or for all forms.

Value

A list of "data.frame"s. The number of tables will differ depending on the forms option selected.

'repeating': one base table and one or more tables for each repeating instrument.
'all': a data.frame for each instrument, regardless of whether it is a repeating instrument or not.

Author(s)

Paul W. Egeler

Examples

## Not run: 
# Using an API call -------------------------------------------------------

library(RCurl)

# Get the records
records <- postForm(
  uri = api_url, # Supply your site-specific URI
  token = api_token, # Supply your own API token
  content = "record",
  format = "json",
  returnFormat = "json"
)

# Get the metadata
metadata <- postForm(
  uri = api_url, # Supply your site-specific URI
  token = api_token, # Supply your own API token
  content = "metadata",
  format = "json"
)

# Convert exported JSON strings into a list of data.frames
REDCapCAST::REDCap_split(records, metadata)

# Using a raw data export -------------------------------------------------

# Get the records
records <- read.csv("/path/to/data/ExampleProject_DATA_2018-06-03_1700.csv")

# Get the metadata
metadata <- read.csv(
  "/path/to/data/ExampleProject_DataDictionary_2018-06-03.csv"
)

# Split the tables
REDCapCAST::REDCap_split(records, metadata)

# In conjunction with the R export script ---------------------------------

# You must set the working directory first since the REDCap data export
# script contains relative file references.
old <- getwd()
setwd("/path/to/data/")

# Run the data export script supplied by REDCap.
# This will create a data.frame of your records called 'data'
source("ExampleProject_R_2018-06-03_1700.r")

# Get the metadatan
metadata <- read.csv("ExampleProject_DataDictionary_2018-06-03.csv")

# Split the tables
REDCapCAST::REDCap_split(data, metadata)
setwd(old)

## End(Not run)

Check if vector is all NA

Description

Check if vector is all NA

Usage

all_na(data)

Arguments

data

vector of data.frame

Value

logical

Examples

rep(NA, 4) |> all_na()

Preserve all factor levels from REDCap data dictionary in data export

Description

Preserve all factor levels from REDCap data dictionary in data export

Usage

apply_factor_labels(data, meta = NULL)

Arguments

data

REDCap exported data set

meta

REDCap data dictionary

Value

data.frame

Apply REDCap filed labels to data frame

Description

Apply REDCap filed labels to data frame

Usage

apply_field_label(data, meta)

Arguments

data

REDCap exported data set

meta

REDCap data dictionary

Value

data.frame

Convert labelled vectors to factors while preserving attributes

Description

This extends as_factor as well as as_factor, by appending original attributes except for "class" after converting to factor to avoid ta loss in case of rich formatted and labelled data.

Usage

as_factor(x, ...)

## S3 method for class 'factor'
as_factor(x, ...)

## S3 method for class 'logical'
as_factor(x, ...)

## S3 method for class 'numeric'
as_factor(x, ...)

## S3 method for class 'character'
as_factor(x, ...)

## S3 method for class 'haven_labelled'
as_factor(
  x,
  levels = c("default", "labels", "values", "both"),
  ordered = FALSE,
  ...
)

## S3 method for class 'labelled'
as_factor(
  x,
  levels = c("default", "labels", "values", "both"),
  ordered = FALSE,
  ...
)

## S3 method for class 'data.frame'
as_factor(x, ..., only_labelled = TRUE)

Arguments

x

Object to coerce to a factor.

...

Other arguments passed down to method.

levels

How to create the levels of the generated factor:

* "default": uses labels where available, otherwise the values. Labels are sorted by value. * "both": like "default", but pastes together the level and value * "label": use only the labels; unlabelled values become 'NA' * "values": use only the values

ordered

If 'TRUE' create an ordered (ordinal) factor, if 'FALSE' (the default) create a regular (nominal) factor.

only_labelled

Only apply to labelled columns?

Details

Please refer to parent functions for extended documentation. To avoid redundancy calls and errors, functions are copy-pasted here

Empty variables with empty levels attribute are interpreted as logicals

Examples

# will preserve all attributes
c(1, 4, 3, "A", 7, 8, 1) |> as_factor()
structure(c(1, 2, 3, 2, 10, 9),
  labels = c(Unknown = 9, Refused = 10)
) |>
  as_factor() |>
  dput()

structure(c(1, 2, 3, 2, 10, 9),
  labels = c(Unknown = 9, Refused = 10),
  class = "haven_labelled"
) |>
  as_factor() |> class()
structure(rep(NA,10),
  class = c("labelled")
) |>
  as_factor() |> summary()

rep(NA,10) |> as_factor()

Interpret specific binary values as logicals

Description

Interpret specific binary values as logicals

Usage

as_logical(
  x,
  values = list(c("TRUE", "FALSE"), c("Yes", "No"), c(1, 0), c(1, 2)),
  ...
)

## S3 method for class 'data.frame'
as_logical(
  x,
  values = list(c("TRUE", "FALSE"), c("Yes", "No"), c(1, 0), c(1, 2)),
  ...
)

## Default S3 method:
as_logical(
  x,
  values = list(c("TRUE", "FALSE"), c("Yes", "No"), c(1, 0), c(1, 2)),
  ...
)

Arguments

x

vector or data.frame

values

list of values to interpret as logicals. First value is

...

ignored interpreted as TRUE.

Value

vector

Examples

c(sample(c("TRUE", "FALSE"), 20, TRUE), NA) |>
  as_logical() |>
  class()
ds <- dplyr::tibble(
  B = factor(sample(c(1, 2), 20, TRUE)),
  A = factor(sample(c("TRUE", "FALSE"), 20, TRUE)),
  C = sample(c(3, 4), 20, TRUE),
  D = factor(sample(c("In", "Out"), 20, TRUE))
)
ds |>
  as_logical() |>
  sapply(class)
ds$A |> class()
sample(c("TRUE",NA), 20, TRUE) |>
  as_logical()
as_logical(0)

List-base regex case_when

Description

Mimics case_when for list of regex patterns and values. Used for date/time validation generation from name vector. Like case_when, the matches are in order of priority. Primarily used in REDCapCAST to do data type coding from systematic variable naming.

Usage

case_match_regex_list(data, match.list, .default = NA)

Arguments

data

vector

match.list

list of case matches

.default

Default value for non-matches. Default is NA.

Value

vector

Examples

case_match_regex_list(
  c("test_date", "test_time", "test_tida", "test_tid"),
  list(date_dmy = "_dat[eo]$", time_hh_mm_ss = "_ti[md]e?$")
)

Overview of REDCapCAST data for shiny

Description

Overview of REDCapCAST data for shiny

Usage

cast_data_overview(data)

Arguments

data

list with class 'REDCapCAST'

Value

gt object

Overview of REDCapCAST meta data for shiny

Description

Overview of REDCapCAST meta data for shiny

Usage

cast_meta_overview(data)

Arguments

data

list with class 'REDCapCAST'

Value

gt object

Simple function to generate REDCap choices from character vector

Description

Simple function to generate REDCap choices from character vector

Usage

char2choice(data, char.split = "/", raw = NULL, .default = NA)

Arguments

data

vector

char.split

splitting character(s)

raw

specific values. Can be used for options of same length.

.default

default value for missing. Default is NA.

Value

vector

Examples

char2choice(c("yes/no","  yep. / nope  ","",NA,"what"),.default=NA)

Simple function to generate REDCap branching logic from character vector

Description

Simple function to generate REDCap branching logic from character vector

Usage

char2cond(
  data,
  minor.split = ",",
  major.split = ";",
  major.sep = " or ",
  .default = NA
)

Arguments

data

vector

minor.split

minor split

major.split

major split

major.sep

argument separation. Default is " or ".

.default

default value for missing. Default is NA.

Value

vector

Examples

#data <- dd_inst$betingelse
#c("Extubation_novent, 2; Pacu_delay, 1") |> char2cond()

Very simple function to remove rich text formatting from field label and save the first paragraph ('<p>...</p>').

Description

Very simple function to remove rich text formatting from field label and save the first paragraph ('<p>...</p>').

Usage

clean_field_label(data)

Arguments

data

field label

Value

character vector

Examples

clean_field_label("<div class=\"rich-text-field-label\"><p>Fazekas score</p></div>")

clean_redcap_name

Description

Stepwise removal on non-alphanumeric characters, trailing white space, substitutes spaces for underscores and converts to lower case. Trying to make up for different naming conventions.

Usage

clean_redcap_name(x)

Arguments

x

vector or data frame for cleaning

Value

vector or data frame, same format as input

Examples

"Research!, ne:ws? and c;l-.ls" |> clean_redcap_name()

Compacting a vector of any length with or without names

Description

Compacting a vector of any length with or without names

Usage

compact_vec(data, nm.sep = ": ", val.sep = "; ")

Arguments

data

vector, optionally named

nm.sep

string separating name from value if any

val.sep

string separating values

Value

character string

Examples

sample(seq_len(4), 20, TRUE) |>
  as_factor() |>
  named_levels() |>
  sort() |>
  compact_vec()
1:6 |> compact_vec()
"test" |> compact_vec()
sample(letters[1:9], 20, TRUE) |> compact_vec()

Create two-column HTML table for data piping in REDCap instruments

Description

Create two-column HTML table for data piping in REDCap instruments

Usage

create_html_table(text, variable)

Arguments

text

descriptive text

variable

variable to pipe

Value

character vector

Examples

create_html_table(text = "Patient ID", variable = c("[cpr]"))
create_html_table(text = paste("assessor", 1:2, sep = "_"), variable = c("[cpr]"))
# create_html_table(text = c("CPR nummer","Word"), variable = c("[cpr][1]", "[cpr][2]", "[test]"))

DEPRICATED Create zips file with necessary content based on data set

Description

Metadata can be added by editing the data dictionary of a project in the initial design phase. If you want to later add new instruments, this function can be used to create (an) instrument(s) to add to a project in production.

Usage

create_instrument_meta(data, dir = here::here(""), record.id = TRUE)

Arguments

data

metadata for the relevant instrument. Could be from 'ds2dd_detailed()'

dir

destination dir for the instrument zip. Default is the current WD.

record.id

flag to omit the first row of the data dictionary assuming this is the record_id field which should not be included in the instrument. Default is TRUE.

Value

list

Examples

## Not run: 
data <- iris |>
  ds2dd_detailed(
    add.auto.id = TRUE,
    form.name = sample(c("b", "c"),
      size = 6,
      replace = TRUE, prob = rep(.5, 2)
    )
  ) |>
  purrr::pluck("meta")
# data |> create_instrument_meta()

data <- iris |>
  ds2dd_detailed(add.auto.id = FALSE) |>
  purrr::pluck("data")
iris |>
  setNames(glue::glue("{sample(x = c('a','b'),size = length(ncol(iris)),
replace=TRUE,prob = rep(x=.5,2))}__{names(iris)}")) |>
  ds2dd_detailed(form.sep = "__")
data |>
  purrr::pluck("meta") |>
  create_instrument_meta(record.id = FALSE)

## End(Not run)

Cut string to desired length

Description

Cut string to desired length

Usage

cut_string_length(data, l = 100)

Arguments

data

data

l

length

Value

character string of length l

Examples

"length" |> cut_string_length(l=3)

Convert single digits to words

Description

Convert single digits to words

Usage

d2w(x, lang = "en", neutrum = FALSE, everything = FALSE)

Arguments

x

data. Handle vectors, data.frames and lists

lang

language. Danish (da) and English (en), Default is "en"

neutrum

for numbers depending on counted word

everything

flag to also split numbers >9 to single digits

Value

returns characters in same format as input

Examples

d2w(c(2:8, 21))
d2w(data.frame(2:7, 3:8, 1), lang = "da", neutrum = TRUE)

## If everything=T, also larger numbers are reduced.
## Elements in the list are same length as input
d2w(list(2:8, c(2, 6, 4, 23), 2), everything = TRUE)

Doc table to data dictionary - EARLY, DOCS MISSING

Description

Works well with 'project.aid::docx2list()'. Allows defining a database in a text document (see provided template) for an easier to use data base creation. This approach allows easier collaboration when defining the database. The generic case is a data frame with variable names as values in a column. This is a format like the REDCap data dictionary, but gives a few options for formatting.

Usage

doc2dd(
  data,
  instrument.name,
  col.variables = 1,
  list.datetime.format = list(date_dmy = "_dat[eo]$", time_hh_mm_ss = "_ti[md]e?$"),
  col.description = NULL,
  col.condition = NULL,
  col.subheader = NULL,
  subheader.tag = "h2",
  condition.minor.sep = ",",
  condition.major.sep = ";",
  col.calculation = NULL,
  col.choices = NULL,
  choices.char.sep = "/",
  missing.default = NA
)

Arguments

data

tibble or data.frame with all variable names in one column

instrument.name

character vector length one. Instrument name.

col.variables

variable names column (default = 1), allows dplyr subsetting

list.datetime.format

formatting for date/time detection. See 'case_match_regex_list()'

col.description

descriptions column, allows dplyr subsetting. If empty, variable names will be used.

col.condition

conditions for branching column, allows dplyr subsetting. See 'char2cond()'.

col.subheader

sub-header column, allows dplyr subsetting. See 'format_subheader()'.

subheader.tag

formatting tag. Default is "h2"

condition.minor.sep

condition split minor. See 'char2cond()'. Default is ",".

condition.major.sep

condition split major. See 'char2cond()'. Default is ";".

col.calculation

calculations column. Has to be written exact. Character vector.

col.choices

choices column. See 'char2choice()'.

choices.char.sep

choices split. See 'char2choice()'. Default is "/".

missing.default

value for missing fields. Default is NA.

Value

tibble or data.frame (same as data)

Examples

# data <- dd_inst
# data |> doc2dd(instrument.name = "evt",
# col.description = 3,
# col.condition = 4,
# col.subheader = 2,
# col.calculation = 5,
# col.choices = 6)

(DEPRECATED) Data set to data dictionary function

Description

Creates a very basic data dictionary skeleton. Please see 'ds2dd_detailed()' for a more advanced function.

Usage

ds2dd(
  ds,
  record.id = "record_id",
  form.name = "basis",
  field.type = "text",
  field.label = NULL,
  include.column.names = FALSE,
  metadata = names(REDCapCAST::redcapcast_meta)
)

Arguments

ds

data set

record.id

name or column number of id variable, moved to first row of data dictionary, character of integer. Default is "record_id".

form.name

vector of form names, character string, length 1 or length equal to number of variables. Default is "basis".

field.type

vector of field types, character string, length 1 or length equal to number of variables. Default is "text.

field.label

vector of form names, character string, length 1 or length equal to number of variables. Default is NULL and is then identical to field names.

include.column.names

Flag to give detailed output including new column names for original data set for upload.

metadata

Metadata column names. Default is the included names(REDCapCAST::redcapcast_meta).

Details

Migrated from stRoke ds2dd(). Fits better with the functionality of 'REDCapCAST'.

Value

data.frame or list of data.frame and vector

Examples

redcapcast_data$record_id <- seq_len(nrow(redcapcast_data))
ds2dd(redcapcast_data, include.column.names = TRUE)

Extract data from stata file for data dictionary

Description

Extract data from stata file for data dictionary

Usage

ds2dd_detailed(
  data,
  add.auto.id = FALSE,
  date.format = "dmy",
  form.name = NULL,
  form.sep = NULL,
  form.prefix = TRUE,
  field.type = NULL,
  field.label = NULL,
  field.label.attr = "label",
  field.validation = NULL,
  metadata = names(REDCapCAST::redcapcast_meta),
  convert.logicals = FALSE
)

Arguments

data

data frame

add.auto.id

flag to add id column

date.format

date format, character string. ymd/dmy/mdy. dafault is dmy.

form.name

manually specify form name(s). Vector of length 1 or ncol(data). Default is NULL and "data" is used.

form.sep

If supplied dataset has form names as suffix or prefix to the column/variable names, the seperator can be specified. If supplied, the form.name is ignored. Default is NULL.

form.prefix

Flag to set if form is prefix (TRUE) or suffix (FALSE) to the column names. Assumes all columns have pre- or suffix if specified.

field.type

manually specify field type(s). Vector of length 1 or ncol(data). Default is NULL and "text" is used for everything but factors, which wil get "radio".

field.label

manually specify field label(s). Vector of length 1 or ncol(data). Default is NULL and colnames(data) is used or attribute 'field.label.attr' for haven_labelled data set (imported .dta file with 'haven::read_dta()').

field.label.attr

attribute name for named labels for haven_labelled data set (imported .dta file with 'haven::read_dta()'. Default is "label"

field.validation

manually specify field validation(s). Vector of length 1 or ncol(data). Default is NULL and 'levels()' are used for factors or attribute 'factor.labels.attr' for haven_labelled data set (imported .dta file with 'haven::read_dta()').

metadata

redcap metadata headings. Default is names(REDCapCAST::redcapcast_meta).

convert.logicals

convert logicals to factor. Default is TRUE.

Details

This function is a natural development of the ds2dd() function. It assumes that the first column is the ID-column. No checks. Please, do always inspect the data dictionary before upload.

Ensure, that the data set is formatted with as much information as possible.

'field.type' can be supplied

Value

list of length 2

Examples

## Basic parsing with default options
requireNamespace("REDCapCAST")
redcapcast_data |>
  dplyr::select(-dplyr::starts_with("redcap_")) |>
  ds2dd_detailed()

## Adding a record_id field
iris |> ds2dd_detailed(add.auto.id = TRUE)

## Passing form name information to function
iris |>
  ds2dd_detailed(
    add.auto.id = TRUE,
    form.name = sample(c("b", "c"), size = 6, replace = TRUE, prob = rep(.5, 2))
  ) |>
  purrr::pluck("meta")
mtcars |>
  dplyr::mutate(unknown = NA) |>
  numchar2fct() |>
  ds2dd_detailed(add.auto.id = TRUE)

## Using column name suffix to carry form name
data <- iris |>
  ds2dd_detailed(add.auto.id = TRUE) |>
  purrr::pluck("data")
names(data) <- glue::glue("{sample(x = c('a','b'),size = length(names(data)),
replace=TRUE,prob = rep(x=.5,2))}__{names(data)}")
data |> ds2dd_detailed(form.sep = "__")

Secure API key storage and data acquisition in one

Description

Secure API key storage and data acquisition in one

Usage

easy_redcap(
  project.name,
  uri,
  raw_or_label = "both",
  data_format = c("wide", "list", "redcap", "long"),
  widen.data = NULL,
  ...
)

Arguments

project.name

The name of the current project (for key storage with key_set, using the default keyring)

uri

REDCap database API uri

raw_or_label

argument passed on to read_redcap_tables. Default is "both" to get labelled data.

data_format

Choose the data

widen.data

argument to widen the exported data. [DEPRECATED], use 'data_format'instead

...

arguments passed on to read_redcap_tables.

Value

data.frame or list depending on widen.data

Examples

## Not run: 
easy_redcap("My_new_project", fields = c("record_id", "age", "hypertension"))

## End(Not run)

Creates zip-file with necessary content to manually add instrument to database

Description

Usage

export_redcap_instrument(data, file, force = FALSE, record.id = "record_id")

Arguments

data

metadata for the relevant instrument. Could be from 'ds2dd_detailed()'

file

destination file name.

force

force instrument creation and ignore different form names by just using the first.

record.id

record id variable name. Default is 'record_id'.

Value

exports zip-file

Examples

# iris |>
#  ds2dd_detailed(
#    add.auto.id = TRUE,
#    form.name = sample(c("b", "c"), size = 6, replace = TRUE, prob = rep(.5, 2))
#  ) |>
#  purrr::pluck("meta") |>
#  (\(.x){
#  split(.x, .x$form_name)
#  })() |>
#  purrr::imap(function(.x, .i){
#  export_redcap_instrument(.x,file=here::here(paste0(.i,Sys.Date(),".zip")))
#  })

# iris |>
#  ds2dd_detailed(
#    add.auto.id = TRUE
#  ) |>
#  purrr::pluck("meta") |>
#  export_redcap_instrument(file=here::here(paste0("instrument",Sys.Date(),".zip")))

Allows conversion of factor to numeric values preserving original levels

Description

Allows conversion of factor to numeric values preserving original levels

Usage

fct2num(data)

Arguments

data

vector

Value

numeric vector

Examples

c(1, 4, 3, "A", 7, 8, 1) |>
  as_factor() |>
  fct2num()

structure(c(1, 2, 3, 2, 10, 9),
  labels = c(Unknown = 9, Refused = 10),
  class = "haven_labelled"
) |>
  as_factor() |>
  fct2num()

structure(c(1, 2, 3, 2, 10, 9),
  labels = c(Unknown = 9, Refused = 10),
  class = "labelled"
) |>
  as_factor() |>
  fct2num()

structure(c(1, 2, 3, 2, 10, 9),
  labels = c(Unknown = 9, Refused = 10)
) |>
  as_factor() |>
  fct2num()

Drop unused levels preserving label data

Description

This extends [forcats::fct_drop()] to natively work across a data.frame and replaces [base::droplevels()].

Usage

fct_drop(x, ...)

## S3 method for class 'data.frame'
fct_drop(x, ...)

## S3 method for class 'factor'
fct_drop(x, ...)

Arguments

x

Factor to drop unused levels

...

Other arguments passed down to method.

Examples

mtcars |>
  numchar2fct() |>
  fct_drop()
mtcars |>
  numchar2fct() |>
  dplyr::mutate(vs = fct_drop(vs))

DEPRECATED Helper to import files correctly

Description

DEPRECATED Helper to import files correctly

Usage

file_extension(filenames)

Arguments

filenames

file names

Value

character vector

Examples

file_extension(list.files(here::here(""))[[2]])[[1]]
file_extension(c("file.cd..ks", "file"))

focused_metadata

Description

Extracts limited metadata for variables in a dataset

Usage

focused_metadata(metadata, vars_in_data)

Arguments

metadata

A dataframe containing metadata

vars_in_data

Vector of variable names in the dataset

Value

A dataframe containing metadata for the variables in the dataset

Converts REDCap choices to factor levels and stores in labels attribute

Description

Applying as_factor to the data.frame or variable, will coerce to a factor.

Usage

format_redcap_factor(data, meta)

Arguments

data

vector

meta

vector of REDCap choices

Value

vector of class "labelled" with a "labels" attribute

Examples

format_redcap_factor(sample(1:3, 20, TRUE), "1, First. | 2, second | 3, THIRD")

Sub-header formatting wrapper

Description

Sub-header formatting wrapper

Usage

format_subheader(data, tag = "h2")

Arguments

data

character vector

tag

character vector length 1

Value

character vector

Examples

"Instrument header" |> format_subheader()

Retrieve project API key if stored, if not, set and retrieve

Description

Attempting to make secure API key storage so simple, that no other way makes sense. Wrapping key_get and key_set using the key_list to check if key is in storage already.

Usage

get_api_key(key.name, ...)

Arguments

key.name

character vector of key name

...

passed to key_set

Value

character vector

Extract attribute. Returns NA if none

Description

Extract attribute. Returns NA if none

Usage

get_attr(data, attr = NULL)

Arguments

data

vector

attr

attribute name

Value

character vector

Examples

attr(mtcars$mpg, "label") <- "testing"
do.call(c, sapply(mtcars, get_attr))
## Not run: 
mtcars |>
  numchar2fct(numeric.threshold = 6) |>
  ds2dd_detailed()

## End(Not run)

Get the id name

Description

Get the id name

Usage

get_id_name(data)

Arguments

data

data frame or list

Value

character vector

Guess time variables based on naming pattern

Description

This is for repairing data with time variables with appended "1970-01-01"

Usage

guess_time_only(
  data,
  validate.time = FALSE,
  time.var.sel.pos = "[Tt]i[d(me)]",
  time.var.sel.neg = "[Dd]at[eo]"
)

Arguments

data

data.frame or tibble

validate.time

Flag to validate guessed time columns

time.var.sel.pos

Positive selection regex string passed to 'gues_time_only_filter()' as sel.pos.

time.var.sel.neg

Negative selection regex string passed to 'gues_time_only_filter()' as sel.neg.

Value

data.frame or tibble

Examples

redcapcast_data |> guess_time_only(validate.time = TRUE)

Try at determining which are true time only variables

Description

This is just a try at guessing data type based on data class and column names hoping for a tiny bit of naming consistency. R does not include a time-only data format natively, so the "hms" class from 'readr' is used. This has to be converted to character class before REDCap upload.

Usage

guess_time_only_filter(
  data,
  validate = FALSE,
  sel.pos = "[Tt]i[d(me)]",
  sel.neg = "[Dd]at[eo]"
)

Arguments

data

data set

validate

flag to output validation data. Will output list.

sel.pos

Positive selection regex string

sel.neg

Negative selection regex string

Value

character vector or list depending on 'validate' flag.

Examples

data <- redcapcast_data
data |> guess_time_only_filter()
data |>
  guess_time_only_filter(validate = TRUE) |>
  lapply(head)

Finish incomplete haven attributes substituting missings with values

Description

Finish incomplete haven attributes substituting missings with values

Usage

haven_all_levels(data)

Arguments

data

haven labelled variable

Value

named vector

Examples

ds <- structure(c(1, 2, 3, 2, 10, 9),
  labels = c(Unknown = 9, Refused = 10),
  class = "haven_labelled"
)
haven::is.labelled(ds)
attributes(ds)
ds |> haven_all_levels()

Change "hms" to "character" for REDCap upload.

Description

Change "hms" to "character" for REDCap upload.

Usage

hms2character(data)

Arguments

data

data set

Value

data.frame or tibble

Examples

data <- redcapcast_data
## data |> time_only_correction() |> hms2character()

Simple html tag wrapping for REDCap text formatting

Description

Simple html tag wrapping for REDCap text formatting

Usage

html_tag_wrap(data, tag = "h2", extra = NULL)

Arguments

data

character vector

tag

character vector length 1

extra

character vector

Value

character vector

Examples

html_tag_wrap("Titel", tag = "div", extra = 'class="rich-text-field-label"')
html_tag_wrap("Titel", tag = "h2")

Tests for multiple label classes

Description

Tests for multiple label classes

Usage

is.labelled(x, classes = c("haven_labelled", "labelled"))

Arguments

x

data

classes

classes to test

Value

logical

Examples

structure(c(1, 2, 3, 2, 10, 9),
  labels = c(Unknown = 9, Refused = 10),
  class = "haven_labelled"
) |> is.labelled()

Multi missing check

Description

Multi missing check

Usage

is_missing(data, nas = c("", "NA"))

Arguments

data

character vector

nas

character vector of strings considered as NA

Value

logical vector

Test if repeatable or longitudinal

Description

Test if repeatable or longitudinal

Usage

is_repeated_longitudinal(
  data,
  generics = c("redcap_event_name", "redcap_repeat_instrument", "redcap_repeat_instance")
)

Arguments

data

data set

generics

default is "redcap_event_name", "redcap_repeat_instrument" and "redcap_repeat_instance"

Value

logical

Examples

is_repeated_longitudinal(c("record_id", "age", "record_id", "gender"))
is_repeated_longitudinal(redcapcast_data)
is_repeated_longitudinal(list(redcapcast_data))

Completion marking based on completed upload

Description

Completion marking based on completed upload

Usage

mark_complete(upload, ls)

Arguments

upload

output list from 'REDCapR::redcap_write()'

ls

output list from 'ds2dd_detailed()'

Value

list with 'REDCapR::redcap_write()' results

Match fields to forms

Description

Match fields to forms

Usage

match_fields_to_form(metadata, vars_in_data)

Arguments

metadata

A data frame containing field names and form names

vars_in_data

A character vector of variable names

Value

A data frame containing field names and form names

mtcars dataset slightly modified to use for Shiny app upload demonstration

Description

mtcars dataset slightly modified to use for Shiny app upload demonstration

Usage

data(mtcars_redcap)

Format

A data frame with 13 variables:

record_id: ID, numeric
mpg: ID, numeric
cyl: ID, numeric
disp: ID, numeric
hp: ID, numeric
drat: ID, numeric
wt: ID, numeric
qsec: ID, numeric
vs: ID, numeric
am: ID, numeric
gear: ID, numeric
carb: ID, numeric
name: original rownames, charater

Get named vector of factor levels and values

Description

Get named vector of factor levels and values

Usage

named_levels(
  data,
  label = "labels",
  na.label = NULL,
  na.value = 99,
  sort.numeric = TRUE
)

Arguments

data

factor

label

character string of attribute with named vector of factor labels

na.label

character string to refactor NA values. Default is NULL.

na.value

new value for NA strings. Ignored if na.label is NULL. Default is 99.

sort.numeric

sort factor levels if levels are numeric. Default is TRUE

Value

named vector

Examples

structure(c(1, 2, 3, 2, 10, 9),
  labels = c(Unknown = 9, Refused = 10),
  class = "haven_labelled"
) |>
  as_factor() |>
  named_levels()
structure(c(1, 2, 3, 2, 10, 9),
  labels = c(Unknown = 9, Refused = 10),
  class = "labelled"
) |>
  as_factor() |>
  named_levels()

Nav_bar defining function for shiny ui

Description

Nav_bar defining function for shiny ui

Usage

nav_bar_page()

Value

shiny object

Applying var2fct across data set

Description

Individual thresholds for character and numeric columns

Usage

numchar2fct(data, numeric.threshold = 6, character.throshold = 6)

Arguments

data

dataset. data.frame or tibble

numeric.threshold

threshold for var2fct for numeric columns. Default is 6.

character.throshold

threshold for var2fct for character columns. Default is 6.

Value

data.frame or tibble

Examples

mtcars |> str()
## Not run: 
mtcars |>
  numchar2fct(numeric.threshold = 6) |>
  str()

## End(Not run)

Helper to auto-parse un-formatted data with haven and readr

Description

Helper to auto-parse un-formatted data with haven and readr

Usage

parse_data(
  data,
  guess_type = TRUE,
  col_types = NULL,
  locale = readr::default_locale(),
  ignore.vars = "cpr",
  ...
)

Arguments

data

data.frame or tibble

guess_type

logical to guess type with readr

col_types

specify col_types using readr semantics. Ignored if guess_type is TRUE

locale

option to specify locale. Defaults to readr::default_locale().

ignore.vars

specify column names of columns to ignore when parsing

...

ignored

Value

data.frame or tibble

Examples

mtcars |>
  parse_data() |>
  str()

Tests if vector can be interpreted as numeric without introducing NAs by coercion

Description

Tests if vector can be interpreted as numeric without introducing NAs by coercion

Usage

possibly_numeric(data)

Arguments

data

vector

Value

logical

Examples

c("1","5") |> possibly_numeric()
c("1","5","e") |> possibly_numeric()

Test if vector can be interpreted as roman numerals

Description

Test if vector can be interpreted as roman numerals

Usage

possibly_roman(data)

Arguments

data

character vector

Value

logical

Examples

sample(1:100, 10) |>
  as.roman() |>
  possibly_roman()
sample(c(TRUE, FALSE), 10, TRUE) |> possibly_roman()
rep(NA, 10) |> possibly_roman()

User input processing

Description

User input processing

Usage

process_user_input(x)

Arguments

x

input

Value

processed input

User input processing character

Description

User input processing character

Usage

## S3 method for class 'character'
process_user_input(x, ...)

Arguments

x

input

...

ignored

Value

processed input

User input processing data.frame

Description

User input processing data.frame

Usage

## S3 method for class 'data.frame'
process_user_input(x, ...)

Arguments

x

input

...

ignored

Value

processed input

User input processing default

Description

User input processing default

Usage

## Default S3 method:
process_user_input(x, ...)

Arguments

x

input

...

ignored

Value

processed input

User input processing response

Description

User input processing response

Usage

## S3 method for class 'response'
process_user_input(x, ...)

Arguments

x

input

...

ignored

Value

processed input

Flexible file import based on extension

Description

Flexible file import based on extension

Usage

read_input(file, consider.na = c("NA", "\"\"", ""))

Arguments

file

file name

consider.na

character vector of strings to consider as NAs

Value

tibble

Examples

read_input("https://raw.githubusercontent.com/agdamsbo/cognitive.index.lookup/main/data/sample.csv")

Convenience function to download complete instrument, using token storage in keyring.

Description

Convenience function to download complete instrument, using token storage in keyring.

Usage

read_redcap_instrument(
  key,
  uri,
  instrument,
  raw_or_label = "raw",
  id_name = "record_id",
  records = NULL
)

Arguments

key

key name in standard keyring for token retrieval.

uri

REDCap database API uri

instrument

instrument name

raw_or_label

raw or label passed to 'REDCapR::redcap_read()'

id_name

id variable name. Default is "record_id".

records

specify the records to download. Index numbers. Numeric vector.

Value

data.frame

Download REDCap data

Description

Implementation of passed on to REDCap_split with a focused data acquisition approach using passed on to redcap_read and only downloading specified fields, forms and/or events using the built-in focused_metadata including some clean-up. Works with classical and longitudinal projects with or without repeating instruments. Will preserve metadata in the data.frames as labels.

Usage

read_redcap_tables(
  uri,
  token,
  records = NULL,
  fields = NULL,
  events = NULL,
  forms = NULL,
  raw_or_label = c("raw", "label", "both"),
  split_forms = c("all", "repeating", "none"),
  ...
)

Arguments

uri

REDCap database API uri

token

API token

records

records to download

fields

fields to download

events

events to download

forms

forms to download

raw_or_label

raw or label tags. Can be "raw", "label" or "both".

* "raw": Standard redcap_read method to get raw values. * "label": Standard redcap_read method to get label values. * "both": Get raw values with REDCap labels applied as labels. Use as_factor to format factors with original labels and use the 'gtsummary' package functions like tbl_summary to easily get beautiful tables with original labels from REDCap. Use fct_drop to drop empty levels.

split_forms

Whether to split "repeating" or "all" forms, default is all. Give "none" to export native semi-long REDCap format

...

passed on to redcap_read

Value

list of instruments

Examples

# Examples will be provided later

Transforms list of REDCap data.frames to a single wide data.frame

Description

Converts a list of REDCap data.frames from long to wide format. In essence it is a wrapper for the pivot_wider function applied on a REDCap output (from read_redcap_tables) or manually split by REDCap_split.

Usage

redcap_wider(
  data,
  event.glue = "{.value}____{redcap_event_name}",
  inst.glue = "{.value}____{redcap_repeat_instance}"
)

Arguments

data

A list of data frames

event.glue

A glue string for repeated events naming

inst.glue

A glue string for repeated instruments naming

Value

data.frame in wide format

Examples

# Longitudinal
list1 <- list(
  data.frame(
    record_id = c(1, 2, 1, 2),
    redcap_event_name = c("baseline", "baseline", "followup", "followup"),
    age = c(25, 26, 27, 28)
  ),
  data.frame(
    record_id = c(1, 2),
    redcap_event_name = c("baseline", "baseline"),
    gender = c("male", "female")
  )
)
redcap_wider(list1)
# Simpel with two instruments
list2 <- list(
  data.frame(
    record_id = c(1, 2),
    age = c(25, 26)
  ),
  data.frame(
    record_id = c(1, 2),
    gender = c("male", "female")
  )
)
redcap_wider(list2)
# Simple with single instrument
list3 <- list(data.frame(
  record_id = c(1, 2),
  age = c(25, 26)
))
redcap_wider(list3)
# Longitudinal with repeatable instruments
list4 <- list(
  data.frame(
    record_id = c(1, 2, 1, 2),
    redcap_event_name = c("baseline", "baseline", "followup", "followup"),
    age = c(25, 26, 27, 28)
  ),
  data.frame(
    record_id = c(1, 1, 1, 1, 2, 2, 2, 2),
    redcap_event_name = c(
      "baseline", "baseline", "followup", "followup",
      "baseline", "baseline", "followup", "followup"
    ),
    redcap_repeat_instrument = "walk",
    redcap_repeat_instance = c(1, 2, 1, 2, 1, 2, 1, 2),
    dist = c(40, 32, 25, 33, 28, 24, 23, 36)
  ),
  data.frame(
    record_id = c(1, 2),
    redcap_event_name = c("baseline", "baseline"),
    gender = c("male", "female")
  )
)
redcap_wider(list4)

list5 <- list(
  data.frame(
    record_id = c(1, 2, 1, 2),
    redcap_event_name = c("baseline", "baseline", "followup", "followup")
  ),
  data.frame(
    record_id = c(1, 1, 1, 1, 2, 2, 2, 2),
    redcap_event_name = c(
      "baseline", "baseline", "followup", "followup",
      "baseline", "baseline", "followup", "followup"
    ),
    redcap_repeat_instrument = "walk",
    redcap_repeat_instance = c(1, 2, 1, 2, 1, 2, 1, 2),
    dist = c(40, 32, 25, 33, 28, 24, 23, 36)
  ),
  data.frame(
    record_id = c(1, 2),
    redcap_event_name = c("baseline", "baseline"),
    gender = c("male", "female")
  )
)
redcap_wider(list5)

Data set for demonstration

Description

This is a small dataset from a REDCap database for demonstrational purposes. Contains only synthetic data.

Usage

data(redcapcast_data)

Format

A data frame with 22 variables:

record_id: ID, numeric
redcap_event_name: Event name, character
redcap_repeat_instrument: Repeat instrument, character
redcap_repeat_instance: Repeat instance, numeric
cpr: CPR number, character
inclusion: Inclusion date, Date
inclusion_time: Inclusion time, hms
dob: Date of birth, Date
age: Age decimal, numeric
age_integer: Age integer, numeric
sex: Legal sex, character
cohabitation: Cohabitation status, character
con_calc: con_calc
con_mrs: con_mrs
consensus_complete: consensus_complete
hypertension: Hypertension, character
diabetes: diabetes, character
region: region, character
baseline_data_start_complete: Completed, character
mrs_assessed: mRS Assessed, character
mrs_date: Assesment date, Date
mrs_score: Categorical score, numeric
mrs_complete: Complete, numeric
event_datetime: Event datetime, POSIXct
event_age: Age at time of event, numeric
event_type: Event type, character
new_event_complete: Completed, character

REDCap metadata from data base

Description

This metadata dataset from a REDCap database is for demonstration purposes.

Usage

data(redcapcast_meta)

Format

A data frame with 22 variables:

field_name: field_name, character
form_name: form_name, character
section_header: section_header, character
field_type: field_type, character
field_label: field_label, character
select_choices_or_calculations: select_choices_or_calculations, character
field_note: field_note, character
text_validation_type_or_show_slider_number: text_validation_type_or_show_slider_number, character
text_validation_min: text_validation_min, character
text_validation_max: text_validation_max, character
identifier: identifier, character
branching_logic: branching_logic, character
required_field: required_field, character
custom_alignment: custom_alignment, character
question_number: question_number, character
matrix_group_name: matrix_group_name, character
matrix_ranking: matrix_ranking, character
field_annotation: field_annotation, character

Replace curly apostrophes and quotes from word

Description

Copied from textclean, which has not been updated since 2018 and is not on CRAN. Github:https://github.com/trinker/textclean

Usage

replace_curly_quote(x)

Arguments

x

character vector

Value

character vector

Sanitize list of data frames

Description

Removing empty rows

Usage

sanitize_split(
  l,
  generic.names = c("redcap_event_name", "redcap_repeat_instrument",
    "redcap_repeat_instance"),
  drop.complete = TRUE,
  drop.empty = TRUE
)

Arguments

l

A list of data frames.

generic.names

A vector of generic names to be excluded.

drop.complete

logical to remove generic REDCap variables indicating instrument completion. Default is TRUE.

drop.empty

logical to remove variables with only NAs Default is TRUE.

Value

A list of data frames with generic names excluded.

Set attributes for named attribute. Appends if attr is NULL

Description

Set attributes for named attribute. Appends if attr is NULL

Usage

set_attr(data, label, attr = NULL, overwrite = FALSE)

Arguments

data

vector

label

label

attr

attribute name

overwrite

overwrite existing attributes. Default is FALSE.

Value

vector with attribute

Launch the included Shiny-app for database casting and upload

Description

Wraps shiny::runApp()

Usage

shiny_cast(...)

Arguments

...

Arguments passed to shiny::runApp()

Value

shiny app

Examples

# shiny_cast()

Split a data frame into separate tables for each form

Description

Split a data frame into separate tables for each form

Usage

split_non_repeating_forms(table, universal_fields, fields)

Arguments

table

A data frame

universal_fields

A character vector of fields that should be included in every table

fields

A two-column matrix containing the names of fields that should be included in each form

Value

A list of data frames, one for each non-repeating form

Examples

# Create a table
table <- data.frame(
  id = c(1, 2, 3, 4, 5),
  form_a_name = c("John", "Alice", "Bob", "Eve", "Mallory"),
  form_a_age = c(25, 30, 25, 15, 20),
  form_b_name = c("John", "Alice", "Bob", "Eve", "Mallory"),
  form_b_gender = c("M", "F", "M", "F", "F")
)

# Create the universal fields
universal_fields <- c("id")

# Create the fields
fields <- matrix(
  c(
    "form_a_name", "form_a",
    "form_a_age", "form_a",
    "form_b_name", "form_b",
    "form_b_gender", "form_b"
  ),
  ncol = 2, byrow = TRUE
)

# Split the table
split_non_repeating_forms(table, universal_fields, fields)

Extended string splitting

Description

Can be used as a substitute of the base function. Main claim to fame is easing the split around the defined delimiter, see example.

Usage

strsplitx(x, split, type = "classic", perl = FALSE, ...)

Arguments

x

data

split

delimiter

type

Split type. Can be c("classic", "before", "after", "around")

perl

perl param from strsplit()

...

additional parameters are passed to base strsplit handling splits

Value

list

Examples

test <- c("12 months follow-up", "3 steps", "mRS 6 weeks",
"Counting to 231 now")
strsplitx(test, "[0-9]", type = "around")

Transfer variable name suffix to label in widened data

Description

Transfer variable name suffix to label in widened data

Usage

suffix2label(
  data,
  suffix.sep = "____",
  attr = "label",
  glue.str = "{label} ({paste(suffixes,collapse=', ')})"
)

Arguments

data

data.frame

suffix.sep

string to split suffix(es). Passed to strsplit

attr

label attribute. Default is "label"

glue.str

glue string for new label. Available variables are "label" and "suffixes"

Value

data.frame

Correction based on time_only_filter function

Description

Correction based on time_only_filter function

Usage

time_only_correction(data, ...)

Arguments

data

data set

...

arguments passed on to 'guess_time_only_filter()'

Value

tibble

Examples

data <- redcapcast_data
## data |> time_only_correction()

Convert vector to factor based on threshold of number of unique levels

Description

This is a wrapper of forcats::as_factor, which sorts numeric vectors before factoring, but levels character vectors in order of appearance.

Usage

var2fct(data, unique.n)

Arguments

data

vector or data.frame column

unique.n

threshold to convert class to factor

Value

vector

Examples

sample(seq_len(4), 20, TRUE) |>
  var2fct(6) |>
  summary()
sample(letters, 20) |>
  var2fct(6) |>
  summary()
sample(letters[1:4], 20, TRUE) |> var2fct(6)

Named vector to REDCap choices ('wrapping compact_vec()')

Description

Named vector to REDCap choices ('wrapping compact_vec()')

Usage

vec2choice(data)

Arguments

data

named vector

Value

character string

Examples

sample(seq_len(4), 20, TRUE) |>
  as_factor() |>
  named_levels() |>
  sort() |>
  vec2choice()

REDCapCAST: REDCap Metadata Casting and Castellated Data Handling

Description

Author(s)

See Also

Split REDCap repeating instruments table into multiple tables

Description

Usage

Arguments

Value

Author(s)

Examples

Check if vector is all NA

Description

Usage

Arguments

Value

Examples

Preserve all factor levels from REDCap data dictionary in data export

Description

Usage

Arguments

Value

Apply REDCap filed labels to data frame

Description

Usage

Arguments

Value

Convert labelled vectors to factors while preserving attributes

Description

Usage

Arguments

Details

Examples

Interpret specific binary values as logicals

Description

Usage

Arguments

Value

Examples

List-base regex case_when

Description

Usage

Arguments

Value

Examples

Overview of REDCapCAST data for shiny

Description

Usage

Arguments

Value

Overview of REDCapCAST meta data for shiny

Description

Usage

Arguments

Value

Simple function to generate REDCap choices from character vector

Description

Usage

Arguments

Value

Examples

Simple function to generate REDCap branching logic from character vector

Description

Usage

Arguments

Value

Examples

Very simple function to remove rich text formatting from field label and save the first paragraph ('<p>...</p>').

Description

Usage

Arguments

Value

Examples

clean_redcap_name

Description

Usage

Arguments

Value

Examples

Compacting a vector of any length with or without names