Help for package volker

Type:

Package

Title:

High-Level Functions for Tabulating, Charting and Reporting Survey Data

Version:

3.1.0

Date:

2025-04-05

Description:

Craft polished tables and plots in Markdown reports. Simply choose whether to treat your data as counts or metrics, and the package will automatically generate well-designed default tables and plots for you. Boiled down to the basics, with labeling features and simple interactive reports. All functions are 'tidyverse' compatible.

URL:

https://github.com/strohne/volker, https://strohne.github.io/volker/

BugReports:

https://github.com/strohne/volker/issues

License:

MIT + file LICENSE

Encoding:

UTF-8

RoxygenNote:

7.3.2

LazyData:

true

Imports:

stats, utils, rlang, lifecycle, tibble, dplyr, tidyr, tidyselect, ggplot2 (≥ 2.2.1), scales, base64enc, purrr, magrittr, skimr, broom, knitr, kableExtra, rmarkdown, psych, car, effectsize, heplots

Depends:

R (≥ 4.2)

Suggests:

tidyverse, remotes, usethis, testthat (≥ 3.0.0)

VignetteBuilder:

knitr

Config/testthat/edition:

NeedsCompilation:

Packaged:

2025-04-05 20:30:45 UTC; Jakob

Author:

Jakob Jünger

[aut, cre, cph], Henrieke Kotthoff [aut, ctb], Chantal Gärtner

[ctb]

Maintainer:

Jakob Jünger <jakob.juenger@uni-muenster.de>

Repository:

CRAN

Date/Publication:

2025-04-05 20:50:02 UTC

volker: High-Level Functions for Tabulating, Charting and Reporting Survey Data

Description

Author(s)

Maintainer: Jakob Jünger jakob.juenger@uni-muenster.de (ORCID) [copyright holder]

Authors:

Henrieke Kotthoff henrieke.kotthoff@uni-muenster.de [contributor]

Other contributors:

Chantal Gärtner chantal.gaertner@uni-muenster.de (ORCID) [contributor]

Pipe operator

Description

See magrittr::%>% for details.

Usage

lhs %>% rhs

Arguments

lhs

A value or the magrittr placeholder.

rhs

A function call using the magrittr semantics.

Value

The result of calling rhs(lhs).

Add an object to the report list

Description

Add an object to the report list

Usage

.add_to_vlkr_rprt(obj, chunks, tab = NULL)

Arguments

obj

A new chunk (volker table, volker plot or character value).

chunks

The current report list.

tab

A tabsheet name or NULL.

Value

A volker report object.

Insert a name-value-pair into an object attribute

Description

Insert a name-value-pair into an object attribute

Usage

.attr_insert(obj, key, name, value)

Arguments

obj

The object.

key

The attribute key.

name

The name of a list item within the attribute.

value

The value of the list item.

Value

The object with new attributes.

Transfer attributes from one to another object

Description

Transfer attributes from one to another object

Usage

.attr_transfer(to, from, keys)

Arguments

to

The target object.

from

The source object.

keys

A character vector of attribute keys

Value

The target object with the updated attributes.

Get the maximum density value in a density plot

Description

Useful for placing geoms in the center of density plots

Usage

.density_mode(data, col)

Arguments

data

A tibble.

col

A tidyselect column.

Value

The maximum density value.

Test whether correlations are different from zero

Description

Test whether correlations are different from zero

Usage

.effect_correlations(data, cols, cross, method = "pearson", labels = TRUE)

Arguments

data

A tibble.

cols

The columns holding metric values.

cross

The columns holding metric values to correlate.

method

The output metrics, pearson = Pearson's R, spearman = Spearman's rho. The reported R square value is just squared Spearman's or Pearson's R.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

Value

A tibble with correlation results.

Calculate nmpi

Description

Calculate nmpi

Usage

.effect_npmi(data, col, cross, labels = TRUE, clean = TRUE, smoothing = 0, ...)

Arguments

data

A tibble.

col

The column holding factor values.

cross

The column to correlate.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

smoothing

Add pseudocount. Calculate the pseudocount based on the number of trials to apply Laplace's rule of succession.

...

Placeholder to allow calling the method with unused parameters from tab_counts.

Value

A volker tibble.

Create a factor vector and preserve all attributes

Description

Create a factor vector and preserve all attributes

Usage

.factor_with_attr(x, levels = NULL)

Arguments

x

The source value, usually a character vector

levels

The new levels

Value

A factor vector with the new levels

Get plot size and resolution for the current output format from the config

Description

Get plot size and resolution for the current output format from the config

Usage

.get_fig_settings()

Value

A list with figure settings

Calculate IQR

Description

Calculate IQR

Usage

.iqr(x)

Arguments

x

A numeric vector

Value

The IQR

Knit volker plots

Description

Automatically calculates the plot height from chunk options and volker options.

Usage

.knit_plot(pl)

Arguments

pl

A ggplot object with vlkr_options. The vlk_options are added by .to_vlkr_plot() and provide information about the number of vertical items (rows) and the maximum.

Details

Presumptions:

a screen resolution of 72dpi
a default plot width of 7 inches = 504px
a default page width of 700px (vignette) or 910px (report)
an optimal bar height of 40px for 910px wide plots. i.e. a ratio of 0.04
an offset of one bar above and one bar below

Value

Character string containing a html image tag, including the base64 encoded image.

Prepare markdown content for table rendering

Description

Prepare markdown content for table rendering

Usage

.knit_prepare(x, wrap = FALSE)

Arguments

x

Markdown text.

wrap

Wrap text after the given number of characters.

Value

Markdown text with line breaks and escaped special characters.

Knit volker tables

Description

Knit volker tables

Usage

.knit_table(df, ...)

Arguments

df

Data frame.

Value

Formatted table produced by kable.

Calculate outliers

Description

Calculate outliers

Usage

.outliers(x, k = 1.5)

Arguments

x

A numeric vector.

Value

A list of outliers.

Helper function: plot grouped bar chart

Description

Helper function: plot grouped bar chart

Usage

.plot_bars(
  data,
  category = NULL,
  ci = FALSE,
  scale = NULL,
  limits = NULL,
  numbers = NULL,
  orientation = "horizontal",
  base = NULL,
  title = NULL
)

Arguments

data

Data frame with the columns item, value, p, n and optionally p_item. If p_item is provided, the column width is generated according the p_item value, resulting in a mosaic plot.

category

Category for filtering the data frame.

ci

Whether to plot error bars for 95% confidence intervals. Provide the columns ci.low and ci.high in data.

scale

Direction of the scale: 0 = no direction for categories, -1 = descending or 1 = ascending values.

numbers

The values to print on the bars: "n" (frequency), "p" (percentage) or both.

orientation

Whether to show bars (horizontal) or columns (vertical)

base

The plot base as character or NULL.

title

The plot title as character or NULL.

Value

A ggplot object.

Helper function: plot cor and regression outputs

Description

Helper function: plot cor and regression outputs

Usage

.plot_cor(
  data,
  ci = TRUE,
  base = NULL,
  limits = NULL,
  title = NULL,
  label = NULL
)

Arguments

data

Dataframe with the columns item and value. To plot errorbars, add the columns low and high and set the ci-paramater to TRUE.

ci

Whether to plot confidence intervals. Provide the columns low and high in data.

base

The plot base as character or NULL.

limits

The scale limits.

title

The plot title as character or NULL.

label

The y axis label.

Value

A ggplot object.

Helper function: plot grouped line chart

Description

Helper function: plot grouped line chart

Usage

.plot_lines(data, scale = NULL, base = NULL, limits = NULL, title = NULL)

Arguments

data

Dataframe with the columns item, value, and .cross

scale

Passed to the label scale function.

base

The plot base as character or NULL.

limits

The scale limits.

title

The plot title as character or NULL.

Value

A ggplot object.

Helper function: scree plot

Description

Helper function: scree plot

Usage

.plot_scree(data, k = NULL, lab_x = NULL, lab_y = NULL)

Arguments

data

Dataframe with the factor or cluster number in the first column and the metric in the second.

k

Provide one of the values in the first column to color points up to this value.

lab_x

Label of the x axis

lab_y

Label of the y axis

Value

A vlkr_plot object

Helper function: plot grouped line chart by summarising values

Description

Helper function: plot grouped line chart by summarising values

Usage

.plot_summary(
  data,
  ci = FALSE,
  scale = NULL,
  base = NULL,
  box = FALSE,
  limits = NULL,
  title = NULL
)

Arguments

data

Dataframe with the columns item, value.

ci

Whether to plot confidence intervals of the means.

scale

Passed to the label scale function.

base

The plot base as character or NULL.

box

Whether to add boxplots.

title

The plot title as character or NULL.

Value

A ggplot object.

Generate an cluster table and plot

Description

Generate an cluster table and plot

Usage

.report_cls(
  data,
  cols,
  cross,
  metric = FALSE,
  ...,
  k = 2,
  effect = FALSE,
  title = TRUE
)

Arguments

data

A data frame.

cols

A tidy column selection, e.g. a single column (without quotes) or multiple columns selected by methods such as starts_with().

cross

Not yet implemented. Optional, a grouping column (without quotes).

metric

Not yet implemented. When crossing variables, the cross column parameter can contain categorical or metric values. By default, the cross column selection is treated as categorical data. Set metric to TRUE, to treat it as metric and calculate correlations.

k

Number of clusters to calculate.

effect

Not yet implemented. Whether to report statistical tests and effect sizes.

title

Add a plot title (default = TRUE).

Value

A list containing a table and a plot volker report chunk.

Generate an factor table and plot

Description

Generate an factor table and plot

Usage

.report_fct(
  data,
  cols,
  cross,
  metric = FALSE,
  ...,
  k = 2,
  effect = FALSE,
  title = TRUE
)

Arguments

data

A data frame.

cols

A tidy column selection, e.g. a single column (without quotes) or multiple columns selected by methods such as starts_with().

cross

Not yet implementedt. Optional, a grouping column (without quotes).

metric

k

Number of factors to calculate.

effect

Not yet implemented. Whether to report statistical tests and effect sizes.

title

Add a plot title (default = TRUE).

Value

A list containing a table and a plot volker report chunk.

Generate an index table and plot

Description

Generate an index table and plot

Usage

.report_idx(
  data,
  cols,
  cross,
  metric = FALSE,
  ...,
  effect = FALSE,
  title = TRUE
)

Arguments

data

A data frame.

cols

A tidy column selection, e.g. a single column (without quotes) or multiple columns selected by methods such as starts_with().

cross

Optional, a grouping column (without quotes).

metric

When crossing variables, the cross column parameter can contain categorical or metric values. By default, the cross column selection is treated as categorical data. Set metric to TRUE, to treat it as metric and calculate correlations.

effect

Whether to report statistical tests and effect sizes.

title

Add a plot title (default = TRUE).

Value

A list containing a table and a plot volker report chunk.

Split a metric column into categories based on the median

Description

Split a metric column into categories based on the median

Usage

.tab_split(data, col, labels = TRUE)

Arguments

data

A data frame containing the column to be split.

col

The column to split.

labels

Logical; if TRUE (default), use custom labels for the split categories based on the column title. If FALSE, use the column name directly.

Value

A data frame with the specified column converted into categorical labels based on its median value. The split threshold (median) is stored as an attribute of the column.

Add vlkr_df class - that means, the data frame has been prepared

Description

Add vlkr_df class - that means, the data frame has been prepared

Usage

.to_vlkr_df(data, digits = NULL)

Arguments

data

A tibble.

Value

A tibble of class vlkr_df.

Add vlkr_list class

Description

Used to collect multiple tables in a list, e.g. from regression outputs

Usage

.to_vlkr_list(data, baseline = TRUE)

Arguments

data

A list.

baseline

Whether to get the baseline.

Value

A volker list.

Add the volker class and options

Description

Add the volker class and options

Usage

.to_vlkr_plot(
  pl,
  rows = NULL,
  maxlab = NULL,
  baseline = TRUE,
  theme_options = TRUE
)

Arguments

pl

A ggplot object.

rows

The number of items on the vertical axis. Will be automatically determined when NULL. For stacked bar charts, don't forget to set the group parameter, otherwise it won't work

maxlab

The character length of the longest label to be plotted. Will be automatically determined when NULL. on the vertical axis.

baseline

Whether to print a message about removed values.

theme_options

Enable or disable axis titles and text, by providing a list with any of the elements axis.text.x, axis.text.y, axis.title.x, axis.title.y set to TRUE or FALSE. By default, titles (=scale labels) are disabled and text (= the tick labels) are enabled.

Value

A ggplot object with vlkr_plt class.

Add the vlkr_rprt class to an object

Description

Adding the class makes sure the appropriate printing function is applied in markdown reports.

Usage

.to_vlkr_rprt(chunks)

Arguments

chunks

A list of character strings.

Value

A volker report object: List of character strings with the vlkr_rprt class containing the parts of the report.

Add vlkr_tbl class

Description

Additionally, removes the skim_df class if present.

Usage

.to_vlkr_tab(data, digits = NULL, caption = NULL, baseline = NULL)

Arguments

data

A tibble.

digits

Set the plot digits. If NULL (default), no digits are set.

caption

The caption printed above the table.

baseline

A base line printed below the table.

Value

A volker tibble.

Calculate lower whisker in a boxplot

Description

Calculate lower whisker in a boxplot

Usage

.whisker_lower(x, k = 1.5)

Arguments

x

A numeric vector.

Value

The lower whisker value.

Calculate upper whisker in a boxplot

Description

Calculate upper whisker in a boxplot

Usage

.whisker_upper(x, k = 1.5)

Arguments

x

VLKR_WRAP_SEPARATOR

Format

An object of class character of length 1.

Add cluster number to a data frame

Description

Clustering is performed using stats::kmeans.

Usage

add_clusters(data, cols, newcol = NULL, k = 2, method = "kmeans", clean = TRUE)

Arguments

data

A dataframe.

cols

A tidy selection of item columns.

newcol

Name of the new cluster column as a character vector. Set to NULL (default) to automatically build a name from the common column prefix, prefixed with "cls_".

k

Number of clusters to calculate. Set to NULL to output a scree plot for up to 10 clusters and automatically choose the number of clusters based on the elbow criterion. The within-sums of squares for the scree plot are calculated by stats::kmeans.

method

The method as character value. Currently, only kmeans is supported. All items are scaled before performing the cluster analysis using base::scale.

clean

Prepare data by data_clean.

Value

The input tibble with additional column containing cluster values as a factor. The new column is prefixed with "cls_". The new column contains the fit result in the attribute stats.kmeans.fit. The names of the items used for clustering are stored in the attribute stats.kmeans.items. The clustering diagnostics (Within-Cluster and Between-Cluster Sum of Squares) are stored in the attribute stats.kmeans.wss.

Examples

library(volker)
ds <- volker::chatgpt

volker::add_clusters(ds, starts_with("cg_adoption"), k = 3)

Add PCA columns along with summary statistics (KMO and Bartlett test) to a data frame

Description

PCA is performed using psych::pca usind varimax rotation. Bartlett's test for sphericity is calculated with psych::cortest.bartlett. The Kaiser-Meyer-Olkin (KMO) measure is computed using psych::KMO.

Usage

add_factors(data, cols, newcols = NULL, k = 2, method = "pca", clean = TRUE)

Arguments

data

A dataframe.

cols

A tidy selection of item columns.

newcols

Names of the factor columns as a character vector. Must be the same length as k or NULL. Set to NULL (default) to automatically build a name from the common column prefix, prefixed with "fct_", postfixed with the factor number.

k

Number of factors to calculate. Set to NULL to calculate eigenvalues for all components up to the number of items and automatically choose k. Eigenvalues and the decision on k are calculated by psych::fa.parallel.

method

The method as character value. Currently, only pca is supported.

clean

Prepare data by data_clean.

Value

The input tibble with additional columns containing factor values. The new columns are prefixed with "fct_". The first new column contains the fit result in the attribute psych.pca.fit. The names of the items used for factor analysis are stored in the attribute psych.pca.items. The summary diagnostics (Bartlett test and KMO) are stored in the attribute psych.kmo.bartlett.

Examples

library(volker)
ds <- volker::chatgpt

volker::add_factors(ds, starts_with("cg_adoption"))

Calculate the mean value of multiple items

Description

Usage

add_index(data, cols, newcol = NULL, cols.reverse, clean = TRUE)

Arguments

data

A dataframe.

cols

A tidy selection of item columns.

newcol

Name of the index as a character value. Set to NULL (default) to automatically build a name from the common column prefix, prefixed with "idx_".

cols.reverse

A tidy selection of columns with reversed codings.

clean

Prepare data by data_clean.

Value

The input tibble with an additional column that contains the index values. The column contains the result of the alpha calculation in the attribute named "psych.alpha".

Examples

ds <- volker::chatgpt
volker::add_index(ds, starts_with("cg_adoption"))

Get configured na numbers

Description

Retrieves values either from the option or from the constant.

Usage

cfg_get_na_numbers(default = VLKR_NA_NUMBERS)

Arguments

default

The default na numbers, if not explicitly provided by na.numbers or the options. return A vector with numbers that should be treated as NAs

ChatGPT Adoption Dataset CG-GE-APR23

Description

A small random subset of data from a survey about ChatGPT adoption. The survey was conducted in April 2023 within the population of German Internet users.

Usage

chatgpt

Format

`chatgpt`

A data frame with 101 rows and 19 columns:

case: A running case number
adopter: Adoption groups inspired by Roger's innovator typology.
use_: Columns starting with use contain data about ChatGPT usage in different contexts.
cg_activities: Text answers to the question, what the respondents do with ChatGPT.
cg_adoption_: A scale consisting of items about advantages, fears, and social aspects. The scales match theoretical constructs inspired by Roger's diffusion model and Davis' Technology Acceptance Model
sd_: Columns starting with sd contain sociodemographics of the respondents.

Details

Call codebook(volker::chatgpt) to see the items and answer options.

Source

Communication Department of the University of Münster (gehrau@uni-muenster.de).

Check whether a column exist and stop if not

Description

Check whether a column exist and stop if not

Usage

check_has_column(data, cols, msg = NULL)

Arguments

data

A data frame.

cols

A tidyselection of columns.

msg

A custom error message if the check fails.

Value

boolean Whether the column exists.

Check whether a column selection is categorical

Description

Check whether a column selection is categorical

Usage

check_is_categorical(data, cols, msg = NULL)

Arguments

data

A data frame.

cols

A tidyselection of columns.

msg

A custom error message if the check fails.

Value

boolean Whether the columns are categorical

Check whether the object is a dataframe

Description

Check whether the object is a dataframe

Usage

check_is_dataframe(obj, msg = NULL, stopit = TRUE)

Arguments

obj

The object to test.

msg

Optional, a custom error message.

stopit

Whether to stop execution with an error message.

Value

boolean Whether the object is a data.frame object.

Check whether a column selection is numeric

Description

Check whether a column selection is numeric

Usage

check_is_numeric(data, cols, msg = NULL)

Arguments

data

A data frame.

cols

A tidyselection of columns.

msg

A custom error message if the check fails.

Value

boolean Whether the columns are numeric.

Check whether a parameter value is from a valid set

Description

Check whether a parameter value is from a valid set

Usage

check_is_param(
  value,
  allowed,
  allownull = FALSE,
  allowmultiple = FALSE,
  stopit = TRUE,
  msg = NULL
)

Arguments

value

A character value.

allowed

Allowed values.

allownull

Whether to allow NULL values.

allowmultiple

Whether to allow multiple values.

stopit

Whether to stop execution if the value is invalid.

msg

A custom error message if the check fails.

Value

logical whether method is valid.

Get plot for clustering result

Description

Kmeans clustering is performed using add_clusters.

Usage

cluster_plot(
  data,
  cols,
  newcol = NULL,
  k = NULL,
  method = NULL,
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble.

cols

A tidy selection of item columns or a single column with cluster values as a factor. If the column already contains a cluster result from add_clusters, it is used, and other parameters are ignored. If no cluster result exists, it is calculated with add_clusters.

newcol

Name of the new cluster column as a character vector. Set to NULL (default) to automatically build a name from the common column prefix, prefixed with "cls_".

k

method

The method as character value. Currently, only kmeans is supported. All items are scaled before performing the cluster analysis using base::scale.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from plot_metrics.

Value

A ggplot object.

Examples

library(volker)
data <- volker::chatgpt

cluster_plot(data, starts_with("cg_adoption"), k = 2)

Get tables for clustering result

Description

Kmeans clustering is performed using add_clusters.

Usage

cluster_tab(
  data,
  cols,
  newcol = NULL,
  k = NULL,
  method = "kmeans",
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble.

cols

newcol

Name of the new cluster column as a character vector. Set to NULL (default) to automatically build a name from the common column prefix, prefixed with "cls_".

k

method

The method as character value. Currently, only kmeans is supported. All items are scaled before performing the cluster analysis using base::scale.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from tab_metrics.

Value

A volker list with with three volker tabs: cluster centers, cluster counts, and clustering diagnostics.

Examples

library(volker)
data <- volker::chatgpt

cluster_tab(data, starts_with("cg_adoption"), k = 2)

Get variable and value labels from a data set

Description

Variable labels are extracted from their comment or label attribute. Variable values are extracted from factor levels, the labels attribute, numeric or boolean attributes.

Usage

codebook(data, cols, values = TRUE)

Arguments

data

A tibble.

cols

A tidy variable selections to filter specific columns.

values

Whether to output values (TRUE) or only items (FALSE)

Details

Value

A tibble with the columns:

item_name: The column name.
item_group: First part of the column name, up to an underscore.
item_class: The last class value of an item (e.g. numeric, factor).
item_label: The comment attribute of the column.
value_name: In case a column has numeric attributes, the attribute names.
value_label: In case a column has numeric attributes or T/F-attributes, the attribute values. In case a column has a levels attribute, the levels.

Examples

volker::codebook(volker::chatgpt)

Convert numeric values to string

Description

Convert numeric values to string

Usage

data_cat(data, cols)

Arguments

data

A data frame containing the items to be converted.

cols

A tidy selection of columns to convert.

Value

A data frame with the converted values

Prepare dataframe for the analysis

Description

Depending on the selected cleaning plan, for example, recodes residual values to NA.

Usage

data_clean(data, plan = "default", ...)

Arguments

data

Data frame.

plan

The cleaning plan. By now, only "default" is supported. See data_clean_default.

...

Other parameters passed to the appropriate cleaning function.

Details

The tibble remembers whether it was already cleaned and the cleaning plan is only applyed once in the first call.

Value

Cleaned data frame with vlkr_df class.

Examples

ds <- volker::chatgpt
ds <- data_clean(ds)

Prepare data originating from SoSci Survey or SPSS

Description

Preparation steps:

Remove the avector class from all columns (comes from SoSci and prevents combining vectors)
Recode residual factor values to NA (e.g. "NA nicht beantwortet")
Recode residual numeric values to NA (e.g. -9)

Usage

data_clean_default(data, remove.na.levels = TRUE, remove.na.numbers = TRUE)

Arguments

data

Data frame

remove.na.levels

Remove residual values from factor columns. Either a character vector with residual values or TRUE to use defaults in VLKR_NA_LEVELS. You can also define or disable residual levels by setting the global option vlkr.na.levels (e.g. options(vlkr.na.levels=c("Not answered")) or to disable options(vlkr.na.levels=FALSE)).

remove.na.numbers

Remove residual values from numeric columns. Either a numeric vector with residual values or TRUE to use defaults in VLKR_NA_NUMBERS. You can also define or disable residual values by setting the global option vlkr.na.numbers (e.g. options(vlkr.na.numbers=c(-2,-9)) or to disable options(vlkr.na.numbers=FALSE)).

Details

The tibble remembers whether it was already prepared and the operations are only performed once in the first call.

Value

Data frame with vlkr_df class (the class is used to prevent double preparation).

Examples

ds <- volker::chatgpt
ds <- data_clean_default(ds)

Convert values to numeric values

Description

Convert values to numeric values

Usage

data_num(data, cols)

Arguments

data

A data frame containing the items to be converted.

cols

A tidy selection of columns to convert.

Value

A data frame with the converted values

Prepare data for calculation

Description

Clean data, check column selection, remove cases with missing values

Usage

data_prepare(
  data,
  cols,
  cross,
  cols.categorical,
  cols.numeric,
  cols.reverse,
  clean = TRUE
)

Arguments

data

Data frame to be prepared.

cols

The first column selection.

cross

The second column selection.

cols.categorical

A tidy selection of columns to be checked for categorical values.

cols.numeric

A tidy selection of columns to be converted to numeric values.

cols.reverse

A tidy selection of columns with reversed codings.

clean

Whether to clean data using data_clean.

Value

Prepared data frame.

Examples

data <- volker::chatgpt
data_prepare(data, sd_age, sd_gender)

Reverse item values

Description

Reverse item values

Usage

data_rev(data, cols)

Arguments

data

A data frame containing the items to be reversed.

cols

A tidy selection of columns to reverse. For example, if you want to calculate an index of the two items "I feel bad about this" and "I like it", both coded with 1=not at all to 5=fully agree, you need to reverse one of them to make the codings compatible.

Value

A data frame with the specified items reversed.

Remove missings and output a message

Description

Remove missings and output a message

Usage

data_rm_missings(data, cols)

Arguments

data

Data frame.

cols

A tidy column selection.

Value

Data frame.

Remove NA levels

Description

Remove NA levels

Usage

data_rm_na_levels(data, na.levels = TRUE, default = VLKR_NA_LEVELS)

Arguments

data

Data frame

na.levels

Residual values to remove from factor columns. Either a character vector with residual values or TRUE to use defaults in VLKR_NA_LEVELS. You can define default residual levels by setting the global option vlkr.na.levels (e.g. options(vlkr.na.levels=c("Not answered"))).

default

The default na levels, if not explicitly provided by na.levels or the options.

Value

Data frame

Remove NA numbers

Description

Remove NA numbers

Usage

data_rm_na_numbers(
  data,
  na.numbers = TRUE,
  check.labels = TRUE,
  default = VLKR_NA_NUMBERS
)

Arguments

data

Data frame

na.numbers

Either a numeric vector with residual values or TRUE to use defaults in VLKR_NA_NUMBERS. You can also define residual values by setting the global option vlkr.na.numbers (e.g. options(vlkr.na.numbers=c(-9))).

check.labels

Whether to only remove NA numbers that are listed in the attributes of a column.

default

The default na numbers, if not explicitly provided by na.numbers or the options.

Value

Data frame

Remove negatives and output a warning

Description

Remove negatives and output a warning

Usage

data_rm_negatives(data, cols)

Arguments

data

Data frame

cols

A tidy column selection

Value

Data frame

Remove zero values, drop missings and output a message

Description

Remove zero values, drop missings and output a message

Usage

data_rm_zeros(data, cols)

Arguments

data

Data frame.

cols

A tidy column selection.

Value

Data frame.

Output effect sizes and test statistics for count data

Description

The type of effect size depends on the number of selected columns:

One categorical column: see effect_counts_one
Multiple categorical columns: see effect_counts_items

Cross tabulations:

One categorical column and one grouping column: see effect_counts_one_grouped
Multiple categorical columns and one grouping column: see effect_counts_items_grouped (not yet implemented)
Multiple categorical columns and multiple grouping columns: effect_counts_items_grouped_items (not yet implemented)

By default, if you provide two column selections, the second column is treated as categorical. Setting the metric-parameter to TRUE will call the appropriate functions for correlation analysis:

One categorical column and one metric column: see effect_counts_one_cor (not yet implemented)
Multiple categorical columns and one metric column: see effect_counts_items_cor (not yet implemented)
Multiple categorical columns and multiple metric columns:effect_counts_items_cor_items (not yet implemented)

Usage

effect_counts(data, cols, cross = NULL, metric = FALSE, clean = TRUE, ...)

Arguments

data

A data frame.

cols

A tidy column selection, e.g. a single column (without quotes) or multiple columns selected by methods such as starts_with().

cross

Optional, a grouping column. The column name without quotes.

metric

clean

Prepare data by data_clean.

...

Other parameters passed to the appropriate effect function.

Value

A volker tibble.

Examples

library(volker)
data <- volker::chatgpt

effect_counts(data, sd_gender, adopter)

Test homogeneity of category shares for multiple items

Description

Performs a goodness-of-fit test and calculates the Gini coefficient for each item. The goodness-of-fit-test is calculated using stats::chisq.test.

Usage

effect_counts_items(data, cols, labels = TRUE, clean = TRUE, ...)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from effect_counts.

Value

A volker tibble with the following statistical measures:

Gini coefficient: Gini coefficient, measuring inequality.
n: Number of cases the calculation is based on.
Chi-squared: Chi-Squared test statistic.
p: p-value for the statistical test.
stars: Significance stars based on p-value (*, **, ***).

Examples

library(volker)
data <- volker::chatgpt

effect_counts_items(data, starts_with("cg_adoption_adv"))

Correlate the values in multiple items with one metric column and output effect sizes and tests

Description

Not yet implemented. The future will come.

Usage

effect_counts_items_cor(data, cols, cross, clean = TRUE, ...)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

cross

The metric column.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from effect_counts.

Value

A volker tibble.

Correlate the values in multiple items with multiple metric columns and output effect sizes and tests

Description

Not yet implemented. The future will come.

Usage

effect_counts_items_cor_items(data, cols, cross, clean = TRUE, ...)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

cross

The metric target columns.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from effect_counts.

Value

A volker tibble.

Effect size and test for comparing multiple variables by a grouping variable

Description

Not yet implemented. The future will come.

Usage

effect_counts_items_grouped(data, cols, cross, clean = TRUE, ...)

Arguments

data

A tibble containing item measures and grouping variable.

cols

Tidyselect item variables (e.g. starts_with...).

cross

The column holding groups to compare.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from effect_counts.

Value

A volker tibble.

Effect size and test for comparing multiple variables by multiple grouping variables

Description

Not yet implemented. The future will come.

Usage

effect_counts_items_grouped_items(data, cols, cross, clean = TRUE, ...)

Arguments

data

A tibble containing item measures and grouping variable.

cols

Tidyselect item variables (e.g. starts_with...).

cross

The columns holding groups to compare.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from effect_counts.

Value

A volker tibble.

Test homogeneity of category shares

Description

Performs a goodness-of-fit test and calculates the Gini coefficient. The goodness-of-fit-test is calculated using stats::chisq.test.

Usage

effect_counts_one(data, col, clean = TRUE, ...)

Arguments

data

A tibble.

col

The column holding factor values.

clean

Prepare data by data_clean

...

Placeholder to allow calling the method with unused parameters from effect_counts.

Value

A volker tibble with the following statistical measures:

Gini coefficient: Gini coefficient, measuring inequality.
n: Number of cases the calculation is based on.
Chi-squared: Chi-Squared test statistic.
p: p-value for the statistical test.
stars: Significance stars based on p-value (*, **, ***).

Examples

library(volker)
data <- volker::chatgpt

data |>
  filter(sd_gender != "diverse") |>
  effect_counts_one(sd_gender)

Output test statistics and effect size from a logistic regression of one metric predictor

Description

Not yet implemented. The future will come.

Usage

effect_counts_one_cor(data, col, cross, clean = TRUE, labels = TRUE, ...)

Arguments

data

A tibble.

col

The column holding factor values.

cross

The column holding metric values.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from effect_counts.

Value

A volker tibble.

Output test statistics and effect size for contingency tables

Description

Chi squared is calculated using stats::chisq.test. If any cell contains less than 5 observations, the exact-parameter is set.

Usage

effect_counts_one_grouped(data, col, cross, clean = TRUE, ...)

Arguments

data

A tibble.

col

The column holding factor values.

cross

The column holding groups to compare.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from effect_counts.

Details

Phi is derived from the Chi squared value by sqrt(fit$statistic / n). Cramer's V is derived by sqrt(phi / (min(dim(contingency)[1], dim(contingency)[2]) - 1)).

Value

A volker tibble with the following statistical measures:

Cramer's V: Effect size measuring the association between two variables.
n: Number of cases the calculation is based on.
Chi-squared: Chi-Squared test statistic.
df: Degrees of freedom.
p: p-value for the statistical test.
stars: Significance stars based on p-value (*, **, ***).

Examples

library(volker)
data <- volker::chatgpt

effect_counts_one_grouped(data, adopter, sd_gender)

Output effect sizes and test statistics for metric data

Description

The calculations depend on the number of selected columns:

One metric column: see effect_metrics_one
Multiple metric columns: see effect_metrics_items

Group comparisons:

One metric column and one grouping column: see effect_metrics_one_grouped
Multiple metric columns and one grouping column: see effect_metrics_items_grouped
Multiple metric columns and multiple grouping columns: not yet implemented

By default, if you provide two column selections, the second column is treated as categorical. Setting the metric-parameter to TRUE will call the appropriate functions for correlation analysis:

Two metric columns: see effect_metrics_one_cor
Multiple metric columns and one metric column: see effect_metrics_items_cor
Two metric column selections: see effect_metrics_items_cor_items

Usage

effect_metrics(data, cols, cross = NULL, metric = FALSE, clean = TRUE, ...)

Arguments

data

A data frame.

cols

A tidy column selection, e.g. a single column (without quotes) or multiple columns selected by methods such as starts_with().

cross

Optional, a grouping column (without quotes).

metric

clean

Prepare data by data_clean.

...

Other parameters passed to the appropriate effect function.

Value

A volker tibble.

Examples

library(volker)
data <- volker::chatgpt

effect_metrics(data, sd_age, sd_gender)

Test whether a distribution is normal for each item

Description

The test is calculated using stats::shapiro.test.

Usage

effect_metrics_items(data, cols, labels = TRUE, clean = TRUE, ...)

Arguments

data

A tibble containing item measures.

cols

The column holding metric values.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from effect_metrics.

Value

A volker table containing itemwise statistics:

skewness: Measure of asymmetry in the distribution. A value of 0 indicates perfect symmetry.
kurtosis: Measure of the "tailedness" of the distribution.
W: W-statistic from the Shapiro-Wilk normality test.
p: p-value for the statistical test.
stars: Significance stars based on p-value (*, **, ***).
normality: Interpretation of normality based on Shapiro-Wilk test.

Examples

library(volker)
data <- volker::chatgpt

effect_metrics_items(data, starts_with("cg_adoption"))

Output correlation coefficients for items and one metric variable

Description

The correlation is calculated using stats::cor.test.

Usage

effect_metrics_items_cor(
  data,
  cols,
  cross,
  method = "pearson",
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

cross

The column holding metric values to correlate.

method

The output metrics, pearson = Pearson's R, spearman = Spearman's rho.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from effect_metrics.

Value

A volker table containing itemwise correlations:

If method = "pearson":

R-squared: Coefficient of determination.
n: Number of cases the calculation is based on.
Pearson's r: Correlation coefficient.
ci low / ci high: Lower and upper bounds of the 95% confidence interval.
df: Degrees of freedom.
t: t-statistic.
p: p-value for the statistical test, indicating whether the correlation differs from zero.
stars: Significance stars based on the p-value (*, **, ***).

If method = "spearman":

Spearman's rho is displayed instead of Pearson's r.
S-statistic is used instead of the t-statistic.

Examples

library(volker)
data <- volker::chatgpt

effect_metrics_items_cor(
  data, starts_with("cg_adoption_adv"), sd_age
)

Output correlation coefficients for multiple items

Description

The correlation is calculated using stats::cor.test.

Usage

effect_metrics_items_cor_items(
  data,
  cols,
  cross,
  method = "pearson",
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

cross

Tidyselect item variables (e.g. starts_with...).

method

The output metrics, pearson = Pearson's R, spearman = Spearman's rho.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from effect_metrics.

Value

A volker table containing correlations.

If method = "pearson":

R-squared: Coefficient of determination.
n: Number of cases the calculation is based on.
Pearson's r: Correlation coefficient.
ci low / ci high: Lower and upper bounds of the 95% confidence interval.
df: Degrees of freedom.
t: t-statistic.
p: p-value for the statistical test, indicating whether the correlation differs from zero.
stars: Significance stars based on the p-value (*, **, ***).

If method = "spearman":

Spearman's rho is displayed instead of Pearson's r.
S-statistic is used instead of the t-statistic.

Examples

library(volker)
data <- volker::chatgpt

effect_metrics_items_cor_items(
  data,
  starts_with("cg_adoption_adv"),
  starts_with("use"),
  metric = TRUE
)

Compare groups for each item by calculating F-statistics and effect sizes

Description

The models are fitted using stats::lm. ANOVA of type II is computed for each fitted model using car::Anova. Eta Squared is calculated for each ANOVA result using effectsize::eta_squared.

Usage

effect_metrics_items_grouped(
  data,
  cols,
  cross,
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

cross

The column holding groups to compare.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from effect_metrics.

Value

A volker tibble with the following statistical measures:

Eta-squared: Effect size indicating the proportion of variance in the dependent variable explained by the predictor.
Eta: Root of Eta-squared, a standardized effect size.
n: Number of cases the calculation is based on.
F: F-statistic from the linear model.
p: p-value for the statistical test.
stars: Significance stars based on p-value (*, **, ***).

Examples

library(volker)
data <- volker::chatgpt

effect_metrics(data, starts_with("cg_adoption_"), adopter)

Compare groups for each item with multiple target items by calculating F-statistics and effect sizes

Description

Not yet implemented. The future will come.

Usage

effect_metrics_items_grouped_items(data, cols, cross, clean = TRUE, ...)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

cross

The grouping items.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from effect_counts.

Value

A volker tibble.

Test whether a distribution is normal

Description

The test is calculated using stats::shapiro.test.

Usage

effect_metrics_one(data, col, labels = TRUE, clean = TRUE, ...)

Arguments

data

A tibble.

col

The column holding metric values.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from effect_metrics.

Value

A volker list object with the following statistical measures:

skewness: Measure of asymmetry in the distribution. A value of 0 indicates perfect symmetry.
kurtosis: Measure of the "tailedness" of the distribution.
W: W-statistic from the Shapiro-Wilk normality test.
p: p-value for the statistical test.
stars: Significance stars based on p-value (*, **, ***).
normality: Interpretation of normality based on Shapiro-Wilk test.

Examples

library(volker)
data <- volker::chatgpt

effect_metrics_one(data, sd_age)

Test whether the correlation is different from zero

Description

The correlation is calculated using stats::cor.test.

Usage

effect_metrics_one_cor(
  data,
  col,
  cross,
  method = "pearson",
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble.

col

The column holding metric values.

cross

The column holding metric values to correlate.

method

The output metrics, TRUE or pearson = Pearson's R, spearman = Spearman's rho.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from effect_metrics.

Value

A volker table containing the requested statistics.

If method = "pearson":

R-squared: Coefficient of determination.
n: Number of cases the calculation is based on.
Pearson's r: Correlation coefficient.
ci low / ci high: Lower and upper bounds of the 95% confidence interval.
df: Degrees of freedom.
t: t-statistic.
p: p-value for the statistical test, indicating whether the correlation differs from zero.
stars: Significance stars based on the p-value (*, **, ***).

If method = "spearman":

Spearman's rho is displayed instead of Pearson's r.
S-statistic is used instead of the t-statistic.

Examples

library(volker)
data <- volker::chatgpt

effect_metrics_one_cor(data, sd_age, use_private, metric = TRUE)

Output a regression table with estimates and macro statistics

Description

The regression output comes from stats::lm. T-test is performed using stats::t.test. Normality check is performed using stats::shapiro.test. Equality of variances across groups is assessed using car::leveneTest. Cohen's d is calculated using effectsize::cohens_d.

Usage

effect_metrics_one_grouped(
  data,
  col,
  cross,
  method = "lm",
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble.

col

The column holding metric values.

cross

The column holding groups to compare.

method

A character vector of methods, e.g. c("t.test","lm"). Supported methods are t.test (only valid if the cross column contains two levels) and lm (regression results).

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from effect_metrics.

Value

A volker list object containing volker tables with the requested statistics.

Regression table:

estimate: Regression coefficient (unstandardized).
ci low / ci high: lower and upper bound of the 95% confidence interval.
se: Standard error of the estimate.
t: t-statistic.
p: p-value for the statistical test.
stars: Significance stars based on p-value (*, **, ***).

Macro statistics:

Adjusted R-squared: Adjusted coefficient of determination.
F: F-statistic for the overall significance of the model.
df: Degrees of freedom for the model.
residual df: Residual degrees of freedom.
p: p-value for the statistical test.
stars: Significance stars based on p-value (*, **, ***).

If method = t.test:

Shapiro-Wilk test (normality check):

W: W-statistic from the Shapiro-Wilk normality test.
p: p-value for the test.
normality: Interpretation of the Shapiro-Wilk test.

Levene test (equality of variances):

F: F-statistic from the Levene test for equality of variances between groups.
p: p-value for Levene's test.
variances: Interpretation of the Levene test.

Cohen's d (effect size):

d: Standardized mean difference between the two groups.
ci low / ci high: Lower and upper bounds of the 95% confidence interval.

t-test

method: Type of t-test performed (e.g., "Two Sample t-test").
difference: Observed difference between group means.
ci low / ci high: Lower and upper bounds of the 95% confidence interval.
se: Estimated standard error of the difference.
df: Degrees of freedom used in the t-test.
t: t-statistic.
p: p-value for the t-test.
stars: Significance stars based on p-value (*, ⁠**⁠, ⁠***⁠).

Examples

library(volker)
data <- volker::chatgpt

effect_metrics_one_grouped(data, sd_age, sd_gender)

Select variables by their postfix

Description

See tidyselect::ends_with for details.

Get plot with factor analysis result

Description

PCA is performed using add_factors.

Usage

factor_plot(
  data,
  cols,
  newcols = NULL,
  k = 2,
  method = "pca",
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A dataframe.

cols

A tidy selection of item columns. If the first column already contains a pca from add_factors, the result is used. Other parameters are ignored. If there is no pca result yet, it is calculated by add_factors first.

newcols

k

Number of factors to calculate. Set to NULL to generate a scree plot with eigenvalues for all components up to the number of items and automatically choose k. Eigenvalues and the decision on k are calculated by psych::fa.parallel.

method

The method as character value. Currently, only pca is supported.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from plot_metrics.

Value

A ggplot object.

Examples

library(volker)
ds <- volker::chatgpt

volker::factor_plot(ds, starts_with("cg_adoption"), k = 3)

Get tables with factor analysis results

Description

PCA is performed using add_factors.

Usage

factor_tab(
  data,
  cols,
  newcols = NULL,
  k = 2,
  method = "pca",
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A dataframe.

cols

A tidy selection of item columns.

        If the first column already contains a pca result from \link{add_factors},
        the result is used. Other parameters are ignored.

        If there is no pca result yet, it is calculated by \link{add_factors} first.

newcols

Names of the new factor columns as a character vector. Must be the same length as k or NULL. Set to NULL (default) to automatically build a name from the common column prefix, prefixed with "fct_", postfixed with the factor number.

k

Number of factors to calculate. Set to NULL to report eigenvalues for all components up to the number of items and automatically choose k. Eigenvalues and the decision on k are calculated by psych::fa.parallel.

method

The method as character value. Currently, only pca is supported.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from tab_metrics.

Value

A volker list with with three volker tabs: loadings, variances and diagnostics.

Examples

library(volker)
ds <- volker::chatgpt

volker::factor_tab(ds, starts_with("cg_adoption"), k = 3)

Filter function

Description

See dplyr::filter for details.

Get number of items and Cronbach's alpha of a scale added by add_index()

Description

TODO: Rename to index_tab, return volker list as in factor_tab()

Usage

get_alpha(data)

Arguments

data

A data frame column.

Value

A named list with with the keys "items" and "alpha".

Angle labels

Description

Calculate angle for label adjustment based on character length.

Usage

get_angle(
  labels,
  threshold = VLKR_PLOT_ANGLE_THRESHOLD,
  angle = VLKR_PLOT_ANGLE_VALUE
)

Arguments

labels

Vector of labels to check. The values are converted to characters.

threshold

Length threshold beyond which the angle is applied. Default is 20. Override with options(vlkr.angle.threshold=10).

angle

The angle to apply if any label exceeds the threshold. Default is 45. Override with options(vlkr.angle.value=30).

Value

A single angle value.

Get a formatted baseline for removed zero, negative, and missing cases and include focus category information if present

Description

Get a formatted baseline for removed zero, negative, and missing cases and include focus category information if present

Usage

get_baseline(obj)

Arguments

obj

An object with the missings and focus attributes.

Value

A formatted message or NULL if missings and focus attributes are not present.

Calculate ci values to be used for error bars on a plot

Description

Calculate ci values to be used for error bars on a plot

Usage

get_ci(x, conf = 0.95)

Arguments

x

A numeric vector.

conf

The confidence level.

Value

A named list with values for y, ymin, and ymax.

Detect whether a scale is a numeric sequence

Description

From all values in the selected columns, the numbers are extracted. If no numeric values can be found, returns 0. Otherwise, if any positive values form an ascending sequence, returns -1. In all other cases, returns 1.

Usage

get_direction(data, cols, extract = TRUE)

Arguments

data

The dataframe.

cols

The tidy selection.

extract

Whether to extract numeric values from characters.

Value

0 = an undirected scale, -1 = descending values, 1 = ascending values.

Calculate Eta squared

Description

Calculate Eta squared

Usage

get_etasq(fit)

Arguments

fit

A model

Value

A data frame with at least the column Eta2

Calculate the Gini coefficient

Description

Calculate the Gini coefficient

Usage

get_gini(x)

Arguments

x

A vector of counts or other values

Value

The gini coefficient

Get the labels of values from a codebook

Description

Get the labels of values from a codebook

Usage

get_labels(codes, values)

Arguments

codes

The codebook as it results from the codebook() function

values

A vector of labels

Value

The labels. If the values are not present in the codebook, returns the values.

Get the numeric range from the labels

Description

Gets the range of all values in the selected columns by the first successful of the following methods:

Usage

get_limits(data, cols, negative = TRUE)

Arguments

data

The labeled data frame.

cols

A tidy variable selection.

negative

Whether to include negative values.

Details

Inspect the limits column attribute.
Lookup the value names in the codebook.
Calculate the range from all values in the columns.

Value

A list or NULL.

Get the common prefix of character values

Description

Helper function taken from the biobase package. Duplicated here instead of loading the package to avoid overhead. See https://github.com/Bioconductor/Biobase

Usage

get_prefix(x, ignore.case = FALSE, trim = FALSE, delimiters = c(":", "\n"))

Arguments

x

Character vector.

ignore.case

Whether case matters (default).

trim

Whether non alphabetic characters should be trimmed.

delimiters

A list of prefix delimiters. If any of the delimiters is present in the extracted prefix, the part after is removed from the prefix. Consider the following two items as an example: c("Usage: in private context", "Usage: in work context"). The common prefix would be

"Usage: in "

, but it makes more sense to break it after the colon.

Value

The longest common prefix of the strings.

Get significance stars from p values

Description

Get significance stars from p values

Usage

get_stars(x)

Arguments

x

A vector of p values.

Value

A character vector with significance stars.

Get a common title for a column selection

Description

Get a common title for a column selection

Usage

get_title(data, cols, default = NULL)

Arguments

data

A tibble.

cols

A tidy column selection.

default

A character string used in case not prefix is found

Value

A character string.

Volker style HTML document format

Description

Based on the standard theme, tweaks the pill navigation to switch between tables and plots. To use the format, in the header of your Markdown document, set output: volker::html_report.

Usage

html_report(...)

Arguments

...

Additional arguments passed to html_document.

Value

R Markdown output format.

Examples

## Not run: 
# Add `volker::html_report` to the output options of your Markdown document:
#
# ```
# ---
# title: "How to create reports?"
# output: volker::html_report
# ---
# ```

## End(Not run)

Deprecated Alias for `add_index`

Description

idx_add() was renamed to add_index().

Usage

idx_add(data, cols, newcol = NULL, reverse = NULL, clean = TRUE)

Details

This function is a deprecated alias for add_index.

Printing method for volker plots when knitting

Description

Printing method for volker plots when knitting

Usage

## S3 method for class 'vlkr_plt'
knit_print(x, ...)

Arguments

x

The volker plot.

...

Further parameters passed to print().

Value

Knitr asis output

Examples

library(volker)
data <- volker::chatgpt

pl <- plot_metrics(data, sd_age)
print(pl)

Wrap labels in plot scales

Description

Wrap labels in plot scales

Usage

label_scale(x, scale)

Arguments

x

The label vector.

scale

A named label vector to select elements that should be wrapped. Prevents numbers from being wrapped.

Value

A vevtor of wrapped labels.

Set column and value labels

Description

Usage

labs_apply(data, codes = NULL, cols = NULL, items = TRUE, values = TRUE)

Arguments

data

A tibble containing the dataset.

codes

A tibble in codebook format.

cols

A tidy column selection. Set to NULL (default) to apply to all columns found in the codebook. Restricting the columns is helpful when you want to set value labels. In this case, provide a tibble with value_name and value_label columns and specify the columns that should be modified.

items

If TRUE, column labels will be retrieved from the codes (the default). If FALSE, no column labels will be changed. Alternatively, a named list of column names with their labels.

values

If TRUE, value labels will be retrieved from the codes (default). If FALSE, no value labels will be changed. Alternatively, a named list of value names with their labels. In this case, use the cols-Parameter to define which columns should be changed.

Details

You can either provide a data frame in codebook format to the codes-parameter or provide named lists to the items- or values-parameter.

When working with a codebook in the codes-parameter:

Change column labels by providing the columns item_name and item_label in the codebook. Set the items-parameter to TRUE (the default setting).
Change value labels by providing the columns value_name and value_label in the codebook. To tell which columns should be changed, you can either use the item_name column in the codebook or use the cols-parameter. For factor values, the levels and their order are retrieved from the value_label column. For coded values, labels are retrieved from both the columns value_name and value_label.

When working with lists in the items- or values-parameter:

Change column labels by providing a named list to the items-parameter. The list contains labels named by the columns. Set the parameters codes and cols to NULL (their default value).
Change value labels by providing a named list to the values-parameter. The list contains labels named by the values. Provide the column selection in the cols-parameter. Set the codes-parameter to NULL (its default value).

Value

A tibble containing the dataset with new labels.

Examples

library(volker)

# Set column labels using the items-parameter
volker::chatgpt %>%
  labs_apply(
   items = list(
     "cg_adoption_advantage_01" = "Allgemeine Vorteile",
     "cg_adoption_advantage_02" = "Finanzielle Vorteile",
     "cg_adoption_advantage_03" = "Vorteile bei der Arbeit",
     "cg_adoption_advantage_04" = "Macht mehr Spaß"
   )
 ) %>%
 tab_metrics(starts_with("cg_adoption_advantage_"))

# Set value labels using the values-parameter
 volker::chatgpt %>%
   labs_apply(
     cols=starts_with("cg_adoption"),
     values = list(
       "1" = "Stimme überhaupt nicht zu",
       "2" = "Stimme nicht zu",
       "3" = "Unentschieden",
       "4" = "Stimme zu",
       "5" =  "Stimme voll und ganz zu"
     )
   ) %>%
   plot_metrics(starts_with("cg_adoption"))

Remove all comments from the selected columns

Description

Usage

labs_clear(data, cols, labels = NULL)

Arguments

data

A tibble.

cols

Tidyselect columns.

labels

The attributes to remove. NULL to remove all attributes except levels and class.

Value

A tibble with comments removed.

Examples

library(volker)
volker::chatgpt |>
  labs_clear()

Add missing residual labels in numeric columns that have at least one labeled value

Description

Add missing residual labels in numeric columns that have at least one labeled value

Usage

labs_impute(data)

Arguments

data

A tibble

Value

A tibble with added value labels

Replace item value names in a column by their labels

Description

Replace item value names in a column by their labels

Usage

labs_replace(
  data,
  col,
  codes,
  col_from = "value_name",
  col_to = "value_label",
  na.missing = FALSE
)

Arguments

data

A tibble.

col

The column holding item values.

codes

The codebook to use: A tibble with the columns value_name and value_label. Can be created by the codebook function, e.g. by calling codes <- codebook(data, myitemcolumn).

col_from

The tidyselect column with source values, defaults to value_name. If the column is not found in the codebook, the first column is used.

col_to

The tidyselect column with target values, defaults to value_label. If the column is not found in the codebook, the second column is used

na.missing

By default, the column is converted to a factor with levels combined from the codebook and the data. Set na.missing to TRUE to set all levels not found in the codes to NA.

Value

Tibble with new labels.

Restore labels from the codebook store in the codebook attribute.

Description

Usage

labs_restore(data, cols = NULL)

Arguments

data

A data frame.

cols

A tidyselect column selection.

Details

You can store labels before mutate operations by calling labs_store.

Value

A data frame.

Examples

library(dplyr)
library(volker)

volker::chatgpt |>
  labs_store() |>
  mutate(sd_age = 2024 - sd_age) |>
  labs_restore() |>
  tab_metrics(sd_age)

Get the current codebook and store it in the codebook attribute.

Description

Usage

labs_store(data)

Arguments

data

A data frame.

Details

You can restore the labels after mutate operations by calling labs_restore.

Value

A data frame.

Examples

library(dplyr)
library(volker)

volker::chatgpt |>
  labs_store() |>
  mutate(sd_age = 2024 - sd_age) |>
  labs_restore() |>
  tab_metrics(sd_age)

Plot regression coefficients

Description

The regression output comes from stats::lm.

Usage

model_metrics_plot(
  data,
  col,
  categorical,
  metric,
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble.

col

The target column holding metric values.

categorical

A tidy column selection holding categorical variables.

metric

A tidy column selection holding metric variables.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from effect_metrics.

Value

A volker list object containing volker plots

Examples

library(volker)
data <- volker::chatgpt

data |>
  filter(sd_gender != "diverse") |>
  model_metrics_plot(use_work, categorical = c(sd_gender, adopter), metric = sd_age)

Output a regression table with estimates and macro statistics for multiple categorical or metric independent variables

Description

The regression output comes from stats::lm.

Usage

model_metrics_tab(
  data,
  col,
  categorical,
  metric,
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble.

col

The target column holding metric values.

categorical

A tidy column selection holding categorical variables.

metric

A tidy column selection holding metric variables.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from effect_metrics.

Value

A volker list object containing volker tables with the requested statistics.

Examples

library(volker)
data <- volker::chatgpt

data |>
  filter(sd_gender != "diverse") |>
  model_metrics_tab(use_work, categorical = c(sd_gender, adopter), metric = sd_age)

Mutate function

Description

See dplyr::mutate for details.

Convert a named vector to a list

Description

Convert a named vector to a list

Usage

named.to.list(x)

Arguments

x

A named vector or a list

Value

Lists are returned as is. Vectors are converted to lists with names as list names.

Volker style PDF document format

Description

Based on the standard theme, tweaks tex headers. To use the format, in the header of your Markdown document, set output: volker::pdf_report.

Usage

pdf_report(...)

Arguments

...

Additional arguments passed to pdf_document.

Value

R Markdown output format.

Examples

## Not run: 
# Add `volker::pdf_report` to the output options of your Markdown document:
#
# ```
# ---
# title: "How to create reports?"
# output: volker::pdf_report
# ---
# ```

## End(Not run)

Output a frequency plot

Description

The type of frequency plot depends on the number of selected columns:

One categorical column: see plot_counts_one
Multiple categorical columns: see plot_counts_items

Cross tabulations:

One categorical column and one grouping column: see plot_counts_one_grouped
Multiple categorical columns and one grouping column: see plot_counts_items_grouped
Two categorical column selections: see plot_counts_items_grouped_items (not yet implemented)

By default, if you provide two column selections, the second selection is treated as categorical. Setting the metric-parameter to TRUE will call the appropriate functions for correlation analysis:

One categorical column and one metric column: see plot_counts_one_cor
Multiple categorical columns and one metric column: see plot_counts_items_cor
Multiple categorical columns and multiple metric columns: see plot_counts_items_cor_items (not yet implemented)

Parameters that may be passed to the count functions (see the respective function help):

ci: Add confidence intervals to proportions.
ordered: The values of the cross column can be nominal (0), ordered ascending (1), or ordered descending (-1). The colors are adjusted accordingly.
category: When you have multiple categories in a column, you can focus one of the categories to simplify the plots. By default, if a column has only TRUE and FALSE values, the outputs focus the TRUE category.
prop: For stacked bar charts, displaying row percentages instead of total percentages gives a direct visual comparison of groups.
limits: The scale limits are automatically guessed by the package functions (work in progress). Use the limits-parameter to manually fix any misleading graphs.
title: All plots usually get a title derived from the column attributes or column names. Set to FALSE to suppress the title or provide a title of your choice as a character value.
labels: Labels are extracted from the column attributes. Set to FALSE to output bare column names and values.
numbers: Set the numbers parameter to “n” (frequency), “p” (percentage) or c(“n”,“p”). To prevent cluttering and overlaps, numbers are only plotted on bars larger than 5%.
width: When comparing groups by row of column percentages, by default, the bar or column width reflects the number of cases. You can disable this behavior by setting width to FALSE.

Usage

plot_counts(data, cols, cross = NULL, metric = FALSE, clean = TRUE, ...)

Arguments

data

A data frame.

cols

A tidy column selection, e.g. a single column (without quotes) or multiple columns selected by methods such as starts_with().

cross

Optional, a grouping column. The column name without quotes.

metric

clean

Prepare data by data_clean.

...

Other parameters passed to the appropriate plot function.

Value

A ggplot2 plot object.

Examples

library(volker)
data <- volker::chatgpt

plot_counts(data, sd_gender)

Output frequencies for multiple variables

Description

Output frequencies for multiple variables

Usage

plot_counts_items(
  data,
  cols,
  category = NULL,
  ordered = NULL,
  ci = FALSE,
  limits = NULL,
  numbers = NULL,
  title = TRUE,
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

category

The value FALSE will force to plot all categories. A character value will focus a selected category. When NULL, in case of boolean values, only the TRUE category is plotted.

ordered

Values can be nominal (0) or ordered ascending (1) descending (-1). By default (NULL), the ordering is automatically detected. An appropriate color scale should be choosen depending on the ordering. For unordered values, colors from VLKR_FILLDISCRETE are used. For ordered values, shades of the VLKR_FILLGRADIENT option are used.

ci

Whether to plot error bars for 95% confidence intervals.

limits

The scale limits, autoscaled by default. Set to c(0,100) to make a 100 % plot.

numbers

The values to print on the bars: "n" (frequency), "p" (percentage) or both.

title

If TRUE (default) shows a plot title derived from the column labels. Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from plot_counts.

Value

A ggplot object.

Examples

library(volker)
data <- volker::chatgpt

plot_counts_items(data, starts_with("cg_adoption_"))

Plot percent shares of multiple items compared by a metric variable split into groups

Description

Plot percent shares of multiple items compared by a metric variable split into groups

Usage

plot_counts_items_cor(
  data,
  cols,
  cross,
  category = NULL,
  title = TRUE,
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

cross

A metric column that will be split into groups at the median.

category

Summarizing multiple items (the cols parameter) by group requires a focus category. By default, for logical column types, only TRUE values are counted. For other column types, the first category is counted. To override the default behavior, provide a vector of values in the dataset or labels from the codebook.

title

If TRUE (default) shows a plot title derived from the column labels. Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from plot_counts.

Value

A ggplot object.

Examples

library(volker)
data <- volker::chatgpt

plot_counts_items_cor(
  data, starts_with("cg_adoption_"), sd_age,
  category=c("agree","strongly agree")
)

plot_counts_items_cor(
  data, starts_with("cg_adoption_"), sd_age,
  category=c(4,5)
)

Correlation of categorical items with metric items

Description

Not yet implemented. The future will come.

Usage

plot_counts_items_cor_items(
  data,
  cols,
  cross,
  title = TRUE,
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

cross

Tidyselect item variables (e.g. starts_with...).

title

If TRUE (default) shows a plot title derived from the column labels. Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from plot_counts.

Value

A ggplot object.

Plot percent shares of multiple items compared by groups

Description

Plot percent shares of multiple items compared by groups

Usage

plot_counts_items_grouped(
  data,
  cols,
  cross,
  category = NULL,
  title = TRUE,
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

cross

The column holding groups to compare.

category

title

If TRUE (default) shows a plot title derived from the column labels. Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from plot_counts.

Value

A ggplot object.

Examples

library(volker)
data <- volker::chatgpt
plot_counts_items_grouped(
  data, starts_with("cg_adoption_"), adopter,
  category=c("agree","strongly agree")
)

plot_counts_items_grouped(
  data, starts_with("cg_adoption_"), adopter,
  category=c(4,5)
)

Correlation of categorical items with categorical items

Description

Not yet implemented. The future will come.

Usage

plot_counts_items_grouped_items(
  data,
  cols,
  cross,
  title = TRUE,
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

cross

Tidyselect item variables (e.g. starts_with...).

title

If TRUE (default) shows a plot title derived from the column labels. Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from plot_counts.

Value

A ggplot object.

Plot the frequency of values in one column

Description

Plot the frequency of values in one column

Usage

plot_counts_one(
  data,
  col,
  category = NULL,
  ci = FALSE,
  limits = NULL,
  numbers = NULL,
  title = TRUE,
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble.

col

The column holding values to count.

category

The value FALSE will force to plot all categories. A character value will focus a selected category. When NULL, in case of boolean values, only the TRUE category is plotted.

ci

Whether to plot error bars for 95% confidence intervals.

limits

The scale limits, autoscaled by default. Set to c(0,100) to make a 100% plot. If the data is binary or focused on a single category, by default a 100% plot is created.

numbers

The values to print on the bars: "n" (frequency), "p" (percentage) or both.

title

If TRUE (default) shows a plot title derived from the column labels. Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from plot_counts.

Value

A ggplot object.

Examples

library(volker)
data <- volker::chatgpt

plot_counts_one(data, sd_gender)

Plot frequencies cross tabulated with a metric column that will be split into groups

Description

Plot frequencies cross tabulated with a metric column that will be split into groups

Usage

plot_counts_one_cor(
  data,
  col,
  cross,
  category = NULL,
  prop = "total",
  limits = NULL,
  ordered = NULL,
  numbers = NULL,
  title = TRUE,
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble.

col

The column holding factor values.

cross

A metric column that will be split into groups at the median.

category

The value FALSE will force to plot all categories. A character value will focus a selected category. When NULL, in case of boolean values, only the TRUE category is plotted.

prop

The basis of percent calculation: "total" (the default), "rows" or "cols". Plotting row or column percentages results in stacked bars that add up to 100%. Whether you set rows or cols determines which variable is in the legend (fill color) and which on the vertical scale.

limits

The scale limits, autoscaled by default. Set to c(0,100) to make a 100 % plot.

ordered

The values of the cross column can be nominal (0), ordered ascending (1), or descending (-1). By default (NULL), the ordering is automatically detected. An appropriate color scale should be chosen depending on the ordering. For unordered values, colors from VLKR_FILLDISCRETE are used. For ordered values, shades of the VLKR_FILLGRADIENT option are used.

numbers

The numbers to print on the bars: "n" (frequency), "p" (percentage) or both.

title

If TRUE (default) shows a plot title derived from the column labels. Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from plot_counts.

Value

A ggplot object.

Examples

library(volker)
data <- volker::chatgpt

plot_counts_one_cor(data, adopter, sd_age)

Plot frequencies cross tabulated with a grouping column

Description

Plot frequencies cross tabulated with a grouping column

Usage

plot_counts_one_grouped(
  data,
  col,
  cross,
  category = NULL,
  prop = "total",
  width = NULL,
  limits = NULL,
  ordered = NULL,
  numbers = NULL,
  title = TRUE,
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble.

col

The column holding factor values.

cross

The column holding groups to split.

category

The value FALSE will force to plot all categories. A character value will focus a selected category. When NULL, in case of boolean values, only the TRUE category is plotted.

prop

width

By default, when setting the prop parameter to "rows" or "cols", the bar or column width reflects the number of cases. You can disable this behavior by setting width to FALSE.

limits

The scale limits, autoscaled by default. Set to c(0,100) to make a 100 % plot.

ordered

numbers

The numbers to print on the bars: "n" (frequency), "p" (percentage) or both.

title

If TRUE (default) shows a plot title derived from the column labels. Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from plot_counts.

Value

A ggplot object.

Examples

library(volker)
data <- volker::chatgpt

plot_counts_one_grouped(data, adopter, sd_gender)

Output a plot with distribution parameters such as the mean values

Description

The plot type depends on the number of selected columns:

One metric column: see plot_metrics_one
Multiple metric columns: see plot_metrics_items

Group comparisons:

One metric column and one grouping column: see plot_metrics_one_grouped
Multiple metric columns and one grouping column: see plot_metrics_items_grouped
Multiple metric columns and multiple grouping columns: see plot_metrics_items_grouped_items (not yet implemented)

By default, if you provide two column selections, the second selection is treated as categorical. Setting the metric-parameter to TRUE will call the appropriate functions for correlation analysis:

Two metric columns: see plot_metrics_one_cor
Multiple metric columns and one metric column : see plot_metrics_items_cor
Two metric column selections: see plot_metrics_items_cor_items

Parameters that may be passed to the metric functions (see the respective function help):

ci: Plot confidence intervals for means or correlation coefficients.
box: Visualise the distribution by adding boxplots.
log: In scatter plots, you can use a logarithmic scale. Be aware, that zero values will be omitted because their log value is undefined.
method: By default, correlations are calculated using Pearson’s R. You can choose Spearman’s Rho with the methods-parameter.
limits: The scale limits are automatically guessed by the package functions (work in progress). Use the limits-parameter to manually fix any misleading graphs.
title: All plots usually get a title derived from the column attributes or column names. Set to FALSE to suppress the title or provide a title of your choice as a character value.
labels: Labels are extracted from the column attributes. Set to FALSE to output bare column names and values.
numbers: Controls whether to display correlation coefficients on the plot.

Usage

plot_metrics(data, cols, cross = NULL, metric = FALSE, clean = TRUE, ...)

Arguments

data

A data frame.

cols

A tidy column selection, e.g. a single column (without quotes) or multiple columns selected by methods such as starts_with().

cross

Optional, a grouping column (without quotes).

metric

clean

Prepare data by data_clean.

...

Other parameters passed to the appropriate plot function.

Value

A ggplot object.

Examples

library(volker)
data <- volker::chatgpt

plot_metrics(data, sd_age)

Output averages for multiple variables

Description

Output averages for multiple variables

Usage

plot_metrics_items(
  data,
  cols,
  ci = FALSE,
  box = FALSE,
  limits = NULL,
  title = TRUE,
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

ci

Whether to plot the 95% confidence interval of the mean.

box

Whether to add boxplots.

limits

The scale limits. Set NULL to extract limits from the labels. NOT IMPLEMENTED YET.

title

If TRUE (default) shows a plot title derived from the column labels. Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from plot_metrics.

Value

A ggplot object.

Examples

library(volker)
data <- volker::chatgpt

plot_metrics_items(data, starts_with("cg_adoption_"))

Multiple items correlated with one metric variable

Description

Multiple items correlated with one metric variable

Usage

plot_metrics_items_cor(
  data,
  cols,
  cross,
  ci = FALSE,
  method = "pearson",
  title = TRUE,
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

cross

The column to correlate.

ci

Whether to plot confidence intervals of the correlation coefficient.

method

The method of correlation calculation, pearson = Pearson's R, spearman = Spearman's rho.

title

If TRUE (default) shows a plot title derived from the column labels. Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from plot_metrics.

Value

A ggplot object.

Examples

library(volker)
data <- volker::chatgpt

plot_metrics_items_cor(data, starts_with("use_"), sd_age)

Heatmap for correlations between multiple items

Description

Heatmap for correlations between multiple items

Usage

plot_metrics_items_cor_items(
  data,
  cols,
  cross,
  method = "pearson",
  numbers = FALSE,
  title = TRUE,
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

cross

Tidyselect item variables to correlate (e.g. starts_with...).

method

The method of correlation calculation, pearson = Pearson's R, spearman = Spearman's rho.

numbers

Controls whether to display correlation coefficients on the plot.

title

If TRUE (default) shows a plot title derived from the column labels. Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from plot_metrics.

Value

A ggplot object.

Examples

library(volker)
data <- volker::chatgpt

plot_metrics_items_cor_items(data, starts_with("cg_adoption_adv"), starts_with("use_"))

Output averages for multiple variables compared by a grouping variable

Description

Output averages for multiple variables compared by a grouping variable

Usage

plot_metrics_items_grouped(
  data,
  cols,
  cross,
  limits = NULL,
  title = TRUE,
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

cross

The column holding groups to compare.

limits

The scale limits. Set NULL to extract limits from the labels.

title

If TRUE (default) shows a plot title derived from the column labels. Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from plot_metrics.

Value

A ggplot object.

Examples

library(volker)
data <- volker::chatgpt

plot_metrics_items_grouped(data, starts_with("cg_adoption_"), sd_gender)

Correlation of metric items with categorical items

Description

Not yet implemented. The future will come.

Usage

plot_metrics_items_grouped_items(
  data,
  cols,
  cross,
  title = TRUE,
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

cross

Tidyselect item variables (e.g. starts_with...)

title

If TRUE (default) shows a plot title derived from the column labels. Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from plot_metrics.

Value

A ggplot object.

Output a density plot for a single metric variable

Description

Output a density plot for a single metric variable

Usage

plot_metrics_one(
  data,
  col,
  ci = FALSE,
  box = FALSE,
  limits = NULL,
  title = TRUE,
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble.

col

The column holding metric values.

ci

Whether to plot the confidence interval.

box

Whether to add a boxplot.

limits

The scale limits. Set NULL to extract limits from the label.

title

If TRUE (default) shows a plot title derived from the column labels. Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from plot_metrics.

Value

A ggplot object.

Examples

library(volker)
data <- volker::chatgpt

plot_metrics_one(data, sd_age)

Correlate two items

Description

Correlate two items

Usage

plot_metrics_one_cor(
  data,
  col,
  cross,
  limits = NULL,
  log = FALSE,
  title = TRUE,
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble.

col

The first column holding metric values.

cross

The second column holding metric values.

limits

The scale limits, a list with x and y components, e.g. list(x=c(0,100), y=c(20,100)). Set NULL to extract limits from the labels.

log

Whether to plot log scales.

title

If TRUE (default) shows a plot title derived from the column labels. Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from plot_metrics.

Value

A ggplot object.

Examples

library(volker)
data <- volker::chatgpt

plot_metrics_one_cor(data, use_private, sd_age)

Output averages for multiple variables

Description

Output averages for multiple variables

Usage

plot_metrics_one_grouped(
  data,
  col,
  cross,
  ci = FALSE,
  box = FALSE,
  limits = NULL,
  title = TRUE,
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble.

col

The column holding metric values.

cross

The column holding groups to compare.

ci

Whether to add error bars with 95% confidence intervals.

box

Whether to add boxplots.

limits

The scale limits. Set NULL to extract limits from the labels.

title

If TRUE (default) shows a plot title derived from the column labels. Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from plot_metrics.

Value

A ggplot object.

Examples

library(volker)
data <- volker::chatgpt

plot_metrics_one_grouped(data, sd_age, sd_gender)

Prepare the scale attribute values

Description

Prepare the scale attribute values

Usage

prepare_scale(data)

Arguments

data

A tibble with a scale attribute.

Value

A named list or NULL.

Printing method for volker lists

Description

Printing method for volker lists

Usage

## S3 method for class 'vlkr_list'
print(x, ...)

Arguments

x

The volker list.

...

Further parameters passed to print.

Value

No return value.

Examples

library(volker)
data <- volker::chatgpt

rp <- report_metrics(data, sd_age, sd_gender, effect = TRUE)
print(rp)

Printing method for volker plots

Description

Printing method for volker plots

Usage

## S3 method for class 'vlkr_plt'
print(x, ...)

## S3 method for class 'vlkr_plt'
plot(x, ...)

Arguments

x

The volker plot.

...

Further parameters passed to print().

Value

No return value.

Examples

library(volker)
data <- volker::chatgpt

pl <- plot_metrics(data, sd_age)
print(pl)

Printing method for volker reports

Description

Printing method for volker reports

Usage

## S3 method for class 'vlkr_rprt'
print(x, ...)

Arguments

x

The volker report object.

...

Further parameters passed to print.

Value

No return value.

Examples

library(volker)
data <- volker::chatgpt

rp <- report_metrics(data, sd_age)
print(rp)

Printing method for volker tables.

Description

Printing method for volker tables.

Usage

## S3 method for class 'vlkr_tbl'
print(x, ...)

Arguments

x

The volker table.

...

Further parameters passed to print().

Value

No return value.

Examples

library(volker)
data <- volker::chatgpt

tb <- tab_metrics(data, sd_age)
print(tb)

Create table and plot for categorical variables

Description

Depending on your column selection, different types of plots and tables are generated. See plot_counts and tab_counts.

Usage

report_counts(
  data,
  cols,
  cross = NULL,
  metric = FALSE,
  index = FALSE,
  effect = FALSE,
  numbers = NULL,
  title = TRUE,
  close = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A data frame.

cols

A tidy column selection, e.g. a single column (without quotes) or multiple columns selected by methods such as starts_with().

cross

Optional, a grouping column (without quotes).

metric

index

When the cols contain items on a metric scale (as determined by get_direction), an index will be calculated using the 'psych' package. Set to FALSE to suppress index generation.

effect

Whether to report statistical tests and effect sizes. See effect_counts for further parameters.

numbers

The numbers to print on the bars: "n" (frequency), "p" (percentage) or both. Set to NULL to remove numbers.

title

A character providing the heading or TRUE (default) to output a heading. Classes for tabset pills will be added.

close

Whether to close the last tab (default value TRUE) or to keep it open. Keep it open to add further custom tabs by adding headers on the fifth level in Markdown (e.g. ##### Method).

clean

Prepare data by data_clean.

...

Parameters passed to the plot_counts and tab_counts and effect_counts functions.

Details

For item batteries, an index is calculated and reported. When used in combination with the Markdown-template "html_report", the different parts of the report are grouped under a tabsheet selector.

Value

A volker report object.

Examples

library(volker)
data <- volker::chatgpt

report_counts(data, sd_gender)

Create table and plot for metric variables

Description

Depending on your column selection, different types of plots and tables are generated. See plot_metrics and tab_metrics.

Usage

report_metrics(
  data,
  cols,
  cross = NULL,
  metric = FALSE,
  ...,
  index = FALSE,
  factors = FALSE,
  clusters = FALSE,
  effect = FALSE,
  title = TRUE,
  close = TRUE,
  clean = TRUE
)

Arguments

data

A data frame.

cols

A tidy column selection, e.g. a single column (without quotes) or multiple columns selected by methods such as starts_with().

cross

Optional, a grouping or correlation column (without quotes).

metric

...

Parameters passed to the plot_metrics and tab_metrics and effect_metrics functions.

index

When the cols contain items on a metric scale (as determined by get_direction), an index will be calculated using the 'psych' package. Set to FALSE to suppress index generation.

factors

The number of factors to calculate. Set to FALSE to suppress factor analysis. Set to TRUE to output a scree plot and automatically choose the number of factors. When the cols contain items on a metric scale (as determined by get_direction), factors will be calculated using the 'psych' package. See add_factors.

clusters

The number of clusters to calculate. Cluster are determined using kmeans after scaling the items. Set to FALSE to suppress cluster analysis. Set to TRUE to output a scree plot and automatically choose the number of clusters based on the elbow criterion. See add_clusters.

effect

Whether to report statistical tests and effect sizes. See effect_counts for further parameters.

title

A character providing the heading or TRUE (default) to output a heading. Classes for tabset pills will be added.

close

Whether to close the last tab (default value TRUE) or to keep it open. Keep it open to add further custom tabs by adding headers on the fifth level in Markdown (e.g. ##### Method).

clean

Prepare data by data_clean.

Details

For item batteries, an index is calculated and reported. When used in combination with the Markdown-template "html_report", the different parts of the report are grouped under a tabsheet selector.

Value

A volker report object.

Examples

library(volker)
data <- volker::chatgpt

report_metrics(data, sd_age)

Select function

Description

See dplyr::select for details.

A skimmer for boxplot generation

Description

Returns a five point summary, mean and sd, items count and alpha for scales added by add_index(). Additionally, the whiskers defined by the minimum respective maximum value within 1.5 * iqr are calculated. Outliers are returned in a list column.

Usage

skim_boxplot(data, ..., .data_name = NULL)

Calculate a metric by groups

Description

Calculate a metric by groups

Usage

skim_grouped(data, cols, cross, value = "numeric.mean", labels = TRUE)

Arguments

data

A tibble.

cols

The item columns that hold the values to summarize.

cross

The column holding groups to compare.

value

The metric to extract from the skim result, e.g. numeric.mean or numeric.sd.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

Value

A tibble with each item in a row, a total column and columns for all groups.

A reduced skimmer for metric variables Returns a five point summary, mean and sd, items count and alpha for scales added by add_index()

Description

A reduced skimmer for metric variables Returns a five point summary, mean and sd, items count and alpha for scales added by add_index()

Usage

skim_metrics(data, ..., .data_name = NULL)

Value

A skimmer, see skim_with

Examples

library(volker)
data <- volker::chatgpt

skim_metrics(data)

Select variables by their prefix

Description

See tidyselect::starts_with for details.

Output a frequency table

Description

The type of frequency table depends on the number of selected columns:

One categorical column: see tab_counts_one
Multiple categorical columns: see tab_counts_items

Cross tabulations:

One categorical column and one grouping column: see tab_counts_one_grouped
Multiple categorical columns and one grouping column: see tab_counts_items_grouped
Multiple categorical columns and multiple grouping columns: see tab_counts_items_grouped_items (not yet implemented)

By default, if you provide two column selections, the second column is treated as categorical. Setting the metric-parameter to TRUE will call the appropriate functions for correlation analysis:

One categorical column and one metric column: see tab_counts_one_cor
Multiple categorical columns and one metric column: see tab_counts_items_cor
Multiple categorical columns and multiple metric columns: tab_counts_items_cor_items (not yet implemented)

Parameters that may be passed to specific count functions:

ci: Add confidence intervals to proportions.
percent: Frequency tables show percentages by default. Set to FALSE to get raw proportions.
prop: For cross tables you can choose between total, row or column percentages.
values: The values to output: n (frequency) or p (percentage) or both (the default).
category: When you have multiple categories in a column, you can focus one of the categories to simplify the plots. By default, if a column has only TRUE and FALSE values, the outputs focus the TRUE category.
labels: Labels are extracted from the column attributes. Set to FALSE to output bare column names and values.

Usage

tab_counts(data, cols, cross = NULL, metric = FALSE, clean = TRUE, ...)

Arguments

data

A data frame.

cols

A tidy column selection, e.g. a single column (without quotes) or multiple columns selected by methods such as starts_with().

cross

Optional, a grouping column. The column name without quotes.

metric

clean

Prepare data by data_clean.

...

Other parameters passed to the appropriate table function.

Value

A volker tibble.

Examples

library(volker)
data <- volker::chatgpt

tab_counts(data, sd_gender)

Output frequencies for multiple variables

Description

Output frequencies for multiple variables

Usage

tab_counts_items(
  data,
  cols,
  ci = FALSE,
  percent = TRUE,
  values = c("n", "p"),
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

ci

Whether to compute 95% confidence intervals.

percent

Set to FALSE to prevent calculating percents from proportions.

values

The values to output: n (frequency) or p (percentage) or both (the default).

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from tab_counts.

Value

A volker tibble.

Examples

library(volker)
data <- volker::chatgpt

tab_counts_items(data, starts_with("cg_adoption_"))

Compare the values in multiple items by a metric column that will be split into groups

Description

Compare the values in multiple items by a metric column that will be split into groups

Usage

tab_counts_items_cor(
  data,
  cols,
  cross,
  category = NULL,
  split = NULL,
  percent = TRUE,
  values = c("n", "p"),
  title = TRUE,
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

cross

A metric column that will be split into groups at the median value.

category

Summarizing multiple items (the cols parameter) by group requires a focus category. By default, for logical column types, only TRUE values are counted. For other column types, the first category is counted. Accepts both character and numeric vectors to override default counting behavior.

split

Not implemented yet.

percent

Proportions are formatted as percent by default. Set to FALSE to get bare proportions.

values

The values to output: n (frequency) or p (percentage) or both (the default).

title

If TRUE (default) shows a plot title derived from the column labels. Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from plot_counts.

Value

A volker tibble.

Examples

library(volker)
data <- volker::chatgpt
tab_counts_items_cor(
  data, starts_with("cg_adoption_"), sd_age,
  category=c("agree", "strongly agree")
)

Correlation of categorical items with metric items

Description

Not yet implemented. The future will come.

Usage

tab_counts_items_cor_items(
  data,
  cols,
  cross,
  title = TRUE,
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

cross

Tidyselect item variables (e.g. starts_with...).

title

If TRUE (default) shows a plot title derived from the column labels. Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from plot_counts.

Value

A volker tibble.

Compare the values in multiple items by a grouping column

Description

Compare the values in multiple items by a grouping column

Usage

tab_counts_items_grouped(
  data,
  cols,
  cross,
  category = NULL,
  percent = TRUE,
  values = c("n", "p"),
  title = TRUE,
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

cross

The column holding groups to compare.

category

Summarizing multiple items (the cols parameter) by group requires a focus category. By default, for logical column types, only TRUE values are counted. For other column types, the first category is counted. Accepts both character and numeric vectors to override default counting behavior.

percent

Proportions are formatted as percent by default. Set to FALSE to get bare proportions.

values

The values to output: n (frequency) or p (percentage) or both (the default).

title

If TRUE (default) shows a plot title derived from the column labels. Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from plot_counts.

Value

A volker tibble.

Examples

library(volker)
data <- volker::chatgpt
tab_counts_items_grouped(
  data, starts_with("cg_adoption_"), adopter,
  category=c("agree", "strongly agree")
)

Correlation of categorical items with categorical items

Description

Not yet implemented. The future will come.

Usage

tab_counts_items_grouped_items(
  data,
  cols,
  cross,
  title = TRUE,
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

cross

Tidyselect item variables (e.g. starts_with...).

title

If TRUE (default) shows a plot title derived from the column labels. Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from plot_counts.

Value

A volker tibble.

Output a frequency table for the values in one column

Description

Output a frequency table for the values in one column

Usage

tab_counts_one(
  data,
  col,
  ci = FALSE,
  percent = TRUE,
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble.

col

The column holding values to count.

ci

Whether to compute 95% confidence intervals using stats::prop.test.

percent

Proportions are formatted as percent by default. Set to FALSE to get bare proportions.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from tab_counts.

Value

A volker tibble.

Examples

library(volker)
data <- volker::chatgpt

tab_counts_one(data, sd_gender)

Count values by a metric column that will be split into groups

Description

Count values by a metric column that will be split into groups

Usage

tab_counts_one_cor(
  data,
  col,
  cross,
  prop = "total",
  percent = TRUE,
  values = c("n", "p"),
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble.

col

The column holding factor values.

cross

The metric column that will be split into groups at the median.

prop

The basis of percent calculation: "total" (the default), "cols", or "rows".

percent

Proportions are formatted as percent by default. Set to FALSE to get bare proportions.

values

The values to output: n (frequency) or p (percentage) or both (the default).

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from tab_counts.

Value

A volker tibble.

Examples

library(volker)
data <- volker::chatgpt

tab_counts_one_cor(data, adopter, sd_age)

Output frequencies cross tabulated with a grouping column

Description

Output frequencies cross tabulated with a grouping column

Usage

tab_counts_one_grouped(
  data,
  col,
  cross,
  prop = "total",
  percent = TRUE,
  values = c("n", "p"),
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble.

col

The column holding factor values.

cross

The column holding groups to split.

prop

The basis of percent calculation: "total" (the default), "cols", or "rows".

percent

Proportions are formatted as percent by default. Set to FALSE to get bare proportions.

values

The values to output: n (frequency) or p (percentage) or both (the default).

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from tab_counts.

Value

A volker tibble.

Examples

library(volker)
data <- volker::chatgpt

tab_counts_one_grouped(data, adopter, sd_gender)

Output a table with distribution parameters

Description

The table type depends on the number of selected columns:

One metric column: see tab_metrics_one
Multiple metric columns: see tab_metrics_items

Group comparisons:

One metric column and one grouping column: see tab_metrics_one_grouped
Multiple metric columns and one grouping column: see tab_metrics_items_grouped
Multiple metric columns and multiple grouping columns: see tab_metrics_items_grouped_items (not yet implemented)

By default, if you provide two column selections, the second column is treated as categorical. Setting the metric-parameter to TRUE will call the appropriate functions for correlation analysis:

Two metric columns: see tab_metrics_one_cor
Multiple metric columns and one metric column: see tab_metrics_items_cor
Two metric column selections: see tab_metrics_items_cor_items

Parameters that may be passed to specific metric functions:

ci: Add confidence intervals for means or correlation coefficients.
values: The output metrics, mean (m), the standard deviation (sd) or both (the default).
digits: Tables containing means and standard deviations by default round values to one digit. Increase the number to show more digits
method: By default, correlations are calculated using Pearson’s R. You can choose Spearman’s Rho with the methods-parameter.
labels: Labels are extracted from the column attributes. Set to FALSE to output bare column names and values.

Usage

tab_metrics(data, cols, cross = NULL, metric = FALSE, clean = TRUE, ...)

Arguments

data

A data frame.

cols

A tidy column selection, e.g. a single column (without quotes) or multiple columns selected by methods such as starts_with().

cross

Optional, a grouping column (without quotes).

metric

clean

Prepare data by data_clean.

...

Other parameters passed to the appropriate table function.

Value

A volker tibble.

Examples

library(volker)
data <- volker::chatgpt

tab_metrics(data, sd_age)

Output a five point summary table for multiple items

Description

Output a five point summary table for multiple items

Usage

tab_metrics_items(
  data,
  cols,
  ci = FALSE,
  digits = 1,
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble.

cols

The columns holding metric values.

ci

Whether to compute confidence intervals of the mean.

digits

The number of digits to print.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from tab_metrics.

Value

A volker tibble.

Examples

library(volker)
data <- volker::chatgpt

tab_metrics_items(data, starts_with("cg_adoption_"))

Output a correlation table for item battery and one metric variable

Description

Usage

tab_metrics_items_cor(
  data,
  cols,
  cross,
  method = "pearson",
  digits = 2,
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble.

cols

The source columns.

cross

The target columns or NULL to calculate correlations within the source columns.

method

The output metrics, pearson = Pearson's R, spearman = Spearman's rho.

digits

The number of digits to print.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from tab_metrics.

Value

A volker tibble.

Examples

library(volker)
data <- volker::chatgpt

tab_metrics_items_cor(
  data,
  starts_with("cg_adoption_adv"),
  sd_age,
  metric = TRUE
)

Output a correlation table for item battery and item battery

Description

Usage

tab_metrics_items_cor_items(
  data,
  cols,
  cross,
  method = "pearson",
  digits = 2,
  ci = FALSE,
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble.

cols

The source columns.

cross

The target columns or NULL to calculate correlations within the source columns.

method

The output metrics, pearson = Pearson's R, spearman = Spearman's rho.

digits

The number of digits to print.

ci

Whether to calculate 95% confidence intervals of the correlation coefficient.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from tab_metrics.

Value

A volker tibble.

Examples

library(volker)
data <- volker::chatgpt

tab_metrics_items_cor_items(
  data,
  starts_with("cg_adoption_adv"),
  starts_with("use"),
  metric = TRUE
)

Output the means for groups in one or multiple columns

Description

Output the means for groups in one or multiple columns

Usage

tab_metrics_items_grouped(
  data,
  cols,
  cross,
  digits = 1,
  values = c("m", "sd"),
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble.

cols

The item columns that hold the values to summarize.

cross

The column holding groups to compare.

digits

The number of digits to print.

values

The output metrics, mean (m), the standard deviation (sd) or both (the default).

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from tab_metrics.

Value

A volker tibble.

Examples

library(volker)
data <- volker::chatgpt

tab_metrics_items_grouped(data, starts_with("cg_adoption_"), sd_gender)

Correlation of metric items with categorical items

Description

Not yet implemented. The future will come.

Usage

tab_metrics_items_grouped_items(
  data,
  cols,
  cross,
  title = TRUE,
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble containing item measures.

cols

Tidyselect item variables (e.g. starts_with...).

cross

Tidyselect item variables (e.g. starts_with...)

title

If TRUE (default) shows a plot title derived from the column labels. Disable the title with FALSE or provide a custom title as character value.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from plot_metrics.

Value

A volker tibble.

Output a five point summary table for the values in multiple columns

Description

Output a five point summary table for the values in multiple columns

Usage

tab_metrics_one(
  data,
  col,
  ci = FALSE,
  digits = 1,
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble.

col

The columns holding metric values.

ci

Whether to calculate 95% confidence intervals of the mean.

digits

The number of digits to print.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from tab_metrics.

Value

A volker tibble.

Examples

library(volker)
data <- volker::chatgpt

tab_metrics_one(data, sd_age)

Correlate two columns

Description

Correlate two columns

Usage

tab_metrics_one_cor(
  data,
  col,
  cross,
  method = "pearson",
  ci = FALSE,
  digits = 2,
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble.

col

The first column holding metric values.

cross

The second column holding metric values.

method

The output metrics, TRUE or pearson = Pearson's R, spearman = Spearman's rho

ci

Whether to output confidence intervals.

digits

The number of digits to print.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from tab_counts.

Value

A volker tibble.

Examples

library(volker)
data <- volker::chatgpt

tab_metrics_one_cor(data, use_private, sd_age)

Output a five point summary for groups

Description

Output a five point summary for groups

Usage

tab_metrics_one_grouped(
  data,
  col,
  cross,
  ci = FALSE,
  digits = 1,
  labels = TRUE,
  clean = TRUE,
  ...
)

Arguments

data

A tibble.

col

The column holding metric values.

cross

The column holding groups to compare.

ci

Whether to output 95% confidence intervals.

digits

The number of digits to print.

labels

If TRUE (default) extracts labels from the attributes, see codebook.

clean

Prepare data by data_clean.

...

Placeholder to allow calling the method with unused parameters from tab_metrics.

Value

A volker tibble.

Examples

library(volker)
data <- volker::chatgpt

tab_metrics_one_grouped(data, sd_age, sd_gender)

Get, set, and modify the active ggplot theme

Description

See ggplot2::theme_set for details.

Define a default theme for volker plots

Description

Set ggplot colors, sizes and layout parameters.

Usage

theme_vlkr(
  base_size = 11,
  base_color = "black",
  base_fill = VLKR_FILLDISCRETE,
  base_gradient = VLKR_FILLGRADIENT
)

Arguments

base_size

Base font size.

base_color

Base font color.

base_fill

A list of fill color sets or at least one fill color set. Example: list(c("red"), c("red", "blue", "green")). Each set can contain different numbers of colors. Depending on the number of colors needed, the set with at least the number of required colors is used. The first color is always used for simple bar charts.

base_gradient

A color vector used for creating gradient fill colors, e.g. in stacked bar plots.

Details

Value

A theme function.

Examples

library(volker)
library(ggplot2)
data <- volker::chatgpt

theme_set(theme_vlkr(base_size=15, base_fill = list("red")))
plot_counts(data, sd_gender)

Tidy tibbles

Description

See tibble::tibble for details.

Tidy lm results, replace categorical parameter names by their levels and add the reference level

Description

Tidy lm results, replace categorical parameter names by their levels and add the reference level

Usage

tidy_lm_levels(fit)

Arguments

fit

Result of a lm call.

Value

A tibble with regression parameters.

Author(s)

Created with the help of ChatGPT.

Tidy tribbles

Description

See tibble::tribble for details.

Remove trailing zeros and trailing or leading whitespaces, colons, hyphens and underscores

Description

Remove trailing zeros and trailing or leading whitespaces, colons, hyphens and underscores

Usage

trim_label(x)

Arguments

x

A character value.

Value

The trimmed character value.

Remove a prefix from a character vector or a factor

Description

If the resulting character values would be empty, the prefix is returned. At the end, all items in the vector are trimmed using trim_label.

Usage

trim_prefix(x, prefix = TRUE)

Arguments

x

A character or factor vector.

prefix

The prefix. Set to TRUE to first extract the prefix.

Details

If x is a factor, the order of factor levels is retained.

Value

The trimmed character or factor vector.

Truncate labels

Description

Truncate labels that exceed a specified maximum length.

Usage

trunc_labels(x, max_length = 20)

Arguments

x

A character vector.

max_length

Maximum length, default is 20. The ellipsis "..." is appended to shortened labels.

Value

A character vector with truncated labels.

Interpolate an alpha value based on case numbers

Description

Interpolate an alpha value based on case numbers

Usage

vlkr_alpha_interpolated(
  n,
  n_min = 20,
  n_max = 100,
  alpha_min = VLKR_POINT_ALPHA,
  alpha_max = 1
)

Arguments

n

Number of cases

n_min

The case number where the minimum alpha value starts

n_max

The case number where the maximum alpha value ends

alpha_min

The minimum alpha value

alpha_max

The maximum alpha value

Value

A value between the minimum and the maximum alpha value

Get colors for discrete scales

Description

If the option ggplot2.discrete.fill is set, gets color values from the first list item that has enough colors and reverses them to start filling from the left in grouped bar charts.

Usage

vlkr_colors_discrete(n)

Arguments

n

Number of colors.

Details

Falls back to scale_fill_hue().

Value

A vector of colors.

Get colors for polarized scales

Description

Creates a gradient scale based on VLKR_FILLPOLARIZED.

Usage

vlkr_colors_polarized(n = NULL)

Arguments

n

Number of colors or NULL to get the raw colors from the config

Value

A vector of colors.

Get colors for sequential scales

Description

Creates a gradient scale based on VLKR_FILLGRADIENT.

Usage

vlkr_colors_sequential(n = NULL)

Arguments

n

Number of colors or NULL to get the raw colors from the config

Value

A vector of colors.

Wrap a string

Description

Wrap a string

Usage

wrap_label(x, width = 40)

Arguments

x

A character vector.

width

The number of chars after which to break.

Value

A character vector with wrapped strings.

Combine two identically shaped data frames by adding values of each column from the second data frame into the corresponding column in the first dataframe using parentheses

Description

Combine two identically shaped data frames by adding values of each column from the second data frame into the corresponding column in the first dataframe using parentheses

Usage

zip_tables(x, y, newline = TRUE, brackets = FALSE)

Arguments

x

The first data frame.

y

The second data frame.

newline

Whether to add a new line character between the values (default: TRUE).

brackets

Whether to set the secondary values in brackets (default: FALSE).

Value

A combined data frame.

volker: High-Level Functions for Tabulating, Charting and Reporting Survey Data

Description

Author(s)

See Also

Pipe operator

Description

Usage

Arguments

Value

Add an object to the report list

Description

Usage

Arguments

Value

Insert a name-value-pair into an object attribute

Description

Usage

Arguments

Value

Transfer attributes from one to another object

Description

Usage

Arguments

Value

Get the maximum density value in a density plot

Description

Usage

Arguments

Value

Test whether correlations are different from zero

Description

Usage

Arguments

Value

Calculate nmpi

Description

Usage

Arguments

Value

Create a factor vector and preserve all attributes

Description

Usage

Arguments

Value

Get plot size and resolution for the current output format from the config

Description

Usage

Value

Calculate IQR

Description

Usage

Arguments

Value

Knit volker plots

Description

Usage

Arguments

Details

Value

Prepare markdown content for table rendering

Description

Usage

Arguments

Value

Knit volker tables

Description

Usage

Arguments

Value

Calculate outliers

Description

Usage

Arguments

Value

Helper function: plot grouped bar chart

Description

Usage

Arguments

Value

Helper function: plot cor and regression outputs