Type: | Package |
Title: | Extension to 'ggplot2' for Plotting Stats |
Version: | 0.10.0 |
Description: | Provides new statistics, new geometries and new positions for 'ggplot2' and a suite of functions to facilitate the creation of statistical plots. |
License: | GPL (≥ 3) |
URL: | https://larmarange.github.io/ggstats/, https://github.com/larmarange/ggstats |
BugReports: | https://github.com/larmarange/ggstats/issues |
Depends: | R (≥ 4.2) |
Imports: | cli, dplyr, forcats, ggplot2 (≥ 3.4.0), lifecycle, patchwork, purrr, rlang, scales, stats, stringr, utils, tidyr |
Suggests: | betareg, broom, broom.helpers (≥ 1.20.0), emmeans, glue, gtsummary, knitr, labelled (≥ 2.11.0), reshape, rmarkdown, nnet, parameters, pscl, testthat (≥ 3.0.0), spelling, survey, survival, vdiffr |
Encoding: | UTF-8 |
RoxygenNote: | 7.3.2 |
Config/testthat/edition: | 3 |
Language: | en-US |
VignetteBuilder: | knitr |
NeedsCompilation: | no |
Packaged: | 2025-07-02 15:59:33 UTC; josep |
Author: | Joseph Larmarange |
Maintainer: | Joseph Larmarange <joseph@larmarange.net> |
Repository: | CRAN |
Date/Publication: | 2025-07-02 22:40:05 UTC |
ggstats: Extension to 'ggplot2' for Plotting Stats
Description
Provides new statistics, new geometries and new positions for 'ggplot2' and a suite of functions to facilitate the creation of statistical plots.
Author(s)
Maintainer: Joseph Larmarange joseph@larmarange.net (ORCID)
See Also
Useful links:
Report bugs at https://github.com/larmarange/ggstats/issues
Augment a chi-squared test and compute phi coefficients
Description
Augment a chi-squared test and compute phi coefficients
Usage
augment_chisq_add_phi(x)
Arguments
x |
a chi-squared test as returned by |
Details
Phi coefficients are a measurement of the degree of association between two binary variables.
A value between -1.0 to -0.7 indicates a strong negative association.
A value between -0.7 to -0.3 indicates a weak negative association.
A value between -0.3 to +0.3 indicates a little or no association.
A value between +0.3 to +0.7 indicates a weak positive association.
A value between +0.7 to +1.0 indicates a strong positive association.
Value
A tibble
.
See Also
stat_cross()
, GDAtools::phi.table()
or psych::phi()
Examples
tab <- xtabs(Freq ~ Sex + Class, data = as.data.frame(Titanic))
augment_chisq_add_phi(chisq.test(tab))
Connect bars / points
Description
geom_connector()
is a variation of ggplot2::geom_step()
.
Its variant geom_bar_connector()
is particularly adapted to
connect bars.
Usage
geom_connector(
mapping = NULL,
data = NULL,
stat = "identity",
position = "identity",
width = 0.1,
continuous = FALSE,
na.rm = FALSE,
orientation = NA,
show.legend = NA,
inherit.aes = TRUE,
...
)
geom_bar_connector(
mapping = NULL,
data = NULL,
stat = "prop",
position = "stack",
width = 0.9,
continuous = FALSE,
add_baseline = TRUE,
na.rm = FALSE,
orientation = NA,
show.legend = NA,
inherit.aes = TRUE,
...
)
Arguments
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
width |
Bar width (see examples). |
continuous |
Should connect segments be continuous? |
na.rm |
If |
orientation |
The orientation of the layer. The default ( |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
... |
Other arguments passed on to
|
add_baseline |
Add connectors at baseline? |
Examples
library(ggplot2)
# geom_bar_connector() -----------
ggplot(diamonds) +
aes(x = clarity, fill = cut) +
geom_bar(width = .5) +
geom_bar_connector(width = .5, linewidth = .25) +
theme_minimal() +
theme(legend.position = "bottom")
ggplot(diamonds) +
aes(x = clarity, fill = cut) +
geom_bar(width = .5) +
geom_bar_connector(
width = .5,
continuous = TRUE,
colour = "red",
linetype = "dotted",
add_baseline = FALSE,
) +
theme(legend.position = "bottom")
ggplot(diamonds) +
aes(x = clarity, fill = cut) +
geom_bar(width = .5, position = "fill") +
geom_bar_connector(width = .5, position = "fill") +
theme(legend.position = "bottom")
ggplot(diamonds) +
aes(x = clarity, fill = cut) +
geom_bar(width = .5, position = "diverging") +
geom_bar_connector(width = .5, position = "diverging", linewidth = .25) +
theme(legend.position = "bottom")
# geom_connector() -----------
ggplot(mtcars) +
aes(x = wt, y = mpg, colour = factor(cyl)) +
geom_connector() +
geom_point()
ggplot(mtcars) +
aes(x = wt, y = mpg, colour = factor(cyl)) +
geom_connector(continuous = TRUE) +
geom_point()
ggplot(mtcars) +
aes(x = wt, y = mpg, colour = factor(cyl)) +
geom_connector(continuous = TRUE, width = .3) +
geom_point()
ggplot(mtcars) +
aes(x = wt, y = mpg, colour = factor(cyl)) +
geom_connector(width = 0) +
geom_point()
ggplot(mtcars) +
aes(x = wt, y = mpg, colour = factor(cyl)) +
geom_connector(width = Inf) +
geom_point()
ggplot(mtcars) +
aes(x = wt, y = mpg, colour = factor(cyl)) +
geom_connector(width = Inf, continuous = TRUE) +
geom_point()
Geometries for diverging bar plots
Description
These geometries are variations of ggplot2::geom_bar()
and
ggplot2::geom_text()
but provides different set of default values.
Usage
geom_diverging(
mapping = NULL,
data = NULL,
position = "diverging",
...,
complete = "fill",
default_by = "total"
)
geom_likert(
mapping = NULL,
data = NULL,
position = "likert",
...,
complete = "fill",
default_by = "x"
)
geom_pyramid(
mapping = NULL,
data = NULL,
position = "diverging",
...,
complete = NULL,
default_by = "total"
)
geom_diverging_text(
mapping = ggplot2::aes(!!!auto_contrast),
data = NULL,
position = position_diverging(0.5),
...,
complete = "fill",
default_by = "total"
)
geom_likert_text(
mapping = ggplot2::aes(!!!auto_contrast),
data = NULL,
position = position_likert(0.5),
...,
complete = "fill",
default_by = "x"
)
geom_pyramid_text(
mapping = ggplot2::aes(!!!auto_contrast),
data = NULL,
position = position_diverging(0.5),
...,
complete = NULL,
default_by = "total"
)
Arguments
mapping |
Optional set of aesthetic mappings. |
data |
The data to be displayed in this layers. |
position |
A position adjustment to use on the data for this layer. |
... |
Other arguments passed on to |
complete |
An aesthetic for those unobserved values should be completed,
see |
default_by |
Name of an aesthetic determining denominators by default,
see |
Details
-
geom_diverging()
is designed for stacked diverging bar plots, usingposition_diverging()
. -
geom_likert()
is designed for Likert-type items. Usingposition_likert()
(each bar sums to 100%). -
geom_pyramid()
is similar togeom_diverging()
but uses proportions of the total instead of counts.
To add labels on the bar plots, simply use geom_diverging_text()
,
geom_likert_text()
, or geom_pyramid_text()
.
All these geometries relies on stat_prop()
.
Examples
library(ggplot2)
ggplot(diamonds) +
aes(x = clarity, fill = cut) +
geom_diverging()
ggplot(diamonds) +
aes(x = clarity, fill = cut) +
geom_diverging(position = position_diverging(cutoff = 4))
ggplot(diamonds) +
aes(y = clarity, fill = cut) +
geom_likert() +
geom_likert_text()
ggplot(diamonds) +
aes(y = clarity, fill = cut) +
geom_likert() +
geom_likert_text(
aes(
label = label_percent_abs(accuracy = 1, hide_below = .10)(
after_stat(prop)
),
colour = after_scale(hex_bw(.data$fill))
)
)
d <- Titanic |> as.data.frame()
ggplot(d) +
aes(y = Class, fill = Sex, weight = Freq) +
geom_diverging() +
geom_diverging_text()
ggplot(d) +
aes(y = Class, fill = Sex, weight = Freq) +
geom_pyramid() +
geom_pyramid_text()
Convenient geometries for proportion bar plots
Description
geom_prop_bar()
, geom_prop_text()
and geom_prop_connector()
are
variations of ggplot2::geom_bar()
, ggplot2::geom_text()
and
geom_bar_connector()
using stat_prop()
, with custom default aesthetics:
after_stat(prop)
for x or y, and
scales::percent(after_stat(prop))
for label.
Usage
geom_prop_bar(
mapping = NULL,
data = NULL,
position = "stack",
...,
width = 0.9,
complete = NULL,
default_by = "x"
)
geom_prop_text(
mapping = ggplot2::aes(!!!auto_contrast),
data = NULL,
position = ggplot2::position_stack(0.5),
...,
complete = NULL,
default_by = "x"
)
geom_prop_connector(
mapping = NULL,
data = NULL,
position = "stack",
...,
width = 0.9,
complete = "fill",
default_by = "x"
)
Arguments
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Additional parameters passed to |
width |
Bar width ( |
complete |
Name (character) of an aesthetic for those statistics should be completed for unobserved values (see example). |
default_by |
If the by aesthetic is not available, name of another
aesthetic that will be used to determine the denominators (e.g. |
See Also
Examples
library(ggplot2)
d <- as.data.frame(Titanic)
ggplot(d) +
aes(x = Class, fill = Survived, weight = Freq) +
geom_prop_bar() +
geom_prop_text() +
geom_prop_connector()
ggplot(d) +
aes(y = Class, fill = Survived, weight = Freq) +
geom_prop_bar(width = .5) +
geom_prop_text() +
geom_prop_connector(width = .5, linetype = "dotted")
ggplot(d) +
aes(
x = Class,
fill = Survived,
weight = Freq,
y = after_stat(count),
label = after_stat(count)
) +
geom_prop_bar() +
geom_prop_text() +
geom_prop_connector()
Alternating Background Color
Description
Add alternating background color along the y-axis. The geom takes default
aesthetics odd
and even
that receive color codes.
Usage
geom_stripped_rows(
mapping = NULL,
data = NULL,
stat = "identity",
position = "identity",
...,
show.legend = NA,
inherit.aes = TRUE,
xfrom = -Inf,
xto = Inf,
width = 1,
nudge_y = 0
)
geom_stripped_cols(
mapping = NULL,
data = NULL,
stat = "identity",
position = "identity",
...,
show.legend = NA,
inherit.aes = TRUE,
yfrom = -Inf,
yto = Inf,
width = 1,
nudge_x = 0
)
Arguments
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
stat |
The statistical transformation to use on the data for this layer.
When using a
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
xfrom , xto |
limitation of the strips along the x-axis |
width |
width of the strips |
yfrom , yto |
limitation of the strips along the y-axis |
nudge_x , nudge_y |
horizontal or vertical adjustment to nudge strips by |
Value
A ggplot2
plot with the added geometry.
Examples
data(tips, package = "reshape")
library(ggplot2)
p <- ggplot(tips) +
aes(x = time, y = day) +
geom_count() +
theme_light()
p
p + geom_stripped_rows()
p + geom_stripped_cols()
p + geom_stripped_rows() + geom_stripped_cols()
p <- ggplot(tips) +
aes(x = total_bill, y = day) +
geom_count() +
theme_light()
p
p + geom_stripped_rows()
p + geom_stripped_rows() + scale_y_discrete(expand = expansion(0, 0.5))
p + geom_stripped_rows(xfrom = 10, xto = 35)
p + geom_stripped_rows(odd = "blue", even = "yellow")
p + geom_stripped_rows(odd = "blue", even = "yellow", alpha = .1)
p + geom_stripped_rows(odd = "#00FF0022", even = "#FF000022")
p + geom_stripped_cols()
p + geom_stripped_cols(width = 10)
p + geom_stripped_cols(width = 10, nudge_x = 5)
Cascade plot
Description
Usage
ggcascade(
.data,
...,
.weights = NULL,
.by = NULL,
.nrow = NULL,
.ncol = NULL,
.add_n = TRUE,
.text_size = 4,
.arrows = TRUE
)
compute_cascade(.data, ..., .weights = NULL, .by = NULL)
plot_cascade(
.data,
.by = NULL,
.nrow = NULL,
.ncol = NULL,
.add_n = TRUE,
.text_size = 4,
.arrows = TRUE
)
Arguments
.data |
A data frame, or data frame extension (e.g. a tibble). For
|
... |
< |
.weights |
< |
.by |
< |
.nrow , .ncol |
Number of rows and columns, for faceted plots. |
.add_n |
Display the number of observations? |
.text_size |
Size of the labels, passed to |
.arrows |
Display arrows between statuses? |
Details
ggcascade()
calls compute_cascade()
to generate a data set passed
to plot_cascade()
. Use compute_cascade()
and plot_cascade()
for
more controls.
Value
A ggplot2
plot or a tibble
.
Examples
ggplot2::diamonds |>
ggcascade(
all = TRUE,
big = carat > .5,
"big & ideal" = carat > .5 & cut == "Ideal"
)
ggplot2::mpg |>
ggcascade(
all = TRUE,
recent = year > 2000,
"recent & economic" = year > 2000 & displ < 3,
.by = cyl,
.ncol = 3,
.arrows = FALSE,
.text_size = 3
)
ggplot2::mpg |>
ggcascade(
all = TRUE,
recent = year > 2000,
"recent & economic" = year > 2000 & displ < 3,
.by = pick(cyl, drv),
.add_n = FALSE,
.text_size = 2
)
Plot model coefficients
Description
ggcoef_model()
, ggcoef_table()
, ggcoef_dodged()
,
ggcoef_faceted()
and ggcoef_compare()
use broom.helpers::tidy_plus_plus()
to obtain a tibble
of the model coefficients,
apply additional data transformation and then pass the
produced tibble
to ggcoef_plot()
to generate the plot.
Usage
ggcoef_model(
model,
tidy_fun = broom.helpers::tidy_with_broom_or_parameters,
tidy_args = NULL,
conf.int = TRUE,
conf.level = 0.95,
exponentiate = FALSE,
variable_labels = NULL,
term_labels = NULL,
interaction_sep = " * ",
categorical_terms_pattern = "{level}",
add_reference_rows = TRUE,
no_reference_row = NULL,
intercept = FALSE,
include = dplyr::everything(),
group_by = broom.helpers::auto_group_by(),
group_labels = NULL,
add_pairwise_contrasts = FALSE,
pairwise_variables = broom.helpers::all_categorical(),
keep_model_terms = FALSE,
pairwise_reverse = TRUE,
emmeans_args = list(),
significance = 1 - conf.level,
significance_labels = NULL,
show_p_values = TRUE,
signif_stars = TRUE,
return_data = FALSE,
...
)
ggcoef_table(
model,
tidy_fun = broom.helpers::tidy_with_broom_or_parameters,
tidy_args = NULL,
conf.int = TRUE,
conf.level = 0.95,
exponentiate = FALSE,
variable_labels = NULL,
term_labels = NULL,
interaction_sep = " * ",
categorical_terms_pattern = "{level}",
add_reference_rows = TRUE,
no_reference_row = NULL,
intercept = FALSE,
include = dplyr::everything(),
group_by = broom.helpers::auto_group_by(),
group_labels = NULL,
add_pairwise_contrasts = FALSE,
pairwise_variables = broom.helpers::all_categorical(),
keep_model_terms = FALSE,
pairwise_reverse = TRUE,
emmeans_args = list(),
significance = 1 - conf.level,
significance_labels = NULL,
show_p_values = FALSE,
signif_stars = FALSE,
table_stat = c("estimate", "ci", "p.value"),
table_header = NULL,
table_text_size = 3,
table_stat_label = NULL,
ci_pattern = "{conf.low}, {conf.high}",
table_widths = c(3, 2),
table_witdhs = deprecated(),
...
)
ggcoef_dodged(
model,
tidy_fun = broom.helpers::tidy_with_broom_or_parameters,
tidy_args = NULL,
conf.int = TRUE,
conf.level = 0.95,
exponentiate = FALSE,
variable_labels = NULL,
term_labels = NULL,
interaction_sep = " * ",
categorical_terms_pattern = "{level}",
add_reference_rows = TRUE,
no_reference_row = NULL,
intercept = FALSE,
include = dplyr::everything(),
group_by = broom.helpers::auto_group_by(),
group_labels = NULL,
significance = 1 - conf.level,
significance_labels = NULL,
return_data = FALSE,
...
)
ggcoef_faceted(
model,
tidy_fun = broom.helpers::tidy_with_broom_or_parameters,
tidy_args = NULL,
conf.int = TRUE,
conf.level = 0.95,
exponentiate = FALSE,
variable_labels = NULL,
term_labels = NULL,
interaction_sep = " * ",
categorical_terms_pattern = "{level}",
add_reference_rows = TRUE,
no_reference_row = NULL,
intercept = FALSE,
include = dplyr::everything(),
group_by = broom.helpers::auto_group_by(),
group_labels = NULL,
significance = 1 - conf.level,
significance_labels = NULL,
return_data = FALSE,
...
)
ggcoef_compare(
models,
type = c("dodged", "faceted"),
tidy_fun = broom.helpers::tidy_with_broom_or_parameters,
tidy_args = NULL,
conf.int = TRUE,
conf.level = 0.95,
exponentiate = FALSE,
variable_labels = NULL,
term_labels = NULL,
interaction_sep = " * ",
categorical_terms_pattern = "{level}",
add_reference_rows = TRUE,
no_reference_row = NULL,
intercept = FALSE,
include = dplyr::everything(),
add_pairwise_contrasts = FALSE,
pairwise_variables = broom.helpers::all_categorical(),
keep_model_terms = FALSE,
pairwise_reverse = TRUE,
emmeans_args = list(),
significance = 1 - conf.level,
significance_labels = NULL,
return_data = FALSE,
...
)
ggcoef_plot(
data,
x = "estimate",
y = "label",
exponentiate = FALSE,
y_labeller = NULL,
point_size = 2,
point_stroke = 2,
point_fill = "white",
colour = NULL,
colour_guide = TRUE,
colour_lab = "",
colour_labels = ggplot2::waiver(),
shape = "significance",
shape_values = c(16, 21),
shape_guide = TRUE,
shape_lab = "",
errorbar = TRUE,
errorbar_height = 0.1,
errorbar_coloured = FALSE,
stripped_rows = TRUE,
strips_odd = "#11111111",
strips_even = "#00000000",
vline = TRUE,
vline_colour = "grey50",
dodged = FALSE,
dodged_width = 0.8,
facet_row = "var_label",
facet_col = NULL,
facet_labeller = "label_value",
plot_title = NULL
)
Arguments
model |
a regression model object |
tidy_fun |
( |
tidy_args |
Additional arguments passed to
|
conf.int |
( |
conf.level |
the confidence level to use for the confidence
interval if |
exponentiate |
if |
variable_labels |
( |
term_labels |
( |
interaction_sep |
( |
categorical_terms_pattern |
( |
add_reference_rows |
( |
no_reference_row |
( |
intercept |
( |
include |
( |
group_by |
( |
group_labels |
( |
add_pairwise_contrasts |
( |
pairwise_variables |
( |
keep_model_terms |
( |
pairwise_reverse |
( |
emmeans_args |
( |
significance |
level (between 0 and 1) below which a
coefficient is consider to be significantly different from 0
(or 1 if |
significance_labels |
optional vector with custom labels for significance variable |
show_p_values |
if |
signif_stars |
if |
return_data |
if |
... |
parameters passed to |
table_stat |
statistics to display in the table, use any column name
returned by the tidier or |
table_header |
optional custom headers for the table |
table_text_size |
text size for the table |
table_stat_label |
optional named list of labeller functions for the displayed statistic (see examples) |
ci_pattern |
glue pattern for confidence intervals in the table |
table_widths |
relative widths of the forest plot and the coefficients table |
table_witdhs |
|
models |
named list of models |
type |
a dodged plot, a faceted plot or multiple table plots? |
data |
a data frame containing data to be plotted,
typically the output of |
x , y |
variables mapped to x and y axis |
y_labeller |
optional function to be applied on y labels (see examples) |
point_size |
size of the points |
point_stroke |
thickness of the points |
point_fill |
fill colour for the points |
colour |
optional variable name to be mapped to colour aesthetic |
colour_guide |
should colour guide be displayed in the legend? |
colour_lab |
label of the colour aesthetic in the legend |
colour_labels |
labels argument passed to
|
shape |
optional variable name to be mapped to the shape aesthetic |
shape_values |
values of the different shapes to use in
|
shape_guide |
should shape guide be displayed in the legend? |
shape_lab |
label of the shape aesthetic in the legend |
errorbar |
should error bars be plotted? |
errorbar_height |
height of error bars |
errorbar_coloured |
should error bars be colored as the points? |
stripped_rows |
should stripped rows be displayed in the background? |
strips_odd |
color of the odd rows |
strips_even |
color of the even rows |
vline |
should a vertical line be drawn at 0 (or 1 if
|
vline_colour |
colour of vertical line |
dodged |
should points be dodged (according to the colour aesthetic)? |
dodged_width |
width value for |
facet_row |
variable name to be used for row facets |
facet_col |
optional variable name to be used for column facets |
facet_labeller |
labeller function to be used for labeling facets;
if labels are too long, you can use |
plot_title |
an optional plot title |
Details
For more control, you can use the argument return_data = TRUE
to
get the produced tibble
, apply any transformation of your own and
then pass your customized tibble
to ggcoef_plot()
.
Value
A ggplot2
plot or a tibble
if return_data = TRUE
.
Functions
-
ggcoef_table()
: a variation ofggcoef_model()
adding a table with estimates, confidence intervals and p-values -
ggcoef_dodged()
: a dodged variation ofggcoef_model()
for multi groups models -
ggcoef_faceted()
: a faceted variation ofggcoef_model()
for multi groups models -
ggcoef_compare()
: designed for displaying several models on the same plot. -
ggcoef_plot()
: plot a tidytibble
of coefficients
See Also
vignette("ggcoef_model")
Examples
mod <- lm(Sepal.Length ~ Sepal.Width + Species, data = iris)
ggcoef_model(mod)
ggcoef_table(mod)
ggcoef_table(mod, table_stat = c("estimate", "ci"))
ggcoef_table(
mod,
table_stat_label = list(
estimate = scales::label_number(.001)
)
)
ggcoef_table(mod, table_text_size = 5, table_widths = c(1, 1))
# a logistic regression example
d_titanic <- as.data.frame(Titanic)
d_titanic$Survived <- factor(d_titanic$Survived, c("No", "Yes"))
mod_titanic <- glm(
Survived ~ Sex * Age + Class,
weights = Freq,
data = d_titanic,
family = binomial
)
# use 'exponentiate = TRUE' to get the Odds Ratio
ggcoef_model(mod_titanic, exponentiate = TRUE)
ggcoef_table(mod_titanic, exponentiate = TRUE)
# display intercepts
ggcoef_model(mod_titanic, exponentiate = TRUE, intercept = TRUE)
# customize terms labels
ggcoef_model(
mod_titanic,
exponentiate = TRUE,
show_p_values = FALSE,
signif_stars = FALSE,
add_reference_rows = FALSE,
categorical_terms_pattern = "{level} (ref: {reference_level})",
interaction_sep = " x ",
y_labeller = scales::label_wrap(15)
)
# display only a subset of terms
ggcoef_model(mod_titanic, exponentiate = TRUE, include = c("Age", "Class"))
# do not change points' shape based on significance
ggcoef_model(mod_titanic, exponentiate = TRUE, significance = NULL)
# a black and white version
ggcoef_model(
mod_titanic,
exponentiate = TRUE,
colour = NULL, stripped_rows = FALSE
)
# show dichotomous terms on one row
ggcoef_model(
mod_titanic,
exponentiate = TRUE,
no_reference_row = broom.helpers::all_dichotomous(),
categorical_terms_pattern =
"{ifelse(dichotomous, paste0(level, ' / ', reference_level), level)}",
show_p_values = FALSE
)
data(tips, package = "reshape")
mod_simple <- lm(tip ~ day + time + total_bill, data = tips)
ggcoef_model(mod_simple)
# custom variable labels
# you can use the labelled package to define variable labels
# before computing model
if (requireNamespace("labelled")) {
tips_labelled <- tips |>
labelled::set_variable_labels(
day = "Day of the week",
time = "Lunch or Dinner",
total_bill = "Bill's total"
)
mod_labelled <- lm(tip ~ day + time + total_bill, data = tips_labelled)
ggcoef_model(mod_labelled)
}
# you can provide custom variable labels with 'variable_labels'
ggcoef_model(
mod_simple,
variable_labels = c(
day = "Week day",
time = "Time (lunch or dinner ?)",
total_bill = "Total of the bill"
)
)
# if labels are too long, you can use 'facet_labeller' to wrap them
ggcoef_model(
mod_simple,
variable_labels = c(
day = "Week day",
time = "Time (lunch or dinner ?)",
total_bill = "Total of the bill"
),
facet_labeller = ggplot2::label_wrap_gen(10)
)
# do not display variable facets but add colour guide
ggcoef_model(mod_simple, facet_row = NULL, colour_guide = TRUE)
# works also with with polynomial terms
mod_poly <- lm(
tip ~ poly(total_bill, 3) + day,
data = tips,
)
ggcoef_model(mod_poly)
# or with different type of contrasts
# for sum contrasts, the value of the reference term is computed
if (requireNamespace("emmeans")) {
mod2 <- lm(
tip ~ day + time + sex,
data = tips,
contrasts = list(time = contr.sum, day = contr.treatment(4, base = 3))
)
ggcoef_model(mod2)
}
# multinomial model
mod <- nnet::multinom(grade ~ stage + trt + age, data = gtsummary::trial)
ggcoef_model(mod, exponentiate = TRUE)
ggcoef_table(mod, group_labels = c(II = "Stage 2 vs. 1"))
ggcoef_dodged(mod, exponentiate = TRUE)
ggcoef_faceted(mod, exponentiate = TRUE)
library(pscl)
data("bioChemists", package = "pscl")
mod <- zeroinfl(art ~ fem * mar | fem + mar, data = bioChemists)
ggcoef_model(mod)
ggcoef_table(mod)
ggcoef_dodged(mod)
ggcoef_faceted(
mod,
group_labels = c(conditional = "Count", zero_inflated = "Zero-inflated")
)
mod2 <- zeroinfl(art ~ fem + mar | 1, data = bioChemists)
ggcoef_table(mod2)
ggcoef_table(mod2, intercept = TRUE)
# Use ggcoef_compare() for comparing several models on the same plot
mod1 <- lm(Fertility ~ ., data = swiss)
mod2 <- step(mod1, trace = 0)
mod3 <- lm(Fertility ~ Agriculture + Education * Catholic, data = swiss)
models <- list(
"Full model" = mod1,
"Simplified model" = mod2,
"With interaction" = mod3
)
ggcoef_compare(models)
ggcoef_compare(models, type = "faceted")
# you can reverse the vertical position of the point by using a negative
# value for dodged_width (but it will produce some warnings)
ggcoef_compare(models, dodged_width = -.9)
Deprecated functions
Description
Usage
ggcoef_multicomponents(
model,
type = c("dodged", "faceted", "table"),
component_col = "component",
component_label = NULL,
tidy_fun = broom.helpers::tidy_with_broom_or_parameters,
tidy_args = NULL,
conf.int = TRUE,
conf.level = 0.95,
exponentiate = FALSE,
variable_labels = NULL,
term_labels = NULL,
interaction_sep = " * ",
categorical_terms_pattern = "{level}",
add_reference_rows = TRUE,
no_reference_row = NULL,
intercept = FALSE,
include = dplyr::everything(),
significance = 1 - conf.level,
significance_labels = NULL,
return_data = FALSE,
table_stat = c("estimate", "ci", "p.value"),
table_header = NULL,
table_text_size = 3,
table_stat_label = NULL,
ci_pattern = "{conf.low}, {conf.high}",
table_witdhs = c(3, 2),
...
)
ggcoef_multinom(
model,
type = c("dodged", "faceted", "table"),
y.level_label = NULL,
tidy_fun = broom.helpers::tidy_with_broom_or_parameters,
tidy_args = NULL,
conf.int = TRUE,
conf.level = 0.95,
exponentiate = FALSE,
variable_labels = NULL,
term_labels = NULL,
interaction_sep = " * ",
categorical_terms_pattern = "{level}",
add_reference_rows = TRUE,
no_reference_row = NULL,
intercept = FALSE,
include = dplyr::everything(),
significance = 1 - conf.level,
significance_labels = NULL,
return_data = FALSE,
table_stat = c("estimate", "ci", "p.value"),
table_header = NULL,
table_text_size = 3,
table_stat_label = NULL,
ci_pattern = "{conf.low}, {conf.high}",
table_witdhs = c(3, 2),
...
)
Arguments
model |
a regression model object |
type |
a dodged plot, a faceted plot or multiple table plots? |
component_col |
name of the component column |
component_label |
an optional named vector for labeling components |
tidy_fun |
( |
tidy_args |
Additional arguments passed to
|
conf.int |
( |
conf.level |
the confidence level to use for the confidence
interval if |
exponentiate |
if |
variable_labels |
( |
term_labels |
( |
interaction_sep |
( |
categorical_terms_pattern |
( |
add_reference_rows |
( |
no_reference_row |
( |
intercept |
( |
include |
( |
significance |
level (between 0 and 1) below which a
coefficient is consider to be significantly different from 0
(or 1 if |
significance_labels |
optional vector with custom labels for significance variable |
return_data |
if |
table_stat |
statistics to display in the table, use any column name
returned by the tidier or |
table_header |
optional custom headers for the table |
table_text_size |
text size for the table |
table_stat_label |
optional named list of labeller functions for the displayed statistic (see examples) |
ci_pattern |
glue pattern for confidence intervals in the table |
table_witdhs |
|
... |
parameters passed to |
y.level_label |
an optional named vector for labeling |
Plotting Likert-type items
Description
Combines several factor variables using the same list of ordered levels (e.g. Likert-type scales) into a unique data frame and generates a centered bar plot.
Usage
gglikert(
data,
include = dplyr::everything(),
weights = NULL,
y = ".question",
variable_labels = NULL,
sort = c("none", "ascending", "descending"),
sort_method = c("prop", "prop_lower", "mean", "median"),
sort_prop_include_center = totals_include_center,
factor_to_sort = ".question",
exclude_fill_values = NULL,
cutoff = NULL,
data_fun = NULL,
add_labels = TRUE,
labels_size = 3.5,
labels_color = "auto",
labels_accuracy = 1,
labels_hide_below = 0.05,
add_totals = TRUE,
totals_size = labels_size,
totals_color = "black",
totals_accuracy = labels_accuracy,
totals_fontface = "bold",
totals_include_center = FALSE,
totals_hjust = 0.1,
y_reverse = TRUE,
y_label_wrap = 50,
reverse_likert = FALSE,
width = 0.9,
facet_rows = NULL,
facet_cols = NULL,
facet_label_wrap = 50,
symmetric = FALSE
)
gglikert_data(
data,
include = dplyr::everything(),
weights = NULL,
variable_labels = NULL,
sort = c("none", "ascending", "descending"),
sort_method = c("prop", "prop_lower", "mean", "median"),
sort_prop_include_center = TRUE,
factor_to_sort = ".question",
exclude_fill_values = NULL,
cutoff = NULL,
data_fun = NULL
)
gglikert_stacked(
data,
include = dplyr::everything(),
weights = NULL,
y = ".question",
variable_labels = NULL,
sort = c("none", "ascending", "descending"),
sort_method = c("prop", "prop_lower", "mean", "median"),
sort_prop_include_center = FALSE,
factor_to_sort = ".question",
data_fun = NULL,
add_labels = TRUE,
labels_size = 3.5,
labels_color = "auto",
labels_accuracy = 1,
labels_hide_below = 0.05,
add_median_line = FALSE,
y_reverse = TRUE,
y_label_wrap = 50,
reverse_fill = TRUE,
width = 0.9
)
Arguments
data |
a data frame |
include |
variables to include, accepts tidy-select syntax |
weights |
optional variable name of a weighting variable, accepts tidy-select syntax |
y |
name of the variable to be plotted on |
variable_labels |
a named list or a named vector of custom variable labels |
sort |
should the factor defined by |
sort_method |
method used to sort the variables: |
sort_prop_include_center |
when sorting with |
factor_to_sort |
name of the factor column to sort if |
exclude_fill_values |
Vector of values that should not be displayed
(but still taken into account for computing proportions),
see |
cutoff |
number of categories to be displayed negatively (i.e. on the
left of the x axis or the bottom of the y axis), could be a decimal value:
|
data_fun |
for advanced usage, custom function to be applied to the
generated dataset at the end of |
add_labels |
should percentage labels be added to the plot? |
labels_size |
size of the percentage labels |
labels_color |
color of the percentage labels ( |
labels_accuracy |
accuracy of the percentages, see
|
labels_hide_below |
if provided, values below will be masked, see
|
add_totals |
should the total proportions of negative and positive answers be added to plot? This option is not compatible with facets! |
totals_size |
size of the total proportions |
totals_color |
color of the total proportions |
totals_accuracy |
accuracy of the total proportions, see
|
totals_fontface |
font face of the total proportions |
totals_include_center |
if the number of levels is uneven, should half of the center level be added to the total proportions? |
totals_hjust |
horizontal adjustment of totals labels on the x axis |
y_reverse |
should the y axis be reversed? |
y_label_wrap |
number of characters per line for y axis labels, see
|
reverse_likert |
if |
width |
bar width, see |
facet_rows , facet_cols |
A set of variables or expressions quoted by
|
facet_label_wrap |
number of characters per line for facet labels, see
|
symmetric |
should the x-axis be symmetric? |
add_median_line |
add a vertical line at 50%? |
reverse_fill |
if |
Details
You could use gglikert_data()
to just produce the dataset to be plotted.
If variable labels have been defined (see labelled::var_label()
), they will
be considered. You can also pass custom variables labels with the
variable_labels
argument.
Value
A ggplot2
plot or a tibble
.
See Also
vignette("gglikert")
, position_likert()
, stat_prop()
Examples
library(ggplot2)
library(dplyr)
likert_levels <- c(
"Strongly disagree",
"Disagree",
"Neither agree nor disagree",
"Agree",
"Strongly agree"
)
set.seed(42)
df <-
tibble(
q1 = sample(likert_levels, 150, replace = TRUE),
q2 = sample(likert_levels, 150, replace = TRUE, prob = 5:1),
q3 = sample(likert_levels, 150, replace = TRUE, prob = 1:5),
q4 = sample(likert_levels, 150, replace = TRUE, prob = 1:5),
q5 = sample(c(likert_levels, NA), 150, replace = TRUE),
q6 = sample(likert_levels, 150, replace = TRUE, prob = c(1, 0, 1, 1, 0))
) |>
mutate(across(everything(), ~ factor(.x, levels = likert_levels)))
gglikert(df)
gglikert(df, include = q1:3) +
scale_fill_likert(pal = scales::brewer_pal(palette = "PRGn"))
gglikert(df, sort = "ascending")
gglikert(df, sort = "ascending", sort_prop_include_center = TRUE)
gglikert(df, sort = "ascending", sort_method = "mean")
gglikert(df, reverse_likert = TRUE)
gglikert(df, add_totals = FALSE, add_labels = FALSE)
gglikert(
df,
totals_include_center = TRUE,
totals_hjust = .25,
totals_size = 4.5,
totals_fontface = "italic",
totals_accuracy = .01,
labels_accuracy = 1,
labels_size = 2.5,
labels_hide_below = .25
)
gglikert(df, exclude_fill_values = "Neither agree nor disagree")
if (require("labelled")) {
df |>
set_variable_labels(
q1 = "First question",
q2 = "Second question"
) |>
gglikert(
variable_labels = c(
q4 = "a custom label",
q6 = "a very very very very very very very very very very long label"
),
y_label_wrap = 25
)
}
# Facets
df_group <- df
df_group$group <- sample(c("A", "B"), 150, replace = TRUE)
gglikert(df_group, q1:q6, facet_rows = vars(group))
gglikert(df_group, q1:q6, facet_cols = vars(group))
gglikert(df_group, q1:q6, y = "group", facet_rows = vars(.question))
# Custom function to be applied on data
f <- function(d) {
d$.question <- forcats::fct_relevel(d$.question, "q5", "q2")
d
}
gglikert(df, include = q1:q6, data_fun = f)
# Custom center
gglikert(df, cutoff = 2)
gglikert(df, cutoff = 1)
gglikert(df, cutoff = 1, symmetric = TRUE)
gglikert_stacked(df, q1:q6)
gglikert_stacked(df, q1:q6, add_median_line = TRUE, sort = "asc")
gglikert_stacked(df_group, q1:q6, y = "group", add_median_line = TRUE) +
facet_grid(rows = vars(.question))
Easy ggplot2 with survey objects
Description
A function to facilitate ggplot2
graphs using a survey object.
It will initiate a ggplot and map survey weights to the
corresponding aesthetic.
Usage
ggsurvey(design = NULL, mapping = NULL, ...)
Arguments
design |
A survey design object, usually created with
|
mapping |
Default list of aesthetic mappings to use for plot,
to be created with |
... |
Other arguments passed on to methods. Not currently used. |
Details
Graphs will be correct as long as only weights are required
to compute the graph. However, statistic or geometry requiring
correct variance computation (like ggplot2::geom_smooth()
) will
be statistically incorrect.
Value
A ggplot2
plot.
Examples
data(api, package = "survey")
dstrat <- survey::svydesign(
id = ~1, strata = ~stype,
weights = ~pw, data = apistrat,
fpc = ~fpc
)
ggsurvey(dstrat) +
ggplot2::aes(x = cnum, y = dnum) +
ggplot2::geom_count()
d <- as.data.frame(Titanic)
dw <- survey::svydesign(ids = ~1, weights = ~Freq, data = d)
ggsurvey(dw) +
ggplot2::aes(x = Class, fill = Survived) +
ggplot2::geom_bar(position = "fill")
Identify a suitable font color (black or white) given a background HEX color
Description
You could use auto_contrast
as a shortcut of
aes(colour = after_scale(hex_bw(.data$fill)))
. You should use !!!
to
inject it within ggplot2::aes()
(see examples).
hex_bw_threshold()
is a variation of hex_bw()
. For values
below
threshold
, black ("#000000"
) will always be returned, regardless of
hex_code
.
Usage
hex_bw(hex_code)
hex_bw_threshold(hex_code, values, threshold)
auto_contrast
Arguments
hex_code |
Background color in hex-format. |
values |
Values to be compared. |
threshold |
Threshold. |
Format
An object of class uneval
of length 1.
Value
Either black or white, in hex-format
Source
Adapted from saros
for hex_code()
and from
https://github.com/teunbrand/ggplot_tricks?tab=readme-ov-file#text-contrast
for auto_contrast
.
Examples
hex_bw("#0dadfd")
library(ggplot2)
ggplot(diamonds) +
aes(x = cut, fill = color, label = after_stat(count)) +
geom_bar() +
geom_text(
mapping = aes(color = after_scale(hex_bw(.data$fill))),
position = position_stack(.5),
stat = "count",
size = 2
)
ggplot(diamonds) +
aes(x = cut, fill = color, label = after_stat(count)) +
geom_bar() +
geom_text(
mapping = auto_contrast,
position = position_stack(.5),
stat = "count",
size = 2
)
ggplot(diamonds) +
aes(x = cut, fill = color, label = after_stat(count), !!!auto_contrast) +
geom_bar() +
geom_text(
mapping = auto_contrast,
position = position_stack(.5),
stat = "count",
size = 2
)
Label absolute values
Description
Label absolute values
Usage
label_number_abs(..., hide_below = NULL)
label_percent_abs(..., hide_below = NULL)
Arguments
... |
arguments passed to |
hide_below |
if provided, values below |
Value
A "labelling" function, , i.e. a function that takes a vector and returns a character vector of same length giving a label for each input value.
See Also
scales::label_number()
, scales::label_percent()
Examples
x <- c(-0.2, -.05, 0, .07, .25, .66)
scales::label_number()(x)
label_number_abs()(x)
scales::label_percent()(x)
label_percent_abs()(x)
label_percent_abs(hide_below = .1)(x)
Extend a discrete colour palette
Description
If the palette returns less colours than requested, the list of colours
will be expanded using scales::pal_gradient_n()
. To be used with a
sequential or diverging palette. Not relevant for qualitative palettes.
Usage
pal_extender(pal = scales::brewer_pal(palette = "BrBG"))
scale_fill_extended(
name = waiver(),
...,
pal = scales::brewer_pal(palette = "BrBG"),
aesthetics = "fill"
)
scale_colour_extended(
name = waiver(),
...,
pal = scales::brewer_pal(palette = "BrBG"),
aesthetics = "colour"
)
Arguments
pal |
A palette function, such as returned by scales::brewer_pal, taking a number of colours as entry and returning a list of colours. |
name |
The name of the scale. Used as the axis or legend title.
If |
... |
Other arguments passed on to |
aesthetics |
Character string or vector of character strings listing
the name(s) of the aesthetic(s) that this scale works with. This can be
useful, for example, to apply colour settings to the colour and fill
aesthetics at the same time, via |
Value
A palette function.
Examples
pal <- scales::pal_brewer(palette = "PiYG")
scales::show_col(pal(16))
scales::show_col(pal_extender(pal)(16))
Stack objects on top of each another and center them around 0
Description
position_diverging()
stacks bars on top of each other and
center them around zero (the same number of categories are displayed on
each side).
position_likert()
uses proportions instead of counts. This type of
presentation is commonly used to display Likert-type scales.
Usage
position_likert(
vjust = 1,
reverse = FALSE,
exclude_fill_values = NULL,
cutoff = NULL
)
position_diverging(
vjust = 1,
reverse = FALSE,
exclude_fill_values = NULL,
cutoff = NULL
)
Arguments
vjust |
Vertical adjustment for geoms that have a position
(like points or lines), not a dimension (like bars or areas). Set to
|
reverse |
If |
exclude_fill_values |
Vector of values from the variable associated with
the |
cutoff |
number of categories to be displayed negatively (i.e. on the
left of the x axis or the bottom of the y axis), could be a decimal value:
|
Details
It is recommended to use position_likert()
with stat_prop()
and its complete
argument (see examples).
See Also
See ggplot2::position_stack()
and ggplot2::position_fill()
Examples
library(ggplot2)
ggplot(diamonds) +
aes(y = clarity, fill = cut) +
geom_bar(position = "fill") +
scale_x_continuous(label = scales::label_percent()) +
xlab("proportion")
ggplot(diamonds) +
aes(y = clarity, fill = cut) +
geom_bar(position = "likert") +
scale_x_continuous(label = label_percent_abs()) +
scale_fill_likert() +
xlab("proportion")
ggplot(diamonds) +
aes(y = clarity, fill = cut) +
geom_bar(position = "stack") +
scale_fill_likert(pal = scales::brewer_pal(palette = "PiYG"))
ggplot(diamonds) +
aes(y = clarity, fill = cut) +
geom_bar(position = "diverging") +
scale_x_continuous(label = label_number_abs()) +
scale_fill_likert()
# Reverse order -------------------------------------------------------------
ggplot(diamonds) +
aes(y = clarity, fill = cut) +
geom_bar(position = position_likert(reverse = TRUE)) +
scale_x_continuous(label = label_percent_abs()) +
scale_fill_likert() +
xlab("proportion")
# Custom center -------------------------------------------------------------
ggplot(diamonds) +
aes(y = clarity, fill = cut) +
geom_bar(position = position_likert(cutoff = 1)) +
scale_x_continuous(label = label_percent_abs()) +
scale_fill_likert(cutoff = 1) +
xlab("proportion")
ggplot(diamonds) +
aes(y = clarity, fill = cut) +
geom_bar(position = position_likert(cutoff = 3.75)) +
scale_x_continuous(label = label_percent_abs()) +
scale_fill_likert(cutoff = 3.75) +
xlab("proportion")
# Missing items -------------------------------------------------------------
# example with a level not being observed for a specific value of y
d <- diamonds
d <- d[!(d$cut == "Premium" & d$clarity == "I1"), ]
d <- d[!(d$cut %in% c("Fair", "Good") & d$clarity == "SI2"), ]
# by default, the two lowest bar are not properly centered
ggplot(d) +
aes(y = clarity, fill = cut) +
geom_bar(position = "likert") +
scale_fill_likert()
# use stat_prop() with `complete = "fill"` to fix it
ggplot(d) +
aes(y = clarity, fill = cut) +
geom_bar(position = "likert", stat = "prop", complete = "fill") +
scale_fill_likert()
# Add labels ----------------------------------------------------------------
custom_label <- function(x) {
p <- scales::percent(x, accuracy = 1)
p[x < .075] <- ""
p
}
ggplot(diamonds) +
aes(y = clarity, fill = cut) +
geom_bar(position = "likert") +
geom_text(
aes(by = clarity, label = custom_label(after_stat(prop))),
stat = "prop",
position = position_likert(vjust = .5)
) +
scale_x_continuous(label = label_percent_abs()) +
scale_fill_likert() +
xlab("proportion")
# Do not display specific fill values ---------------------------------------
# (but taken into account to compute proportions)
ggplot(diamonds) +
aes(y = clarity, fill = cut) +
geom_bar(position = position_likert(exclude_fill_values = "Very Good")) +
scale_x_continuous(label = label_percent_abs()) +
scale_fill_likert() +
xlab("proportion")
Round to multiple of any number.
Description
Round to multiple of any number.
Usage
round_any(x, accuracy, f = round)
Arguments
x |
numeric or date-time (POSIXct) vector to round |
accuracy |
number to round to; for POSIXct objects, a number of seconds |
f |
Source
adapted from plyr
Examples
round_any(1.865, accuracy = .25)
Colour scale for Likert-type plots
Description
This scale is similar to other diverging discrete colour scales, but allows
to change the "center" of the scale using cutoff
argument, as used by
position_likert()
.
Usage
scale_fill_likert(
name = waiver(),
...,
pal = scales::brewer_pal(palette = "BrBG"),
cutoff = NULL,
aesthetics = "fill"
)
likert_pal(pal = scales::brewer_pal(palette = "BrBG"), cutoff = NULL)
Arguments
name |
The name of the scale. Used as the axis or legend title.
If |
... |
Other arguments passed on to |
pal |
A palette function taking a number of colours as entry and returning a list of colours (see examples), ideally a diverging palette |
cutoff |
Number of categories displayed negatively (see
|
aesthetics |
Character string or vector of character strings listing
the name(s) of the aesthetic(s) that this scale works with. This can be
useful, for example, to apply colour settings to the colour and fill
aesthetics at the same time, via |
Examples
library(ggplot2)
ggplot(diamonds) +
aes(y = clarity, fill = cut) +
geom_bar(position = "likert") +
scale_x_continuous(label = label_percent_abs()) +
xlab("proportion")
ggplot(diamonds) +
aes(y = clarity, fill = cut) +
geom_bar(position = "likert") +
scale_x_continuous(label = label_percent_abs()) +
xlab("proportion") +
scale_fill_likert()
ggplot(diamonds) +
aes(y = clarity, fill = cut) +
geom_bar(position = position_likert(cutoff = 1)) +
scale_x_continuous(label = label_percent_abs()) +
xlab("proportion") +
scale_fill_likert(cutoff = 1)
Significance Stars
Description
Calculate significance stars
Usage
signif_stars(x, three = 0.001, two = 0.01, one = 0.05, point = 0.1)
Arguments
x |
numeric values that will be compared to the |
three |
threshold below which to display three stars |
two |
threshold below which to display two stars |
one |
threshold below which to display one star |
point |
threshold below which to display one point
( |
Value
Character vector containing the appropriate number of
stars for each x
value.
Author(s)
Joseph Larmarange
Examples
x <- c(0.5, 0.1, 0.05, 0.01, 0.001)
signif_stars(x)
signif_stars(x, one = .15, point = NULL)
Compute cross-tabulation statistics
Description
Computes statistics of a 2-dimensional matrix using broom::augment.htest.
Usage
stat_cross(
mapping = NULL,
data = NULL,
geom = "point",
position = "identity",
...,
na.rm = TRUE,
show.legend = NA,
inherit.aes = TRUE,
keep.zero.cells = FALSE
)
Arguments
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
Override the default connection with
|
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
na.rm |
If |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
keep.zero.cells |
If |
Value
A ggplot2
plot with the added statistic.
Aesthetics
stat_cross()
requires the x and the y aesthetics.
Computed variables
- observed
number of observations in x,y
- prop
proportion of total
- row.prop
row proportion
- col.prop
column proportion
- expected
expected count under the null hypothesis
- resid
Pearson's residual
- std.resid
standardized residual
- row.observed
total number of observations within row
- col.observed
total number of observations within column
- total.observed
total number of observations within the table
- phi
phi coefficients, see
augment_chisq_add_phi()
See Also
vignette("stat_cross")
Examples
library(ggplot2)
d <- as.data.frame(Titanic)
# plot number of observations
ggplot(d) +
aes(x = Class, y = Survived, weight = Freq, size = after_stat(observed)) +
stat_cross() +
scale_size_area(max_size = 20)
# custom shape and fill colour based on chi-squared residuals
ggplot(d) +
aes(
x = Class, y = Survived, weight = Freq,
size = after_stat(observed), fill = after_stat(std.resid)
) +
stat_cross(shape = 22) +
scale_fill_steps2(breaks = c(-3, -2, 2, 3), show.limits = TRUE) +
scale_size_area(max_size = 20)
# custom shape and fill colour based on phi coeffients
ggplot(d) +
aes(
x = Class, y = Survived, weight = Freq,
size = after_stat(observed), fill = after_stat(phi)
) +
stat_cross(shape = 22) +
scale_fill_steps2(show.limits = TRUE) +
scale_size_area(max_size = 20)
# plotting the number of observations as a table
ggplot(d) +
aes(
x = Class, y = Survived, weight = Freq, label = after_stat(observed)
) +
geom_text(stat = "cross")
# Row proportions with standardized residuals
ggplot(d) +
aes(
x = Class, y = Survived, weight = Freq,
label = scales::percent(after_stat(row.prop)),
size = NULL, fill = after_stat(std.resid)
) +
stat_cross(shape = 22, size = 30) +
geom_text(stat = "cross") +
scale_fill_steps2(breaks = c(-3, -2, 2, 3), show.limits = TRUE) +
facet_grid(Sex ~ .) +
labs(fill = "Standardized residuals") +
theme_minimal()
Compute proportions according to custom denominator
Description
stat_prop()
is a variation of ggplot2::stat_count()
allowing to
compute custom proportions according to the by aesthetic defining
the denominator (i.e. all proportions for a same value of by will
sum to 1). If the by aesthetic is not specified, denominators will be
determined according to the default_by
argument.
Usage
stat_prop(
mapping = NULL,
data = NULL,
geom = "bar",
position = "fill",
...,
width = NULL,
na.rm = FALSE,
orientation = NA,
show.legend = NA,
inherit.aes = TRUE,
complete = NULL,
default_by = "total"
)
Arguments
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
Override the default connection with |
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
width |
Bar width. By default, set to 90% of the |
na.rm |
If |
orientation |
The orientation of the layer. The default ( |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
complete |
Name (character) of an aesthetic for those statistics should be completed for unobserved values (see example). |
default_by |
If the by aesthetic is not available, name of another
aesthetic that will be used to determine the denominators (e.g. |
Value
A ggplot2
plot with the added statistic.
Aesthetics
stat_prop()
understands the following aesthetics
(required aesthetics are in bold):
-
x or y
by
weight
Computed variables
after_stat(count)
number of points in bin
after_stat(denominator)
denominator for the proportions
after_stat(prop)
computed proportion, i.e.
after_stat(count)
/after_stat(denominator)
See Also
vignette("stat_prop")
, ggplot2::stat_count()
. For an alternative
approach, see
https://github.com/tidyverse/ggplot2/issues/5505#issuecomment-1791324008.
Examples
library(ggplot2)
d <- as.data.frame(Titanic)
p <- ggplot(d) +
aes(x = Class, fill = Survived, weight = Freq, by = Class) +
geom_bar(position = "fill") +
geom_text(stat = "prop", position = position_fill(.5))
p
p + facet_grid(~Sex)
ggplot(d) +
aes(x = Class, fill = Survived, weight = Freq) +
geom_bar(position = "dodge") +
geom_text(
aes(by = Survived),
stat = "prop",
position = position_dodge(0.9), vjust = "bottom"
)
if (requireNamespace("scales")) {
ggplot(d) +
aes(x = Class, fill = Survived, weight = Freq, by = 1) +
geom_bar() +
geom_text(
aes(label = scales::percent(after_stat(prop), accuracy = 1)),
stat = "prop",
position = position_stack(.5)
)
}
# displaying unobserved levels with complete
d <- diamonds |>
dplyr::filter(!(cut == "Ideal" & clarity == "I1")) |>
dplyr::filter(!(cut == "Very Good" & clarity == "VS2")) |>
dplyr::filter(!(cut == "Premium" & clarity == "IF"))
p <- ggplot(d) +
aes(x = clarity, fill = cut, by = clarity) +
geom_bar(position = "fill")
p + geom_text(stat = "prop", position = position_fill(.5))
p + geom_text(stat = "prop", position = position_fill(.5), complete = "fill")
Compute weighted y mean
Description
This statistic will compute the mean of y aesthetic for each unique value of x, taking into account weight aesthetic if provided.
Usage
stat_weighted_mean(
mapping = NULL,
data = NULL,
geom = "point",
position = "identity",
...,
na.rm = FALSE,
orientation = NA,
show.legend = NA,
inherit.aes = TRUE
)
Arguments
mapping |
Set of aesthetic mappings created by |
data |
The data to be displayed in this layer. There are three options: If A A |
geom |
Override the default connection with |
position |
A position adjustment to use on the data for this layer. This
can be used in various ways, including to prevent overplotting and
improving the display. The
|
... |
Other arguments passed on to
|
na.rm |
If |
orientation |
The orientation of the layer. The default ( |
show.legend |
logical. Should this layer be included in the legends?
|
inherit.aes |
If |
Value
A ggplot2
plot with the added statistic.
Computed variables
- y
weighted y (numerator / denominator)
- numerator
numerator
- denominator
denominator
See Also
vignette("stat_weighted_mean")
Examples
library(ggplot2)
data(tips, package = "reshape")
ggplot(tips) +
aes(x = day, y = total_bill) +
geom_point()
ggplot(tips) +
aes(x = day, y = total_bill) +
stat_weighted_mean()
ggplot(tips) +
aes(x = day, y = total_bill, group = 1) +
stat_weighted_mean(geom = "line")
ggplot(tips) +
aes(x = day, y = total_bill, colour = sex, group = sex) +
stat_weighted_mean(geom = "line")
ggplot(tips) +
aes(x = day, y = total_bill, fill = sex) +
stat_weighted_mean(geom = "bar", position = "dodge")
# computing a proportion on the fly
if (requireNamespace("scales")) {
ggplot(tips) +
aes(x = day, y = as.integer(smoker == "Yes"), fill = sex) +
stat_weighted_mean(geom = "bar", position = "dodge") +
scale_y_continuous(labels = scales::percent)
}
library(ggplot2)
# taking into account some weights
d <- as.data.frame(Titanic)
ggplot(d) +
aes(
x = Class, y = as.integer(Survived == "Yes"),
weight = Freq, fill = Sex
) +
geom_bar(stat = "weighted_mean", position = "dodge") +
scale_y_continuous(labels = scales::percent) +
labs(y = "Survived")
Symmetric limits
Description
Expand scale limits to make them symmetric around zero.
Can be passed as argument to parameter limits
of continuous scales from
packages {ggplot2}
or {scales}
. Can be also used to obtain an enclosing
symmetric range for numeric vectors.
Usage
symmetric_limits(x)
Arguments
x |
a vector of numeric values, possibly a range, from which to compute enclosing range |
Value
A numeric vector of length two with the new limits, which are always such that the absolute value of upper and lower limits is the same.
Source
Adapted from the homonym function in {ggpmisc}
Examples
library(ggplot2)
ggplot(iris) +
aes(x = Sepal.Length - 5, y = Sepal.Width - 3, colour = Species) +
geom_vline(xintercept = 0) +
geom_hline(yintercept = 0) +
geom_point()
last_plot() +
scale_x_continuous(limits = symmetric_limits) +
scale_y_continuous(limits = symmetric_limits)
Weighted Median and Quantiles
Description
Compute the median or quantiles a set of numbers which have weights associated with them.
Usage
weighted.median(x, w, na.rm = TRUE, type = 2)
weighted.quantile(x, w, probs = seq(0, 1, 0.25), na.rm = TRUE, type = 4)
Arguments
x |
a numeric vector of values |
w |
a numeric vector of weights |
na.rm |
a logical indicating whether to ignore |
type |
Integer specifying the rule for calculating the median or
quantile, corresponding to the rules available for |
probs |
probabilities for which the quantiles should be computed, a numeric vector of values between 0 and 1 |
Details
The i
th observation x[i]
is treated as having a weight proportional to
w[i]
.
The weighted median is a value m
such that the total weight of data less
than or equal to m
is equal to half the total weight. More generally, the
weighted quantile with probability p
is a value q
such that the total
weight of data less than or equal to q
is equal to p
times the total
weight.
If there is no such value, then
if
type = 1
, the next largest value is returned (this is the right-continuous inverse of the left-continuous cumulative distribution function);if
type = 2
, the average of the two surrounding values is returned (the average of the right-continuous and left-continuous inverses);if
type = 4
, linear interpolation is performed.
Note that the default rule for weighted.median()
is type = 2
, consistent
with the traditional definition of the median, while the default for
weighted.quantile()
is type = 4
.
Value
A numeric vector.
Source
These functions are adapted from their homonyms developed by Adrian
Baddeley in the spatstat
package.
Examples
x <- 1:20
w <- runif(20)
weighted.median(x, w)
weighted.quantile(x, w)
Weighted Sum
Description
Weighted Sum
Usage
weighted.sum(x, w, na.rm = TRUE)
Arguments
x |
a numeric vector of values |
w |
a numeric vector of weights |
na.rm |
a logical indicating whether to ignore |
Value
A numeric vector.
Examples
x <- 1:20
w <- runif(20)
weighted.sum(x, w)