Title: | Graph-Based Permutation Tests for Microbiome Data |
Version: | 0.1.1 |
Author: | Julia Fukuyama [aut, cre] |
Maintainer: | Julia Fukuyama <julia.fukuyama@gmail.com> |
Description: | Provides functions for graph-based multiple-sample testing and visualization of microbiome data, in particular data stored in 'phyloseq' objects. The tests are based on those described in Friedman and Rafsky (1979) http://www.jstor.org/stable/2958919, and the tests are described in more detail in Callahan et al. (2016) <doi:10.12688/f1000research.8986.1>. |
Imports: | ggnetwork (≥ 0.5.1), igraph (≥ 1.1.2) |
Depends: | R (≥ 3.5.0), ggplot2 (≥ 2.2.1), phyloseq (≥ 1.24.0) |
License: | CC0 |
Suggests: | knitr, rmarkdown |
VignetteBuilder: | knitr |
URL: | https://github.com/jfukuyama/phyloseqGraphTest |
RoxygenNote: | 7.2.3 |
NeedsCompilation: | no |
Packaged: | 2024-02-05 16:24:23 UTC; jfukuyam |
Repository: | CRAN |
Date/Publication: | 2024-02-05 19:00:02 UTC |
phyloseqGraphTest: Non-parametric graph-based testing for microbiome data.
Description
This package lets you test for differences between groups of samples with a graph-based permutation test.
Details
The main function in the package is graph_perm_test
,
which takes a phyloseq
object.
The graph used in the test can be visualized using
plot_test_network
. The permutation distribution and
the test statistic can be visualized with
plot_permutations
.
format_fortify
Description
a unified function to format network
or
igraph
object. Copied with
very slight modification from
https://github.com/briatte/ggnetwork/blob/master/R/utilities.R to
fix the same CRAN problem as new_fortify.igraph.
Usage
format_fortify(
model,
nodes = NULL,
weights = NULL,
arrow.gap = 0,
by = NULL,
scale = TRUE,
stringsAsFactors = getOption("stringsAsFactors", FALSE),
.list_vertex_attributes_fun = NULL,
.get_vertex_attributes_fun = NULL,
.list_edges_attributes_fun = NULL,
.get_edges_attributes_fun = NULL,
.as_edges_list_fun = NULL
)
Arguments
model |
|
nodes |
a nodes object from a call to fortify. |
weights |
the name of an edge attribute to use as edge weights when
computing the network layout, if the layout supports such weights (see
'Details').
Defaults to |
arrow.gap |
a parameter that will shorten the network edges in order to
avoid overplotting edge arrows and nodes; defaults to |
by |
a character vector that matches an edge attribute, which will be
used to generate a data frame that can be plotted with
|
scale |
whether to (re)scale the layout coordinates. Defaults to
|
stringsAsFactors |
whether vertex and edge attributes should be
converted to factors if they are of class |
.list_vertex_attributes_fun |
a "list vertex attributes" function. |
.get_vertex_attributes_fun |
a "get vertex attributes" function. |
.list_edges_attributes_fun |
a "get edges attributes" function. |
.get_edges_attributes_fun |
a "get edges attributes" function. |
.as_edges_list_fun |
a "as edges list" function. |
Value
a data.frame
object.
Performs graph-based permutation tests
Description
Performs graph-based tests for one-way designs.
Usage
graph_perm_test(
physeq,
sampletype,
grouping = 1:nsamples(physeq),
distance = "jaccard",
type = c("mst", "knn", "threshold.value", "threshold.nedges"),
max.dist = 0.4,
knn = 1,
nedges = nsamples(physeq),
keep.isolates = TRUE,
nperm = 499
)
Arguments
physeq |
A phyloseq object. |
sampletype |
A string giving the column name of the sample to be tested. This should be a factor with two or more levels. |
grouping |
Either a string with the name of a sample data column or a factor of length equal to the number of samples in physeq. These are the groups of samples whose labels should be permuted and are used for repeated measures designs. Default is no grouping (each group is of size 1). |
distance |
A distance, see |
type |
One of "mst", "knn", "threshold". If "mst", forms the minimum spanning tree of the sample points. If "knn", forms a directed graph with links from each node to its k nearest neighbors. If "threshold", forms a graph with edges between every pair of samples within a certain distance. |
max.dist |
For type "threshold", the maximum distance between two samples such that we put an edge between them. |
knn |
For type "knn", the number of nearest neighbors. |
nedges |
If using "threshold.nedges", the number of edges to use. |
keep.isolates |
In the returned network, keep the unconnected points? |
nperm |
The number of permutations to perform. |
Value
A list with the observed number of pure edges, the vector containing the number of pure edges in each permutation, the permutation p-value, the graph used for testing, and a vector with the sample types used for the test.
Examples
library(phyloseq)
data(enterotype)
gt = graph_perm_test(enterotype, sampletype = "SeqTech", type = "mst")
gt
Fortify method for networks of class igraph
Description
This is copied with very slight modification from https://github.com/briatte/ggnetwork/blob/master/R/fortify-igraph.R, as that version is not on CRAN yet.
Usage
new_fortify.igraph(
model,
data = NULL,
layout = igraph::nicely(),
arrow.gap = ifelse(igraph::is.directed(model), 0.025, 0),
by = NULL,
scale = TRUE,
stringsAsFactors = getOption("stringsAsFactors", FALSE),
...
)
Arguments
model |
an object of class |
data |
not used by this method. |
layout |
a function call to an
|
arrow.gap |
a parameter that will shorten the network edges in order to
avoid overplotting edge arrows and nodes; defaults to |
by |
a character vector that matches an edge attribute, which will be
used to generate a data frame that can be plotted with
|
scale |
whether to (re)scale the layout coordinates. Defaults to
|
stringsAsFactors |
whether vertex and edge attributes should be
converted to factors if they are of class |
... |
additional parameters for the |
Value
a data.frame
object.
Permute labels
Description
Permutes sample labels, respecting repeated measures.
Usage
permute(sampledata, grouping, sampletype)
Arguments
sampledata |
Data frame describing the samples. |
grouping |
Grouping for repeated measures. |
sampletype |
The sampletype used for testing (a column of sampledata). |
Value
A permuted set of labels where the permutations are done over the levels of grouping.
Plots the permutation distribution
Description
Plots a histogram of the permutation distribution of the number of pure edges and a mark showing the observed number of pure edges.
Usage
plot_permutations(graphtest, bins = 30)
Arguments
graphtest |
The output from graph_perm_test. |
bins |
The number of bins to use for the histogram. |
Value
A ggplot object.
Examples
library(phyloseq)
data(enterotype)
gt = graph_perm_test(enterotype, sampletype = "SeqTech")
plot_permutations(gt)
Plots the graph used for testing
Description
When using the graph_perm_test function, a graph is created. This function will plot the graph used for testing with nodes colored by sample type and edges marked as pure or mixed.
Usage
plot_test_network(graphtest)
Arguments
graphtest |
The output from graph_perm_test. |
Value
A ggplot object created by ggnetwork.
Examples
library(phyloseq)
data(enterotype)
gt = graph_perm_test(enterotype, sampletype = "SeqTech")
plot_test_network(gt)
Print psgraphtest objects
Description
Print psgraphtest objects
Usage
## S3 method for class 'psgraphtest'
print(x, ...)
Arguments
x |
|
... |
Not used |
Rescale x to (0, 1), except if x is constant
Description
Copied from https://github.com/briatte/ggnetwork/blob/f3b8b84d28a65620a94f7aecd769c0ea939466e3/R/utilities.R so as to fix a problem with the cran version of ggnetwork.
Usage
scale_safely(x, scale = diff(range(x)))
Arguments
x |
a vector to rescale |
scale |
the scale on which to rescale the vector |
Value
The rescaled vector, coerced to a vector if necessary. If the original vector was constant, all of its values are replaced by 0.5.
Author(s)
Kipp Johnson
Check for valid grouping
Description
Grouping should describe a repeated measures design, so this function tests whether all of the levels of grouping have the same value of sampletype.
Usage
validGrouping(sd, sampletype, grouping)
Arguments
sd |
Data frame describing the samples. |
sampletype |
The sampletype used for testing. |
grouping |
Grouping for repeated measures. |
Value
TRUE or FALSE for valid or invalid grouping.