% Generated by roxygen2 (4.1.0): do not edit by hand
% Please edit documentation in R/combinatorialSignificance.R
\name{combinatorialSignificance}
\alias{combinatorialSignificance}
\title{Compute the statistical significance of each replicate combination}
\usage{
combinatorialSignificance(ExpressionSet, replicates,
  TestStatistic = "FlatLineTest", permutations = 1000, parallel = FALSE)
}
\arguments{
\item{ExpressionSet}{a standard PhyloExpressionSet or DivergenceExpressionSet object.}

\item{replicates}{a numeric vector storing the number of replicates within each developmental stage or experiment.
In case replicate stores only one value, then the function assumes that each developmental stage or experiment
stores the same number of replicates.}

\item{TestStatistic}{a string defining the type of test statistics to be used to quantify the statistical significance the present phylotranscriptomics pattern.
Possible values can be: \code{TestStatistic} = "FlatLineTest" : Statistical test for the deviation from a flat line.
\code{TestStatistic} = "ReductiveHourglassTest" : Statistical test for the existence of a hourglass shape (high-low-high pattern).}

\item{permutations}{a numeric value specifying the number of permutations to be performed for the \code{\link{FlatLineTest}} or \code{\link{ReductiveHourglassTest}}.}

\item{parallel}{a boolean value specifying whether parallel processing (multicore processing) shall be performed.}
}
\value{
a numeric vector storing the p-values returned by the underlying test statistic for all possible replicate combinations.
}
\description{
In case a PhyloExpressionSet or DivergenceExpressionSet stores replicates for each
developmental stage or experiment, this function allows to compute the p-values quantifying
the statistical significance of the underlying pattern for all combinations of replicates.
}
\details{
#' The intention of this analysis is to validate that there exists no sequence of replicates
(for all possible combination of replicates) that results in a non-significant pattern,
when the initial pattern with combined replicates was shown to be significant.

A small Example:


Assume PhyloExpressionSet stores 3 developmental stages with 3 replicates measured for each stage.
The 9 replicates in total are denoted as: 1.1, 1.2, 1.3, 2.1, 2.2, 2.3, 3.1, 3.2, 3.3. Now the function computes the
statistical significance of each pattern derived by the corresponding combination of replicates, e.g.

\itemize{
\item 1.1, 2.1, 3.1 -> p-value for combination 1

\item 1.1, 2.2, 3.1  -> p-value for combination 2

\item 1.1, 2.3, 3.1 -> p-value for combination 3

\item 1.2, 2.1, 3.1 -> p-value for combination 4

\item 1.2, 2.1, 3.1 -> p-value for combination 5

\item 1.2, 2.1, 3.1 -> p-value for combination 6

\item 1.3, 2.1, 3.1 -> p-value for combination 7

\item 1.3, 2.2, 3.1 -> p-value for combination 8

\item 1.3, 2.3, 3.1 -> p-value for combination 9

\item \dots
}
This procedure yields 27 p-values for the \eqn{3^3} (\eqn{n_stages^n_replicates}) replicate combinations.

Note, that in case you have a large amount of stages/experiments and a large amount of replicates
the computation time will increase by \eqn{n_stages^n_replicates}. For 11 stages and 4 replicates, 4^11 = 4194304 p-values have to be computed.
Each p-value computation itself is based on a permutation test running with 1000 or more permutations. Be aware that this might take some time.

The p-value vector returned by this function can then be used to plot the p-values to see
whether an critical value \eqn{\alpha} is exeeded or not (e.g. \eqn{\alpha = 0.05}).


The function receives a standard PhyloExpressionSet or DivergenceExpressionSet object and a vector storing the number of replicates present in each stage or experiment. Based on these arguments the function computes all possible replicate combinations using the \code{\link{expand.grid}} function and performs a permutation test (either a \code{\link{FlatLineTest}} or \code{\link{ReductiveHourglassTest}}) for each replicate combination. The \emph{permutation} parameter of this function specifies the number of permutations that shall be performed for each permutation test. When all p-values are computed, a numeric vector storing the corresponding p-values for each replicate combination is returned.

In other words, for each replicate combination present in the PhyloExpressionSet or DivergenceExpressionSet object, the TAI or TDI pattern of the corresponding replicate combination is tested for its statistical significance based on the underlying test statistic.

This function is also able to perform all computations in parallel using multicore processing. The underlying statistical tests are written in C++ and optimized for fast computations.
}
\examples{
# load a standard PhyloExpressionSet
data(PhyloExpressionSetExample)

# we assume that the PhyloExpressionSetExample
# consists of 3 developmental stages
# and 2 replicates for stage 1, 3 replicates for stage 2,
# and 2 replicates for stage 3
# FOR REAL ANALYSES PLEASE USE: permutations = 1000 or 10000
# BUT NOTE THAT THIS TAKES MUCH MORE COMPUTATION TIME
p.vector <- combinatorialSignificance(PhyloExpressionSetExample,
                                      replicates = c(2,3,2),
                                      TestStatistic = "FlatLineTest",
                                      permutations = 10, parallel = FALSE)
}
\author{
Hajk-Georg Drost
}
\references{
Drost et al. 2015, Evidence for active maintenance of phylotranscriptomic hourglass patterns in animal and plant embryogenesis.
}
\seealso{
\code{\link{expand.grid}}, \code{\link{FlatLineTest}}, \code{\link{ReductiveHourglassTest}}
}

